FoNE: Precise Single-Token Number Embeddings Via Fourier Features

Source

Core Claim

FoNE maps each number into a single token embedding built from Fourier features, using sine/cosine components at digit-aligned periods so numeric values can be represented without fragmented subword or digit tokens.

Key Contributions

  • Defines Fourier Number Embedding as a concatenation of circular embeddings over powers-of-10 periods.
  • Uses each sine/cosine pair to recover a modular component of the number, giving a digit-aligned representation.
  • Adds the Fourier number embedding to a learned [NUM] token, then decodes numbers by matching hidden-state pairs to digit embeddings.
  • Reports stronger arithmetic performance and data efficiency than subword and digit-wise baselines in its controlled experiments.
  • Builds directly on the observation that pretrained LLMs already contain Fourier-like number features.

Method Notes

FoNE is the cleanest source in this batch for a smooth, periodic basis view of number embeddings. Its closest time-series analogy is EIDOS-style point-wise scalar encoding: both use bounded periodic basis functions to map scalar numeric values into higher-dimensional representations.

The difference is semantic and operational. FoNE is designed for literal numbers in language-model text and arithmetic outputs. EIDOS maps observed time-series samples into latent tokens for passive forecasting. The two should not be collapsed into one method, but they support the same broader design question: scalar numeric values may deserve specialized embeddings rather than ordinary tokenization.

Slug note: this page uses the arXiv submission year 2025 in the slug, while the OpenReview venue page lists the paper as an ICLR 2026 poster.

Evidence And Results

The abstract and results report that FoNE reduces the number of tokens per number and improves arithmetic accuracy in controlled language-model experiments. The project page and paper present a concrete tokenization comparison for a decimal number, then show how modular Fourier components represent digits.

The source is also important historically: it cites Pre-trained Large Language Models Use Fourier Features To Compute Addition as the mechanistic motivation for explicitly building Fourier number embeddings.

Limitations

FoNE’s strongest claims are for controlled arithmetic tasks. BitTokens challenges its generality, arguing that sinusoidal/Fourier encodings are well suited to addition but force non-local decoding and re-encoding for multiplication and division. Treat FoNE as an important representation proposal, not as a settled universal numeric encoding.

Open Questions

  • Can FoNE-style periodic scalar embeddings improve point-wise time-series embeddings beyond arithmetic tasks?
  • Should Fourier number embeddings be combined with bit-level or logarithmic encodings to cover both addition and multiplication-like operations?
  • How should sign, uncertainty, missingness, and measurement units be represented when FoNE-style encodings are applied to auxiliary numeric values?