Sundial: A Family of Highly Capable Time Series Foundation Models

Source

Raw Markdown: paper_sundial-2025.md
PDF: paper_sundial-2025.pdf
Preprint: arXiv 2502.00816
Official code: thuml/Sundial
Official checkpoint: thuml/sundial-base-128m

Core Claim

Sundial is a family of native continuous-valued time-series foundation models that uses TimeFlow Loss, a flow-matching objective, to generate multiple probable forecasts without discrete tokenization or a fixed parametric predictive distribution.

Key Contributions

Introduces TimeFlow Loss for autoregressive forecasting, where a small flow-matching network is conditioned on Transformer representations to sample future numeric patches.
Uses continuous patch tokenization, context-level re-normalization, a decoder-only Transformer, RoPE, Pre-LN, FlashAttention, KV cache support, and multi-patch prediction for flexible zero-shot forecasting.
Curates TimeBench, a pretraining corpus of about one trillion time points from mostly real-world sources plus a small synthetic component.
Reports zero-shot results on Time-Series-Library, GIFT-Eval, and FEV, covering point forecasting and probabilistic forecasting.
Frames generative forecasting as a way to reduce mode collapse from MSE-style objectives and to estimate arbitrary forecast statistics from sampled trajectories.

Benchmarked Models

Model	Role In Paper	Notes	Official Artifact
Sundial-Base-128M	Main released benchmarked checkpoint	Base member of the Sundial family; the paper lists patch size 16, context length 2880, prediction lengths 16 and 720, 12 Transformer layers, hidden dimension 768, 12 attention heads, a 3-layer TimeFlow module, and 128M parameters.	thuml/sundial-base-128m

Method Notes

Sundial treats multivariate time series through a univariate pretraining format rather than explicit cross-channel dynamics. Each variable is normalized and modeled as a continuous event stream of patches, so the model’s main inductive bias is temporal continuity and autoregressive generation rather than channel interaction.

TimeFlow Loss conditions a flow-matching sampler on each lookback representation. At inference time, Sundial starts from Gaussian noise, follows a learned velocity field for a fixed number of steps, and repeats sampling to estimate medians, quantiles, and other forecast statistics.

For the knowledge base’s world-model frame, Sundial is a passive probabilistic forecasting model. It improves generative uncertainty modeling for future numeric observations, but it does not introduce explicit action, control input, intervention, or treatment channels.

Evidence And Results

On Time-Series-Library long-horizon forecasting, the paper reports Sundial ahead of the compared time-series foundation models on average MSE and MAE, with the family improving as parameter count grows.
On GIFT-Eval, Sundial reports the best MASE among the compared zero-shot and supervised models and the second-best CRPS behind TimesFM.
On FEV, Sundial reports second place among zero-shot pretrained models behind Chronos while being much faster at inference in the paper’s benchmark.
TimeFlow ablations report better average point-forecasting results than diffusion loss and MSE loss, and better CRPS than those objectives on most evaluated datasets.
Data-scaling experiments compare Sundial trained on 94B, 230B, and 1032B time points, arguing that the larger TimeBench scale improves zero-shot forecasting.

Limitations

The paper notes possible hallucinations in generated forecasts despite larger model capacity.
TimeBench is weighted toward middle- and low-frequency time series, so performance on very high-frequency data is not guaranteed.
The released approach uses a simple Gaussian-noise sampling procedure, leaving sampling strategy and post-processing as future work.
Sundial’s univariate pretraining format does not explicitly model multivariate channel correlations, covariates, actions, control inputs, or interventions.
Autoregressive rolling for long horizons can still lead to over-smooth or unreliable predictions.

Links Into The Wiki

Open Questions

How much of Sundial’s gain comes from TimeFlow Loss itself versus TimeBench scale and the engineering upgrades to the Transformer backbone?
Can the generative sampler be made more reliable for high-frequency or long-horizon forecasts without sacrificing inference speed?
Would explicit multivariate, covariate, action, control input, or intervention channels preserve Sundial’s zero-shot flexibility while making it more useful as a world-model component?

Alex Knowledge Base

Explorer

Sundial: A Family of Highly Capable Time Series Foundation Models

Sundial: A Family of Highly Capable Time Series Foundation Models

Source

Core Claim

Key Contributions

Benchmarked Models

Method Notes

Evidence And Results

Limitations

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks