Chronos-2: From Univariate to Universal Forecasting

Source

Raw Markdown: paper_chronos-2-2025.md
PDF: paper_chronos-2-2025.pdf
Preprint: arXiv 2510.15821
Official blog post: Introducing Chronos-2
Official code: amazon-science/chronos-forecasting
Official checkpoint: amazon/chronos-2

Core Claim

Chronos-2 is a zero-shot time-series foundation model that extends the Chronos line from mostly univariate forecasting to a universal forecasting interface covering univariate targets, multivariate targets, past-only covariates, known covariates, categorical covariates, and cross-learning across related series.

Benchmarked Model Entry

Model: Chronos-2
Family: Chronos time-series foundation models
Organization: Amazon Web Services and collaborators
Parameters: 120M for the benchmarked base model
Primary task surface: zero-shot probabilistic forecasting over univariate, multivariate, and covariate-informed time-series tasks
Official artifact: the Amazon Science Chronos forecasting repository and the amazon/chronos-2 Hugging Face checkpoint.

Key Contributions

Introduces group attention, alternating with time attention, so the model can share information across related series within a group while preserving temporal attention within each series.
Uses a unified input construction for targets and covariates, including past-only covariates, future-known covariates, and categorical covariates represented as numeric features.
Trains on heterogeneous forecasting tasks and relies on synthetic multivariate and covariate-informed data, created by imposing multivariate structure on univariate generators, to teach the model in-context forecasting behavior.
Produces direct multi-step quantile forecasts with a 21-quantile grid, including extreme quantiles for rare-event and risk-aware forecasting.
Reports state-of-the-art results across fev-bench, GIFT-Eval, and Chronos Benchmark II against pretrained forecasting baselines and statistical forecasting baselines.

Method Notes

Chronos-2 treats a group as a flexible unit of relatedness. A group can be a single target series, a batch of related univariate series, variates of one multivariate time series, or targets together with covariates. The group attention layer shares information across series at a matching patch index, while time attention models the temporal sequence inside each input dimension.

This is directly relevant to the knowledge base’s world-model frame because Chronos-2 makes covariates a first-class interface for forecasting but does not model actions, control inputs, or interventions as separate causal operators. Its covariates are exogenous variables or known future numeric features, not explicit decision channels.

Evidence And Results

On fev-bench, the paper reports Chronos-2 as the strongest model under scaled quantile loss, with an average win rate of 90.7%, a skill score of 47.3%, and no reported leakage or failures.
On GIFT-Eval, Chronos-2 leads the compared pretrained forecasting models under both weighted quantile loss and mean absolute scaled error.
On Chronos Benchmark II, Chronos-2 outperforms the compared models under both probabilistic and point forecasting metrics.
The largest in-context learning gains appear on covariate-informed fev-bench tasks, where univariate inference ignores useful exogenous variables.
Energy and retail case studies show that Chronos-2 uses load, renewable-generation, promotion, holiday, and footfall covariates to improve forecasts over univariate mode.

Limitations

The model supports numeric and categorical covariates, but the paper identifies multimodal covariates such as text as future work.
The abilities beyond univariate forecasting rely on synthetic multivariate and covariate-informed training data rather than real multivariate pretraining corpora.
The GIFT-Eval pretraining corpus excludes test overlap, but the paper notes partial overlap with training portions of some GIFT-Eval datasets; it therefore reports a synthetic-only ablation for stricter zero-shot evidence.
The reported multivariate gains are modest compared with the larger covariate-informed gains, suggesting that strong univariate modeling still captures much of the useful structure in some multivariate benchmarks.

Links Into The Wiki

Open Questions

How much of Chronos-2’s covariate advantage transfers to domains where the covariates are noisy forecasts rather than observed future features?
Can group attention be extended to explicit action, control input, or intervention channels for counterfactual world-model use cases?
How robust is the synthetic multivariatizer training recipe when downstream variables have sparse, delayed, or regime-dependent coupling?

Alex Knowledge Base

Explorer

Chronos-2: From Univariate to Universal Forecasting

Chronos-2: From Univariate to Universal Forecasting

Source

Core Claim

Benchmarked Model Entry

Key Contributions

Method Notes

Evidence And Results

Limitations

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks