GIFT-Eval: General Time Series Forecasting Model Evaluation

Source

Core Claim

GIFT-Eval is a broad general-purpose forecasting benchmark for comparing time-series foundation models across domains, frequencies, variate counts, and prediction lengths.

Dataset Notes

  • The Hugging Face card describes 144,000 time series, roughly 177 million data points, and 97 forecasting configurations.
  • The suite includes a non-leaking pretraining dataset intended to support zero-shot evaluation without test leakage.
  • Public summaries and papers use slightly different dataset counts, so exact counts should be tied to a specific artifact version.

Why It Matters

GIFT-Eval is a central benchmark for Toto, Toto 2.0, and many other time-series foundation-model sources in this repository. It is useful for benchmark hygiene because it separates train/test data and exposes public leaderboard protocols.

Limitations

  • It is not an observability-specific benchmark.
  • It usually does not stress the hundreds-to-thousands channel regime the way BOOM or Time-HD do.
  • Component dataset licenses and terms should be checked for downstream use.