TIME Benchmark
Source
- Dataset metadata snapshot: source.md
- Metadata JSON: metadata.json
- Official Hugging Face: https://huggingface.co/datasets/Real-TSF/TIME
- Official leaderboard: https://huggingface.co/spaces/Real-TSF/TIME-leaderboard
- Paper: https://arxiv.org/abs/2602.12147
Core Claim
TIME is a task-centric zero-shot forecasting benchmark built from fresh datasets to reduce benchmark contamination. Toto 2.0 uses it as part of its scaling-era benchmark evidence.
Dataset Notes
- The Hugging Face card describes 50 fresh datasets and 98 forecasting tasks.
- The artifact includes task-level data and window-level prediction results for benchmark visualization.
- The benchmark is positioned around strict zero-shot and contamination-resistance claims.
Why It Matters
TIME belongs in Time-Series Benchmark Hygiene because it directly addresses the concern that public forecasting benchmarks can become contaminated by pretraining corpora, benchmark-specific tuning, or repeated leaderboard exposure.
Limitations
- TIME is not primarily an observability benchmark.
- It is not primarily an HDTSF benchmark.
- It remains passive forecasting data, not an action-conditioned world-model dataset.