Exploring Large Models for Time Series

Source

Core Claim

This THUML presentation frames large time-series models as an early-stage foundation-model direction where generalizability, task generality, scalability, and emergent interfaces are still much less mature than in language or vision models.

Curation Notes

Alex’s note: treat this primarily as a strong historical overview of the large-time-series-model landscape at the time of publication. Its value comes from the Tsinghua/THUML time-series research school being an active, capable group in the area; use it as a field map and perspective source before checking specific benchmark claims against primary papers.

Key Contributions

  • Surveys early native pretrained time-series models, including ForecastPFN, Lag-Llama, TimesFM, TimeGPT, MOIRAI, Chronos, MOMENT, and Timer.
  • Separates native time-series pretraining from large-language-model adaptation for time series, with AutoTimes used as the main THUML example of an LLM-based forecasting route.
  • Presents Timer as a task-general decoder-only time-series model built around UTSD, single-series sequence formatting, next-token prediction, and generative formulations for forecasting, imputation, and anomaly detection.
  • Emphasizes long-context forecasting and Timer-XL as a route toward larger context windows for time-series foundation models.
  • Introduces OpenLTM as a codebase for developing and evaluating large time-series model designs.

Method Notes

The source is a slide deck, not a full paper. It is useful as a curated THUML perspective on the early large-time-series-model landscape and as a guide to the group’s terminology, resources, and open problems.

For wiki synthesis, treat the deck’s benchmark claims as secondary pointers unless they are checked against the primary papers. Its most durable value is the interface framing: time-series foundation models need data infrastructure, scalable architectures, task-general adaptation, multivariate modeling, textual or exogenous context, and evaluation beyond single benchmark scores.

Evidence And Results

The deck reports Timer few-shot gains, imputation improvements, anomaly-detection results on UCR Anomaly Archive tasks, scaling gains from larger Timer variants, and zero-shot comparisons among several large time-series models. It also highlights long-context forecasting as a separate problem from ordinary long-horizon forecasting.

For LLM-based forecasting, the deck presents AutoTimes as a low-parameter adaptation method for decoder-only LLMs, with in-context forecasting and autoregressive generation as the target interface.

Limitations

  • Many figures and plots are only partially represented in extracted text, so the PDF should remain the source of truth for visual evidence.
  • Several results are presented as overview claims rather than fully specified experiments.
  • The deck itself says current large time-series models remain limited by conservative prediction objectives, single-series learning interfaces, weak use of multivariate correlations and domain knowledge, and unclear transfer behavior across applications.

Open Questions

  • Which claims from the deck should be promoted into topic-level synthesis after checking against the primary Timer, AutoTimes, Timer-XL, and OpenLTM sources?
  • Should OpenLTM and AutoTimes receive dedicated source or entity pages in this wiki?