UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting

Source

Core Claim

Domain instructions can help a unified cross-domain time-series forecasting model distinguish between heterogeneous datasets and align time-series tokens with language-model representations.

Key Contributions

  • Proposes UniTime for cross-domain multivariate time-series forecasting.
  • Uses channel independence so the model can handle datasets with different numbers of variables.
  • Patches time-series channels into tokens and projects them into a shared hidden space.
  • Prepends natural-language domain instructions before time-series tokens so causal attention can use the instruction while processing the series.
  • Uses masking to reduce domain convergence-speed imbalance and improve cross-domain learning.

Method Notes

UniTime is an early language-empowered time-series model. Its text is mostly domain instruction, not arbitrary natural-language reasoning over a time series.

The input-order detail matters: because GPT-style causal masking prevents later tokens from attending backward only to previous tokens, UniTime places the instruction before the time-series tokens so the numeric sequence can attend to the text.

Evidence And Results

The paper reports state-of-the-art cross-domain forecasting performance and zero-shot transferability to unseen domains. For the wiki, its durable contribution is the interface pattern: domain text as a conditioning prefix for time-series tokens.

Alex Notes

  • Alex flagged this as the first Multi-Modal LLM plus time-series model that follows instructions, albeit poorly.

Limitations

  • Instruction following is narrow compared with modern time-series MLLMs.
  • Channel independence improves cross-domain flexibility but can miss native multivariate channel dependence.
  • It is still passive forecasting: the domain instruction is context, not an action, control input, or intervention.

Open Questions

  • How much of UniTime’s gain comes from domain text versus cross-domain masking and channel-independent design?
  • Can instruction conditioning preserve calibration when the text is noisy or misleading?
  • What is the minimal language interface needed for context-aided forecasting?