UniTS: A Unified Multi-Task Time Series Model
Source
- Raw Markdown: paper_units-2024.md
- PDF: paper_units-2024.pdf
- Preprint: arXiv 2403.00131
- Official project page: Zitnik Lab UniTS
- Official code: mims-harvard/UniTS
- Official pretrained weights: UniTS checkpoint release
Core Claim
UniTS argues that forecasting, classification, imputation, and anomaly detection can share one time-series model through task tokenization, prompt tokens, and a unified architecture rather than separate task-specific modules.
Key Contributions
- Defines a universal task specification with sample tokens, prompt tokens, and task tokens such as
GENandCLS. - Uses a unified time-series architecture with attention over time and variable dimensions, plus a dynamic linear operator for temporal relationships.
- Pretrains with masked reconstruction losses that support both generative and predictive tasks.
- Evaluates one shared model over 38 datasets spanning forecasting, classification, imputation, and anomaly detection.
- Releases code, datasets, and checkpoint artifacts for the benchmarked settings.
Method Notes
UniTS is trained on time-series data rather than by reprogramming a text LLM. Its tokens are model-interface tokens for numeric time series and task specification, not natural-language tokens.
For this wiki, UniTS sits between forecasting foundation models and classification foundation models. It is broader than a pure forecaster, but it remains a passive time-series model unless a downstream task explicitly provides actions, control inputs, interventions, or counterfactual semantics.
Evidence And Results
- The paper reports strong multi-task performance across forecasting, classification, anomaly detection, and imputation compared with task-specialized and LLM-adapted baselines.
- Few-shot and prompt-learning evaluations suggest that task tokens can adapt the same backbone to new datasets and tasks.
- Ablations study cross-task pretraining, cross-domain pretraining, and prompt-learning behavior across model sizes.
Limitations
- UniTS unifies common passive time-series tasks, but it does not make intervention, control, or action-conditioned rollout a first-class interface.
- Broad task support makes evaluation heterogeneous; scores should be compared task by task rather than collapsed into one foundation-model rank.
- The model still needs careful benchmark hygiene because multi-domain pretraining can blur zero-shot and in-distribution boundaries.
Links Into The Wiki
- UniTS
- Time-Series Foundation Models
- Time-Series Classification Foundation Models
- Time-Series Benchmark Hygiene
- Self-Supervised Representation Learning
Open Questions
- Is task tokenization a better general interface than separate heads for future broad TSFMs?
- Can the UniTS task-token interface be extended to explicit action, control input, or intervention tokens?
- Which tasks benefit from shared weights, and which tasks suffer negative transfer under a unified backbone?