TempoPFN: Synthetic Pre-training of Linear RNNs for Zero-shot Time Series Forecasting

Source

Core Claim

TempoPFN is a univariate zero-shot forecasting foundation model that combines a Prior-data Fitted Network framing with a GatedDeltaProduct linear RNN backbone, showing that a purely synthetic pretraining pipeline can produce competitive probabilistic forecasts without real-data pretraining or per-dataset fine-tuning.

Benchmarked Model Entry

  • Model: TempoPFN-38M
  • Family: Prior-data Fitted Network for time-series forecasting
  • Organization: University of Freiburg, ELLIS Institute Tuebingen, Prior Labs, and collaborators
  • Architecture: 10-layer GatedDeltaProduct linear RNN with state-weaving, direct horizon prediction, and quantile output heads
  • Primary task surface: zero-shot univariate probabilistic forecasting
  • Training data: approximately 10 million synthetic time series from 10 generators, with synthetic augmentations and no real-world benchmark exposure
  • Official artifact: the automl/TempoPFN code repository and the AutoML-org/TempoPFN Hugging Face checkpoint.

Key Contributions

  • Uses GatedDeltaProduct linear RNN layers to obtain parallelizable sequence processing while preserving state tracking over long histories.
  • Introduces state-weaving, where final hidden states from one recurrent layer initialize the next layer, allowing future forecast tokens to use history and horizon context without explicit bidirectional recurrence.
  • Trains entirely on synthetic time series generated from ForecastPFN-style components, KernelSynth and Gaussian-process priors, CauKer-style causal structure, sawtooth and step functions, anomaly and spike processes, sine waves, audio-inspired generators, and a regime-switching Ornstein-Uhlenbeck SDE generator.
  • Adds an augmentation cascade covering scaling, TS-Mixup, temporal reversal, sign inversion, regime changes, shock-recovery dynamics, calendar effects, amplitude modulation, filtering, differencing, integration, convolutional transforms, missing values, and measurement artifacts.
  • Reports competitive zero-shot results on GIFT-Eval, Chronos-ZS, and fev-bench while keeping the training pipeline and training code open.

Method Notes

TempoPFN maps the history values, history time features, and future time features into one sequence, then predicts all future quantiles in a single forward pass. This differs from autoregressive forecasting models that unroll the horizon and from patching or windowing methods that compress time steps before prediction.

The paper is useful for the knowledge base because it makes synthetic temporal dynamics the primary training substrate rather than a supplement to real corpora. It therefore sits close to the world-model question of how much forecasting behavior can be induced by a simulator family before exposing the model to real event streams, numeric features, control inputs, or interventions.

Evidence And Results

  • On GIFT-Eval, TempoPFN reports overall CRPS of 0.537 and MASE of 0.797, outperforming the synthetic-only TabPFN-TS baseline on CRPS while trailing it on MASE.
  • On fev-bench, the paper reports TempoPFN at rank 6 under MASE and rank 6 under scaled quantile loss, with zero reported leakage and zero failed tasks.
  • On Chronos-ZS, TempoPFN is reported among the top-performing zero-shot models by average rank across probabilistic and point forecasting metrics.
  • Robustness experiments report that TempoPFN’s normalized CRPS increases less than TiRex as missing-value rates rise, with the paper attributing this to NaN-aware training augmentation and direct sequence modeling.
  • Ablations suggest that the generator mixture and augmentations matter: removing individual priors or training narrower/deeper variants changes GIFT-Eval and Chronos-ZS performance, so the result is not only an architecture claim.

Limitations

  • TempoPFN is framed as a univariate forecasting model; it does not directly model multivariate time-series interactions, actions, control inputs, interventions, or known future exogenous variables as first-class channels.
  • The main evidence is benchmark-centric and depends on whether GIFT-Eval, Chronos-ZS, and fev-bench cover the temporal regimes a downstream user needs.
  • The synthetic prior is broad but hand-designed; failures may appear when real systems contain sparse delayed coupling, policy-driven interventions, structural breaks, or observation processes outside the generator family.
  • The released model entry is described as TempoPFN-38M in the official artifacts, while the paper text also describes the training configuration as a roughly 40M-parameter model.

Open Questions

  • Which generator families are most important for transfer to industrial event streams with sparse, delayed, or intervention-heavy dynamics?
  • Can the GatedDeltaProduct plus state-weaving design be extended to multivariate time-series models with explicit control inputs and known future exogenous variables?
  • Does synthetic-only pretraining remain competitive when the evaluation requires calibrated counterfactual rollouts rather than unconditional forecast horizons?