Tabular Foundation Models
Summary
Tabular foundation models are adjacent to the time-series cluster because they learn inference procedures from synthetic tasks, but their inputs are static tables rather than ordered observations over time. This page also tracks selected static-tabular deep-learning baselines when they clarify the boundary or the numeric feature interface. They should be treated as methodological analogs, not as time-series models.
What The Wiki Currently Believes
- TabPFN-v2 treats a small supervised table as context and performs fast classification or regression without per-dataset gradient training.
- TabPFN-3 extends the TabPFN line toward larger static supervised tables, many-class prediction, tabular text through API variants, and test-time compute through Thinking mode.
- TabICL scales tabular in-context classification by compressing rows before dataset-level attention, making larger training tables feasible.
- TempoPFN is the closest open time-series relative: it keeps the prior-data fitted network idea, but replaces static tabular tasks with synthetic temporal generators and a zero-shot forecasting interface.
- TabM is not a foundation model, but it is a strong per-dataset static-tabular DL baseline built from MLPs plus parameter-efficient ensembling.
- These models are useful analogs for time-series foundation models, but TabPFN-v2, TabPFN-3, and TabICL do not model temporal next-state dynamics, control inputs, interventions, or event streams directly.
Portable Idea
The portable idea is learned inference over structured context. A model can be pretrained on many synthetic tasks, then reused at inference time by conditioning on a new small dataset, table, or history instead of fitting a fresh model with gradient descent.
For time-series work, that suggests a useful design pattern: build a context that contains enough history, time features, exogenous variables, and target examples for a model to infer the latent task. TempoPFN makes this explicit for univariate forecasting by training on synthetic time-series priors and predicting future quantiles in one forward pass.
TabPFN-3 adds an important boundary case: its report includes a specialized TabPFN-TS-3 checkpoint for passive time-series forecasting, but the core TabPFN-3 interface is still a static tabular in-context learner. The extension should be cited for PFN-to-time-series transfer, not as evidence that ordinary tabular rows automatically define temporal histories.
Static Tabular DL Baseline
TabM is useful as a counterweight to the foundation-model framing. It does not pretrain on synthetic tasks and does not perform in-context learning; it trains per supervised dataset. Its lesson is that a simple MLP-like backbone can become highly competitive when combined with parameter-efficient ensembling, simultaneous training of submodels, and a good numerical feature interface.
For the time-series cluster, TabM is most relevant for auxiliary numeric values. Its supported numerical embedding options include raw scalar features, LinearReLUEmbeddings, updated piecewise-linear embeddings, and periodic embeddings. Those are typed feature encoders, so they map more naturally to known exogenous variables, metadata, numeric control inputs, and intervention intensities than to free-standing text numerals.
Boundary With Time Series
Static supervised tables are not temporal histories. Rows do not inherently define next-state dynamics, and feature columns do not automatically become events, exogenous variables, control inputs, or interventions. To move a PFN-style model into the time-series setting, the context must encode temporal order and must separate observed history from future-conditioning variables.
This boundary is especially important for world-model claims. A static tabular in-context learner can be powerful without supporting counterfactual rollouts under actions or interventions. A time-series PFN analogue becomes more world-model-like only when actions, control inputs, or interventions are explicit in the context and evaluation.
Evidence
TabPFN-v2, TabPFN-3, and TabICL all rely on synthetic task generation and in-context prediction over supervised tabular context. TempoPFN shows the open time-series translation: synthetic temporal tasks plus a sequence model can produce a zero-shot probabilistic forecaster without real-data pretraining. TabM provides a different kind of evidence: per-dataset static-tabular training can still be a strong baseline when the architecture is simple, efficiently ensembled, and equipped with useful numerical feature embeddings.
Open Questions
- Which synthetic tabular priors transfer to multivariate time-series models?
- Can row-wise context compression inspire trajectory compression without losing temporal order?
- How should open-weight TabPFN-3, API TabPFN-3-Plus, Thinking mode, and TabPFN-TS-3 be compared without mixing availability and adaptation modes?
- How should tabular in-context learners be compared with passive dynamics models when time order is added?
- Can PFN-style context learning support known future exogenous variables, event streams, or action-conditioned rollouts without collapsing those variables into ordinary static features?
- Which TabM-style typed numeric feature embeddings should be reused for auxiliary variables in multivariate time-series models?