UTICA: Multi-Objective Self-Distillation Foundation Model Pretraining for Time Series Classification

Source

Raw Markdown: paper_utica-2026.md
PDF: paper_utica-2026.pdf
Preprint: https://arxiv.org/abs/2603.01348
Official code: https://github.com/fegounna/Utica
Official checkpoint: https://huggingface.co/fegounna/Utica

Core Claim

UTICA adapts DINOv2-style non-contrastive self-distillation to time-series classification foundation models, arguing that multi-crop invariance and masked patch prediction are complementary pretraining signals for time-series representations.

Key Contributions

Builds on the Mantis tokenizer and Transformer encoder backbone rather than introducing a new architecture family.
Combines global and local crop alignment through a DINO-style [CLS] objective with iBOT-style masked patch prediction and a KoLeo regularizer.
Uses an EMA teacher network as the final representation model after student-teacher self-distillation.
Evaluates on UCR and UEA classification benchmarks under both frozen linear probing and end-to-end fine-tuning.

Benchmarked Models

Mantis-UTICA-8M: UTICA-pretrained Mantis-8M teacher model. The paper evaluates this model against Mantis, MOMENT, NuTime, and GPT4TS on UCR and UEA classification tasks; the official checkpoint is published at https://huggingface.co/fegounna/Utica.

Method Notes

UTICA is best treated as a classification-focused time-series representation model. It is not an action-conditioned world model: there is no action or control-input channel, and the evaluation is benchmark classification over univariate and multivariate time series.

The pretraining objective is useful for the wiki’s time-series foundation model cluster because it tests whether a vision-style self-distillation recipe transfers to numeric time series. The method also gives a clean comparison point against contrastive Mantis pretraining, where false negatives can be problematic when different samples share temporal structure.

Within the Mantis lineage, UTICA is the self-distillation branch: Mantis provides the original contrastive Mantis-8M baseline, MantisV2 explores synthetic pretraining plus test-time strategies, and UTICA keeps the Mantis tokenizer/backbone while changing the pretraining objective.

Evidence And Results

The reported UCR linear-probing result is 0.794 average accuracy with 52 wins out of 128 datasets, compared with 0.792 and 33 wins for Mantis. Under UCR fine-tuning, UTICA reports 0.857 average accuracy and 60 wins, compared with 0.850 and 38 wins for Mantis.

On UEA, UTICA reports the best average rank in both linear probing and fine-tuning. The ablation study reports that the combined UTICA loss outperforms either DINO+KoLeo or iBOT+KoLeo alone on UCR linear probing, supporting the claim that crop invariance and local masked prediction provide complementary supervision.

Links Into The Wiki

Open Questions

Does the gain persist when the Mantis backbone is scaled beyond the 8M-parameter setting?
How much of the improvement comes from the non-contrastive objective versus the augmentation and masking schedule?

Alex Knowledge Base

Explorer

UTICA: Multi-Objective Self-Distillation Foundation Model Pretraining for Time Series Classification

UTICA: Multi-Objective Self-Distillation Foundation Model Pretraining for Time Series Classification

Source

Core Claim

Key Contributions

Benchmarked Models

Method Notes

Evidence And Results

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks