Dynamic Tokenization Via Reinforcement Patching: End-to-end Training And Zero-shot Transfer

Source

Core Claim

ReinPatch adapts learned dynamic chunking to time-series forecasting by treating patch-boundary placement as a discrete policy optimized from downstream forecasting loss.

Benchmarked Model Entry

Method: ReinPatch.
Family: learned adaptive patching for sequence and time-series models.
Primary task surface: passive multivariate time-series forecasting.
Evaluation surface: ETT subsets, Weather, and Electricity long-horizon forecasting datasets.
Transfer setting: a standalone foundation patcher is pretrained on univariate UTSD series and frozen for zero-shot patching on downstream forecasting tasks.

Key Contributions

Introduces a patching policy trained with Group Relative Policy Gradient instead of soft boundary relaxations.
Keeps the patching policy detached from the downstream sequence backbone, making it possible to freeze and reuse the patcher.
Enforces a minimum compression rate in the downstream model environment rather than through a soft auxiliary loss.
Supports multi-level hierarchical patching by letting the policy emit boundary levels.
Reports that either the zero-shot or end-to-end ReinPatch variant is best in most tested dataset/horizon configurations under the paper’s shared forecasting backbone.

Method Notes

ReinPatch is best read as a time-series case study for H-Net-style dynamic chunking rather than as a new general time-series foundation model. It keeps the central learned-boundary idea, but replaces H-Net’s differentiable similarity/routing mechanism with an RL policy whose reward is the downstream model’s task loss.

For this wiki, the important interface distinction is that ReinPatch compresses runs of numeric time-series observations into variable-size patches. That is different from point-wise numeric value embeddings in EIDOS and from free-standing number-token encodings in FoNE or BitTokens.

The foundation-patcher experiment is a small tokenizer-transfer analogue: the patcher is pretrained on broad univariate time-series data, then reused zero-shot while only the forecasting backbone is trained on the target dataset.

Evidence And Results

The paper reports average MSE/MAE improvements over static patching, selective patching, periodic patching, entropy patching, TimeSqueeze compression, and an H-Net chunking baseline on ETT, Weather, and Electricity forecasting. The zero-shot foundation patcher is reported to outperform the end-to-end ReinPatch variant on the aggregate table, suggesting that pretraining the patch policy on broader data can regularize downstream forecasting.

The ablations emphasize longer look-back windows: ReinPatch is reported to separate more clearly from entropy patching as the input context grows, which makes the source relevant to Time-Series Scaling And Efficiency.

Limitations

ReinPatch remains a passive forecasting method in this paper. It does not expose actions, control inputs, interventions, or counterfactual rollout semantics, so it should not be treated as an action-conditioned world model.

The foundation patcher is pretrained on univariate series and the downstream multivariate forecasting formulation follows a channel-independent setup. That leaves native multivariate patching and cross-channel boundary dependencies open.

The paper checklist mentions an OSF view-only reproduction artifact, but that link was not independently verified as accessible during ingest. Do not treat an official code or data release as confirmed from this source page alone.

Links Into The Wiki

Open Questions

Can a ReinPatch-style policy learn cross-channel patch boundaries for native multivariate time series rather than channel-independent univariate streams?
Does the foundation patcher transfer beyond forecasting into classification, anomaly detection, event streams, audio, or genomics?
Should future time-series foundation models prefer point-wise numeric embeddings, fixed patches, entropy-style adaptive patches, or RL-trained dynamic patches?
Is the OSF reproduction artifact public and official, or should the wiki omit it permanently from source-level links?

Alex Knowledge Base

Explorer

Dynamic Tokenization Via Reinforcement Patching: End-to-end Training And Zero-shot Transfer

Dynamic Tokenization Via Reinforcement Patching: End-to-end Training And Zero-shot Transfer

Source

Core Claim

Benchmarked Model Entry

Key Contributions

Method Notes

Evidence And Results

Limitations

Links Into The Wiki

Open Questions

Graph View

Table of Contents

Backlinks