Rank And Flow Methods
Summary
Rank-and-flow methods use low-rank structure and its evolution across layers to understand and compress time-series Transformers.
What The Wiki Currently Believes
- FlowRanks argues that time-series embeddings have sharply decaying singular spectra, making Q/K/V projections and attention layers compressible.
Evidence
FlowRanks uses this theory to compress Chronos with large inference-time and memory reductions while preserving accuracy.
Open Questions
- Do rank spectra explain why time-series Transformers differ from language and vision Transformers?
- Can rank-aware design replace after-the-fact compression?