Rank And Flow Methods

Summary

Rank-and-flow methods use low-rank structure and its evolution across layers to understand and compress time-series Transformers.

What The Wiki Currently Believes

  • FlowRanks argues that time-series embeddings have sharply decaying singular spectra, making Q/K/V projections and attention layers compressible.

Evidence

FlowRanks uses this theory to compress Chronos with large inference-time and memory reductions while preserving accuracy.

Open Questions

  • Do rank spectra explain why time-series Transformers differ from language and vision Transformers?
  • Can rank-aware design replace after-the-fact compression?