LLM Sleep

Summary

LLM Sleep is a sleep-time memory-consolidation method for SSM-attention hybrid language models. It is best read as a test-time or serving-time compute allocation pattern: perform offline recurrent passes over a context window before clearing the KV cache, update SSM fast weights, and let later wake-time predictions use the consolidated state with a single forward pass.

Role In The Wiki

Use this page as the object card for the method. The source page carries the evidence details, limitations, and agenda mapping.

LLM Sleep sits between compact recurrent-state models and recurrent-depth models: the recurrence is spent on fast-weight formation before context eviction, not on every prediction token. For this wiki, its main relevance is the infinite-context streaming analogy: finite windows can roll off if the system has a learned state-refresh step that consolidates them first.

Relation To Foundation TSFM Agenda

Use the source-level agenda mapping in language-models-need-sleep-2026 rather than duplicating verdict rows here. At the entity level, this page should stay as the object card; source pages carry slot-level evidence, limitations, and missing pieces.

Evidence

Language Models Need Sleep

Official Artifacts

Preprint: arXiv 2605.26099

No official code, project page, blog post, or author X thread was verified during ingest.

Alex Open Research Wiki

Explorer

LLM Sleep

LLM Sleep

Summary

Role In The Wiki

Relation To Foundation TSFM Agenda

Evidence

Official Artifacts

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

LLM Sleep

LLM Sleep

Summary

Role In The Wiki

Relation To Foundation TSFM Agenda

Evidence

Official Artifacts

Related Pages

Graph View

Table of Contents

Backlinks