Synergy: End-To-End Concept Model
Source
- Raw Markdown: paper_synergy-2025.md
- PDF: paper_synergy-2025.pdf
Core Claim
Synergy learns to bridge byte-level and higher-level linguistic abstractions through an end-to-end routing mechanism, producing concept tokens without a fixed tokenizer.
Key Contributions
- Trains as a byte-level language model with learned abstraction routing.
- Reports spontaneous byte tokenization into fewer concept tokens than BBPE while keeping comparable performance.
- Observes benefits from removing positional encodings in the higher-abstraction middle part.
Method Notes
Synergy is part of Byte-Level Language Models and Latent Tokenization, alongside H-Net and Bolmo.
Evidence And Results
The abstract reports an advantage over Llama3 under the same model scale and training dataset size, plus emergent position-independent concepts.
Limitations
The paper focuses on low-level linguistic abstraction. It needs comparison against larger-scale byteification and hierarchical chunking systems.
Links Into The Wiki
Open Questions
- Are Synergy’s concept tokens stable across domains and languages?
- Can routing-based abstraction scale to multimodal inputs?