Alex Knowledge Base Overview

This wiki currently centers on alternatives to brittle next-token or observation-space prediction: JEPA-style predictive representations, energy-based and latent-variable modeling, learned tokenization or concept compression, unified multimodal models, and time-series foundation models. The raw paper corpus lives under papers/; this generated layer organizes it into durable synthesis.

Research Clusters

  • JEPA and world models connect LeCun’s autonomous intelligence proposal, latent-variable energy-based models, LeJEPA, LeWorldModel, NEPA, and VL-JEPA around prediction in representation space rather than direct reconstruction.
  • Representation collapse is a recurring design pressure: LeJEPA and LeWorldModel use Gaussian/SIGReg-style regularization, while other predictive-representation methods use stop-gradient, teacher models, or architectural constraints.
  • Tokenization and concept compression spans H-Net, Synergy, Bolmo, and ConceptMoE, which all challenge fixed subword-token pipelines from different angles.
  • Vision and multimodal representation learning connects DINOv3, Prism, Tuna-2, Beyond Language Modeling, and robotic latent-space evaluation.
  • Time-series foundation models connect CauKer, ChatTS, Eidos, FlowRanks, TimeOmni-1, and TimeOmni-VL around synthetic data, reasoning, latent prediction, and modality-specific structure.

Useful Entry Points

  • Start with index.md to inspect the migrated paper set.
  • Start with index.md to follow cross-paper synthesis.
  • Start with index.md to find named systems.
  • Check contradictions.md before treating a synthesis claim as settled.

Current Synthesis

The strongest through-line is that many papers are searching for the right intermediate representation layer: JEPA papers use latent prediction to avoid reconstructing irrelevant detail; tokenization papers learn segmentation or concepts to avoid brittle fixed tokens; multimodal papers debate whether semantic encoders, pixel embeddings, or representation autoencoders are the right visual substrate; time-series papers argue that forecasting, reasoning, and generation need representations that preserve temporal structure rather than merely fitting surface values.