Source Pages
Files
- amsterdamumcdb-2021.md - European ICU database with longitudinal observations, medications, fluids, and procedures.
- assistments-2009.md - ASSISTments student interaction data with hints, attempts, and tutoring-event sequences.
- beyond-language-modeling-2026.md - Controlled multimodal pretraining study using Transfusion, visual data, world modeling, and MoE scaling.
- bolmo-2025.md - Byteification method for converting subword LMs into competitive byte-level language models.
- cauker-2025.md - Synthetic causally coherent time-series generator for TSFM pretraining.
- causal-chambers-2024.md - Real physical systems with known causal structure and interventional data.
- causalworld-2020.md - Robotic manipulation benchmark for causal structure and transfer learning.
- chatts-2024.md - Synthetic-data-trained time-series MLLM for understanding and reasoning over multivariate series.
- chronograph-2025.md - Graph-structured multivariate microservice time-series dataset with incident labels.
- conceptmoe-2026.md - MoE architecture that merges semantically similar tokens into concept representations.
- criteo-uplift-2018.md - Marketing treatment/control dataset for uplift and treatment-effect modeling.
- d4rl-2020.md - Offline RL benchmark suite of state-action-reward trajectories.
- dinov3-2025.md - Scaled self-supervised vision foundation model with improved dense features.
- ednet-2019.md - Large-scale hierarchical student activity sequence dataset.
- eicu-crd-2018.md - Multi-center ICU database with longitudinal treatments and observations.
- eidos-2026.md - Time-series foundation model family trained through latent-space predictive learning.
- flow-of-ranks-2025.md - Rank-structure analysis and compression recipe for time-series Transformers.
- h-net-2025.md - End-to-end hierarchical byte model with learned dynamic chunking.
- heartsteps-2019.md - Mobile-health micro-randomized intervention data for activity suggestions.
- hirid-2020.md - High-resolution ICU time-series dataset with treatment/event records.
- kdd-cup-2010.md - Student-performance prediction dataset from intelligent tutoring logs.
- kuairand-2022.md - Sequential recommendation dataset with randomly exposed videos.
- latent-variable-energy-based-models-2023.md - Lecture-note introduction to latent-variable energy-based models and H-JEPA.
- lecun-autonomous-machine-intelligence-2022.md - LeCun autonomous machine intelligence proposal centered on world models, intrinsic objectives, and hierarchical JEPA.
- lejepa-2025.md - JEPA theory and SIGReg objective for Gaussian predictive representations.
- leworldmodel-2026.md - Stable end-to-end JEPA world model from pixels using next-embedding prediction and Gaussian regularization.
- mimic-iv-2023.md - Clinical EHR/ICU database with longitudinal measurements, orders, procedures, and treatments.
- nepa-2025.md - Next-embedding predictive autoregression for visual self-supervised learning.
- ohio-t1dm-2018.md - Type-1 diabetes longitudinal glucose, insulin, meal, and activity dataset.
- open-bandit-dataset-2020.md - Logged bandit feedback dataset and pipeline for off-policy evaluation.
- prism-hypothesis-2025.md - Spectral hypothesis unifying semantic and pixel encoders through frequency structure.
- pslc-datashop-2010.md - Learning-science repository with student/tutor event logs.
- reconstruction-or-semantics-2026.md - Evaluation of reconstruction and semantic latent spaces for robotic diffusion world models.
- rl-unplugged-2020.md - Offline RL benchmark suite built from logged transitions.
- synergy-2025.md - Tokenizer-free byte-level language model with learned abstraction routing.
- timeomni-1-2026.md - Time-series reasoning suite and TimeOmni-1 model for complex temporal reasoning.
- timeomni-vl-2026.md - Vision-centric unified model for time-series understanding and generation.
- tuna-2-2026.md - Pixel-space unified multimodal model that removes pretrained vision encoders.
- vl-jepa-2025.md - Vision-language JEPA that predicts text embeddings instead of autoregressive tokens.
- yahoo-contextual-bandit-2010.md - Yahoo! news recommendation contextual-bandit logs and evaluation method.
40 items under this folder.