📚 Weekly Papers

|Archive

MARCH 2026

2026-03-30Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models9 papers
2026-03-23HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning13 papers
2026-03-16Temporal Straightening for Latent Planning12 papers
2026-03-09Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing6 papers
2026-03-02From Scale to Speed: Adaptive Test-Time Scaling for Image Editing6 papers

FEBRUARY 2026

2026-02-23VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training13 papers
2026-02-16Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs2 papers
2026-02-09The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies13 papers
2026-02-02ERNIE 5.0 Technical Report23 papers

JANUARY 2026

2026-01-26Learning to Discover at Test Time17 papers
2026-01-19STEM: Scaling Transformers with Embedding Modules10 papers
2026-01-12GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization23 papers
2026-01-05Step-DeepResearch Technical Report18 papers

DECEMBER 2025

2025-12-29Evaluating AI's ability to perform scientific research tasks19 papers
2025-12-22Memory in the Age of AI Agents31 papers
2025-12-15Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning16 papers
2025-12-08From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence18 papers

NOVEMBER 2025

2025-11-30General Agentic Memory Via Deep Research16 papers
2025-11-23DeepSeek-OCR: Contexts Optical Compression6 papers
2025-11-16Olympiad-level formal mathematical reasoning with reinforcement learning24 papers
2025-11-09Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm21 papers
2025-11-02General Agentic Memory Via Deep Research16 papers

OCTOBER 2025

2025-10-26On-Policy Distillation22 papers
2025-10-19DeepSeek-OCR: Contexts Optical Compression6 papers
2025-10-12Less is More: Recursive Reasoning with Tiny Networks13 papers
2025-10-05The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain16 papers

SEPTEMBER 2025

2025-09-28SIM-CoT: Supervised Implicit Chain-of-Thought9 papers
2025-09-21FlowRL: Matching Reward Distributions for LLM Reasoning13 papers
2025-09-14A Survey of Reinforcement Learning for Large Reasoning Models16 papers
2025-09-07Why Language Models Hallucinate20 papers

AUGUST 2025

2025-08-31Deep Think with Confidence13 papers
2025-08-24Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning7 papers
2025-08-17Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens9 papers
2025-08-10Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens14 papers