-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 86 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 156 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2603.09906
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 151 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
From Perception to Action: An Interactive Benchmark for Vision Reasoning
Paper • 2602.21015 • Published • 24 -
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper • 2603.09906 • Published • 76
-
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
Paper • 2602.17100 • Published • 4 -
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
Paper • 2603.01059 • Published • 1 -
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Paper • 2603.00618 • Published -
Heterogeneous Agent Collaborative Reinforcement Learning
Paper • 2603.02604 • Published • 198
-
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Paper • 2510.02209 • Published • 57 -
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading
Paper • 2509.05080 • Published -
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis
Paper • 2508.17565 • Published • 1 -
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning
Paper • 2508.20467 • Published
-
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Paper • 2506.14702 • Published • 3 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 278 -
Scaling Test-time Compute for LLM Agents
Paper • 2506.12928 • Published • 64 -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 95
-
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era
Paper • 2601.07526 • Published • 23 -
Intelligent AI Delegation
Paper • 2602.11865 • Published • 16 -
ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents
Paper • 2511.12960 • Published • 1 -
CityRAG: Stepping Into a City via Spatially-Grounded Video Generation
Paper • 2604.19741 • Published • 17
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 19 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 104 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 43
-
Mixture of Contexts for Long Video Generation
Paper • 2508.21058 • Published • 35 -
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
Paper • 2512.21337 • Published • 31 -
SCOPE: Prompt Evolution for Enhancing Agent Effectiveness
Paper • 2512.15374 • Published • 6 -
Fast-weight Product Key Memory
Paper • 2601.00671 • Published • 7
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 731 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 40 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 86 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 156 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era
Paper • 2601.07526 • Published • 23 -
Intelligent AI Delegation
Paper • 2602.11865 • Published • 16 -
ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents
Paper • 2511.12960 • Published • 1 -
CityRAG: Stepping Into a City via Spatially-Grounded Video Generation
Paper • 2604.19741 • Published • 17
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 151 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
From Perception to Action: An Interactive Benchmark for Vision Reasoning
Paper • 2602.21015 • Published • 24 -
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper • 2603.09906 • Published • 76
-
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
Paper • 2602.17100 • Published • 4 -
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
Paper • 2603.01059 • Published • 1 -
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Paper • 2603.00618 • Published -
Heterogeneous Agent Collaborative Reinforcement Learning
Paper • 2603.02604 • Published • 198
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 19 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 104 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 43
-
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Paper • 2510.02209 • Published • 57 -
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading
Paper • 2509.05080 • Published -
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis
Paper • 2508.17565 • Published • 1 -
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning
Paper • 2508.20467 • Published
-
Mixture of Contexts for Long Video Generation
Paper • 2508.21058 • Published • 35 -
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
Paper • 2512.21337 • Published • 31 -
SCOPE: Prompt Evolution for Enhancing Agent Effectiveness
Paper • 2512.15374 • Published • 6 -
Fast-weight Product Key Memory
Paper • 2601.00671 • Published • 7
-
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Paper • 2506.14702 • Published • 3 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 278 -
Scaling Test-time Compute for LLM Agents
Paper • 2506.12928 • Published • 64 -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 95
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 731 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 40 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89