WUSH: Near-Optimal Adaptive Transforms for LLM Quantization Paper • 2512.00956 • Published 6 days ago • 17
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published 10 days ago • 25
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published 12 days ago • 29
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published 4 days ago • 47
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 10 days ago • 94
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 5 days ago • 76
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 2 days ago • 129
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 4 days ago • 163
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published 13 days ago • 238
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 27 days ago • 128
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 6 days ago • 223
The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated 20 days ago • 68
GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper • 2511.11653 • Published 26 days ago • 54
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data Paper • 2511.12609 • Published 20 days ago • 102
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published 25 days ago • 104
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms Paper • 2511.17592 • Published 19 days ago • 118
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper • 2511.10629 • Published 23 days ago • 122