Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency Paper • 2601.05905 • Published 4 days ago • 15
Can We Predict Before Executing Machine Learning Agents? Paper • 2601.05930 • Published 4 days ago • 22
InnoGym: Benchmarking the Innovation Potential of AI Agents Paper • 2512.01822 • Published Dec 1, 2025 • 35
Exploring Model Kinship for Merging Large Language Models Paper • 2410.12613 • Published Oct 16, 2024 • 21