Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting Paper • 2509.11452 • Published Sep 14, 2025 • 13
IHEval: Evaluating Language Models on Following the Instruction Hierarchy Paper • 2502.08745 • Published Feb 12, 2025 • 20
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning Paper • 2410.04223 • Published Oct 5, 2024 • 8
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning Paper • 2406.12050 • Published Jun 17, 2024 • 19
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models Paper • 2310.13127 • Published Oct 19, 2023 • 12