Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 21 days ago • 175
Luciole LLM Collection Open Source LLM in French, English, German, Spanish, Italian, Portuguese, Dutch and Arabic • 11 items • Updated about 5 hours ago • 10
sentence-transformers/all-mpnet-base-v2 Sentence Similarity • 0.1B • Updated Aug 19, 2025 • 33.2M • • 1.32k
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction Paper • 2510.20411 • Published Oct 23, 2025 • 2
Papers Collection Papers Led/Contributed to by ALTA Computer Science & Technology Members • 6 items • Updated Oct 11, 2025
view article Article Reinforcement Learning for Large Language Models: Beyond the Agent Paradigm royswastik • Mar 19, 2025 • 9