Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

upvoted an article 1 day ago

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

updated a model 1 day ago

BEE-spoke-data/mega-ar-126m-4k

updated a model 1 day ago

BEE-spoke-data/pegasus-x-base-synthsumm_open-16k

View all activity

Organizations

upvoted an article 1 day ago

Article

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

bezzam, Steveeeeeeen, eustlb, SBruccoleriAppen, jmss-appen, c-e-ford-appen, wgb14, YukaiHuang, like2026, logicbean, ally-lxl

•

8 days ago

• 15

updated 2 models 1 day ago

BEE-spoke-data/mega-ar-126m-4k

Text Generation • 0.1B • Updated 1 day ago • 272 • 4

BEE-spoke-data/pegasus-x-base-synthsumm_open-16k

Summarization • 0.3B • Updated 1 day ago • 50 • 2

upvoted a paper 1 day ago

Investigating Efficiently Extending Transformers for Long Input Summarization

Paper • 2208.04347 • Published Aug 8, 2022 • 1

upvoted an article 2 days ago

Article

EMO: Pretraining mixture of experts for emergent modularity

allenai

•

5 days ago

• 30

liked a model 3 days ago

shb777/Llama-3.3-8B-Instruct-128K

Text Generation • Updated Jan 3 • 3.8k • 49

liked a model 5 days ago

Zyphra/ZAYA1-8B

9B • Updated 2 days ago • 110k • 471

upvoted an article 9 days ago

Article

Multimodal Embedding & Reranker Models with Sentence Transformers

tomaarsen

•

Apr 9

• 59

updated a model 10 days ago

pszemraj/parakeet-tdt-0.6b-v3-gguf

Automatic Speech Recognition • 0.6B • Updated 10 days ago • 376

upvoted a collection 11 days ago

OlmPool

Collection

Collection of models from the paper "Cracks in the Foundation: Seemingly Minor Architectural Choices Impact Long Context Extension". • 26 items • Updated 13 days ago • 3

updated a Space 11 days ago

FLAN Grammar Correction

✍

Correct grammar in your text with highlighted edits

liked a model 11 days ago

Qwen/Qwen3.6-27B

Image-Text-to-Text • 28B • Updated 20 days ago • 2.77M • • 1.27k

upvoted 2 papers 12 days ago

A Survey on LLM-based Conversational User Simulation

Paper • 2604.24977 • Published 17 days ago • 8

Efficient Training on Multiple Consumer GPUs with RoundPipe

Paper • 2604.27085 • Published 15 days ago • 40

upvoted a paper 14 days ago

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published 28 days ago • 23

upvoted a collection 14 days ago

Olmo 3.1

Collection

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 51

upvoted an article 14 days ago

Article

Granite 4.1 LLMs: How They’re Built

ibm-granite

•

14 days ago

• 68

upvoted a paper 14 days ago

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora

Paper • 2604.24819 • Published 17 days ago • 88

upvoted a collection 15 days ago

Laguna XS.2

Collection

Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 6 days ago • 20

published a model 15 days ago

pszemraj/parakeet-tdt-0.6b-v3-gguf

Automatic Speech Recognition • 0.6B • Updated 10 days ago • 376

Peter Szemraj PRO

AI & ML interests

Recent Activity

Organizations

pszemraj's activity

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

EMO: Pretraining mixture of experts for emergent modularity

Multimodal Embedding & Reranker Models with Sentence Transformers

FLAN Grammar Correction

Granite 4.1 LLMs: How They’re Built