4 40 51

Chew Kok Wah

chewkokwah

AI & ML interests

Open Domain Question Answering

Recent Activity

upvoted an article 7 days ago

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

liked a dataset 8 days ago

asusevski/kaggle-AI-Mathematical-Olympiad-3-responses

upvoted an article 8 days ago

Transformers v5: Simple model definitions powering the AI ecosystem

View all activity

Organizations

upvoted an article 7 days ago

Article

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Aug 8

•

upvoted an article 8 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

9 days ago

•

233

upvoted an article 16 days ago

Article

Announcing New Hugging Face and KerasHub integration

Jul 10, 2024

•

upvoted a collection 17 days ago

MathArena Outputs

Collection

Outputs of models on the MathArena Benchmark. • 16 items • Updated 1 day ago • 1

upvoted an article about 1 month ago

Article

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

Sep 16

•

upvoted a paper about 1 month ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 176

upvoted an article about 1 month ago

Article

On the Shifting Global Compute Landscape

Oct 29

•

upvoted 2 papers about 1 month ago

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29 • 77

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 317

upvoted 4 articles about 2 months ago

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Aug 8

•

Article

Sentence Transformers is joining Hugging Face!

Oct 22

•

Article

Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL

Jan 10, 2024

•

Article

How to Run a Hugging Face Model in JAX (Part 1)

Jul 20

•

upvoted an article 3 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Sep 11

•

166

upvoted a paper 4 months ago

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

Paper • 2410.07985 • Published Oct 10, 2024 • 32

upvoted a paper 5 months ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22 • 63

upvoted an article 5 months ago

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Jul 18

•

upvoted a paper 5 months ago

OpenCodeReasoning-II: A Simple Test Time Scaling Approach via Self-Critique

Paper • 2507.09075 • Published Jul 11 • 15

upvoted an article 5 months ago

Article

Ettin Suite: SoTA Paired Encoders and Decoders

Jul 16

•

upvoted a paper 5 months ago

AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

Paper • 2506.13284 • Published Jun 16 • 26

Chew Kok Wah

AI & ML interests

Recent Activity

Organizations

chewkokwah's activity

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Transformers v5: Simple model definitions powering the AI ecosystem

Announcing New Hugging Face and KerasHub integration

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

On the Shifting Global Compute Landscape

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Sentence Transformers is joining Hugging Face!

Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL

How to Run a Hugging Face Model in JAX (Part 1)

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Ettin Suite: SoTA Paired Encoders and Decoders