ykarout
/

Phi4-ThinkMode-fp16

Text Generation

text-generation-inference

Model card Files Files and versions

Phi4-ThinkMode

This is a fine-tuned version of unsloth/Phi-4 with enhanced reasoning capabilities using GRPO (1000 step) on the dataset gsm8k

Model details

Base model: unsloth/Phi-4
Fine-tuning: 16-bit precision
Use case: Improved reasoning and thinking mode

Downloads last month: 5

Safetensors

Model size

15B params

Tensor type

BF16

·

Model tree for ykarout/Phi4-ThinkMode-fp16

Base model

microsoft/phi-4

Finetuned

Finetuned

(84)

this model

Quantizations

1 model

Dataset used to train ykarout/Phi4-ThinkMode-fp16