arxiv:2502.09245
Gleb Gerasimov
gudleifrr
AI & ML interests
NLP, interpretability
Organizations
None yet
models 214
gudleifrr/gpt2_saes
Updated
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.15.hook_resid_post_16384_batchtopk_64_0.001_9715
Updated
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_1376
Updated
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_3866
Updated
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_9689
Updated
gudleifrr/sae_Qwen_Qwen2.5-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_8634
Updated
gudleifrr/sae_Qwen_Qwen2.5-Math-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_r1
Updated
gudleifrr/sae_Qwen_Qwen2.5-7B_diff_blocks.10.hook_resid_post_16384_batchtopk_64_0.001_r1
Updated
gudleifrr/sae_Qwen_Qwen2.5-7B_default_ln_final.hook_normalized_16384_batchtopk_64_0.001
Updated
gudleifrr/sae_Qwen_Qwen2.5-7B_default_blocks.10.hook_resid_post_16384_batchtopk_64_0.001
Updated
datasets 9
gudleifrr/gpt2_saes
Updated • 4
gudleifrr/interpretations
Updated • 5
gudleifrr/OpenThoughts-114k-full-fix
Viewer • Updated • 114k • 13
gudleifrr/steering_diff
Preview • Updated • 80
gudleifrr/MATH-500
Viewer • Updated • 500 • 5
gudleifrr/OpenThoughts-114k-full
Viewer • Updated • 114k • 5
gudleifrr/linear_probing
Viewer • Updated • 8 • 7
gudleifrr/OpenThoughts-114k-thinking
Viewer • Updated • 114k • 15
gudleifrr/text-correction-en
Viewer • Updated • 784k • 5