Update README.md
Browse files
README.md
CHANGED
|
@@ -7,6 +7,11 @@ tags:
|
|
| 7 |
---
|
| 8 |
This is [allenai/Olmo-3-7B-Think](https://huggingface.co/allenai/Olmo-3-7B-Think) quantized with [LLM Compressor](https://github.com/vllm-project/llm-compressor) with Smoothquant (W8A8). The model is compatible with vLLM (tested: v0.11.2). Tested with an RTX 4090.
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
|
| 12 |
- **License:** Apache 2.0 license
|
|
|
|
| 7 |
---
|
| 8 |
This is [allenai/Olmo-3-7B-Think](https://huggingface.co/allenai/Olmo-3-7B-Think) quantized with [LLM Compressor](https://github.com/vllm-project/llm-compressor) with Smoothquant (W8A8). The model is compatible with vLLM (tested: v0.11.2). Tested with an RTX 4090.
|
| 9 |
|
| 10 |
+
How the models perform (token efficiency, accuracy per domain, ...) and how to use them:
|
| 11 |
+
[Quantizing Olmo 3: Most Efficient and Accurate Formats](https://kaitchup.substack.com/p/quantizing-olmo-3-most-efficient)
|
| 12 |
+
|
| 13 |
+

|
| 14 |
+
|
| 15 |
|
| 16 |
- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
|
| 17 |
- **License:** Apache 2.0 license
|