Cecilia 2B Instruct v1 - GGUF

This repository contains quantized GGUF versions of the gia-uh/cecilia-2b-instruct-v1 model.

These files are optimized for efficient local inference on CPUs and GPUs using tools like llama.cpp, Ollama, LM Studio, GPT4All, and others.

📦 Available Files

Below are the available quantization formats. Since this is a 2B parameter model, it is recommended to use the highest precision your hardware allows to maintain coherence.

Filename	Type	Size (approx)	Description & Recommended Use
`cecilia-2b-instruct-v1-Q8_0.gguf`	Q8_0	2.24 GB	Max Fidelity. Almost indistinguishable from the original.
`cecilia-2b-instruct-v1-Q6_K.gguf`	Q6_K	1.79 GB	Perfect Balance. High quality with considerable memory savings.
`cecilia-2b-instruct-v1-Q4_K_M.gguf`	Q4_K_M	1.4 GB	High Speed. Best compression but with some losing. Best for older laptops or mobile devices.