Cecilia 2B Instruct v1 - GGUF

This repository contains quantized GGUF versions of the gia-uh/cecilia-2b-instruct-v1 model.

These files are optimized for efficient local inference on CPUs and GPUs using tools like llama.cpp, Ollama, LM Studio, GPT4All, and others.

📦 Available Files

Below are the available quantization formats. Since this is a 2B parameter model, it is recommended to use the highest precision your hardware allows to maintain coherence.

Filename Type Size (approx) Description & Recommended Use
cecilia-2b-instruct-v1-Q8_0.gguf Q8_0 2.24 GB Max Fidelity. Almost indistinguishable from the original.
cecilia-2b-instruct-v1-Q6_K.gguf Q6_K 1.79 GB Perfect Balance. High quality with considerable memory savings.
cecilia-2b-instruct-v1-Q4_K_M.gguf Q4_K_M 1.4 GB High Speed. Best compression but with some losing. Best for older laptops or mobile devices.

Downloads last month
8
GGUF
Model size
2B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gia-uh/cecilia-2b-instruct-v1-GGUF

Quantized
(3)
this model