Llama 3.1 8B Instruct - Blasphemer (GGUF)
This is an uncensored version of Meta's Llama 3.1 8B Instruct, processed using Blasphemer. This model will now deliver Fully uncensored outputs. Make adjustments to temperature as necessary for your own use-case. It has an extremely low refusal rate; just one follow-up is often enough to break refusal and receive previously censored output when a refusal Does appear.
In testing I found this model to function best at .7+ temperature for tool-calling.
Model Details
- Base Model: meta-llama/Llama-3.1-8B-Instruct
- Method: Abliteration (refusal direction removal)
- Format: GGUF (for llama.cpp, LM Studio, etc.)
- Quality Metrics:
- Refusals: 3/100 (3%) โญ Excellent
- KL Divergence: 0.06 โญ Excellent
- Trial: #168 of 200
Quantization Versions
| File | Size | Use Case |
|---|---|---|
| Q4_K_M | ~4.5GB | Best balance - most popular |
| Q5_K_M | ~5.5GB | Higher quality, slightly larger |
| F16 | ~15GB | Full precision (for further quantization) |
Usage
LM Studio
- Download the GGUF file
- Open LM Studio
- Click "Import Model"
- Select the downloaded file
- Start chatting!
llama.cpp
./llama-cli -m Llama-3.1-8B-Blasphemer-Q4_K_M.gguf -p "Your prompt here"
Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama(
model_path="Llama-3.1-8B-Blasphemer-Q4_K_M.gguf",
n_ctx=8192,
n_gpu_layers=-1 # Use GPU
)
response = llm("Your prompt here", max_tokens=512)
print(response['choices'][0]['text'])
What is Abliteration?
Abliteration removes refusal behavior from language models by identifying and removing the neural directions responsible for safety alignment. This is done through:
- Calculating refusal directions from harmful/harmless prompt pairs
- Using Bayesian optimization (TPE) to find optimal removal parameters
- Orthogonalizing model weights to these directions
The result is a model that maintains capabilities while removing refusal behavior.
Ethical Considerations
This model has massively reduced safety guardrails. Users are responsible for:
- Ensuring ethical use of the model
- Compliance with applicable laws and regulations
- Understanding the implications of reduced safety filtering
Performance
Compared to the original Llama 3.1 8B Instruct:
- Follows instructions more directly
- Responds to previously refused queries
- Maintains general capabilities (KL divergence: 0.06)
- Greatly Reduced safety filtering
Credits
- Base Model: Meta AI (Llama 3.1)
- Abliteration Tool: Blasphemer by Christopher Bradford
- Method: Based on "Refusal in Language Models Is Mediated by a Single Direction" (Arditi et al., 2024)
Citation
If you use this model, please cite:
@software{blasphemer2024,
author = {Bradford, Christopher},
title = {Blasphemer: Abliteration for Language Models},
year = {2024},
url = {https://github.com/sunkencity999/blasphemer}
}
@article{arditi2024refusal,
title={Refusal in Language Models Is Mediated by a Single Direction},
author={Arditi, Andy and Obmann, Oscar and Syed, Aaquib and others},
journal={arXiv preprint arXiv:2406.11717},
year={2024}
}
License
This model inherits the Llama 3.1 license from Meta AI. Please review the Llama 3.1 License for usage terms.
- Downloads last month
- 86
4-bit
5-bit
16-bit
Model tree for sunkencity/Llama-3.1-8B-Blasphemer-GGUF
Base model
meta-llama/Llama-3.1-8B