Instructions to use EldritchLabs/Cactus-Dream-Horror-12B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use EldritchLabs/Cactus-Dream-Horror-12B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="EldritchLabs/Cactus-Dream-Horror-12B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("EldritchLabs/Cactus-Dream-Horror-12B")
model = AutoModelForCausalLM.from_pretrained("EldritchLabs/Cactus-Dream-Horror-12B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

NeMo
How to use EldritchLabs/Cactus-Dream-Horror-12B with NeMo:
```
# tag did not correspond to a valid NeMo domain.
```
Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use EldritchLabs/Cactus-Dream-Horror-12B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "EldritchLabs/Cactus-Dream-Horror-12B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EldritchLabs/Cactus-Dream-Horror-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/EldritchLabs/Cactus-Dream-Horror-12B

SGLang

How to use EldritchLabs/Cactus-Dream-Horror-12B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "EldritchLabs/Cactus-Dream-Horror-12B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EldritchLabs/Cactus-Dream-Horror-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "EldritchLabs/Cactus-Dream-Horror-12B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "EldritchLabs/Cactus-Dream-Horror-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use EldritchLabs/Cactus-Dream-Horror-12B with Docker Model Runner:
```
docker model run hf.co/EldritchLabs/Cactus-Dream-Horror-12B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

⚠️ Note: This model requires ChatML chat template.

🌵 Cactus Dream Horror 12B

This is a merge of pre-trained language models created using mergekit.

The model is partially censored but can be jailbroken or ablated if needed.

Merge Details

Merge Method

This model was merged using the DELLA merge method using p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop as a base.

This merge required the enable_fix_mistral_regex_true.md patch for tokenizer stability.

The graph_v18.py patch was also helpful to use 8GB VRAM for acceleration.

Models Merged

The following models were included in the merge:

p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop
BeaverAI/MN-2407-DSK-QwQify-v0.1-12B
crestf411/MN-Slush
D1rtyB1rd/Egregore-Alice-RP-NSFW-12B
D1rtyB1rd/Looking-Glass-Alice-Thinking-NSFW-RP
Delta-Vector/Francois-PE-V2-Huali-12B
Delta-Vector/Ohashi-NeMo-12B
Delta-Vector/Rei-V3-KTO-12B
Epiculous/Violet_Twilight-v0.2
elinas/Chronos-Gold-12B-1.0
inflatebot/MN-12B-Mag-Mell-R1
MarinaraSpaghetti/NemoMix-Unleashed-12B
Sao10K/MN-12B-Vespa-x1
TheDrummer/Rocinante-12B-v1.1
TheDrummer/UnslopNemo-12B-v4.1
Vortex5/Crimson-Constellation-12B

Brain Scan Audit

Configuration

The following YAML configuration was used to produce this model:

architecture: MistralForCausalLM
base_model: B:/12B/models--p-e-w--Mistral-Nemo-Instruct-2407-heretic-noslop
models:
  - model: B:/12B/models--p-e-w--Mistral-Nemo-Instruct-2407-heretic-noslop
  - model: B:/12B/models--BeaverAI--MN-2407-DSK-QwQify-v0.1-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--crestf411--MN-Slush
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--D1rtyB1rd--Egregore-Alice-RP-NSFW-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--D1rtyB1rd--Looking-Glass-Alice-Thinking-NSFW-RP
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--Delta-Vector--Francois-PE-V2-Huali-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--Delta-Vector--Ohashi-NeMo-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--Delta-Vector--Rei-V3-KTO-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--Epiculous--Violet_Twilight-v0.2
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:/12B/models--elinas--Chronos-Gold-12B-1.0
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:\12B\models--inflatebot--MN-12B-Mag-Mell-R1
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:\12B\models--MarinaraSpaghetti--NemoMix-Unleashed-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:\12B\models--Sao10K--MN-12B-Vespa-x1
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:\12B\models--TheDrummer--Rocinante-12B-v1.1
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:\12B\models--TheDrummer--UnslopNemo-12B-v4.1
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
  - model: B:\12B\models--Vortex5--Crimson-Constellation-12B
    parameters:
      density: 0.9
      weight: 0.1
      epsilon: 0.099
# --lazy-unpickle --random-seed 420 --cuda --fix-mistral-regex
merge_method: della
parameters:  
  lambda: 1.0
  normalize: false
  int8_mask: false
dtype: float32
out_dtype: bfloat16
tokenizer:  
  source: "union"  
  tokens:  
    # Force ChatML EOS tokens  
    "<|im_start|>":  
      source: "B:/12B/models--D1rtyB1rd--Egregore-Alice-RP-NSFW-12B"  
      force: true  
    "<|im_end|>":  
      source: "B:/12B/models--D1rtyB1rd--Egregore-Alice-RP-NSFW-12B"  
      force: true  
    # Keep Mistral tokens  
    "[INST]":  
      source: "B:/12B/models--p-e-w--Mistral-Nemo-Instruct-2407-heretic-noslop"  
     #  source: "B:/12B/models--mistralai--Mistral-Nemo-Instruct-2407"    
     # The tokenizer system requires all models referenced in token configurations to be present in the merge's model list to build proper embedding permutations. 
    "[/INST]":  
      source: "B:/12B/models--p-e-w--Mistral-Nemo-Instruct-2407-heretic-noslop"  
    # Force </s> as fallback EOS  
    "</s>":  
      source: "B:/12B/models--p-e-w--Mistral-Nemo-Instruct-2407-heretic-noslop"  
      force: true

chat_template: "chatml"
name: 🌵 Cactus-Dream-Horror-12B