Instructions to use North-ML1/aurora-one with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use North-ML1/aurora-one with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="North-ML1/aurora-one")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("North-ML1/aurora-one", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use North-ML1/aurora-one with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "North-ML1/aurora-one" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "North-ML1/aurora-one", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/North-ML1/aurora-one
- SGLang
How to use North-ML1/aurora-one with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "North-ML1/aurora-one" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "North-ML1/aurora-one", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "North-ML1/aurora-one" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "North-ML1/aurora-one", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use North-ML1/aurora-one with Docker Model Runner:
docker model run hf.co/North-ML1/aurora-one
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.
🌌 Aurora One
A compact experimental language model built by North ML.
General conversation · Lightweight assistance · Local inference · Small-model research
Overview
Aurora One is a compact, experimental causal language model designed for general conversation, lightweight assistance, educational experiments, and research into small language models.
With approximately 120 million parameters, Aurora One is intended to provide useful text generation while remaining practical for local and lower-resource inference environments.
Experimental model: Aurora One may produce incorrect information, misunderstand complex instructions, repeat phrases, or generate flawed reasoning.
Capabilities
💬 ConversationHold basic conversations and respond to straightforward questions. |
📖 ExplanationsExplain simple concepts using concise, accessible language. |
✍️ WritingAssist with brainstorming, rewriting, short stories, and creative responses. |
💻 Basic CodingAnswer introductory programming questions and generate short code examples. |
🎯 Instruction FollowingFollow short, clearly written instructions and formatting requests. |
🔬 ResearchSupport experiments involving compact language models and local inference. |
Intended Uses
- General-purpose conversational applications
- Lightweight personal assistants
- Educational demonstrations and experiments
- Writing and brainstorming tools
- Simple question answering
- Basic coding assistance
- Chatbot prototypes
- Small-model research
- Local and resource-conscious inference
Limitations
Aurora One is a small experimental model and may:
- Generate factually incorrect or invented information
- Struggle with multi-step reasoning
- Misunderstand long or highly detailed instructions
- Repeat words, phrases, or ideas
- Lose consistency during longer conversations
- Produce code that requires correction
- Perform poorly on specialized or expert-level subjects
Aurora One should not be used as the sole source of information for medical, legal, financial, safety-critical, or other high-stakes decisions.
Model Architecture
| Property | Value |
|---|---|
| Parameters | 119,953,152 |
| Architecture | Llama-style causal transformer |
| Transformer layers | 14 |
| Hidden size | 768 |
| Attention heads | 12 |
| Key-value heads | 12 |
| MLP intermediate size | 2,304 |
| Vocabulary size | 16,384 tokens |
| Maximum context | 2,048 tokens |
Using Transformers
Text-generation pipeline
from transformers import pipeline
generator = pipeline(
task="text-generation",
model="North-ML1/aurora-one",
device_map="auto",
)
result = generator(
"Explain why the sky appears blue:",
max_new_tokens=128,
temperature=0.7,
do_sample=True,
)
print(result[0]["generated_text"])
Load the tokenizer and causal language model directly
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "North-ML1/aurora-one"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
)
prompt = "Write a short explanation of neural networks."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=150,
temperature=0.7,
do_sample=True,
repetition_penalty=1.1,
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Use
AutoModelForCausalLM, rather thanAutoModel, for text generation.
Using vLLM
Install vLLM:
pip install vllm
Start an OpenAI-compatible server:
vllm serve "North-ML1/aurora-one"
Send a completion request:
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "North-ML1/aurora-one",
"prompt": "Once upon a time,",
"max_tokens": 256,
"temperature": 0.7
}'
Using SGLang
Install SGLang:
pip install sglang
Start the server:
python3 -m sglang.launch_server \
--model-path "North-ML1/aurora-one" \
--host 0.0.0.0 \
--port 30000
Send a completion request:
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "North-ML1/aurora-one",
"prompt": "Once upon a time,",
"max_tokens": 256,
"temperature": 0.7
}'
Docker Model Runner
docker model run hf.co/North-ML1/aurora-one
Recommended Generation Settings
| Setting | Suggested value |
|---|---|
temperature |
0.6–0.8 |
top_p |
0.9–0.95 |
repetition_penalty |
1.05–1.15 |
max_new_tokens |
64–256 |
Lower temperatures generally produce more predictable output. Higher temperatures may improve variety but can increase factual errors and repetition.
Example prompt
Explain photosynthesis in simple language.
Answer clearly and keep the response under five sentences.
Example system-style instruction
You are Aurora One, a helpful and concise assistant.
Follow the user's instructions carefully.
State when you are uncertain.
Do not invent sources or facts.
Evaluation
Aurora One is under active experimental evaluation. Benchmark results should be interpreted cautiously because small changes in prompting, tokenization, and generation settings can significantly affect results.
Future evaluations may cover:
- General knowledge
- Instruction following
- Basic mathematical reasoning
- Code generation
- Reading comprehension
- Repetition and generation stability
Responsible Use
Developers using Aurora One should:
- Clearly disclose when users are interacting with an AI system
- Validate factual claims before presenting them as reliable
- Apply additional safeguards for public deployments
- Avoid using the model for high-stakes automated decisions
- Test the model for failure cases relevant to their application
Built by North ML
Aurora One is an experimental model created to explore how much useful capability can be achieved with compact language-model architectures.
Model:
North-ML1/aurora-one