Text Generation
Transformers
English
chat
causal-lm
llama
experimental
aurora-one

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Unreleased Model. Do not try downloading. A unreleased checkpoint was on here for some time, but testing proved unworthy.

Aurora One model artwork

🌌 Aurora One

A compact experimental language model built by North ML.

General conversation · Lightweight assistance · Local inference · Small-model research

120M parameters 2048 token context English Experimental


Overview

Aurora One is a compact, experimental causal language model designed for general conversation, lightweight assistance, educational experiments, and research into small language models.

With approximately 120 million parameters, Aurora One is intended to provide useful text generation while remaining practical for local and lower-resource inference environments.

Experimental model: Aurora One may produce incorrect information, misunderstand complex instructions, repeat phrases, or generate flawed reasoning.

Capabilities

💬 Conversation

Hold basic conversations and respond to straightforward questions.

📖 Explanations

Explain simple concepts using concise, accessible language.

✍️ Writing

Assist with brainstorming, rewriting, short stories, and creative responses.

💻 Basic Coding

Answer introductory programming questions and generate short code examples.

🎯 Instruction Following

Follow short, clearly written instructions and formatting requests.

🔬 Research

Support experiments involving compact language models and local inference.

Intended Uses

  • General-purpose conversational applications
  • Lightweight personal assistants
  • Educational demonstrations and experiments
  • Writing and brainstorming tools
  • Simple question answering
  • Basic coding assistance
  • Chatbot prototypes
  • Small-model research
  • Local and resource-conscious inference

Limitations

Aurora One is a small experimental model and may:

  • Generate factually incorrect or invented information
  • Struggle with multi-step reasoning
  • Misunderstand long or highly detailed instructions
  • Repeat words, phrases, or ideas
  • Lose consistency during longer conversations
  • Produce code that requires correction
  • Perform poorly on specialized or expert-level subjects

Aurora One should not be used as the sole source of information for medical, legal, financial, safety-critical, or other high-stakes decisions.

Model Architecture

Property Value
Parameters 119,953,152
Architecture Llama-style causal transformer
Transformer layers 14
Hidden size 768
Attention heads 12
Key-value heads 12
MLP intermediate size 2,304
Vocabulary size 16,384 tokens
Maximum context 2,048 tokens

Using Transformers

Text-generation pipeline

from transformers import pipeline

generator = pipeline(
    task="text-generation",
    model="North-ML1/aurora-one",
    device_map="auto",
)

result = generator(
    "Explain why the sky appears blue:",
    max_new_tokens=128,
    temperature=0.7,
    do_sample=True,
)

print(result[0]["generated_text"])

Load the tokenizer and causal language model directly

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "North-ML1/aurora-one"

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

prompt = "Write a short explanation of neural networks."

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=150,
        temperature=0.7,
        do_sample=True,
        repetition_penalty=1.1,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))

Use AutoModelForCausalLM, rather than AutoModel, for text generation.

Using vLLM

Install vLLM:

pip install vllm

Start an OpenAI-compatible server:

vllm serve "North-ML1/aurora-one"

Send a completion request:

curl -X POST "http://localhost:8000/v1/completions" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "North-ML1/aurora-one",
    "prompt": "Once upon a time,",
    "max_tokens": 256,
    "temperature": 0.7
  }'

Using SGLang

Install SGLang:

pip install sglang

Start the server:

python3 -m sglang.launch_server \
  --model-path "North-ML1/aurora-one" \
  --host 0.0.0.0 \
  --port 30000

Send a completion request:

curl -X POST "http://localhost:30000/v1/completions" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "North-ML1/aurora-one",
    "prompt": "Once upon a time,",
    "max_tokens": 256,
    "temperature": 0.7
  }'

Docker Model Runner

docker model run hf.co/North-ML1/aurora-one

Recommended Generation Settings

Setting Suggested value
temperature 0.6–0.8
top_p 0.9–0.95
repetition_penalty 1.05–1.15
max_new_tokens 64–256

Lower temperatures generally produce more predictable output. Higher temperatures may improve variety but can increase factual errors and repetition.

Example prompt
Explain photosynthesis in simple language.

Answer clearly and keep the response under five sentences.
Example system-style instruction
You are Aurora One, a helpful and concise assistant.

Follow the user's instructions carefully.
State when you are uncertain.
Do not invent sources or facts.

Evaluation

Aurora One is under active experimental evaluation. Benchmark results should be interpreted cautiously because small changes in prompting, tokenization, and generation settings can significantly affect results.

Future evaluations may cover:

  • General knowledge
  • Instruction following
  • Basic mathematical reasoning
  • Code generation
  • Reading comprehension
  • Repetition and generation stability

Responsible Use

Developers using Aurora One should:

  • Clearly disclose when users are interacting with an AI system
  • Validate factual claims before presenting them as reliable
  • Apply additional safeguards for public deployments
  • Avoid using the model for high-stakes automated decisions
  • Test the model for failure cases relevant to their application

Built by North ML

Aurora One is an experimental model created to explore how much useful capability can be achieved with compact language-model architectures.

Model: North-ML1/aurora-one

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train North-ML1/aurora-one