SQLTemple-1.1B-Alpha

SQLTemple-1.1B-Alpha is a specialized SQL code generation model fine-tuned from TinyLlama-1.1B-Chat-v1.0 using LoRA on the Spider dataset.

Model Details

  • Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Parameters: ~1.1B
  • Training Dataset: Spider (7,000 examples)
  • Training Method: LoRA (r=16, α=32)
  • Training Examples: 1,000
  • Context Length: 512 tokens

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("./sqltemple-1.1b-alpha-hf")
model = AutoModelForCausalLM.from_pretrained("./sqltemple-1.1b-alpha-hf")

prompt = "<|system|>You are an SQL assistant. Answer in valid SQL.\n<|user|>Question: Get all users\n<|assistant|>"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)

Training Details

  • Epochs: 1
  • Learning Rate: 0.0001
  • Batch Size: 8
Downloads last month
11
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for victorbona/sqltemple-1.1b-alpha

Quantized
(122)
this model

Dataset used to train victorbona/sqltemple-1.1b-alpha