Model Card
Legal LLM (Fine-Tuned on Indian Supreme Court Synthetic Q&A)
- Developed by: Shivam Pandey (shivvamm)
- License: apache-2.0
- Finetuned from:
unsloth/meta-llama-3.1-8b-unsloth-bnb-4bit
This model is a fine-tuned LLaMA-3.1-8B version optimized using Unsloth for high-speed training and inference.
It was trained on a synthetic legal Q&A dataset derived from Indian Supreme Court case data, enabling the model to reason over legal facts, summarize judgments, and answer domain-specific legal queries.
Training was conducted using the Hugging Face TRL library with Unslothβs accelerated training pipeline, achieving ~2Γ faster fine-tuning compared to standard methods.
π Training Dataset
The training data consists of:
- Synthetic legal questions and answers generated from publicly available Supreme Court of India judgment texts.
- Structured Q&A style conversation format.
- Emphasis on:
- Case summaries
- Legal principles
- Interpretations
- Procedural details
- Outcome classification
No private or confidential data was used.
π§ Model Capabilities
The model has been evaluated on test samples from the synthetic dataset and performs reliably on:
- Answering legal domain questions
- Summarizing court judgments
- Explaining legal concepts in simple language
- Classifying legal issues
- Extracting key principles from case texts
β‘ Performance
Initial testing shows:
- Strong understanding of legal terminology
- Accurate reasoning within the context of Supreme Court judgments
- Smooth conversational and structured Q&A performance
- Improved logical consistency after fine-tuning
Further benchmarking (e.g., LawBench, Indian legal QA datasets) is planned.
π Fine-Tuning Details
- Framework: Unsloth + Hugging Face TRL
- Technique: PEFT / LoRA
- Precision: 4-bit (bnb-4bit)
- Training Speed: ~2Γ faster with Unsloth acceleration
- Training Style: Instruction-tuned Q&A format
π Use Cases
- Legal research assistance
- Court judgment summarization
- Law student Q&A assistant
- Domain-specific legal reasoning
- Automated drafting helpers (non-advisory)
β οΈ Limitations & Disclaimer
- The model does not provide real legal advice.
- Outputs may contain inaccuracies and should not be used for professional legal decision-making.
- The dataset includes synthetically generated labels, which may introduce bias or hallucinations.
β€οΈ Built With
This model was trained using:
Powered by Unsloth, LLaMA, and Hugging Face TRL.
- Downloads last month
- 28
16-bit
