nolanplatt commited on
Commit
158e334
·
verified ·
1 Parent(s): d96a48b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - maritime
7
+ - AIS
8
+ - vessel-tracking
9
+ - navigation
10
+ - fine-tuned
11
+ - experimental
12
+ - research
13
+ base_model: meta-llama/Llama-3.1-8B-Instruct
14
+ datasets:
15
+ - synthetic-maritime-ais-qa
16
+ model-index:
17
+ - name: hvf-slm-v2-magistral
18
+ results: []
19
+ ---
20
+
21
+ # HVF-SLM v2 (Llama): Improved Maritime/AIS LLM with Limitations
22
+
23
+ Second iteration in the HVF-SLM series, based on Llama 3.1 8B. Shows significant improvements over v1-magistral in coordinate extraction but suffers from hallucination issues.
24
+
25
+
26
+ **Improvements over v1:**
27
+ - 97% better vessel identification accuracy
28
+ - Successful 131k token context processing
29
+ - Lower training loss (0.009 → 0.0002)
30
+
31
+ **Critical Issues:**
32
+ - Repeats phrases 50+ times
33
+ - Invents vessel positions
34
+ - Extracts wrong vessels when similar names exist
35
+
36
+ ## Model Details
37
+ - **Base Model**: Llama-3.1-8B-Instruct
38
+ - **Context Length**: 131k tokens (already supported by llama)
39
+ - **Training Dataset**: ~22,000 synthetic maritime Q&A pairs
40
+ - **Fine-tuning Method**: QLoRA (4-bit) rank 128
41
+ - **Status**: Superseded by v2-llama and v3-qwen
42
+
43
+ ## Research Value
44
+
45
+ Despite not being our final SLM, this model does demonstrate the challenge of training SLMs on structured maritime data. Despite good training metrics, the model failed to generalize properly, leading to the development of v3-qwen with improvements.
46
+
47
+ ## Not Recommended For:
48
+ - Production maritime systems
49
+ - Safety-critical applications
50
+ - Real vessel tracking
51
+
52
+
53
+ ## Citation
54
+
55
+ Part of a larger, in-depth paper by HVF. Full citation available upon publication.