Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ pipeline_tag: text-generation
|
|
| 8 |
tags:
|
| 9 |
- table
|
| 10 |
---
|
| 11 |
-
# Model Card for
|
| 12 |
|
| 13 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 14 |
|
|
@@ -16,7 +16,7 @@ Recent advances in table understanding have focused on instruction-tuning large
|
|
| 16 |
|
| 17 |
Through systematic analysis, we show that hyperparameters, such as learning rate, can significantly influence both table-specific and general capabilities. Contrary to the previous table instruction-tuning work, we demonstrate that smaller learning rates and fewer training instances can enhance table understanding while preserving general capabilities. Based on our findings, we introduce TAMA, a TAble LLM instruction-tuned from LLaMA 3.1 8B Instruct, which achieves performance on par with, or surpassing GPT-3.5 and GPT-4 on table tasks, while maintaining strong out-of-domain generalization and general capabilities. Our findings highlight the potential for reduced data annotation costs and more efficient model development through careful hyperparameter selection.
|
| 18 |
|
| 19 |
-
## Model Details
|
| 20 |
|
| 21 |
### Model Description
|
| 22 |
|
|
@@ -42,7 +42,7 @@ Through systematic analysis, we show that hyperparameters, such as learning rate
|
|
| 42 |
TAMA is intended for the use in table understanding tasks and to facilitate future research.
|
| 43 |
|
| 44 |
|
| 45 |
-
## How to Get Started with the Model
|
| 46 |
|
| 47 |
Use the code below to get started with the model.
|
| 48 |
Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
|
|
@@ -109,7 +109,7 @@ llamafactory-cli train yamls/train.yaml
|
|
| 109 |
- **Cutoff length:** 2048
|
| 110 |
- **Learning rate**: 5e-7
|
| 111 |
|
| 112 |
-
## Evaluation
|
| 113 |
|
| 114 |
### Results
|
| 115 |
|
|
|
|
| 8 |
tags:
|
| 9 |
- table
|
| 10 |
---
|
| 11 |
+
# Model Card for TAMA-5e-7
|
| 12 |
|
| 13 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 14 |
|
|
|
|
| 16 |
|
| 17 |
Through systematic analysis, we show that hyperparameters, such as learning rate, can significantly influence both table-specific and general capabilities. Contrary to the previous table instruction-tuning work, we demonstrate that smaller learning rates and fewer training instances can enhance table understanding while preserving general capabilities. Based on our findings, we introduce TAMA, a TAble LLM instruction-tuned from LLaMA 3.1 8B Instruct, which achieves performance on par with, or surpassing GPT-3.5 and GPT-4 on table tasks, while maintaining strong out-of-domain generalization and general capabilities. Our findings highlight the potential for reduced data annotation costs and more efficient model development through careful hyperparameter selection.
|
| 18 |
|
| 19 |
+
## 🚀 Model Details
|
| 20 |
|
| 21 |
### Model Description
|
| 22 |
|
|
|
|
| 42 |
TAMA is intended for the use in table understanding tasks and to facilitate future research.
|
| 43 |
|
| 44 |
|
| 45 |
+
## 🔨 How to Get Started with the Model
|
| 46 |
|
| 47 |
Use the code below to get started with the model.
|
| 48 |
Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
|
|
|
|
| 109 |
- **Cutoff length:** 2048
|
| 110 |
- **Learning rate**: 5e-7
|
| 111 |
|
| 112 |
+
## 📝 Evaluation
|
| 113 |
|
| 114 |
### Results
|
| 115 |
|