MichiganNLP
/

TAMA-vB

Text Generation

Model card Files Files and versions

dnaihao commited on Jun 15

Commit

45eebe1

·

verified ·

1 Parent(s): 3d1fed5

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ pipeline_tag: text-generation
 tags:
 - table
 ---
-# Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
@@ -16,7 +16,7 @@ Recent advances in table understanding have focused on instruction-tuning large
 Through systematic analysis, we show that hyperparameters, such as learning rate, can significantly influence both table-specific and general capabilities. Contrary to the previous table instruction-tuning work, we demonstrate that smaller learning rates and fewer training instances can enhance table understanding while preserving general capabilities. Based on our findings, we introduce TAMA, a TAble LLM instruction-tuned from LLaMA 3.1 8B Instruct, which achieves performance on par with, or surpassing GPT-3.5 and GPT-4 on table tasks, while maintaining strong out-of-domain generalization and general capabilities. Our findings highlight the potential for reduced data annotation costs and more efficient model development through careful hyperparameter selection.
-## Model Details
 ### Model Description
@@ -42,7 +42,7 @@ Through systematic analysis, we show that hyperparameters, such as learning rate
 TAMA is intended for the use in table understanding tasks and to facilitate future research.
-## How to Get Started with the Model
 Use the code below to get started with the model.
 Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
@@ -109,7 +109,7 @@ llamafactory-cli train yamls/train.yaml
 - **Cutoff length:** 2048
 - **Learning rate**: 5e-7
-## Evaluation
 ### Results

 tags:
 - table
 ---
+# Model Card for TAMA-5e-7
 <!-- Provide a quick summary of what the model is/does. -->
 Through systematic analysis, we show that hyperparameters, such as learning rate, can significantly influence both table-specific and general capabilities. Contrary to the previous table instruction-tuning work, we demonstrate that smaller learning rates and fewer training instances can enhance table understanding while preserving general capabilities. Based on our findings, we introduce TAMA, a TAble LLM instruction-tuned from LLaMA 3.1 8B Instruct, which achieves performance on par with, or surpassing GPT-3.5 and GPT-4 on table tasks, while maintaining strong out-of-domain generalization and general capabilities. Our findings highlight the potential for reduced data annotation costs and more efficient model development through careful hyperparameter selection.
+## 🚀 Model Details
 ### Model Description
 TAMA is intended for the use in table understanding tasks and to facilitate future research.
+## 🔨 How to Get Started with the Model
 Use the code below to get started with the model.
 Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
 - **Cutoff length:** 2048
 - **Learning rate**: 5e-7
+## 📝 Evaluation
 ### Results