File size: 1,673 Bytes
6905bca | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | ---
language: vi
tags:
- summarization
- multi-answer-summarization
- t5
- vietnamese
- absosum
license: apache-2.0
metrics:
- rouge
---
# ABSOSUM Phase 2 V1.0 - Weight-Aware Multi-Answer Summarization
This model is a weight-aware T5-based model fine-tuned for multi-answer summarization on Q&A data (ABSOSUM Phase 2).
## Model Description
- **Base Model:** T5-base
- **Architecture:** V2++ with Weight-Aware Cross-Attention
- **Task:** Multi-answer summarization with answer importance weighting
- **Language:** Vietnamese
## Training Details
- Special tokens: `<POST>`, `<ANS>`
- Max sequence length: 512
- Max target length: 400
- Weight injection: log-scaled weights in cross-attention
## Performance
ROUGE Scores on Test Set:
- ROUGE-1: 44.98%
- ROUGE-2: 22.26%
- ROUGE-L: 33.65%
## Usage
```python
from transformers import T5Tokenizer, T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained("HuyTran1301/ABSOSUM_Phase2_v1.2")
tokenizer = T5Tokenizer.from_pretrained("HuyTran1301/ABSOSUM_Phase2_v1.2")
# Format input with special tokens
input_text = "<POST> Your question here </s> <ANS> Answer 1 </s> <ANS> Answer 2 </s>"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
# Generate summary
outputs = model.generate(input_ids, max_length=150)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
```
## Citation
If you use this model, please cite:
```
@misc{absosum_phase2_v1,
title={ABSOSUM Phase 2: Weight-Aware Multi-Answer Summarization},
author={Huy Tran},
year={2025},
url={https://huggingface.co/HuyTran1301/ABSOSUM_Phase2_v1.2}
}
```
## Training Date
November 28, 2025
|