Korean Political News Stance Classifier v2

KoBERT ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ์ •์น˜ ๋‰ด์Šค ์Šคํƒ ์Šค(์ž…์žฅ) ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

Model Description

  • Base Model: monologg/kobert
  • Task: 3-class stance classification (์˜นํ˜ธ/์ค‘๋ฆฝ/๋น„ํŒ)
  • Language: Korean
  • Training Data: ~12,000 labeled political news articles

Performance

Metric Score
Test Accuracy 73.93%
Test F1 (macro) 0.7395

Labels

Label ID Korean English Description
0 ์˜นํ˜ธ support ์ •๋ถ€/์—ฌ๋‹น์— ์šฐํ˜ธ์ 
1 ์ค‘๋ฆฝ neutral ๊ฐ๊ด€์  ์‚ฌ์‹ค ์ „๋‹ฌ
2 ๋น„ํŒ oppose ์ •๋ถ€/์—ฌ๋‹น์— ๋น„ํŒ์ 

Usage

import torch
from transformers import BertModel, AutoTokenizer
from huggingface_hub import hf_hub_download
import torch.nn as nn

# ๋ชจ๋ธ ์ •์˜
class StanceClassifier(nn.Module):
    def __init__(self, bert_model, num_classes=3, dropout_rate=0.3):
        super().__init__()
        self.bert = bert_model
        self.dropout = nn.Dropout(dropout_rate)
        self.classifier = nn.Linear(768, num_classes)

    def forward(self, input_ids, attention_mask, token_type_ids=None):
        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
        pooled_output = outputs.pooler_output
        pooled_output = self.dropout(pooled_output)
        return self.classifier(pooled_output)

# ๋ชจ๋ธ ๋กœ๋“œ
model_path = hf_hub_download(repo_id="gaaahee/stance-classifier-v2", filename="pytorch_model.pt")
checkpoint = torch.load(model_path, map_location='cpu')

bert_model = BertModel.from_pretrained('monologg/kobert')
model = StanceClassifier(bert_model)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
tokenizer = AutoTokenizer.from_pretrained('monologg/kobert', trust_remote_code=True)

# ์˜ˆ์ธก
text = "์ •๋ถ€์˜ ์ƒˆ ์ •์ฑ…์ด ๊ฒฝ์ œ ์„ฑ์žฅ์— ํฌ๊ฒŒ ๊ธฐ์—ฌํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค"
encoding = tokenizer(text, truncation=True, max_length=512, padding='max_length', return_tensors='pt')

with torch.no_grad():
    logits = model(encoding['input_ids'], encoding['attention_mask'])
    probs = torch.softmax(logits, dim=1)
    pred = torch.argmax(probs, dim=1).item()

labels = ['์˜นํ˜ธ', '์ค‘๋ฆฝ', '๋น„ํŒ']
print(f"Prediction: {labels[pred]} ({probs[0][pred].item()*100:.1f}%)")

Training Details

Parameter Value
Base Model monologg/kobert
Max Length 512
Batch Size 64
Learning Rate 2e-05
Dropout 0.3
Loss Function Focal Loss (gamma=2.0)
Early Stopping patience=3

Citation

@misc{korean-stance-classifier-v2,
  title={Korean Political News Stance Classifier v2},
  year={2024},
  publisher={HuggingFace}
}
Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using gaaahee/stance-classifier-v2 1

Evaluation results