---
language: ko
license: mit
tags:
  - pytorch
  - bert
  - kobert
  - text-classification
  - stance-detection
  - korean
  - news
  - political
datasets:
  - custom
metrics:
  - accuracy
  - f1
model-index:
  - name: stance-classifier-v2
    results:
      - task:
          type: text-classification
          name: Stance Classification
        metrics:
          - type: accuracy
            value: 73.93
            name: Test Accuracy
          - type: f1
            value: 0.7395
            name: Test F1
---

# Korean Political News Stance Classifier v2

KoBERT 기반 한국어 정치 뉴스 스탠스(입장) 분류 모델입니다.

## Model Description

- **Base Model**: monologg/kobert
- **Task**: 3-class stance classification (옹호/중립/비판)
- **Language**: Korean
- **Training Data**: ~12,000 labeled political news articles

## Performance

| Metric | Score |
|--------|-------|
| Test Accuracy | 73.93% |
| Test F1 (macro) | 0.7395 |

## Labels

| Label ID | Korean | English | Description |
|----------|--------|---------|-------------|
| 0 | 옹호 | support | 정부/여당에 우호적 |
| 1 | 중립 | neutral | 객관적 사실 전달 |
| 2 | 비판 | oppose | 정부/여당에 비판적 |

## Usage

```python
import torch
from transformers import BertModel, AutoTokenizer
from huggingface_hub import hf_hub_download
import torch.nn as nn

# 모델 정의
class StanceClassifier(nn.Module):
    def __init__(self, bert_model, num_classes=3, dropout_rate=0.3):
        super().__init__()
        self.bert = bert_model
        self.dropout = nn.Dropout(dropout_rate)
        self.classifier = nn.Linear(768, num_classes)

    def forward(self, input_ids, attention_mask, token_type_ids=None):
        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
        pooled_output = outputs.pooler_output
        pooled_output = self.dropout(pooled_output)
        return self.classifier(pooled_output)

# 모델 로드
model_path = hf_hub_download(repo_id="gaaahee/stance-classifier-v2", filename="pytorch_model.pt")
checkpoint = torch.load(model_path, map_location='cpu')

bert_model = BertModel.from_pretrained('monologg/kobert')
model = StanceClassifier(bert_model)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()

# 토크나이저 로드
tokenizer = AutoTokenizer.from_pretrained('monologg/kobert', trust_remote_code=True)

# 예측
text = "정부의 새 정책이 경제 성장에 크게 기여할 것으로 기대된다"
encoding = tokenizer(text, truncation=True, max_length=512, padding='max_length', return_tensors='pt')

with torch.no_grad():
    logits = model(encoding['input_ids'], encoding['attention_mask'])
    probs = torch.softmax(logits, dim=1)
    pred = torch.argmax(probs, dim=1).item()

labels = ['옹호', '중립', '비판']
print(f"Prediction: {labels[pred]} ({probs[0][pred].item()*100:.1f}%)")
```

## Training Details

| Parameter | Value |
|-----------|-------|
| Base Model | monologg/kobert |
| Max Length | 512 |
| Batch Size | 64 |
| Learning Rate | 2e-05 |
| Dropout | 0.3 |
| Loss Function | Focal Loss (gamma=2.0) |
| Early Stopping | patience=3 |

## Citation

```bibtex
@misc{korean-stance-classifier-v2,
  title={Korean Political News Stance Classifier v2},
  year={2024},
  publisher={HuggingFace}
}
```