C-BERT v3 (Factorized): Causal Relation Extraction

A factorized multi-task model for extracting fine-grained causal attributions from German text.

📄 Paper: C-BERT: Factorized Causal Relation Extraction
💻 Code: github.com/padjohn/cbert
📊 Dataset: Bundestag Causal Attribution

Model Details

C-BERT extends EuroBERT-610m with two task-specific modules, fine-tuned jointly with LoRA:

Task Output Labels
1. Span Recognition BIOES Sequence Labeling 9 tags: B/I/O/E/S × {INDICATOR, ENTITY}
2. Relation Classification 3 Parallel Heads Role (CAUSE, EFFECT, NO_RELATION), Polarity (POS, NEG), Salience (MONO, PRIO, DIST)

The three relation heads decompose the 14-class label space into linguistically motivated dimensions:

  • Role: syntactic position (cause vs. effect vs. unrelated)
  • Polarity: influence direction (promoting vs. inhibiting)
  • Salience: attribution strength (monocausal vs. prioritized vs. distributive)

Predictions reconstruct to 14-class labels via salience × polarity × role (e.g., MONO_POS_CAUSE) and to a continuous influence scalar I[1,+1]I \in [-1, +1].

Usage

from causalbert.infer import load_model, sentence_analysis, extract_tuples

model, tokenizer, config, device = load_model("pdjohn/C-EBERT-V3-610m")
sentences = ["Pestizide und Autoverkehr sind Ursachen von Artensterben."]
analysis = sentence_analysis(model, tokenizer, config, sentences, device=device)
results = extract_tuples(analysis)

for item in results:
    print(f"{item['cause']} --({item['influence']:+.2f})--> {item['effect']}")

Output:

Pestizide --(+1.00)--> Artensterben
Autoverkehr --(+1.00)--> Artensterben

Evaluation

Flagship model (seed 456, epoch 4). Evaluated on validation set (478 relations.

Relation Classification

Metric Score
Role Accuracy 88.7%
Polarity Accuracy 92.0%
Salience Accuracy 92.4%
Reconstructed 14-class Accuracy 76.9%
Reconstructed 14-class F1 (macro) 62.2%

Span Detection (Strict F1)

Span Type Precision Recall F1
Entity 0.771 0.759 0.765
Indicator 0.829 0.715 0.768

Multi-Seed Robustness

v2 (unified) v3 (factorized)
Mean accuracy (5 seeds) 0.744 ± 0.007 0.768 ± 0.009
Best seed 0.753 0.781
Seeds where v3 wins 5/5

Error Analysis

Error Type v2 v3
Role only 31.0% 35.9%
Polarity only 22.6% 24.8%
Salience only 23.8% 23.1%
Multi-head cascade 22.6% 16.2%

The factorized architecture reduces multi-head error cascades and concentrates failures in single, interpretable subtasks.

Training

Parameter Value
Base model EuroBERT-610m
Architecture v3 (factorized: role + polarity + salience heads)
LoRA r=16, α=32, dropout=0.05
Learning rate 3×10⁻⁴ (cosine schedule)
Warmup ratio 0.05
Epochs 7 (best checkpoint: epoch 4)
Batch size 32
Training seed 456
Dataset 2,391 relations, augmented to 7,604 (mode 2)
Loss Sum of 3 weighted cross-entropy terms (role + polarity + salience)

Dataset

Trained on 2,391 manually annotated causal relations in German environmental discourse (1990–2020), covering forest dieback, insect death, bee death, and species extinction. 80/20 train/test split at sentence level; augmentation doubles training relations via entity replacement.

A publicly releasable subset of 487 relations from German parliamentary debates is available at bundestag-causal-attribution.

Citation

@article{johnson2026cbert,
  title={C-BERT: Factorized Causal Relation Extraction},
  author={Johnson, Patrick},
  year={2026},
  doi={10.26083/tuda-7797}
}

Also Available

  • C-BERT v2 (Unified): Single 14-class classification head. Simpler but lower accuracy (75.3% vs. 76.9%).
Downloads last month
77
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pdjohn/C-EBERT-V3-610m

Adapter
(2)
this model

Collection including pdjohn/C-EBERT-V3-610m