Hate Speech Detector — fine-tuned RoBERTa
Fine-tuned từ cardiffnlp/twitter-roberta-base-hate trên dataset Davidson et al. (2017).
Labels
- 0 — Hate Speech: ngôn ngữ thù ghét
- 1 — Offensive: xúc phạm nhưng không phải hate speech
- 2 — Neither: bình thường
Kết quả
| Metric | Score |
|---|---|
| Macro F1 | ~0.77 |
| Hate Speech F1 | ~0.48 |
| Accuracy | ~0.89 |
Cách dùng
from transformers import pipeline
clf = pipeline("text-classification", model="Merikatori/hate-speech-roberta")
clf("I hate all people like that")
- Downloads last month
- 49