nci-technique-classifier-v5.2 / README.md

Upload README.md with huggingface_hub

0b58fe8 verified 6 days ago

4.91 kB

	---
	license: apache-2.0
	library_name: transformers
	tags:
	- propaganda-detection
	- multi-label-classification
	- modernbert
	- text-classification
	datasets:
	- synapti/nci-propaganda-v5
	base_model: answerdotai/ModernBERT-base
	language:
	- en
	metrics:
	- f1
	pipeline_tag: text-classification
	---

	# NCI Technique Classifier v5.2

	Multi-label propaganda technique classifier based on ModernBERT, trained to identify 18 propaganda techniques from the SemEval-2020 Task 11 taxonomy.

	## Model Description

	This model is part of the NCI (Narrative Coordination Index) Protocol for detecting coordinated influence operations. It classifies text into 18 propaganda techniques with well-calibrated probability outputs.

	### Key Improvements in v5.2

	- Reduced False Positives: Scientific/factual content false positive rate reduced from 35% (v4) to 8.8%
	- Better Calibration: ASL loss with clip=0.02 provides more discriminative probability outputs
	- Hard Negatives Training: Trained on v5 dataset with 1000+ hard negative examples (scientific, business, factual content)
	- Document-Level Analysis: Works well with full documents, no need for sentence-level splitting

	### Training Details

	- Base Model: `answerdotai/ModernBERT-base`
	- Dataset: `synapti/nci-propaganda-v5` (24,037 samples)
	- Loss Function: Asymmetric Loss (ASL)
	- gamma_neg: 4.0
	- gamma_pos: 1.0
	- clip: 0.02 (reduced from 0.05 to minimize probability shifting)
	- Training: 3 epochs, lr=2e-5, batch_size=16
	- Validation: 4/7 tests passed (57%)

	## Techniques Detected

	\| ID \| Technique \| Description \|
	\|----\|-----------\|-------------\|
	\| 0 \| Loaded_Language \| Words with strong emotional implications \|
	\| 1 \| Appeal_to_fear-prejudice \| Building support through fear or prejudice \|
	\| 2 \| Exaggeration,Minimisation \| Overstating or understating facts \|
	\| 3 \| Repetition \| Repeating messages for reinforcement \|
	\| 4 \| Flag-Waving \| Appealing to patriotism/national identity \|
	\| 5 \| Name_Calling,Labeling \| Using labels to evoke prejudice \|
	\| 6 \| Reductio_ad_hitlerum \| Comparing to Hitler/Nazis \|
	\| 7 \| Black-and-White_Fallacy \| Presenting only two choices \|
	\| 8 \| Causal_Oversimplification \| Assuming single cause for complex issues \|
	\| 9 \| Whataboutism,Straw_Men,Red_Herring \| Deflection techniques \|
	\| 10 \| Straw_Man \| Misrepresenting opponent's position \|
	\| 11 \| Red_Herring \| Introducing irrelevant topics \|
	\| 12 \| Doubt \| Questioning credibility \|
	\| 13 \| Appeal_to_Authority \| Using authority figures to support claims \|
	\| 14 \| Thought-terminating_Cliches \| Phrases that end rational thought \|
	\| 15 \| Bandwagon \| "Everyone is doing it" appeals \|
	\| 16 \| Slogans \| Catchy phrases for memorability \|
	\| 17 \| Obfuscation,Intentional_Vagueness,Confusion \| Deliberately confusing language \|

	## Usage

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch

	model_id = "synapti/nci-technique-classifier-v5.2"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(model_id)

	text = "This is OUTRAGEOUS! They are LYING to you. WAKE UP!"

	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
	with torch.no_grad():
	outputs = model(**inputs)
	probs = torch.sigmoid(outputs.logits)[0]

	# Get techniques with probability > 0.5
	LABELS = [
	"Loaded_Language", "Appeal_to_fear-prejudice", "Exaggeration,Minimisation",
	"Repetition", "Flag-Waving", "Name_Calling,Labeling", "Reductio_ad_hitlerum",
	"Black-and-White_Fallacy", "Causal_Oversimplification",
	"Whataboutism,Straw_Men,Red_Herring", "Straw_Man", "Red_Herring", "Doubt",
	"Appeal_to_Authority", "Thought-terminating_Cliches", "Bandwagon", "Slogans",
	"Obfuscation,Intentional_Vagueness,Confusion"
	]

	for i, (label, prob) in enumerate(zip(LABELS, probs)):
	if prob > 0.5:
	print(f"{label}: {prob:.1%}")
	```

	## Performance

	### Validation Results

	\| Test Case \| v5.2 \| v4 \| Status \|
	\|-----------\|------\|-----\|--------\|
	\| Pure Propaganda \| 66.8% \| 70.8% \| ✓ Detected \|
	\| Neutral News \| 6.9% \| 5.5% \| ✓ Clean \|
	\| SpaceX Factual \| 3.7% \| - \| ✓ Clean \|
	\| Multi-Label Propaganda \| 76.5% \| - \| ✓ Detected \|
	\| Mixed Content \| 7.3% \| - \| - \|
	\| Fear Appeal \| 69.9% \| - \| ✓ Detected \|
	\| Scientific Report \| 8.8% \| 35.4% \| ✓ Clean \|

	### Key Metrics

	- Scientific Report FPR: 8.8% (vs 35% in v4) - 75% reduction
	- Factual News FPR: 4.6% (vs 29% in v4) - 84% reduction
	- Propaganda Detection: Maintained (73.7% max confidence on propaganda)

	## Citation

	```bibtex
	@inproceedings{da-san-martino-etal-2020-semeval,
	title = "{S}em{E}val-2020 Task 11: Detection of Propaganda Techniques in News Articles",
	author = "Da San Martino, Giovanni and others",
	booktitle = "Proceedings of the 14th International Workshop on Semantic Evaluation",
	year = "2020",
	}
	```

	## License

	Apache 2.0