πŸ† IndoBERT-NER (Gold) - State-of-the-Art Indonesian NER

This model represents the current State-of-the-Art (SOTA) for Indonesian Named Entity Recognition (NER).

It utilizes a Two-Stage Curriculum Learning strategy to achieve higher accuracy and robustness compared to previous benchmarks (such as NusaBERT). Despite being a Large architecture (334M parameters), it achieves 46% higher throughput and lower latency than comparable Base models during inference.

πŸš€ Key Performance Highlights

Metric vs. Previous SOTA (NusaBERT)
Accuracy (F1) +3.54% higher on NER_UI dataset
Recall +4.02% higher recall (misses fewer entities)
Speed 15.84 ms latency (vs 23.13 ms)
Throughput 63 samples/sec (vs 43 samples/sec)

πŸŽ“ Training Strategy (Curriculum Learning)

To achieve these results, we employed a novel "Silver-to-Gold" training pipeline. We did not simply concatenate datasets; instead, we trained the model in phases to simulate a learning curriculum.

Phase 1: The "Warm-up" (Silver Data)

  • Data Source: A massive synthetic dataset of 130,000+ sentences.
  • Collection Method: We translated high-quality English NER datasets into Indonesian and utilized GLiNER-Multi-v2.1 to automatically tag entities.
  • Objective: This phase allowed the model to learn general context, sentence structures, and the concept of 19 rich entity labels (Events, Art, Facilities, etc.) which are often missing in standard datasets.

Phase 2: The "Refinement" (Gold Data)

  • Data Source: NERGRIT, a high-quality, human-annotated dataset for Indonesian.
  • Method: We took the checkpoint from Phase 1 and performed precise fine-tuning on this Gold standard data with a lower learning rate.
  • Objective: This corrected any noise introduced by the synthetic data and aligned the model's predictions with proper Indonesian grammatical standards.

πŸ“Š Evaluation & Benchmarks

We evaluated this model against the previous best-performing model (cahya/NusaBert-ner-v1.3) using strict evaluation scripts on standard academic datasets.

1. Dataset: NER_UI (Universitas Indonesia)

This model achieves a significant improvement in Recall and F1 Score.

Rank Model Name F1 Score Precision Recall Accuracy
πŸ₯‡ 1 indobert-ner-gold (Ours) 79.91% 79.54% 80.28% 94.49%
πŸ₯ˆ 2 NusaBERT (Cahya) 76.37% 76.47% 76.26% 93.88%

2. Dataset: NER_UGM (Universitas Gadjah Mada)

Even on the challenging UGM dataset, our model maintains superior performance.

Rank Model Name F1 Score Precision Recall Accuracy
πŸ₯‡ 1 indobert-ner-gold (Ours) 70.21% 61.74% 81.37% 93.48%
πŸ₯ˆ 2 NusaBERT (Cahya) 67.49% 58.88% 79.07% 93.12%

⚑ Inference Efficiency (Speed)

We benchmarked the inference speed on a standard CUDA environment. Despite being a "Large" model (334M params) compared to NusaBERT's Base architecture (160M params), this model runs significantly faster.

This efficiency is attributed to the optimized tokenizer and the robustness of the Phase 1 training which allows for confident, faster convergence during inference.

Metric Our Model (Large) NusaBERT (Base) Verdict
Parameters 334M 160M Ours is heavier (More Knowledge)
Latency 15.84 ms 23.13 ms ⚑ 31% Faster
Throughput 63.12 samples/sec 43.23 samples/sec πŸš€ 46% More Capacity

🏷️ Supported Labels

Unlike standard models that only detect Person, Location, and Organization, this model supports 19 detailed entity tags:

Label Description Label Description
PER Person ORG Organization
LOC Location GPE Geopolitical Entity
FAC Facility (Buildings, Airports, etc.) EVT Event
WOA Work of Art LAW Law / Legal references
PRO Product LAN Language
DAT Date TIM Time
MON Money QTY Quantity
CRD Cardinal Number ORD Ordinal Number
PRC Percent NOR Political/Religious Group

πŸ’» How to Use

from transformers import pipeline

# Load the pipeline
ner = pipeline("ner", model="treamyracle/indobert-ner-gold", aggregation_strategy="simple")

text = """
Presiden Joko Widodo meninjau pembangunan Istana Negara di Ibu Kota Nusantara (IKN) 
pada hari Selasa tanggal 20 Januari 2024. Proyek senilai Rp 15 Triliun ini dikerjakan oleh PT Waskita Karya.
"""

results = ner(text)
print(results)
Downloads last month
15
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 2 Ask for provider support

Model tree for treamyracle/indobert-ner-gold

Finetuned
(13)
this model

Datasets used to train treamyracle/indobert-ner-gold