MAIB Incident Type Classifier
A fine-tuned DeBERTa-v3 model for classifying marine incident types based on accident investigation reports from the Marine Accident Investigation Branch (MAIB).
Model Description
This model is a fine-tuned version of microsoft/deberta-v3-base specifically designed to classify marine incidents into 11 different categories. It was trained on the MAIB incident reports dataset and achieves high performance in maritime safety incident classification.
- Developed by: Ilia Munaev
- Model type: Text Classification
- Language(s): English
- License: Apache 2.0
- Finetuned from model: microsoft/deberta-v3-base
Model Performance
The model achieves the following performance metrics on the test set:
| Metric | Score |
|---|---|
| Accuracy | 89.0% |
| Weighted F1-Score | 89.0% |
| Macro F1-Score | 70.2% |
Evaluation Results
The model evaluation:
- Confusion Matrix: Shows classification accuracy across all incident types
- Per-Class F1 Scores: Displays F1 performance for each incident category
Intended Use
Primary Use Cases
- Maritime Safety Analysis: Classify marine incident reports for safety analysis
- Regulatory Compliance: Automate incident categorization for regulatory reporting
- Risk Assessment: Support risk analysis by categorizing incident types
- Research: Academic and industry research on maritime safety patterns
Out-of-Scope Use Cases
- Real-time Emergency Response: Not suitable for emergency situations requiring immediate response
- Legal Proceedings: Should not be used as primary evidence in legal cases
- Non-English Text: Model is trained only on English incident reports
Training Data
The model was trained on the baker-street/maib-incident-reports-5K dataset, which contains:
- Total Samples: 5,768 incident reports
- Training Set: 5,191 samples
- Validation Set: 288 samples
- Test Set: 289 samples
- Source: Marine Accident Investigation Branch (MAIB) reports
- Language: English
- Time Period: Historical MAIB incident reports
Training Procedure
Training Hyperparameters
- Learning Rate: 2e-5
- Batch Size: 32
- Epochs: 3
- Max Length: 256 tokens
- Optimizer: AdamW
- Scheduler: Linear with warmup
Training Infrastructure
- Hardware: CUDA-compatible GPU (Tesla T4)
- Training Time: ~16 minutes for 3 epochs
- Framework: PyTorch with Transformers library
Usage
Using Transformers Pipeline
from transformers import pipeline
# Load the model
classifier = pipeline("text-classification",
model="your-username/maib-incident-classifier")
# Classify an incident
result = classifier("A crew member fell overboard from a motorboat")
print(result)
Using Model Directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("your-username/maib-incident-classifier")
model = AutoModelForSequenceClassification.from_pretrained("your-username/maib-incident-classifier")
# Prepare input
text = "Fire broke out in the engine room during routine maintenance"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
# Get predictions
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
# Get class labels
class_labels = [
"Accident to person(s)",
"Capsizing / Listing",
"Collision",
"Contact",
"Damage / Loss Of Equipment",
"Fire / Explosion",
"Flooding / Foundering",
"Grounding / Stranding",
"Hull Failure",
"Loss Of Control",
"Non-accidental Event"
]
# Get top prediction
top_prediction = torch.argmax(predictions, dim=-1)
print(f"Predicted class: {class_labels[top_prediction]}")
print(f"Confidence: {predictions[0][top_prediction]:.3f}")
Using the Command Line
# Install the package
pip install maib-incident-classifier
# Run inference
maib-inference --model_path your-username/maib-incident-classifier --text "Incident description"
Class Labels
The model classifies incidents into the following 11 categories:
- Accident to person(s) - Injuries or fatalities to crew or passengers
- Capsizing / Listing - Vessel capsizing or severe listing
- Collision - Collision with another vessel or object
- Contact - Contact with fixed or floating objects
- Damage / Loss Of Equipment - Equipment failure or damage
- Fire / Explosion - Fire or explosion incidents
- Flooding / Foundering - Water ingress or vessel sinking
- Grounding / Stranding - Vessel running aground
- Hull Failure - Structural hull damage
- Loss Of Control - Loss of steering or propulsion control
- Non-accidental Event - Events not classified as accidents
Limitations and Bias
Known Limitations
- Class Imbalance: Some incident types (Hull Failure, Non-accidental Event) have very few samples
- Language: Model only works with English text
- Domain Specific: Trained specifically on MAIB reports, may not generalize to other maritime contexts
- Temporal Bias: Based on historical data, may not reflect current incident patterns
Potential Biases
- Reporting Bias: Reflects biases in how incidents are reported to MAIB
- Geographic Bias: Primarily UK-focused incident reports
- Vessel Type Bias: May be biased toward certain vessel types more commonly reported
Citation
@software{maib_classifier,
title={MAIB Incident Type Classifier},
author={Ilia Munaev},
year={2024},
url={https://huggingface.co/your-username/maib-incident-classifier}
}
Acknowledgments
- Marine Accident Investigation Branch (MAIB) for providing the dataset
- Microsoft for the DeBERTa-v3 base model
- Hugging Face for the transformers library and platform
- Baker Street for hosting the MAIB incident reports dataset
Contact
For questions, issues, or contributions:
- Repository: [GitHub Repository URL]
- Issues: [GitHub Issues URL]
- Email: [email protected]
License
This model is released under the Apache 2.0 License. See the LICENSE file for more details.
- Downloads last month
- 2
Dataset used to train baker-street/maib-incident-classifier
Evaluation results
- Accuracy on MAIB Incident Reportsself-reported0.890
- Weighted F1-Score on MAIB Incident Reportsself-reported0.890
- Macro F1-Score on MAIB Incident Reportsself-reported0.700