README.md · darkknight25/fraud_ensemble

File size: 4,718 Bytes

---
license: mit
datasets:
  
  - financial-fraud-detection
language:
  - en
metrics:
  - auc
  - accuracy
  - f1
  - precision
  - recall
base_model:
  - "None"
library_name: onnx
pipeline_tag: fraud-detection
tags:
  - fraud-detection
  - ensemble
  - financial-security
  - onnx
  - xgboost
  - lightgbm
  - catboost
  - random-forest
  - production
  - cybersecurity
  - mlops
  - real-time-inference
  - deployed
model-index:
  - name: Fraud Detection Ensemble ONNX
    results:
      - task:
          name: Fraud Detection
          type: fraud-detection
        dataset:
          name: CREDIT CARD fraud detection credit card.csv
          type: tabular
        metrics:
          - type: auc
            value: 0.9998
          - type: accuracy
            value: 0.9942
          - type: f1
            value: 0.9756
          - type: precision
            value: 0.9813
          - type: recall
            value: 0.9701
new_version: "true"
---


# 🛡️ Fraud Detection Ensemble Suite - ONNX Format
**Author:** [darkknight25](https://huggingface.co/darkknight25)  
**Models:** XGBoost, LightGBM, CatBoost, Random Forest, Meta Learner  
**Format:** ONNX for production-ready deployment  
**Tags:** `fraud-detection`, `onnx`, `ensemble`, `real-world`, `ml`, `lightweight`, `financial-security`

---

## 🔍 Overview

This repository provides a high-performance **fraud detection ensemble** trained on real-world financial datasets and exported in **ONNX format** for lightning-fast inference.

Each model is optimized for different fraud signals and then blended via a **meta-model** for enhanced generalization.

---

## 🎯 Real-World Use Cases

✅ Credit card fraud detection  
✅ Transaction monitoring systems  
✅ Risk scoring engines  
✅ Insurance fraud  
✅ Online payment gateways  
✅ Embedded or edge deployments using ONNX

---

## 🧠 Models Included

| Model         | Format | Status     | Notes                                  |
|---------------|--------|------------|----------------------------------------|
| XGBoost       | ONNX   | ✅ Ready    | Best for handling imbalanced data      |
| LightGBM      | ONNX   | ✅ Ready    | Fast, efficient gradient boosting      |
| CatBoost      | ONNX   | ✅ Ready    | Handles categorical features well      |
| RandomForest  | ONNX   | ✅ Ready    | Stable classical ensemble              |
| Meta Model    | ONNX   | ✅ Ready    | Trained on outputs of above models     |

---

## 🧾 Feature Schema

`feature_names.json` contains the exact input features expected by all models.

You must preprocess data to match this schema before ONNX inference.

```json
["amount", "time", "is_foreign", "txn_type", ..., "ratio_to_median_purchase_price"]
```
Shape: (None, 29)

Dtype: float32

```java
import onnxruntime as ort
import numpy as np
import json

# Load feature schema
with open("feature_names.json") as f:
    feature_names = json.load(f)

# Dummy input (replace with your real preprocessed data)
X = np.random.rand(1, len(feature_names)).astype(np.float32)

# Load ONNX model
session = ort.InferenceSession("xgb_model.onnx", providers=["CPUExecutionProvider"])

# Inference
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: X})

print("Fraud probability:", output[0])
```

# Example Inference Code:
```java
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("meta_model.onnx")
input_data = np.array([[...]], dtype=np.float32)  # shape (1, 29)
inputs = {session.get_inputs()[0].name: input_data}
outputs = session.run(None, inputs)
print("Fraud Probability:", outputs[0])
```

🧪 Training Pipeline

All models were trained using the following:

    ✅ Stratified train/test split

    ✅ StandardScaler normalization

    ✅ Log loss and AUC optimization

    ✅ Early stopping and feature importance

    ✅ Light-weight autoencoder anomaly filter (not included here)


🔐 Security Focus

    Ensemble modeling reduces false positives and model drift.

    Models are robust against outliers and data shifts.

    TFLite autoencoder (optional) can detect unknown fraud patterns.


📁 Files
```Java 
models/
├── xgb_model.onnx
├── lgb_model.onnx
├── cat_model.onnx
├── rf_model.onnx
├── meta_model.onnx
├── feature_names.json
```

🛠️ Advanced Users

    Easily convert ONNX to TFLite, TensorRT, or CoreML.

    Deploy via FastAPI, Flask, Streamlit, or ONNX runtime on edge devices.

🤝 License

MIT License. You are free to use, modify, and deploy with attribution.
🙌 Author

Made with ❤️ by darkknight25,SUNNYTHAKUR
Contact for enterprise deployments, smart contract forensics, or advanced ML pipelines