File size: 4,718 Bytes
d30dc7a
 
 
0ab2471
d30dc7a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1630c99
d30dc7a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2af9cc3
 
 
 
 
 
 
 
 
 
 
 
d30dc7a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
---
license: mit
datasets:
  
  - financial-fraud-detection
language:
  - en
metrics:
  - auc
  - accuracy
  - f1
  - precision
  - recall
base_model:
  - "None"
library_name: onnx
pipeline_tag: fraud-detection
tags:
  - fraud-detection
  - ensemble
  - financial-security
  - onnx
  - xgboost
  - lightgbm
  - catboost
  - random-forest
  - production
  - cybersecurity
  - mlops
  - real-time-inference
  - deployed
model-index:
  - name: Fraud Detection Ensemble ONNX
    results:
      - task:
          name: Fraud Detection
          type: fraud-detection
        dataset:
          name: CREDIT CARD fraud detection credit card.csv
          type: tabular
        metrics:
          - type: auc
            value: 0.9998
          - type: accuracy
            value: 0.9942
          - type: f1
            value: 0.9756
          - type: precision
            value: 0.9813
          - type: recall
            value: 0.9701
new_version: "true"
---


# πŸ›‘οΈ Fraud Detection Ensemble Suite - ONNX Format
**Author:** [darkknight25](https://huggingface.co/darkknight25)  
**Models:** XGBoost, LightGBM, CatBoost, Random Forest, Meta Learner  
**Format:** ONNX for production-ready deployment  
**Tags:** `fraud-detection`, `onnx`, `ensemble`, `real-world`, `ml`, `lightweight`, `financial-security`

---

## πŸ” Overview

This repository provides a high-performance **fraud detection ensemble** trained on real-world financial datasets and exported in **ONNX format** for lightning-fast inference.

Each model is optimized for different fraud signals and then blended via a **meta-model** for enhanced generalization.

---

## 🎯 Real-World Use Cases

βœ… Credit card fraud detection  
βœ… Transaction monitoring systems  
βœ… Risk scoring engines  
βœ… Insurance fraud  
βœ… Online payment gateways  
βœ… Embedded or edge deployments using ONNX

---

## 🧠 Models Included

| Model         | Format | Status     | Notes                                  |
|---------------|--------|------------|----------------------------------------|
| XGBoost       | ONNX   | βœ… Ready    | Best for handling imbalanced data      |
| LightGBM      | ONNX   | βœ… Ready    | Fast, efficient gradient boosting      |
| CatBoost      | ONNX   | βœ… Ready    | Handles categorical features well      |
| RandomForest  | ONNX   | βœ… Ready    | Stable classical ensemble              |
| Meta Model    | ONNX   | βœ… Ready    | Trained on outputs of above models     |

---

## 🧾 Feature Schema

`feature_names.json` contains the exact input features expected by all models.

You must preprocess data to match this schema before ONNX inference.

```json
["amount", "time", "is_foreign", "txn_type", ..., "ratio_to_median_purchase_price"]
```
Shape: (None, 29)

Dtype: float32

```java
import onnxruntime as ort
import numpy as np
import json

# Load feature schema
with open("feature_names.json") as f:
    feature_names = json.load(f)

# Dummy input (replace with your real preprocessed data)
X = np.random.rand(1, len(feature_names)).astype(np.float32)

# Load ONNX model
session = ort.InferenceSession("xgb_model.onnx", providers=["CPUExecutionProvider"])

# Inference
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: X})

print("Fraud probability:", output[0])
```

# Example Inference Code:
```java
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("meta_model.onnx")
input_data = np.array([[...]], dtype=np.float32)  # shape (1, 29)
inputs = {session.get_inputs()[0].name: input_data}
outputs = session.run(None, inputs)
print("Fraud Probability:", outputs[0])
```

πŸ§ͺ Training Pipeline

All models were trained using the following:

    βœ… Stratified train/test split

    βœ… StandardScaler normalization

    βœ… Log loss and AUC optimization

    βœ… Early stopping and feature importance

    βœ… Light-weight autoencoder anomaly filter (not included here)


πŸ” Security Focus

    Ensemble modeling reduces false positives and model drift.

    Models are robust against outliers and data shifts.

    TFLite autoencoder (optional) can detect unknown fraud patterns.


πŸ“ Files
```Java 
models/
β”œβ”€β”€ xgb_model.onnx
β”œβ”€β”€ lgb_model.onnx
β”œβ”€β”€ cat_model.onnx
β”œβ”€β”€ rf_model.onnx
β”œβ”€β”€ meta_model.onnx
β”œβ”€β”€ feature_names.json
```

πŸ› οΈ Advanced Users

    Easily convert ONNX to TFLite, TensorRT, or CoreML.

    Deploy via FastAPI, Flask, Streamlit, or ONNX runtime on edge devices.

🀝 License

MIT License. You are free to use, modify, and deploy with attribution.
πŸ™Œ Author

Made with ❀️ by darkknight25,SUNNYTHAKUR
Contact for enterprise deployments, smart contract forensics, or advanced ML pipelines