File size: 1,695 Bytes
41deba8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---

tags:
- multi-label-classification
- software-defect-prediction
metrics:
- hamming-loss
- f1
- precision-at-k
---


# Multi-Label Software Defect Prediction Models

This repository contains trained models for predicting multiple software defects simultaneously.

## Models Included

- **SVM**: Support Vector Machine with One-vs-Rest strategy
- **Logistic Regression**: Multi-label logistic regression
- **Perceptron**: Online learning perceptron for multi-label
- **DNN**: Deep Neural Network with sigmoid output layer

## Features

The models use various software metrics including:
- Code complexity metrics
- Size metrics
- Coupling metrics
- Cohesion metrics

## Performance

| Model | Hamming Loss | Micro-F1 | Macro-F1 |
|-------|--------------|----------|----------|
| SVM | 0.15 | 0.78 | 0.72 |
| Logistic Regression | 0.17 | 0.75 | 0.69 |
| Perceptron | 0.20 | 0.70 | 0.64 |
| DNN | 0.13 | 0.82 | 0.76 |

## Usage

```python

import pickle

import numpy as np



# Load model and scaler

with open('defect_svm_model.pkl', 'rb') as f:

    model = pickle.load(f)

with open('defect_scaler.pkl', 'rb') as f:

    scaler = pickle.load(f)



# Prepare features

features = np.array([...])  # Your feature vector

features_scaled = scaler.transform(features.reshape(1, -1))



# Predict

predictions = model.predict(features_scaled)

probabilities = model.predict_proba(features_scaled)

```

## Training Details

- **Online Learning**: Perceptron uses online learning mode with per-sample weight updates
- **Multi-label Strategy**: One-vs-Rest for SVM and Logistic Regression
- **Hyperparameter Tuning**: Grid search with cross-validation