Siamese Network for Signature Verification

A deep learning model built entirely from scratch using PyTorch to verify the authenticity of handwritten signatures. The model uses Siamese Networks with Triplet Loss for metric learning, creating embeddings where genuine signature pairs are close together and forged signatures are far apart.

Model Details

Custom Architecture Overview

This model was built from scratch without using pre-trained weights. It consists of two main components:

1. SimpleEmbeddingNetwork

A custom CNN-based feature extractor designed specifically for signature analysis:

Architecture Layers:

  • Conv Block 1: 3 β†’ 32 channels (5Γ—5 kernel, MaxPool 2Γ—2)
  • Conv Block 2: 32 β†’ 64 channels (5Γ—5 kernel, MaxPool 2Γ—2)
  • Conv Block 3: 64 β†’ 128 channels (3Γ—3 kernel, MaxPool 2Γ—2)
  • Conv Block 4: 128 β†’ 256 channels (3Γ—3 kernel, MaxPool 2Γ—2)

Fully Connected Layers:

  • Linear(flattened_features, 512) + BatchNorm + ReLU + Dropout(0.5)
  • Linear(512, 256) + BatchNorm + ReLU + Dropout(0.3)
  • Linear(256, embedding_dim=256)

Output: L2-normalized embeddings (256-dimensional)

2. SiameseNetwork

Wraps the embedding network with shared weights:

  • Triplet Mode: Processes anchor, positive, and negative samples
  • Pair Mode: Processes two images for comparison
  • Distance Computation: Euclidean and cosine distance metrics
  • Similarity Prediction: Built-in threshold-based classification

Key Features

  • Custom CNN Architecture: Optimized for signature images
  • Metric Learning: Trained using triplet loss
  • L2 Normalization: Embeddings are normalized for consistent distance metrics
  • Batch Normalization & Dropout: Prevents overfitting
  • Flexible Input: Supports both triplet and pair-wise comparisons

Performance

Metric Value
Validation Accuracy 0.6773
Test Accuracy 0.6673
Precision 0.6840
Recall 0.6218
F1 Score 0.6514
AUC-ROC 0.5042
Optimal Threshold 0.8485

Distance Statistics

  • Genuine Pair Distance: 0.7386 Β± 0.4836
  • Forged Pair Distance: 1.1545 Β± 0.4977
  • Class Separation: 0.4160

Dataset

Source: siddharthmagesh/signature-verfication (Kaggle)

Structure:

  • Real signatures per user
  • Forged signatures per user
  • Triplet dataset generation: 100 triplets per user
  • Train/Val Split: 80/20

Preprocessing:

  • Image Size: 224 Γ— 224
  • Normalization Mean: [0.861, 0.861, 0.861]
  • Normalization Std: [0.274, 0.274, 0.274]
  • Data Augmentation: Random affine transforms, perspective distortion

Training Configuration

Hyperparameters

  • Batch Size: 32
  • Learning Rate: 0.000973 (optimized via Optuna)
  • Weight Decay: 0.000177
  • Triplet Margin: 0.6836
  • Epochs: 50
  • Scheduler Gamma: 0.3953

Training Details

  • Total Training Time: 445.15 minutes (~7.4 hours)
  • Best Epoch: 32
  • Optimizer: Adam
  • Scheduler: StepLR (step_size=5)

Usage

Simple Inference Example

import torch
from PIL import Image
from torchvision import transforms
from modules.embedding_network import SimpleEmbeddingNetwork
from siamese_network import SiameseNetwork

# Device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Load model
embedding_net = SimpleEmbeddingNetwork(embedding_dim=256, input_size=(224, 224))
model = SiameseNetwork(embedding_net)
checkpoint = torch.load('best_model.pth', map_location=device)
model.load_state_dict(checkpoint)
model.to(device)
model.eval()

# Image preprocessing
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.861, 0.861, 0.861], [0.274, 0.274, 0.274])
])

# Load and process images
img1 = transform(Image.open('sig1.jpg')).unsqueeze(0).to(device)
img2 = transform(Image.open('sig2.jpg')).unsqueeze(0).to(device)

# Get embeddings and compare
with torch.no_grad():
    z1, z2 = model(img1, img2, triplet_bool=False)
    distance = model.compute_distance(z1, z2).item()
    is_match = distance < 0.8485  # Optimal threshold

print(f"Distance: {distance:.4f}")
print(f"Match: {is_match}")

Download Checkpoint from HuggingFace

from huggingface_hub import hf_hub_download

# Download model weights
weights_path = hf_hub_download(
    repo_id="siddharth-magesh/siamese-signature-verification",
    filename="best_model.pth"
)

checkpoint = torch.load(weights_path, map_location=device)
model.load_state_dict(checkpoint)

Compare Two Signatures

# Load and preprocess images
img1 = transform(Image.open('signature1.jpg')).unsqueeze(0).to(device)
img2 = transform(Image.open('signature2.jpg')).unsqueeze(0).to(device)

# Predict similarity
with torch.no_grad():
    is_match = siamese_model.predict_similarity(
        img1, img2, 
        threshold=0.8485  # Optimal threshold from training
    )
    
print(f"Signatures match: {is_match}")

# Get embeddings and distance
with torch.no_grad():
    emb1 = siamese_model.get_embedding(img1)
    emb2 = siamese_model.get_embedding(img2)
    distance = siamese_model.compute_distance(emb1, emb2).item()
    
print(f"Embedding distance: {distance:.4f}")

Batch Processing

# Process multiple image pairs
image_paths = [
    ('sig1.jpg', 'sig2.jpg'),
    ('sig3.jpg', 'sig4.jpg'),
]

with torch.no_grad():
    for path1, path2 in image_paths:
        img1 = transform(Image.open(path1)).unsqueeze(0).to(device)
        img2 = transform(Image.open(path2)).unsqueeze(0).to(device)
        
        match = siamese_model.predict_similarity(img1, img2, threshold=0.8485)
        distance = siamese_model.compute_distance(
            siamese_model.get_embedding(img1),
            siamese_model.get_embedding(img2)
        ).item()
        
        print(f"{path1} vs {path2}: Match={match}, Distance={distance:.4f}")

Configuration Details

Parameter Value Description
Input Size 224Γ—224 Image height and width
Embedding Dimension 256 Size of the output embedding vector
Distance Metric Euclidean (L2) Metric used for comparing embeddings
Threshold 0.8485 Decision boundary for signature verification
Normalization Mean [0.861, 0.861, 0.861] Dataset-specific normalization
Normalization Std [0.274, 0.274, 0.274] Dataset-specific normalization

Installation & Setup

Install Dependencies

pip install torch torchvision pillow huggingface_hub

Quick Start

# Download model from HuggingFace
from huggingface_hub import hf_hub_download

weights_path = hf_hub_download(
    repo_id="siddharth-magesh/siamese-signature-verification",
    filename="best_model.pth"
)

# Load and use
import torch
from modules.embedding_network import SimpleEmbeddingNetwork
from siamese_network import SiameseNetwork

embedding_net = SimpleEmbeddingNetwork(embedding_dim=256, input_size=(224, 224))
model = SiameseNetwork(embedding_net)
checkpoint = torch.load(weights_path, map_location='cpu')
model.load_state_dict(checkpoint)
model.eval()

Hyperparameter Tuning

This model was optimized using Optuna hyperparameter tuning framework. Key parameters and their optimized values:

Parameter Optimized Value Range
Learning Rate 0.000973 1e-5 to 1e-2
Weight Decay 0.000177 1e-6 to 1e-2
Triplet Margin 0.6836 0.1 to 2.0
Scheduler Gamma 0.3953 0.1 to 0.9
Batch Size 32 16 to 64

Custom Architecture Implementation

The entire model was built from scratch without pre-trained weights:

SimpleEmbeddingNetwork

  • Input: Images of shape (B, 3, 224, 224)
  • Convolution Blocks: 4 blocks with increasing channels (3 β†’ 32 β†’ 64 β†’ 128 β†’ 256)
  • Regularization: BatchNorm, ReLU activations, MaxPooling, Dropout2d
  • Fully Connected Layers: Feature flattening β†’ 512 β†’ 256 dimensions
  • Output: L2-normalized 256-dimensional embeddings

SiameseNetwork Wrapper

  • Shared Weights: Same embedding network processes all input images
  • Triplet Loss: $\mathcal{L} = \max(0, d(a,p) - d(a,n) + \text{margin})$
    • Where $a$ = anchor, $p$ = positive, $n$ = negative
    • $d(\cdot,\cdot)$ = Euclidean distance
  • Metric Learning: Creates discriminative embedding space
  • Inference Methods:
    • get_embedding(): Extract embedding for single or batch of images
    • compute_distance(): Calculate distance between embeddings
    • predict_similarity(): Threshold-based binary classification

Evaluation Metrics Explained

  • Accuracy: Percentage of correct classifications at threshold 0.8
  • Precision: Of predicted genuine pairs, how many were actually genuine
  • Recall: Of actual genuine pairs, how many were correctly identified
  • F1 Score: Harmonic mean of precision and recall
  • AUC-ROC: Area under the Receiver Operating Characteristic curve
  • Optimal Threshold: Distance threshold that maximizes accuracy (0.8485)

Confusion Matrix

                 Predicted Genuine  Predicted Forged
Actual Genuine         684 (TP)          416 (FN)
Actual Forged          316 (FP)          784 (TN)

Limitations

  • Model trained on specific signature dataset; may have domain bias
  • Performance depends on image quality and signature consistency
  • Works best with clear, full signatures (not partial or heavily degraded)
  • Optimal threshold (0.8485) should be adjusted based on use case requirements

Future Improvements

  • Multi-user signature verification
  • Real-time signature capture support
  • Mobile deployment optimization
  • Cross-domain signature adaptation

Citation

If you use this model in your research, please cite:

@model{siamese_signature_verification_2025,
  title={Siamese Network for Signature Verification},
  author={Siddharth Magesh},
  year={2025},
  url={https://huggingface.co/siddharth-magesh/siamese-signature-verification}
}

License

MIT License - See LICENSE file for details

Contact & Support

For questions or issues, please open an issue on the model repository.


Model Last Updated: December 2025
Training Framework: PyTorch
Dataset Source: Kaggle (siddharthmagesh/signature-verfication)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support