File size: 9,678 Bytes

b05be91

---
license: apache-2.0
base_model:
- facebook/sam2-hiera-tiny
- facebook/sam2-hiera-small
- facebook/sam2-hiera-base-plus
- facebook/sam2-hiera-large
pipeline_tag: image-segmentation
---
# Model Card: CoronarySAM2 - Fine-tuned SAM2 for Coronary Artery Segmentation

## Model Details

### Model Description

CoronarySAM2 is a collection of fine-tuned Segment Anything Model 2 (SAM2) variants specifically optimized for coronary artery segmentation in X-ray angiography images. The models use point-based prompting to enable interactive and precise segmentation of coronary arteries from medical imaging data.

- **Developed by:** Research Team
- **Model Type:** Computer Vision - Image Segmentation
- **Base Architecture:** SAM2 (Segment Anything Model 2) with Hiera backbone
- **Language(s):** Python
- **License:** [Specify License]
- **Fine-tuned from:** [facebook/segment-anything-2](https://github.com/facebookresearch/segment-anything-2)

### Model Variants

Four model variants are available, offering different trade-offs between speed and accuracy:

| Model | Parameters | Checkpoint | Speed | Accuracy | Use Case |
|-------|-----------|------------|-------|----------|----------|
| **SAM2 Hiera Tiny** | ~38M | `sam2_t/best_model.pt` | ⚡⚡⚡ Fast | ⭐⭐⭐ Good | Quick experiments, real-time feedback |
| **SAM2 Hiera Small** | ~46M | `sam2_s/checkpoint_epoch_70.pt` | ⚡⚡ Medium | ⭐⭐⭐⭐ Very Good | Balanced performance, general use |
| **SAM2 Hiera Base Plus** | ~80M | `sam2_b+/best_model.pt` | ⚡ Slower | ⭐⭐⭐⭐⭐ Excellent | High-quality results, clinical evaluation |
| **SAM2 Hiera Large** | ~224M | `sam2_l/final_model.pt` | ⚡ Slowest | ⭐⭐⭐⭐⭐ Best | Maximum accuracy, research purposes |

### Model Architecture

The models follow the SAM2 architecture with the following components:

1. **Image Encoder**: Hiera hierarchical vision transformer backbone
2. **Prompt Encoder**: Encodes point prompts (positive/negative) as spatial embeddings
3. **Mask Decoder**: Transformer-based decoder that generates high-quality segmentation masks
4. **Preprocessing Pipeline**:
   - X-ray image normalization using Gaussian blur
   - CLAHE (Contrast Limited Adaptive Histogram Equalization) for vessel enhancement
   - Fixed resolution resizing to 1024×1024 pixels

## Intended Use

### Primary Use Cases

- **Interactive Coronary Artery Segmentation**: Point-based annotation for precise artery delineation
- **Medical Image Analysis**: Automated assistance for cardiologists and radiologists
- **Research**: Computer-aided diagnosis and treatment planning research
- **Educational Purposes**: Training and demonstration of medical image segmentation

### Out-of-Scope Use

- ❌ Clinical diagnosis without expert oversight
- ❌ Automated treatment decisions
- ❌ Real-time interventional guidance without validation
- ❌ Non-coronary vessel segmentation (not trained for this task)
- ❌ Modalities other than X-ray angiography (CT, MRI, etc.)

## Training Data

### Dataset

The models were fine-tuned on coronary X-ray angiography images with annotations for coronary artery structures.

**Training Specifications:**
- **Modality**: X-ray Angiography
- **Target**: Coronary Arteries
- **Annotation Type**: Binary segmentation masks
- **Resolution**: Images resized to 1024×1024 for training

### Preprocessing

All training images underwent the following preprocessing pipeline:

1. **Normalization**: Gaussian blur-based intensity normalization
2. **CLAHE Enhancement**: Adaptive histogram equalization (clip limit: 2.0, tile grid: 8×8)
3. **Resizing**: Fixed 1024×1024 resolution
4. **Format**: RGB format (grayscale images converted to RGB)

## Evaluation

### Metrics

The models should be evaluated using the following metrics:

- **Dice Coefficient**: Measures overlap between predicted and ground truth masks
- **IoU (Intersection over Union)**: Pixel-wise accuracy metric
- **Precision & Recall**: For detecting true vessel pixels
- **Hausdorff Distance**: Measures boundary accuracy
- **Inference Time**: Speed benchmarks on various hardware

### Performance Considerations

- **Point Prompt Quality**: Model performance heavily depends on the quality and number of point prompts
- **Image Quality**: Better results with high-contrast angiography images
- **Vessel Complexity**: Performance may vary with vessel overlap and bifurcations
- **Model Selection**: Larger models generally provide better accuracy but slower inference

## How to Use

### Installation

```bash
# Create conda environment
conda create -n sam2_FT_env python=3.10.0 -y
conda activate sam2_FT_env

# Install SAM2
git clone https://github.com/facebookresearch/segment-anything-2.git
cd segment-anything-2
pip install -e .
cd ..

# Install dependencies
pip install gradio opencv-python-headless torch torchvision torchaudio
```

### Basic Usage

```python
import torch
import numpy as np
from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor

# Load model
checkpoint_path = "ft_models/sam2_s/checkpoint_epoch_70.pt"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

checkpoint = torch.load(checkpoint_path, map_location=device)
model_cfg = checkpoint['model_cfg']
sam2_model = build_sam2(model_cfg, checkpoint_path=None, device=device)

# Load state dict
state_dict = checkpoint['model_state_dict']
new_state_dict = {k[7:] if k.startswith('module.') else k: v 
                  for k, v in state_dict.items()}
sam2_model.load_state_dict(new_state_dict)
sam2_model.eval()

# Create predictor
predictor = SAM2ImagePredictor(sam2_model)

# Set image (preprocessed, 1024x1024, RGB, uint8)
predictor.set_image(preprocessed_image)

# Add point prompts
point_coords = np.array([[512, 300], [520, 310]])  # x, y coordinates
point_labels = np.array([1, 1])  # 1 = positive, 0 = negative

# Predict
masks, scores, logits = predictor.predict(
    point_coords=point_coords,
    point_labels=point_labels,
    multimask_output=True
)
```

### Interactive Application

Launch the Gradio interface:

```bash
python app.py
```

Access at `http://127.0.0.1:7860`

## Limitations

### Technical Limitations

- **Fixed Input Size**: Models expect 1024×1024 input (automatic resizing may affect small vessels)
- **Memory Requirements**: Large model requires significant GPU memory (~8GB VRAM recommended)
- **Point Dependency**: Requires manual point prompts; not fully automatic
- **Single Modality**: Optimized only for X-ray angiography

### Medical Limitations

- **Not FDA Approved**: Not cleared for clinical diagnostic use
- **Requires Expert Review**: All outputs must be validated by qualified professionals
- **Variability**: Performance may vary across different imaging protocols and equipment
- **Edge Cases**: May struggle with severe vessel overlap, calcifications, or poor image quality

### Known Issues

- High-contrast regions may cause over-segmentation
- Thin vessel branches may be missed without precise point placement
- Performance degradation on low-quality or motion-blurred images

## Ethical Considerations

### Medical AI Responsibility

- **Human Oversight Required**: This tool is designed to assist, not replace, medical professionals
- **No Autonomous Decisions**: Should never be used for automated clinical decisions
- **Training Data Bias**: Model performance may reflect biases present in training data
- **Privacy**: Ensure patient data is handled according to HIPAA/GDPR regulations

### Fairness & Bias

- Model performance across different patient demographics should be validated
- Imaging equipment and protocols may affect performance
- Consider potential biases in training dataset composition

### Transparency

- Model predictions should be explainable to medical professionals
- Segmentation boundaries should be reviewable and editable
- Point prompt influence on outputs should be clear to users

## Citation

### Base Model (SAM2)

```bibtex
@article{ravi2024sam2,
  title={SAM 2: Segment Anything in Images and Videos},
  author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and others},
  journal={arXiv preprint arXiv:2408.00714},
  year={2024}
}
```

### This Work

If you use CoronarySAM2 in your research, please cite:

```bibtex
@software{coronarysam2_2025,
  title={CoronarySAM2: Fine-tuned SAM2 for Coronary Artery Segmentation},
  author={[Your Name/Team]},
  year={2025},
  url={[Repository URL]}
}
```

## Model Card Authors

- [Primary Author Names]
- Last Updated: November 2025

## Contact

For questions, issues, or collaboration inquiries:

- **GitHub Issues**: [Repository URL]/issues
- **Email**: [Contact Email]

## Disclaimer

**⚠️ IMPORTANT MEDICAL DISCLAIMER ⚠️**

This software is provided for **research and educational purposes only**. It is not intended for clinical use, medical diagnosis, or treatment planning. The models have not been validated for clinical deployment and are not FDA-approved or CE-marked medical devices.

**Always consult qualified healthcare professionals** for medical image interpretation and clinical decisions. The developers assume no liability for any clinical use or consequences resulting from the use of this software.

## Additional Resources

- [SAM2 Paper](https://arxiv.org/abs/2408.00714)
- [SAM2 GitHub Repository](https://github.com/facebookresearch/segment-anything-2)
- [Project README](README.md)
- [Application Interface](app.py)

---

**Version**: 1.0  
**Last Updated**: November 18, 2025  
**Status**: Research/Development