---
language: en
license: apache-2.0
tags:
- vision
- image-classification
- medical
- alzheimer
- brain-mri
- vit
- transfer-learning
datasets:
- custom-brain-mri-dataset
metrics:
- accuracy
- f1
model-index:
- name: NotIshaan/vit-large-alzheimer-6layers-75M-final
  results:
  - task:
      type: image-classification
    metrics:
    - type: accuracy
      value: 0.9453
    - type: f1
      value: 0.9451
base_model:
- google/vit-large-patch16-224
---

# ViT-Large for Alzheimer's Detection from Brain MRI Scans

## Model Description

This model is a fine-tuned version of [google/vit-large-patch16-224](https://huggingface.co/google/vit-large-patch16-224) 
for multiclass classification of brain MRI scans to detect Alzheimer's disease stages.

**Key Features:**
- Base Model: Vision Transformer Large (304M parameters)
- Fine-tuning Strategy: Last 6 transformer layers + classifier
- Class Imbalance Handling: Weighted cross-entropy loss
- Data Augmentation: Rotation, flip, brightness/contrast adjustments

## Model Performance

| Metric | Value |
|--------|-------|
| Accuracy | 0.9453 |
| Precision | 0.9454 |
| Recall | 0.9453 |
| F1 Score | 0.9451 |

### Per-Class F1 Scores

- Class 0: 0.8807
- Class 1: 1.0000
- Class 2: 0.9688
- Class 3: 0.9363

## Training Details

- **Training Data:** 4352 brain MRI scans
- **Validation Data:** 768 brain MRI scans
- **Epochs:** 25
- **Batch Size:** 4 (effective: 16)
- **Learning Rate:** 1e-05
- **Optimizer:** AdamW with cosine learning rate schedule
- **Loss Function:** Weighted Cross-Entropy (for class imbalance)
- **Training Time:** 50 minutes

## Intended Use

This model is designed for research purposes in medical image classification, specifically for:
- Alzheimer's disease detection from brain MRI scans
- Multi-stage classification of cognitive decline
- Research and educational purposes in medical AI

**Note:** This model is NOT intended for clinical diagnosis. Always consult qualified medical professionals.

## How to Use
```python
from transformers import AutoImageProcessor, ViTForImageClassification
from PIL import Image
import torch

# Load model and processor
processor = AutoImageProcessor.from_pretrained("NotIshaan/vit-large-alzheimer-6layers-75M-final")
model = ViTForImageClassification.from_pretrained("NotIshaan/vit-large-alzheimer-6layers-75M-final")

# Load and preprocess image
image = Image.open("brain_mri.jpg")
inputs = processor(images=image, return_tensors="pt")

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_class = logits.argmax(-1).item()
    confidence = torch.softmax(logits, dim=1)[0][predicted_class].item()

print(f"Predicted class: {model.config.id2label[predicted_class]}")
print(f"Confidence: {confidence:.2%}")
```

## Label Mapping

{0: '0', 1: '1', 2: '2', 3: '3'}

## Limitations

- Model trained on specific brain MRI dataset - may not generalize to all MRI protocols
- Class imbalance in training data may affect minority class performance
- Requires grayscale MRI images (converted to RGB internally)
- Input image size: 224x224 pixels

## Citation

If you use this model, please cite:
```bibtex
@misc{vit-large-alzheimer,
  author = {Your Name},
  title = {ViT-Large for Alzheimer's Detection},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/NotIshaan/vit-large-alzheimer-6layers-75M-final}}
}
```

## Acknowledgments

- Base model: [google/vit-large-patch16-224](https://huggingface.co/google/vit-large-patch16-224)
- Framework: Hugging Face Transformers
- Original ViT paper: [An Image is Worth 16x16 Words](https://arxiv.org/abs/2010.11929)