--- language: en license: apache-2.0 tags: - vision - image-classification - medical - alzheimer - brain-mri - vit - transfer-learning datasets: - custom-brain-mri-dataset metrics: - accuracy - f1 model-index: - name: NotIshaan/vit-large-alzheimer-6layers-75M-final results: - task: type: image-classification metrics: - type: accuracy value: 0.9453 - type: f1 value: 0.9451 base_model: - google/vit-large-patch16-224 --- # ViT-Large for Alzheimer's Detection from Brain MRI Scans ## Model Description This model is a fine-tuned version of [google/vit-large-patch16-224](https://huggingface.co/google/vit-large-patch16-224) for multiclass classification of brain MRI scans to detect Alzheimer's disease stages. **Key Features:** - Base Model: Vision Transformer Large (304M parameters) - Fine-tuning Strategy: Last 6 transformer layers + classifier - Class Imbalance Handling: Weighted cross-entropy loss - Data Augmentation: Rotation, flip, brightness/contrast adjustments ## Model Performance | Metric | Value | |--------|-------| | Accuracy | 0.9453 | | Precision | 0.9454 | | Recall | 0.9453 | | F1 Score | 0.9451 | ### Per-Class F1 Scores - Class 0: 0.8807 - Class 1: 1.0000 - Class 2: 0.9688 - Class 3: 0.9363 ## Training Details - **Training Data:** 4352 brain MRI scans - **Validation Data:** 768 brain MRI scans - **Epochs:** 25 - **Batch Size:** 4 (effective: 16) - **Learning Rate:** 1e-05 - **Optimizer:** AdamW with cosine learning rate schedule - **Loss Function:** Weighted Cross-Entropy (for class imbalance) - **Training Time:** 50 minutes ## Intended Use This model is designed for research purposes in medical image classification, specifically for: - Alzheimer's disease detection from brain MRI scans - Multi-stage classification of cognitive decline - Research and educational purposes in medical AI **Note:** This model is NOT intended for clinical diagnosis. Always consult qualified medical professionals. ## How to Use ```python from transformers import AutoImageProcessor, ViTForImageClassification from PIL import Image import torch # Load model and processor processor = AutoImageProcessor.from_pretrained("NotIshaan/vit-large-alzheimer-6layers-75M-final") model = ViTForImageClassification.from_pretrained("NotIshaan/vit-large-alzheimer-6layers-75M-final") # Load and preprocess image image = Image.open("brain_mri.jpg") inputs = processor(images=image, return_tensors="pt") # Make prediction with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits predicted_class = logits.argmax(-1).item() confidence = torch.softmax(logits, dim=1)[0][predicted_class].item() print(f"Predicted class: {model.config.id2label[predicted_class]}") print(f"Confidence: {confidence:.2%}") ``` ## Label Mapping {0: '0', 1: '1', 2: '2', 3: '3'} ## Limitations - Model trained on specific brain MRI dataset - may not generalize to all MRI protocols - Class imbalance in training data may affect minority class performance - Requires grayscale MRI images (converted to RGB internally) - Input image size: 224x224 pixels ## Citation If you use this model, please cite: ```bibtex @misc{vit-large-alzheimer, author = {Your Name}, title = {ViT-Large for Alzheimer's Detection}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/NotIshaan/vit-large-alzheimer-6layers-75M-final}} } ``` ## Acknowledgments - Base model: [google/vit-large-patch16-224](https://huggingface.co/google/vit-large-patch16-224) - Framework: Hugging Face Transformers - Original ViT paper: [An Image is Worth 16x16 Words](https://arxiv.org/abs/2010.11929)