dronefreak
/

human-action-classification-stanford40

@@ -48,9 +48,9 @@ model-index:
 # Human Action Classification v2.0
-State-of-the-art human action recognition model trained on Stanford 40 Actions dataset.
-![Demo](looking_through_a_telescope.jpg)
 ## Model Description
@@ -59,7 +59,6 @@ This model performs real-time human action classification from images, recognizi
 - **Developed by:** Saumya Kumaar Saksena ([@dronefreak](https://github.com/dronefreak))
 - **Model type:** Image Classification (Action Recognition)
 - **Language(s):** English (action labels)
-- **License:** MIT
 - **Finetuned from:** ImageNet pretrained ResNet34
 ## Key Features
@@ -249,6 +248,8 @@ Full metrics available in [metrics.json](metrics.json)
 - **Classes:** 40 human action categories
 - **Image resolution:** 224×224 (resized)
 ### Training Procedure
 #### Preprocessing
@@ -260,7 +261,7 @@ transforms.Compose([
     transforms.RandomHorizontalFlip(),
     transforms.ColorJitter(brightness=0.2, contrast=0.2),
     transforms.ToTensor(),
-    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
 ])
 ```
@@ -276,16 +277,12 @@ transforms.Compose([
 - **Augmentation:** Mixup (α=0.4)
 - **Scheduler:** CosineAnnealingLR
-#### Hardware
-- **GPU:** NVIDIA GTX 1050 Ti (4GB)
-- **Training time:** ~4 hours
 - **Framework:** PyTorch 2.0+
-### Two-Stage Training Strategy
-1. **Stage 1 (20 epochs):** Freeze backbone, train classifier head
-2. **Stage 2 (180 epochs):** Fine-tune entire network with Mixup
 This approach reduced overfitting from 99% train / 62% test → 82% train / 86% test.
@@ -305,11 +302,6 @@ print(f"Accuracy: {metrics['accuracy']:.2%}")
 print(f"F1-Score: {metrics['f1_macro']:.4f}")
 ```
-## Environmental Impact
-- **Hardware:** 1× NVIDIA GTX 1050 Ti
-- **Training time:** 4 hours
-- **Estimated CO2 emissions:** ~0.5 kg CO2eq
 ## Limitations

 # Human Action Classification v2.0
+State-of-the-art human action recognition model trained on Stanford 40 Actions dataset. GitHub project link -> ![human-action-classification](https://github.com/dronefreak/human-action-classification)
+![Demo](demo_result.jpg)
 ## Model Description
 - **Developed by:** Saumya Kumaar Saksena ([@dronefreak](https://github.com/dronefreak))
 - **Model type:** Image Classification (Action Recognition)
 - **Language(s):** English (action labels)
 - **Finetuned from:** ImageNet pretrained ResNet34
 ## Key Features
 - **Classes:** 40 human action categories
 - **Image resolution:** 224×224 (resized)
+Please note that the proposed train-test split is a bit unconventional, which is why I had to create a custom train-test split of 80-20, which is a standard in machine learning practises.
 ### Training Procedure
 #### Preprocessing
     transforms.RandomHorizontalFlip(),
     transforms.ColorJitter(brightness=0.2, contrast=0.2),
     transforms.ToTensor(),
+    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
 ])
 ```
 - **Augmentation:** Mixup (α=0.4)
 - **Scheduler:** CosineAnnealingLR
+#### Training Hardware
+- **GPU:** NVIDIA RTX 4070 Super (12GB)
+- **Training time:** ~0.5 hours
 - **Framework:** PyTorch 2.0+
 This approach reduced overfitting from 99% train / 62% test → 82% train / 86% test.
 print(f"F1-Score: {metrics['f1_macro']:.4f}")
 ```
 ## Limitations