MobileNetV4 -- Universal Models for the Mobile Ecosystem
Paper
•
2404.10518
•
Published
•
3
A MobileNet v4 image classification model. This model was trained on the eu-common dataset containing common European bird species.
The species list is derived from the Collins bird guide [^1].
[^1]: Svensson, L., Mullarney, K., & Zetterström, D. (2022). Collins bird guide (3rd ed.). London, England: William Collins.
Model Type: Image classification and detection backbone
Model Stats:
Dataset: eu-common (707 classes)
Papers:
import birder
from birder.inference.classification import infer_image
(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)
# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)
# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)
image = "path/to/image.jpeg" # or a PIL image, must be loaded in RGB format
(out, _) = infer_image(net, image, transform)
# out is a NumPy array with shape of (1, 707), representing class probabilities.
import birder
from birder.inference.classification import infer_image
(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)
# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)
# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)
image = "path/to/image.jpeg" # or a PIL image
(out, embedding) = infer_image(net, image, transform, return_embedding=True)
# embedding is a NumPy array with shape of (1, 1280)
from PIL import Image
import birder
(net, model_info) = birder.load_pretrained_model("mobilenet_v4_l_eu-common", inference=True)
# Get the image size the model was trained on
size = birder.get_size_from_signature(model_info.signature)
# Create an inference transform
transform = birder.classification_transform(size, model_info.rgb_stats)
image = Image.open("path/to/image.jpeg")
features = net.detection_features(transform(image).unsqueeze(0))
# features is a dict (stage name -> torch.Tensor)
print([(k, v.size()) for k, v in features.items()])
# Output example:
# [('stage1', torch.Size([1, 48, 96, 96])),
# ('stage2', torch.Size([1, 96, 48, 48])),
# ('stage3', torch.Size([1, 192, 24, 24])),
# ('stage4', torch.Size([1, 512, 12, 12]))]
@misc{qin2024mobilenetv4universalmodels,
title={MobileNetV4 -- Universal Models for the Mobile Ecosystem},
author={Danfeng Qin and Chas Leichner and Manolis Delakis and Marco Fornoni and Shixin Luo and Fan Yang and Weijun Wang and Colby Banbury and Chengxi Ye and Berkin Akin and Vaibhav Aggarwal and Tenghui Zhu and Daniele Moro and Andrew Howard},
year={2024},
eprint={2404.10518},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2404.10518},
}