--- tags: - model_hub_mixin - pytorch_model_hub_mixin license: mit --- ## A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text? - Code: [DLILP](https://github.com/jusiro/DLILP) - Paper: [IPMI 2025](https://link.springer.com/chapter/10.1007/978-3-031-96625-5_20) - [ArXiv](https://arxiv.org/abs/2504.05227) - Docs: [Documentation](https://github.com/jusiro/DLILP) - Tutorial: [Notebook](https://colab.research.google.com/drive/1_8Ysd8mCKuLX_Q86e-7pOAHFbSR9F4aZ?usp=sharing) ### About "CONVIRT" weights: - Pre-trained using a vanilla CLIP contrastive loss - a very similar pre-training as earlier proposed in [CONVIRT](https://arxiv.org/abs/2010.00747) paper (2020). - Pre-trained on MIMIC. If you find this repository useful, please consider citing this paper: ``` @inproceedings{convirt, author = {Yuhao Zhang and others}, booktitle = {MHLC}, pages = {1-24}, title = {Contrastive Learning of Medical Visual Representations from Paired Images and Text}, year = {2022}, } @inproceedings{dlilp, title={A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text?}, author={Julio Silva-Rodríguez and Jose Dolz and Ismail {Ben Ayed}}, booktitle={Information Processing in Medical Imaging (IPMI)}, year={2025} } ```