X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
Paper
โข
2503.06134
โข
Published
โข
2
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
๐ If you find our work helpful, please consider citing our paper and leaving valuable stars
@misc{ma2025x2i,
title={X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation},
author={Jian Ma and Qirong Peng and Xu Guo and Chen Chen and Haonan Lu and Zhenyu Yang},
year={2025},
eprint={2503.06134},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This model is released under the Apache 2.0 License.
Base model
OpenGVLab/InternVL2_5-1B