File size: 1,152 Bytes

d0425e9
 
4aa9486
 
 
d0425e9
 
4aa9486
 
 
 
d0425e9
4aa9486
 
 
 
 
 
 
d0425e9
4aa9486
 
 
 
d0425e9
4aa9486
d0425e9
4aa9486
d0425e9
fe514b2

---
library_name: transformers
license: apache-2.0
datasets:
- wmt/wmt14
---

# Quick start guide
To use this models, follow the snippet below:
```python
from transformers import AutoModelForMaskedLM

# model_config_overrides = {}  # Use this to optionally override config parameters
model = AutoModelForMaskedLM.from_pretrained(
    "kuleshov-group/e2d2-wmt",
    trust_remote_code=True,
    # **model_config_overrides,
)
```

# Model details
- Trained from scratch on [`wmt/wmt14`](https://huggingface.co/datasets/wmt/wmt14)
- Qwen3 tokenizer: [`Qwen/Qwen3-0.6B-Base`](https://huggingface.co/Qwen/Qwen3-0.6B-Base)
- Block diffusion parameterization, with block size 4

See the project site for more details and link to the paper and code: https://m-arriola.com/e2d2/

# Citation

```
@inproceedings{
arriola2025e2d2,
title={Encoder-Decoder Diffusion Language Models for Efficient Training and Inference},
author={Marianne Arriola and Yair Schiff and Hao Phung and Aaron Gokaslan and Volodymyr Kuleshov},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2510.22852}
}
```