e2d2-wmt / README.md

yairschiff

Update citation

fe514b2 verified 2 months ago

preview code

raw

history blame contribute delete

1.15 kB

metadata

library_name: transformers
license: apache-2.0
datasets:
  - wmt/wmt14

Quick start guide

To use this models, follow the snippet below:

from transformers import AutoModelForMaskedLM

# model_config_overrides = {}  # Use this to optionally override config parameters
model = AutoModelForMaskedLM.from_pretrained(
    "kuleshov-group/e2d2-wmt",
    trust_remote_code=True,
    # **model_config_overrides,
)

Model details

Trained from scratch on wmt/wmt14
Qwen3 tokenizer: Qwen/Qwen3-0.6B-Base
Block diffusion parameterization, with block size 4

See the project site for more details and link to the paper and code: https://m-arriola.com/e2d2/

Citation

@inproceedings{
arriola2025e2d2,
title={Encoder-Decoder Diffusion Language Models for Efficient Training and Inference},
author={Marianne Arriola and Yair Schiff and Hao Phung and Aaron Gokaslan and Volodymyr Kuleshov},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2510.22852}
}