Uni-DPO

psp-dada 's Collections

updated Feb 16

[ICLR 2026] Official repository of "Uni-DPO: A Unified Paradigm for Dynamic Preference Optimization of LLMs". Repo: https://github.com/pspdada/Uni-DPO

Upvote

Uni-DPO: A Unified Paradigm for Dynamic Preference Optimization of LLMs

Paper • 2506.10054 • Published Feb 11 • 3
psp-dada/Uni-DPO

Preview • Updated Feb 22 • 81 • 1
psp-dada/Qwen2.5-7B-Uni-DPO

Text Generation • 8B • Updated Feb 22 • 4 • 1
psp-dada/Llama-3-8B-Instruct-Uni-DPO-v2-GPT-4o

Text Generation • 8B • Updated Feb 22 • 6 • 1
psp-dada/Llama-3-8B-Instruct-Uni-DPO-v2-ArmoRM

Text Generation • 8B • Updated Feb 22 • 3 • 1
psp-dada/Llama-3-8B-Base-SFT-Uni-DPO

Text Generation • 8B • Updated Feb 22 • 6 • 1
psp-dada/Llama-3-8B-Base-SFT-Uni-DPO-v2-Qwen

Text Generation • 8B • Updated Feb 22 • 3 • 1
psp-dada/Gemma2-9B-IT-Uni-DPO

Text Generation • 9B • Updated Feb 22 • 2 • 1
psp-dada/Llama-3-8B-Base-SFT-Uni-DPO-v2-GPT-4

Text Generation • 8B • Updated Feb 22 • 3 • 1
psp-dada/Llama-3-8B-Instruct-Uni-DPO

Text Generation • 8B • Updated Feb 22 • 3 • 1
psp-dada/Qwen2.5-Math-7B-Uni-DPO

Text Generation • 8B • Updated Feb 22 • 8 • 1

Upvote