OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published 7 days ago • 29
Architecture Decoupling Is Not All You Need For Unified Multimodal Model Paper • 2511.22663 • Published 12 days ago • 28
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views Paper • 2510.18632 • Published Oct 21 • 21
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 56