Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR Paper • 2504.11101 • Published Apr 15, 2025 • 1
DiRL: An Efficient Post-Training Framework for Diffusion Language Models Paper • 2512.22234 • Published 15 days ago • 19
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published 30 days ago • 57
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published Nov 19, 2025 • 93
view article Article The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs Nov 15, 2025 • 12
MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Paper • 2511.07137 • Published Nov 10, 2025 • 5
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 211
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation Paper • 2508.00428 • Published Aug 1, 2025 • 3
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards Paper • 2508.04632 • Published Aug 6, 2025 • 2
Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache Paper • 2506.11886 • Published Jun 13, 2025 • 20
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Paper • 2506.14429 • Published Jun 17, 2025 • 44
TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation Paper • 2404.11824 • Published Apr 18, 2024 • 1