Future Optical Flow Prediction Improves Robot Control & Video Generation Paper • 2601.10781 • Published Jan 15 • 19
Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding Paper • 2512.05774 • Published Dec 5, 2025 • 7
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models Paper • 2507.12806 • Published Jul 17, 2025 • 21
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14, 2025 • 99
xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs Paper • 2410.16267 • Published Oct 21, 2024 • 18
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations Paper • 2408.12590 • Published Aug 22, 2024 • 35
XGen-MM-1 models and datasets Collection A collection of all XGen-MM (Foundation LMM) models! • 18 items • Updated Nov 5, 2025 • 40
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper • 2407.01370 • Published Jul 1, 2024 • 89