Xin Zhou

LMD0311

3 14 9

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

upvoted a paper 20 days ago

Next Forcing: Causal World Modeling with Multi-Chunk Prediction

upvoted a paper about 1 month ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

View all activity

Organizations

upvoted a paper 13 days ago

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

Paper • 2606.17628 • Published 14 days ago • 27

upvoted a paper 20 days ago

Next Forcing: Causal World Modeling with Multi-Chunk Prediction

Paper • 2606.11187 • Published 21 days ago • 6

upvoted a paper about 1 month ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published May 28 • 146

authored a paper about 2 months ago

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

Paper • 2604.28196 • Published Apr 30 • 74

upvoted a paper about 2 months ago

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

Paper • 2604.28196 • Published Apr 30 • 74

submitted a paper to Daily Papers about 2 months ago

HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation

Paper • 2604.28196 • Published Apr 30 • 74

liked a model 2 months ago

H-EmbodVis/HERMESV2

Image-to-3D • Updated Apr 30 • 1

published a model 2 months ago

H-EmbodVis/HERMESV2

Image-to-3D • Updated Apr 30 • 1

updated a model 2 months ago

H-EmbodVis/HERMESV2

Image-to-3D • Updated Apr 30 • 1

liked a dataset 2 months ago

KlingTeam/HM-World

Updated Apr 22 • 854 • 7

authored a paper 3 months ago

When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

Paper • 2604.08546 • Published Apr 9 • 116

upvoted a paper 3 months ago

When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

Paper • 2604.08546 • Published Apr 9 • 116

authored 4 papers 3 months ago

Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception

Paper • 2503.13587 • Published Mar 17, 2025

More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models

Paper • 2510.23574 • Published Oct 27, 2025

Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching

Paper • 2507.02860 • Published Jul 3, 2025

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published Mar 26 • 157

upvoted 3 papers 3 months ago

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published Mar 26 • 157

Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

Paper • 2603.19235 • Published Mar 19 • 95

Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

Paper • 2603.12262 • Published Mar 12 • 31

upvoted a paper 7 months ago

Memory in the Age of AI Agents

Paper • 2512.13564 • Published Dec 15, 2025 • 160

Xin Zhou

AI & ML interests

Recent Activity

Organizations

LMD0311's activity