Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19, 2025 • 227
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published Nov 13, 2025 • 60
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6, 2025 • 118
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 72
VideoRoPE: What Makes for Good Video Rotary Position Embedding? Paper • 2502.05173 • Published Feb 7, 2025 • 65
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing Paper • 2503.16153 • Published Mar 20, 2025 • 2
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1, 2025 • 62
NeuralSVG: An Implicit Representation for Text-to-Vector Generation Paper • 2501.03992 • Published Jan 7, 2025 • 2
Rendering-Aware Reinforcement Learning for Vector Graphics Generation Paper • 2505.20793 • Published May 27, 2025 • 13
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11, 2025 • 130
BioMed Collection A suite of open-source biomedical foundation models. https://research.ibm.com/projects/biomedical-foundation-models • 28 items • Updated Jul 11, 2025 • 9
Continuous Speech Synthesis using per-token Latent Diffusion Paper • 2410.16048 • Published Oct 21, 2024 • 29