Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs Paper • 2405.15208 • Published May 24, 2024
Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework Paper • 2502.14864 • Published Feb 20 • 1
Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval Paper • 2505.19650 • Published May 26 • 5
TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos Paper • 2505.20124 • Published May 26
Capybara-OMNI: An Efficient Paradigm for Building Omni-Modal Language Models Paper • 2504.12315 • Published Apr 10
From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models Paper • 2410.06795 • Published Oct 9, 2024
RLEP: Reinforcement Learning with Experience Replay for LLM Reasoning Paper • 2507.07451 • Published Jul 10 • 5
Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search Paper • 2506.11155 • Published Jun 11 • 1
Leanabell-Prover-V2: Verifier-integrated Reasoning for Formal Theorem Proving via Reinforcement Learning Paper • 2507.08649 • Published Jul 11
AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning Paper • 2508.06924 • Published Aug 9 • 3
Evil Geniuses: Delving into the Safety of LLM-based Agents Paper • 2311.11855 • Published Nov 20, 2023
Accelerating Diffusion LLM Inference via Local Determinism Propagation Paper • 2510.07081 • Published Oct 8
Open Multimodal Retrieval-Augmented Factual Image Generation Paper • 2510.22521 • Published Oct 26 • 30
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG Paper • 2506.02544 • Published Jun 3