Bo Li's picture

Bo Li PRO

luodian

·

https://brianboli.com/

luodian

AI & ML interests

None yet

Recent Activity

new activity 17 days ago

zh-ai-community/model-release-heatmap-zh:Propose to count in LMMs-Lab

updated a Space 17 days ago

lmms-lab/README

upvoted a collection 21 days ago

OneVision-Encoder

View all activity

Organizations

authored 4 papers 4 months ago

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Paper • 2411.15296 • Published Nov 22, 2024 • 21

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published Jan 23, 2025 • 23

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Paper • 2509.23661 • Published Sep 28, 2025 • 47

Visual Jigsaw Post-Training Improves MLLMs

Paper • 2509.25190 • Published Sep 29, 2025 • 36

authored 2 papers 7 months ago

EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published Mar 5, 2025 • 46

MMSearch-R1: Incentivizing LMMs to Search

Paper • 2506.20670 • Published Jun 25, 2025 • 64

authored 6 papers about 1 year ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 46

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Paper • 2411.14982 • Published Nov 22, 2024 • 19

Large Language Models are Visual Reasoning Coordinators

Paper • 2310.15166 • Published Oct 23, 2023 • 2

OtterHD: A High-Resolution Multi-modality Model

Paper • 2311.04219 • Published Nov 7, 2023 • 34

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3, 2024 • 39

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Paper • 2410.13754 • Published Oct 17, 2024 • 75

authored 8 papers over 1 year ago

Octopus: Embodied Vision-Language Programmer from Environmental Feedback

Paper • 2310.08588 • Published Oct 12, 2023 • 38

MIMIC-IT: Multi-Modal In-Context Instruction Tuning

Paper • 2306.05425 • Published Jun 8, 2023 • 11

FunQA: Towards Surprising Video Comprehension

Paper • 2306.14899 • Published Jun 26, 2023 • 1

MMBench: Is Your Multi-modal Model an All-around Player?

Paper • 2307.06281 • Published Jul 12, 2023 • 5

Long Context Transfer from Language to Vision

Paper • 2406.16852 • Published Jun 24, 2024 • 33

LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models

Paper • 2407.12772 • Published Jul 17, 2024 • 35

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10, 2024 • 42