Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper β’ 2404.13013 β’ Published Apr 19, 2024 β’ 31
FeatUp: A Model-Agnostic Framework for Features at Any Resolution Paper β’ 2403.10516 β’ Published Mar 15, 2024 β’ 16
Awesome Document AI Collection A collection of open-source document AI π π π β’ 27 items β’ Updated Mar 11, 2024 β’ 80
DocLLM: A layout-aware generative language model for multimodal document understanding Paper β’ 2401.00908 β’ Published Dec 31, 2023 β’ 189
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. β’ 121 items β’ Updated Jan 31, 2024 β’ 564
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching Paper β’ 2311.11284 β’ Published Nov 19, 2023 β’ 21