FG-CLIP 2 Collection FG-CLIP 2 is the foundation model for fine-grained vision-language understanding in both English and Chinese. • 10 items • Updated Nov 6, 2025 • 5
view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 • 1.16k
Cosmos Collection ⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/nvidia-cosmos-2 • 31 items • Updated about 11 hours ago • 299
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 24 days ago • 309