How to Build a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac for Healthcare Oct 28 • 18
NVIDIA Releases 8 Million Sample Open Dataset and Tooling for OCR, Image Reasoning, Image and Video QA Tasks Oct 28 • 16
Llama‑Embed‑Nemotron‑8B Text Embedding Model Ranks First on Multilingual MTEB Leaderboard Oct 21 • 14
📢 NVIDIA Releases Nemotron-CC-Math Pre-Training Dataset: A High-Quality, Web-Scale Math Corpus for Pretraining Large Language Models Aug 18 • 5
NVIDIA Releases Improved Pretraining Dataset: Preserves High Value Math & Code, and Augments with Multi-Lingual Aug 18 • 3
NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11 • 75
Llama-NeMoRetriever-ColEmbed: Developer-Focused Guide to NVIDIA's State-of-the-Art Text-Image Retrieval Jul 9 • 4
Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions Jun 10 • 21
VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models Paper • 2511.07299 • Published 27 days ago • 4
DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning Paper • 2510.15110 • Published Oct 16 • 15
TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control Paper • 2510.09561 • Published Oct 10 • 7
Temporal Prompting Matters: Rethinking Referring Video Object Segmentation Paper • 2510.07319 • Published Oct 8 • 2
LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models Paper • 2510.03232 • Published Oct 3 • 1
Non-Intrusive Detection of Adversarial Deep Learning Attacks via Observer Networks Paper • 2002.09772 • Published Feb 22, 2020
V2V-GoT: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models and Graph-of-Thoughts Paper • 2509.18053 • Published Sep 22 • 3
LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos Paper • 2508.14041 • Published Aug 19 • 59
CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models Paper • 2501.02355 • Published Jan 4 • 1
ORFormer: Occlusion-Robust Transformer for Accurate Facial Landmark Detection Paper • 2412.13174 • Published Dec 17, 2024 • 1
Spatio-Temporal Context Prompting for Zero-Shot Action Detection Paper • 2408.15996 • Published Aug 28, 2024 • 1
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation Paper • 2406.12834 • Published Jun 18, 2024 • 1
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation Paper • 2404.04231 • Published Apr 5, 2024 • 1