Abstract
Multi-teacher distillation enables unified student models that maintain and enhance multiple teacher capabilities, with C-RADIOv4 offering improved performance and efficiency through updated training teachers and enhanced resolution support.
By leveraging multi-teacher distillation, agglomerative vision backbones provide a unified student model that retains and improves the distinct capabilities of multiple teachers. In this tech report, we describe the most recent release of the C-RADIO family of models, C-RADIOv4, which builds upon AM-RADIO/RADIOv2.5 in design, offering strong improvements on key downstream tasks at the same computational complexity. We release -SO400M (412M params), and -H (631M) model variants, both trained with an updated set of teachers: SigLIP2, DINOv3, and SAM3. In addition to improvements on core metrics and new capabilities from imitating SAM3, the C-RADIOv4 model family further improves any-resolution support, brings back the ViTDet option for drastically enhanced efficiency at high-resolution, and comes with a permissive license.
Community
Combines SigLIP2, SAM3 and Dinov3 into the same feature space.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model (2025)
- CLIMP: Contrastive Language-Image Mamba Pretraining (2026)
- Revisiting Multi-Task Visual Representation Learning (2026)
- LinMU: Multimodal Understanding Made Linear (2026)
- UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders (2026)
- The Spatial Blindspot of Vision-Language Models (2026)
- SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper