-
YOLO-FEDER FusionNet: A Novel Deep Learning Architecture for Drone Detection
Paper • 2406.11641 • Published -
YOLOv12: Attention-Centric Real-Time Object Detectors
Paper • 2502.12524 • Published • 12 -
DGE-YOLO: Dual-Branch Gathering and Attention for Accurate UAV Object Detection
Paper • 2506.23252 • Published -
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection
Paper • 2512.23273 • Published • 14
Collections
Discover the best community collections!
Collections including paper arxiv:2409.08513
-
LocalMamba: Visual State Space Model with Windowed Selective Scan
Paper • 2403.09338 • Published • 8 -
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Paper • 2403.09394 • Published • 26 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 35 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30
-
Human-inspired Perspectives: A Survey on AI Long-term Memory
Paper • 2411.00489 • Published • 1 -
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation
Paper • 2409.09135 • Published • 2 -
Reading Recognition in the Wild
Paper • 2505.24848 • Published • 1 -
EgoLife: Towards Egocentric Life Assistant
Paper • 2503.03803 • Published • 46
-
YOLO-FEDER FusionNet: A Novel Deep Learning Architecture for Drone Detection
Paper • 2406.11641 • Published -
YOLOv12: Attention-Centric Real-Time Object Detectors
Paper • 2502.12524 • Published • 12 -
DGE-YOLO: Dual-Branch Gathering and Attention for Accurate UAV Object Detection
Paper • 2506.23252 • Published -
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection
Paper • 2512.23273 • Published • 14
-
Human-inspired Perspectives: A Survey on AI Long-term Memory
Paper • 2411.00489 • Published • 1 -
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation
Paper • 2409.09135 • Published • 2 -
Reading Recognition in the Wild
Paper • 2505.24848 • Published • 1 -
EgoLife: Towards Egocentric Life Assistant
Paper • 2503.03803 • Published • 46
-
LocalMamba: Visual State Space Model with Windowed Selective Scan
Paper • 2403.09338 • Published • 8 -
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Paper • 2403.09394 • Published • 26 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 35 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30