Andres Marafioti's picture

Andres Marafioti

andito

·

AI & ML interests

Multimodal models, VLM and TTS

Recent Activity

liked a Space 8 days ago

HuggingFaceFW/finephrase

upvoted a paper 10 days ago

Qwen3-TTS Technical Report

liked a Space 10 days ago

lvwerra/agent-ui

View all activity

Organizations

published an article 25 days ago

Article

I Let a Lobster Run My Jetson: What OpenClaw Taught Me About the Future of Computing

25 days ago

•

15

published an article 5 months ago

Article

Streaming datasets: 100x More Efficient

+3

Oct 27, 2025

•

84

published an article 5 months ago

Article

Supercharge your OCR Pipelines with Open Models

+5

Oct 21, 2025

•

307

published an article 8 months ago

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

+2

Jul 23, 2025

•

48

published an article 8 months ago

Article

Efficient MultiModal Data Pipeline

+3

Jul 8, 2025

•

70

published an article 10 months ago

Article

KV Cache from scratch in nanoVLM

+3

Jun 4, 2025

•

114

published an article 10 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

Jun 3, 2025

•

336

published an article 10 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21, 2025

•

252

published an article 10 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

600

published an article about 1 year ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

+5

Feb 20, 2025

•

330

published an article about 1 year ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

+5

Feb 20, 2025

•

330

published an article about 1 year ago

Article

SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

+1

Jan 23, 2025

•

192

published an article over 1 year ago

Article

SmolVLM - small yet mighty Vision Language Model

+3

Nov 26, 2024

•

416

published an article over 1 year ago

Article

Deploying Speech-to-Speech on Hugging Face

+2

Oct 22, 2024

•

45

published an article over 1 year ago

Article

FineVideo: behind the scenes

+4

Sep 23, 2024

•

35

published an article over 1 year ago

Article

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25, 2024

•

17

published an article over 1 year ago

Article

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25, 2024

•

17

published an article over 1 year ago

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18, 2024

•

78

published an article over 1 year ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

+1

Jun 24, 2024

•

206