Spaces:
Sleeping
Sleeping
| title: Advanced Multilingual Image Describer | |
| emoji: π | |
| colorFrom: purple | |
| colorTo: indigo | |
| sdk: streamlit | |
| sdk_version: "1.32.0" | |
| app_file: app.py | |
| pinned: false | |
| # π Advanced Multilingual Image Describer | |
| **No translation APIs β’ Native multilingual support β’ Latest vision-language models** | |
| ## π Features | |
| - **Direct multilingual captioning** - No separate translation step | |
| - **Latest models** - LLaVA 1.5, Qwen-VL, Moondream 2 | |
| - **10+ languages** - Native support for English, Chinese, Amharic, Spanish, French, German, Arabic, and more | |
| - **Fast & efficient** - Optimized for Hugging Face Spaces | |
| - **Clean interface** - Simple and intuitive | |
| ## π€ Supported Models | |
| ### LLaVA 1.5 (7B) | |
| - **Languages**: English, Chinese, Spanish, French, German, Italian, Russian, Japanese, Korean, Arabic | |
| - **Best for**: High-quality detailed descriptions | |
| - **Size**: 7 billion parameters | |
| ### Qwen-VL-Chat | |
| - **Languages**: English, Chinese, Japanese, Korean, French, German, Spanish, Russian | |
| - **Best for**: Conversational responses | |
| - **Size**: 9.6 billion parameters | |
| ### Moondream 2 | |
| - **Languages**: English, Spanish, French, German | |
| - **Best for**: Fast inference, smaller size | |
| - **Size**: 1.4 billion parameters | |
| ## π How It Works | |
| 1. **Select a model** from the sidebar | |
| 2. **Choose language** for output | |
| 3. **Upload an image** (JPG, PNG, WebP) | |
| 4. **Click "Generate Description"** | |
| 5. **Get native description** in selected language | |
| ## β‘ Performance | |
| - **Inference time**: 2-10 seconds | |
| - **Memory usage**: ~8-16GB RAM | |
| - **Quality**: Human-like descriptions | |
| - **Languages**: Native output (not translated) | |
| ## π οΈ Technical Details | |
| - **Framework**: Streamlit + Transformers | |
| - **Models**: Latest vision-language models from Hugging Face | |
| - **Deployment**: Hugging Face Spaces (CPU/GPU) | |
| - **Code**: Pure Python, no external APIs | |
| ## π File Structure |