|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- zh |
|
|
base_model: |
|
|
- Tongyi-MAI/Z-Image-Turbo |
|
|
pipeline_tag: text-to-image |
|
|
library_name: diffusers |
|
|
tags: |
|
|
- text-to-image |
|
|
- image-generation |
|
|
- diffusion |
|
|
- comfyui |
|
|
- photorealistic |
|
|
- bilingual |
|
|
- chinese |
|
|
- english |
|
|
- 8-step |
|
|
- fast-generation |
|
|
--- |
|
|
|
|
|
# π Z-Image-Turbo-AIO | 8-Step Photorealistic Generation |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
**Ultra-Fast β’ Bilingual Text Rendering β’ All-in-One β’ FP8 & BF16** |
|
|
|
|
|
[](https://opensource.org/licenses/Apache-2.0) |
|
|
[](https://github.com/comfyanonymous/ComfyUI) |
|
|
|
|
|
</div> |
|
|
|
|
|
## β¨ What is Z-Image-Turbo-AIO? |
|
|
|
|
|
Z-Image-Turbo-AIO is an **All-in-One repackage** of Alibaba Tongyi Lab's 6B parameter photorealistic image generator, optimized for lightning-fast 8-step generation. This version includes **integrated VAE and Text Encoder** for maximum convenience - just download and generate! |
|
|
|
|
|
### Available Versions |
|
|
|
|
|
| Version | Size | Best For | |
|
|
|---------|------|----------| |
|
|
| π‘ **FP8-AIO** | ~10GB | Most users, testing, everyday use | |
|
|
| π **BF16-AIO** | ~20GB | Maximum quality, professional work | |
|
|
|
|
|
## π― Key Features |
|
|
|
|
|
- β‘ **8-step generation** - 10-40 seconds per image |
|
|
- π¦ **All-in-One** - No separate VAE/Text Encoder downloads needed |
|
|
- πΈ **Photorealistic** - Professional quality output |
|
|
- π **Bilingual** - English & Chinese text rendering |
|
|
- π― **8GB VRAM** - Works on RTX 4060 and similar |
|
|
- π **Apache 2.0** - Open license for any use |
|
|
|
|
|
## π Which Version Should I Choose? |
|
|
|
|
|
### π‘ FP8-AIO (Recommended for most users) |
|
|
- β
Half the file size |
|
|
- β
Faster downloads |
|
|
- β
Excellent quality |
|
|
- β
Perfect for 8GB VRAM |
|
|
- β
Great for testing & everyday use |
|
|
|
|
|
### π BF16-AIO (Maximum precision) |
|
|
- β
BFloat16 full precision |
|
|
- β
Absolute best quality |
|
|
- β
Professional/commercial grade |
|
|
- β
Still works on 8GB VRAM |
|
|
|
|
|
## π₯ Quick Start (ComfyUI) |
|
|
|
|
|
### Installation |
|
|
|
|
|
1. Download your preferred version (FP8 or BF16) |
|
|
2. Place in `ComfyUI/models/checkpoints/` |
|
|
3. Load with "Load Checkpoint" node |
|
|
4. Generate! |
|
|
|
|
|
### Recommended Settings |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Steps | 8 | |
|
|
| CFG | 1.0 | |
|
|
| Sampler | res_multistep | |
|
|
| Scheduler | simple | |
|
|
| Resolution | 1920Γ1088 | |
|
|
|
|
|
**That's it! No separate VAE or Text Encoder needed!** |
|
|
|
|
|
## π Performance |
|
|
|
|
|
All tests on **RTX 4060 (8GB VRAM)** β’ FP8 β’ 1920Γ1088 β’ 8 steps |
|
|
|
|
|
| Test | Generation Time | |
|
|
|------|-----------------| |
|
|
| Urban Interior | ~32s | |
|
|
| Architecture | ~32-34s | |
|
|
| Food Photography | ~32s | |
|
|
| Bilingual Signage | ~32s | |
|
|
|
|
|
## π‘ Prompting Guide |
|
|
|
|
|
### β
Natural Language Works Best! |
|
|
|
|
|
**Good Example:** |
|
|
``` |
|
|
A cozy bookstore with floor-to-ceiling wooden shelves filled with |
|
|
colorful books, comfortable reading nooks with cushions near large |
|
|
windows, warm pendant lighting, peaceful afternoon atmosphere, |
|
|
professional interior photography |
|
|
``` |
|
|
|
|
|
**Bad Example:** |
|
|
``` |
|
|
bookstore, books, chairs, window, cozy, warm light, interior |
|
|
``` |
|
|
|
|
|
### π Bilingual Text Rendering |
|
|
|
|
|
**English Text:** |
|
|
``` |
|
|
Neon sign reading "OPEN 24/7" in bright blue letters above entrance. |
|
|
Modern sans-serif font, glowing effect against brick wall. |
|
|
``` |
|
|
|
|
|
**Chinese Text:** |
|
|
``` |
|
|
Traditional tea house entrance with sign reading "ε€ι΅θΆε" in elegant |
|
|
gold Chinese calligraphy on red wooden board with ornate carved border. |
|
|
``` |
|
|
|
|
|
**Both Languages:** |
|
|
``` |
|
|
Modern cafe exterior with bilingual sign. "Morning Brew Coffee" in |
|
|
white elegant script above, "ζ¨ζ¦εε‘" in matching Chinese characters |
|
|
below. Both glowing warmly at dusk. |
|
|
``` |
|
|
|
|
|
### π Prompting Tips |
|
|
|
|
|
| Do β
| Don't β | |
|
|
|------|---------| |
|
|
| Use natural language descriptions | Use tag-style prompts (tag1, tag2) | |
|
|
| Be detailed (100-300 words optimal) | Write very short prompts (<50 words) | |
|
|
| Include lighting and mood | Add negative prompts (not used) | |
|
|
| Describe camera angle and style | Include conflicting instructions | |
|
|
| Specify materials and colors | | |
|
|
|
|
|
## π Credits & Acknowledgments |
|
|
|
|
|
### Original Model |
|
|
- **Developer:** Tongyi Lab (Alibaba Group) |
|
|
- **Architecture:** Single-Stream Diffusion Transformer (6B parameters) |
|
|
- **Algorithm:** Decoupled-DMD + DMDR |
|
|
- **License:** Apache 2.0 |
|
|
|
|
|
### AIO Conversion |
|
|
- **Created by:** [SeeSee21](https://huggingface.co/SeeSee21) |
|
|
- **Format:** Integrated VAE + Text Encoder |
|
|
- **Purpose:** Simplified single-file deployment |
|
|
|
|
|
### Resources |
|
|
- π€ [Original HuggingFace](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) |
|
|
- π» [GitHub Repository](https://github.com/Tongyi-MAI/Z-Image) |
|
|
- π¨ [ComfyUI Files](https://huggingface.co/Comfy-Org/z_image_turbo) |
|
|
- πΌοΈ [CivitAI Page](https://civitai.com/models/2173571) |
|
|
|
|
|
## π Version History |
|
|
|
|
|
### v1.0 - Initial AIO Release |
|
|
- FP8-AIO version (10GB) |
|
|
- BF16-AIO version (20GB) |
|
|
- Integrated VAE + Text Encoder |
|
|
- Single-file deployment |
|
|
- Based on Tongyi-MAI/Z-Image-Turbo |
|
|
- Tested on RTX 4060 8GB |
|
|
- Optimized for 1920Γ1088 |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
**Download, load with "Load Checkpoint", and generate professional photos in seconds! π** |
|
|
|
|
|
</div> |