v0.29.1
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.29.1 for changelog.
- README.md +55 -55
- TextEncoder.bin +0 -3
- TextEncoderQuantizable.bin +0 -3
- TextEncoderQuantizable_w8a16.bin +0 -3
- TextEncoderQuantizable_w8a16.onnx.zip +0 -3
- TextEncoder_Quantized.bin +0 -3
- UNet_Quantized.bin +0 -3
- Unet.bin +0 -3
- UnetQuantizable.bin +0 -3
- UnetQuantizable_w8a16.bin +0 -3
- VAEDecoder_Quantized.bin +0 -3
- VaeDecoder.bin +0 -3
- VaeDecoderQuantizable.bin +0 -3
- VaeDecoderQuantizable.so +0 -3
- VaeDecoderQuantizable_w8a16.bin +0 -3
README.md
CHANGED
|
@@ -8,7 +8,7 @@ pipeline_tag: unconditional-image-generation
|
|
| 8 |
|
| 9 |
---
|
| 10 |
|
| 11 |
-
 | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
| 39 |
|---|---|---|---|---|---|---|---|---|
|
| 40 |
-
| TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 9.
|
| 41 |
-
| TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 4.
|
| 42 |
-
| TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 4.
|
| 43 |
-
| TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 9.
|
| 44 |
-
| TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 4.
|
| 45 |
-
| TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 4.
|
| 46 |
-
| TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 4.
|
| 47 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 4.
|
| 48 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 4.
|
| 49 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 3.
|
| 50 |
-
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 3.
|
| 51 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 3.
|
| 52 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 3.
|
| 53 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 4.
|
| 54 |
-
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 4.
|
| 55 |
-
| UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 269.
|
| 56 |
-
| UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 114.
|
| 57 |
-
| UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 108.
|
| 58 |
-
| UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 269.
|
| 59 |
-
| UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 114.
|
| 60 |
-
| UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN |
|
| 61 |
-
| UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 108.
|
| 62 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN |
|
| 63 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX |
|
| 64 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN |
|
| 65 |
-
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX |
|
| 66 |
-
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 71.
|
| 67 |
-
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX |
|
| 68 |
-
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 116.
|
| 69 |
-
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX |
|
| 70 |
-
| VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.
|
| 71 |
-
| VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN |
|
| 72 |
-
| VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.
|
| 73 |
-
| VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.
|
| 74 |
-
| VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN |
|
| 75 |
-
| VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN |
|
| 76 |
-
| VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.
|
| 77 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN |
|
| 78 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX |
|
| 79 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN |
|
| 80 |
-
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX |
|
| 81 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN |
|
| 82 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX |
|
| 83 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.
|
| 84 |
-
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX |
|
| 85 |
|
| 86 |
|
| 87 |
|
|
@@ -91,7 +91,7 @@ More details on model performance across various devices, can be found
|
|
| 91 |
|
| 92 |
Install the package via pip:
|
| 93 |
```bash
|
| 94 |
-
pip install "qai-hub-models[stable-diffusion-v1-5
|
| 95 |
```
|
| 96 |
|
| 97 |
|
|
@@ -115,7 +115,7 @@ The package contains a simple end-to-end demo that downloads pre-trained
|
|
| 115 |
weights and runs this model on a sample input.
|
| 116 |
|
| 117 |
```bash
|
| 118 |
-
python -m qai_hub_models.models.
|
| 119 |
```
|
| 120 |
|
| 121 |
The above demo runs a reference implementation of pre-processing, model
|
|
@@ -124,7 +124,7 @@ inference, and post processing.
|
|
| 124 |
**NOTE**: If you want running in a Jupyter Notebook or Google Colab like
|
| 125 |
environment, please add the following to your cell (instead of the above).
|
| 126 |
```
|
| 127 |
-
%run -m qai_hub_models.models.
|
| 128 |
```
|
| 129 |
|
| 130 |
|
|
@@ -137,7 +137,7 @@ device. This script does the following:
|
|
| 137 |
* Accuracy check between PyTorch and on-device outputs.
|
| 138 |
|
| 139 |
```bash
|
| 140 |
-
python -m qai_hub_models.models.
|
| 141 |
```
|
| 142 |
```
|
| 143 |
Profiling Results
|
|
@@ -145,7 +145,7 @@ Profiling Results
|
|
| 145 |
TextEncoderQuantizable
|
| 146 |
Device : cs_8275 (ANDROID 14)
|
| 147 |
Runtime : QNN
|
| 148 |
-
Estimated inference time (ms) : 9.
|
| 149 |
Estimated peak memory usage (MB): [0, 9]
|
| 150 |
Total # Ops : 533
|
| 151 |
Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
|
|
@@ -154,7 +154,7 @@ Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
|
|
| 154 |
UnetQuantizable
|
| 155 |
Device : cs_8275 (ANDROID 14)
|
| 156 |
Runtime : QNN
|
| 157 |
-
Estimated inference time (ms) : 269.
|
| 158 |
Estimated peak memory usage (MB): [0, 8]
|
| 159 |
Total # Ops : 4041
|
| 160 |
Compute Unit(s) : npu (4041 ops) gpu (0 ops) cpu (0 ops)
|
|
@@ -164,7 +164,7 @@ VaeDecoderQuantizable
|
|
| 164 |
Device : cs_8275 (ANDROID 14)
|
| 165 |
Runtime : QNN
|
| 166 |
Estimated inference time (ms) : 720.6
|
| 167 |
-
Estimated peak memory usage (MB): [0,
|
| 168 |
Total # Ops : 189
|
| 169 |
Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
|
| 170 |
```
|
|
@@ -188,7 +188,7 @@ provides instructions on how to use the `.so` shared library in an Android appl
|
|
| 188 |
|
| 189 |
|
| 190 |
## View on Qualcomm® AI Hub
|
| 191 |
-
Get more details on Stable-Diffusion-v1.5's performance across various devices [here](https://aihub.qualcomm.com/models/
|
| 192 |
Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
|
| 193 |
|
| 194 |
|
|
|
|
| 8 |
|
| 9 |
---
|
| 10 |
|
| 11 |
+

|
| 12 |
|
| 13 |
# Stable-Diffusion-v1.5: Optimized for Mobile Deployment
|
| 14 |
## State-of-the-art generative AI model used to generate detailed images conditioned on text descriptions
|
|
|
|
| 21 |
|
| 22 |
This repository provides scripts to run Stable-Diffusion-v1.5 on Qualcomm® devices.
|
| 23 |
More details on model performance across various devices, can be found
|
| 24 |
+
[here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5).
|
| 25 |
|
| 26 |
|
| 27 |
### Model Details
|
|
|
|
| 37 |
|
| 38 |
| Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
| 39 |
|---|---|---|---|---|---|---|---|---|
|
| 40 |
+
| TextEncoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 9.359 ms | 0 - 9 MB | NPU | Use Export Script |
|
| 41 |
+
| TextEncoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 4.49 ms | 0 - 3 MB | NPU | Use Export Script |
|
| 42 |
+
| TextEncoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 4.954 ms | 0 - 10 MB | NPU | Use Export Script |
|
| 43 |
+
| TextEncoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 9.359 ms | 0 - 9 MB | NPU | Use Export Script |
|
| 44 |
+
| TextEncoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 4.541 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 45 |
+
| TextEncoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 4.619 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 46 |
+
| TextEncoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 4.954 ms | 0 - 10 MB | NPU | Use Export Script |
|
| 47 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 4.56 ms | 0 - 10 MB | NPU | Use Export Script |
|
| 48 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 4.728 ms | 0 - 164 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 49 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 3.271 ms | 0 - 18 MB | NPU | Use Export Script |
|
| 50 |
+
| TextEncoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 3.346 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 51 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 3.046 ms | 0 - 14 MB | NPU | Use Export Script |
|
| 52 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 3.189 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 53 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 4.891 ms | 1 - 1 MB | NPU | Use Export Script |
|
| 54 |
+
| TextEncoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 4.915 ms | 157 - 157 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 55 |
+
| UnetQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 269.398 ms | 0 - 8 MB | NPU | Use Export Script |
|
| 56 |
+
| UnetQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 114.529 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 57 |
+
| UnetQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 108.714 ms | 0 - 8 MB | NPU | Use Export Script |
|
| 58 |
+
| UnetQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 269.398 ms | 0 - 8 MB | NPU | Use Export Script |
|
| 59 |
+
| UnetQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 114.507 ms | 1 - 3 MB | NPU | Use Export Script |
|
| 60 |
+
| UnetQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 113.487 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 61 |
+
| UnetQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 108.714 ms | 0 - 8 MB | NPU | Use Export Script |
|
| 62 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 114.344 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 63 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 112.155 ms | 0 - 4 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 64 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 81.714 ms | 0 - 19 MB | NPU | Use Export Script |
|
| 65 |
+
| UnetQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 79.459 ms | 0 - 14 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 66 |
+
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 71.239 ms | 0 - 14 MB | NPU | Use Export Script |
|
| 67 |
+
| UnetQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 71.488 ms | 0 - 15 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 68 |
+
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 116.593 ms | 0 - 0 MB | NPU | Use Export Script |
|
| 69 |
+
| UnetQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 114.443 ms | 842 - 842 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 70 |
+
| VaeDecoderQuantizable | w8a16 | QCS8275 (Proxy) | Qualcomm® QCS8275 (Proxy) | QNN | 720.65 ms | 0 - 9 MB | NPU | Use Export Script |
|
| 71 |
+
| VaeDecoderQuantizable | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 268.706 ms | 0 - 3 MB | NPU | Use Export Script |
|
| 72 |
+
| VaeDecoderQuantizable | w8a16 | QCS9075 (Proxy) | Qualcomm® QCS9075 (Proxy) | QNN | 250.387 ms | 0 - 12 MB | NPU | Use Export Script |
|
| 73 |
+
| VaeDecoderQuantizable | w8a16 | SA7255P ADP | Qualcomm® SA7255P | QNN | 720.65 ms | 0 - 9 MB | NPU | Use Export Script |
|
| 74 |
+
| VaeDecoderQuantizable | w8a16 | SA8255 (Proxy) | Qualcomm® SA8255P (Proxy) | QNN | 273.815 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 75 |
+
| VaeDecoderQuantizable | w8a16 | SA8650 (Proxy) | Qualcomm® SA8650P (Proxy) | QNN | 274.195 ms | 0 - 2 MB | NPU | Use Export Script |
|
| 76 |
+
| VaeDecoderQuantizable | w8a16 | SA8775P ADP | Qualcomm® SA8775P | QNN | 250.387 ms | 0 - 12 MB | NPU | Use Export Script |
|
| 77 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 270.703 ms | 0 - 3 MB | NPU | Use Export Script |
|
| 78 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | ONNX | 268.632 ms | 0 - 66 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 79 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 205.905 ms | 0 - 21 MB | NPU | Use Export Script |
|
| 80 |
+
| VaeDecoderQuantizable | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | ONNX | 206.342 ms | 3 - 23 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 81 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | QNN | 192.889 ms | 0 - 15 MB | NPU | Use Export Script |
|
| 82 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | ONNX | 175.944 ms | 3 - 17 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 83 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | QNN | 266.828 ms | 0 - 0 MB | NPU | Use Export Script |
|
| 84 |
+
| VaeDecoderQuantizable | w8a16 | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 264.883 ms | 63 - 63 MB | NPU | [Stable-Diffusion-v1.5.onnx](https://huggingface.co/qualcomm/Stable-Diffusion-v1.5/blob/main/Stable-Diffusion-v1.5_w8a16.onnx) |
|
| 85 |
|
| 86 |
|
| 87 |
|
|
|
|
| 91 |
|
| 92 |
Install the package via pip:
|
| 93 |
```bash
|
| 94 |
+
pip install "qai-hub-models[stable-diffusion-v1-5]"
|
| 95 |
```
|
| 96 |
|
| 97 |
|
|
|
|
| 115 |
weights and runs this model on a sample input.
|
| 116 |
|
| 117 |
```bash
|
| 118 |
+
python -m qai_hub_models.models.stable_diffusion_v1_5.demo
|
| 119 |
```
|
| 120 |
|
| 121 |
The above demo runs a reference implementation of pre-processing, model
|
|
|
|
| 124 |
**NOTE**: If you want running in a Jupyter Notebook or Google Colab like
|
| 125 |
environment, please add the following to your cell (instead of the above).
|
| 126 |
```
|
| 127 |
+
%run -m qai_hub_models.models.stable_diffusion_v1_5.demo
|
| 128 |
```
|
| 129 |
|
| 130 |
|
|
|
|
| 137 |
* Accuracy check between PyTorch and on-device outputs.
|
| 138 |
|
| 139 |
```bash
|
| 140 |
+
python -m qai_hub_models.models.stable_diffusion_v1_5.export
|
| 141 |
```
|
| 142 |
```
|
| 143 |
Profiling Results
|
|
|
|
| 145 |
TextEncoderQuantizable
|
| 146 |
Device : cs_8275 (ANDROID 14)
|
| 147 |
Runtime : QNN
|
| 148 |
+
Estimated inference time (ms) : 9.4
|
| 149 |
Estimated peak memory usage (MB): [0, 9]
|
| 150 |
Total # Ops : 533
|
| 151 |
Compute Unit(s) : npu (533 ops) gpu (0 ops) cpu (0 ops)
|
|
|
|
| 154 |
UnetQuantizable
|
| 155 |
Device : cs_8275 (ANDROID 14)
|
| 156 |
Runtime : QNN
|
| 157 |
+
Estimated inference time (ms) : 269.4
|
| 158 |
Estimated peak memory usage (MB): [0, 8]
|
| 159 |
Total # Ops : 4041
|
| 160 |
Compute Unit(s) : npu (4041 ops) gpu (0 ops) cpu (0 ops)
|
|
|
|
| 164 |
Device : cs_8275 (ANDROID 14)
|
| 165 |
Runtime : QNN
|
| 166 |
Estimated inference time (ms) : 720.6
|
| 167 |
+
Estimated peak memory usage (MB): [0, 9]
|
| 168 |
Total # Ops : 189
|
| 169 |
Compute Unit(s) : npu (189 ops) gpu (0 ops) cpu (0 ops)
|
| 170 |
```
|
|
|
|
| 188 |
|
| 189 |
|
| 190 |
## View on Qualcomm® AI Hub
|
| 191 |
+
Get more details on Stable-Diffusion-v1.5's performance across various devices [here](https://aihub.qualcomm.com/models/stable_diffusion_v1_5).
|
| 192 |
Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
|
| 193 |
|
| 194 |
|
TextEncoder.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:5ea609803056cc46b35aaf7db04e7091a2cdeee823e64bbd569faf594b7e6e8b
|
| 3 |
-
size 163545088
|
|
|
|
|
|
|
|
|
|
|
|
TextEncoderQuantizable.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:9ed3b67ad0b0725f72b42427afef780be75fdfd138b874cb891e2af34dcbac8e
|
| 3 |
-
size 163545088
|
|
|
|
|
|
|
|
|
|
|
|
TextEncoderQuantizable_w8a16.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:d311113834583b852501aee90ffbb25a35f128fc43fb712600d65c674f974040
|
| 3 |
-
size 163548336
|
|
|
|
|
|
|
|
|
|
|
|
TextEncoderQuantizable_w8a16.onnx.zip
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:3cf1a0900cd118efd4d11f5067ba79a8f99d8995b5c11947d6b5228154086857
|
| 3 |
-
size 127241529
|
|
|
|
|
|
|
|
|
|
|
|
TextEncoder_Quantized.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:aad7cc2d5c4ae1ceb59264d47880c109bdc963aa1d0841d47dfcd34032556abe
|
| 3 |
-
size 163275152
|
|
|
|
|
|
|
|
|
|
|
|
UNet_Quantized.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:e7523141556997cc2e6b4a1bacc0dc59b38b05fd18aae8c64004987d05f0eb7e
|
| 3 |
-
size 878473240
|
|
|
|
|
|
|
|
|
|
|
|
Unet.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:a8057f09a165388abfdbfc1520983ff368bf58dd5abf0fd29affafbee68e3e1b
|
| 3 |
-
size 879088632
|
|
|
|
|
|
|
|
|
|
|
|
UnetQuantizable.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:a8acc84d9be477334dc746e4d9c7ac94ec82a0aa538fb186a1452f4a377f2bec
|
| 3 |
-
size 879088632
|
|
|
|
|
|
|
|
|
|
|
|
UnetQuantizable_w8a16.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:3b0f06fd2f9fb9d3ec1e5a0b5d46242eab182a187a70f45b6639023338ac2e1e
|
| 3 |
-
size 881209680
|
|
|
|
|
|
|
|
|
|
|
|
VAEDecoder_Quantized.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:7789b6a8b8aa6ae02f20f2817b54b410c45f0fddee9cf231cf3aac83724f8975
|
| 3 |
-
size 59072424
|
|
|
|
|
|
|
|
|
|
|
|
VaeDecoder.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:9a2d7e70ba95a2749d73f9785233d25ebf5abb1e34351d87c0f1c9e0adb00d49
|
| 3 |
-
size 64693320
|
|
|
|
|
|
|
|
|
|
|
|
VaeDecoderQuantizable.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:8b613581e3a9ff71c1637918dce76ba296d55abc8807f0faef12556ed60525d3
|
| 3 |
-
size 64693320
|
|
|
|
|
|
|
|
|
|
|
|
VaeDecoderQuantizable.so
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:a89f37abd5657cf80a35936d75058bfb964fc36920eaa0d66bc9b2fe37822d83
|
| 3 |
-
size 50386176
|
|
|
|
|
|
|
|
|
|
|
|
VaeDecoderQuantizable_w8a16.bin
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:c818562619cfc7622ab57fb139afc9033df650f4049d8d2b9443210e5a7b7846
|
| 3 |
-
size 64701512
|
|
|
|
|
|
|
|
|
|
|
|