Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,21 @@ Here it is, the BPModel, a Stable Diffusion model you may love or hate.
|
|
| 24 |
Trained with 5k high quality images that suit my taste (not necessary yours unfortunately) from [Sankaku Complex](https://chan.sankakucomplex.com) with annotations. Not the best strategy since pure combination of tags may not be the optimal way to describe the image, but I don't need to do extra work. And no, I won't feed any AI generated image
|
| 25 |
to the model even it might outlaw the model from being used in some countries.
|
| 26 |
|
| 27 |
-
The training of a high resolution model requires a significant amount of GPU
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
[Mikubill/naifu-diffusion](https://github.com/Mikubill/naifu-diffusion) is used as training script and I also recommend to
|
| 30 |
checkout [CCRcmcpe/scal-sdt](https://github.com/CCRcmcpe/scal-sdt).
|
|
|
|
| 24 |
Trained with 5k high quality images that suit my taste (not necessary yours unfortunately) from [Sankaku Complex](https://chan.sankakucomplex.com) with annotations. Not the best strategy since pure combination of tags may not be the optimal way to describe the image, but I don't need to do extra work. And no, I won't feed any AI generated image
|
| 25 |
to the model even it might outlaw the model from being used in some countries.
|
| 26 |
|
| 27 |
+
The training of a high resolution model requires a significant amount of GPU
|
| 28 |
+
hours and can be costly. In this particular case, 10 V100 GPU hours were spent
|
| 29 |
+
on training 30 epochs with a resolution of 512, while 60 V100 GPU hours were spent
|
| 30 |
+
on training 30 epochs with a resolution of 768. An additional 100 V100 GPU hours
|
| 31 |
+
were also spent on training a model with a resolution of 1024, although **ONLY** 10
|
| 32 |
+
epochs were run. The results of the training on the 1024 resolution model did
|
| 33 |
+
not show a significant improvement compared to the 768 resolution model, and the
|
| 34 |
+
resource demands, achieving a batch size of 1 on a V100 with 32G VRAM, were
|
| 35 |
+
high. However, training on the 768 resolution did yield better results than
|
| 36 |
+
training on the 512 resolution, and it is worth considering as an option. It is
|
| 37 |
+
worth noting that Stable Diffusion 2.x also chose to train on a 768 resolution
|
| 38 |
+
model. However, it may be more efficient to start with training on a 512
|
| 39 |
+
resolution model due to the slower training process and the need for additional
|
| 40 |
+
prior knowledge to speed up the training process when working with a 768
|
| 41 |
+
resolution.
|
| 42 |
|
| 43 |
[Mikubill/naifu-diffusion](https://github.com/Mikubill/naifu-diffusion) is used as training script and I also recommend to
|
| 44 |
checkout [CCRcmcpe/scal-sdt](https://github.com/CCRcmcpe/scal-sdt).
|