first version of CoRe^2

Browse files

Files changed (5) hide show

.gitattributes +35 -0
readme.md +83 -0
sample_img.py +2 -2
weights/sd35_noise_model.pth +3 -0
weights/sdxl_noise_model.pth +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

readme.md ADDED Viewed

	@@ -0,0 +1,83 @@

+# The Official Implementation of our Arxiv 2025 paper:
+> **[CoRe^2: _Collect, Reflect and Refine_ to Generate Better and Faster](https://arxiv.org/abs/2503.09662)** <br>
+Authors:
+>**<em>Shitong Shao, Zikai Zhou, Dian Xie, Yuetong Fang, Tian Ye, Lichen Bai</em> and <em>Zeke Xie*</em>** <br>
+> xLeaf Lab, HKUST (GZ) <br>
+> *: Corresponding author
+## New
+- [x] Release the inference code of SD3.5 and SDXL.
+- [ ] Release the inference code of FLUX.
+- [ ] Release the inference code of LlamaGen.
+- [ ] Release the implementation of the Collect phase.
+- [ ] Release the implementation of the Reflect phase.
+## Overview
+This guide provides instructions on how to use the CoRe^2.
+Here we provide the inference code which supports different models like ***Stable Diffusion XL, Stable Diffusion 3.5 Large.***
+## Requirements
+- `python version == 3.8`
+- `pytorch with cuda version`
+- `diffusers`
+- `PIL`
+- `bitsandbytes`
+- `numpy`
+- `timm`
+- `argparse`
+- `einops`
+## Installation🚀️
+Make sure you have successfully built `python` environment and installed `pytorch` with cuda version. Before running the script, ensure you have all the required packages installed. You can install them using:
+```bash
+pip install diffusers, PIL, numpy, timm, argparse, einops
+```
+## Usage👀️
+To use the CoRe^2 pipeline, you need to run the `sample_img.py` script with appropriate command-line arguments. Below are the available options:
+### Command-Line Arguments
+- `--pipeline`: Select the model pipeline (`sdxl`, `sd35`). Default is `sdxl`.
+- `--prompt`: The textual prompt based on which the image will be generated. Default is "Mickey Mouse painting by Frank Frazetta."
+- `--inference-step`: Number of inference steps for the diffusion process. Default is 50.
+- `--cfg`: Classifier-free guidance scale. Default is 5.5.
+- `--pretrained-path`: Path to the pretrained model weights. Default is a specified path in the script.
+- `--size`: The size (height and width) of the generated image. Default is 1024.
+- `--method`: Select the inference method (`standard`, `core`, `zigzag`, `z-core`)
+### Running the Script
+Run the script from the command line by navigating to the directory containing `sample_img.py` and executing:
+```
+python sample_img.py --pipeline sdxl --prompt "A banana on the left of an apple." --size 1024
+```
+This command will generate an image based on the prompt using the Stable Diffusion XL model with an image size of 1024x1024 pixels.
+### Output🎉️
+The script will save one image:
+## Pre-trained Weights Download❤️
+We provide the pre-trained CoRe^2 weights of Stable Diffusion XL, and Stable Diffusion 3.5 Large with https://drive.google.com/drive/folders/1alJco6X3cFw4oHTD9SifvS7apc3AwG8I?usp=drive_link

sample_img.py CHANGED Viewed

@@ -103,14 +103,14 @@ if __name__ == '__main__':
                   replace_linear_with_lora(refine_model, rank=64, alpha=1.0, number_of_lora=28)
                   lora_true(refine_model, lora_idx=0)
-                  checkpoint = torch.load('./weights/sd35_ckpt_v9.pth', map_location='cpu')
                   refine_model.load_state_dict(checkpoint)
             elif args.model == 'sdxl':
                   refine_model = PromptSDXLNet()
                   replace_linear_with_lora(refine_model, rank=48, alpha=1.0, number_of_lora=50)
                   lora_true(refine_model, lora_idx=0)
-                  checkpoint = torch.load('./weights/sdxl_ckpt_v9.pth', map_location='cpu')
                   refine_model.load_state_dict(checkpoint)
             print("Load Lora Success")

                   replace_linear_with_lora(refine_model, rank=64, alpha=1.0, number_of_lora=28)
                   lora_true(refine_model, lora_idx=0)
+                  checkpoint = torch.load('./weights/sd35_noise_model.pth', map_location='cpu')
                   refine_model.load_state_dict(checkpoint)
             elif args.model == 'sdxl':
                   refine_model = PromptSDXLNet()
                   replace_linear_with_lora(refine_model, rank=48, alpha=1.0, number_of_lora=50)
                   lora_true(refine_model, lora_idx=0)
+                  checkpoint = torch.load('./weights/sdxl_noise_model.pth', map_location='cpu')
                   refine_model.load_state_dict(checkpoint)
             print("Load Lora Success")

weights/sd35_noise_model.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6f99a9b437fba4da9c3fb87516c6285bd9bac07f1969a4ba4d631734412edaf2
+size 2881450254

weights/sdxl_noise_model.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ad620bd3a604908abfe8178e05f34a83434db246cd63f151755f26de14c5f241
+size 2034660755