Instructions to use maxin-cn/Cinemo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use maxin-cn/Cinemo with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("maxin-cn/Cinemo", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
How do I generate a longer video?
For example, I provide an image, input.png, and it generates a 2-second video, temp.mp4. My goal is to generate a 4-second video. I extracted the last frame of temp.mp4, saved it as input_1.png, and passed it to the model. It returned temp2.mp4, but the second video still has no movement. I found out that temp.mp4 has movement, but temp2.mp4 has no movement—it's just a still image, the same as the last frame. Why can't it generate movement after the first time?
Why can't it generate a video(with movement) from the last frame of the output video?
This kind of autoregressive generation is the easiest way to generate growth videos, and usually, the results are not very good. You can generate it a few more times in the second phase, and you should get good results. Modifying the corresponding prompt in the second phase should also help.