| library_name: diffusers | |
| tags: | |
| - fp8 | |
| - safetensors | |
| - lora | |
| - low-rank | |
| - diffusion | |
| - converted-by-gradio | |
| # FP8 Model with Low-Rank LoRA | |
| - **Source**: `https://huggingface.co/Kijai/WanVideo_comfy` | |
| - **File**: `Wan2_1_VAE_bf16.safetensors` | |
| - **FP8 Format**: `E5M2` | |
| - **LoRA Rank**: 32 | |
| - **LoRA File**: `Wan2_1_VAE_bf16-lora-r32.safetensors` | |
| ## Usage (Inference) | |
| ```python | |
| from safetensors.torch import load_file | |
| import torch | |
| # Load FP8 model | |
| fp8_state = load_file("Wan2_1_VAE_bf16-fp8-e5m2.safetensors") | |
| lora_state = load_file("Wan2_1_VAE_bf16-lora-r32.safetensors") | |
| # Reconstruct approximate original weights | |
| reconstructed = {} | |
| for key in fp8_state: | |
| if f"lora_A.{key}" in lora_state and f"lora_B.{key}" in lora_state: | |
| A = lora_state[f"lora_A.{key}"].to(torch.float32) | |
| B = lora_state[f"lora_B.{key}"].to(torch.float32) | |
| lora_weight = B @ A # (rank, out) @ (in, rank) -> (out, in) | |
| fp8_weight = fp8_state[key].to(torch.float32) | |
| reconstructed[key] = fp8_weight + lora_weight | |
| else: | |
| reconstructed[key] = fp8_state[key].to(torch.float32) | |
| ``` | |
| > Requires PyTorch ≥ 2.1 for FP8 support. | |