Can't finetune stable diffusion with --enable_xformers_memory_efficient_attention #2234

LucasSloan · 2023-02-03T19:34:54Z

Describe the bug

I'm trying to finetune stable diffusion, and I'm trying to reduce the memory footprint so I can train with a larger batch size (and thus fewer gradient accumulation steps, and thus faster).

Setting --enable_xformers_memory_efficient_attention results in numeric instability of some kind, I think? The safety_checker tripped (training on the Pokemon dataset, validation prompt "Yoda"). If I disable the safety_checker, and I get black images anyway, along with the error message:

/home/lucas/.local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py:813: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")

If I instead set --enable_xformers_memory_efficient_attention, but disable --gradient_checkpointing, everything hums along nicely, but the model doesn't actually fine tune.

I attempted to force xformers to use Flash Attention (using the snippet in #2049), because #1997 suggested there were issues with the other xformers attention kernels, I get this error:

ValueError: Operator `memory_efficient_attention` does not support inputs:
     query       : shape=(8, 256, 1, 160) (torch.float16)
     key         : shape=(8, 256, 1, 160) (torch.float16)
     value       : shape=(8, 256, 1, 160) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 128

Reproduction

Here's the command I ran with --enable_xformers_memory_efficient_attention, but not with --gradient_checkpointing:

accelerate launch train_text_to_image.py   --pretrained_model_name_or_path=$MODEL_NAME   --dataset_name=$dataset_name   --use_ema   --resolution=512 --center_crop --random_flip   --train_batch_size=1   --gradient_accumulation_steps=8   --mixed_precision="fp16"   --max_train_steps=15000   --learning_rate=1e-05   --max_grad_norm=1   --lr_scheduler="constant" --lr_warmup_steps=0   --output_dir="sd-pokemon-model"  --validation_prompt=Yoda --num_validation_images=8  --validation_steps=1000  --enable_xformers_memory_efficient_attention

I'm running with #2157, because that gives me images to see how training is progressing (which is how I noticed it wasn't finetuning), but I've observed it at HEAD.

Logs

No response

System Info

diffusers version: 0.13.0.dev0
Platform: Linux-5.15.79.1-microsoft-standard-WSL2-x86_64-with-glibc2.29
Python version: 3.8.10
PyTorch version (GPU?): 1.13.1+cu117 (True)
Huggingface_hub version: 0.11.1
Transformers version: 0.15.0
Accelerate version: using accelerate, but configured to run on a single gpu
xFormers version: 0.0.16
Using GPU in script?: 3090

The text was updated successfully, but these errors were encountered:

EandrewJones · 2023-02-05T06:06:24Z

@LucasSloan There's a good chance your issues are related to a problem in xformers v0.0.16 where the Stable Diffusion attention head dims are too large on certain GPU architectures (sm86/89):

facebookresearch/xformers#631

Try updating to a newer xformers dev release that includes the patch from that issue:
pip install xformers==0.0.17.dev435
pip install xformers==0.0.17.dev441
pip install xformers==0.0.17.dev442

If that doesn't work, would you mind sharing the output from python -m xformers.info?

LucasSloan · 2023-02-06T05:20:08Z

That fixed it, thanks!

Dragonswords102 · 2023-02-21T03:03:22Z

Hey, I stumbled upon your response while trying to fix my own issues with xformer 0.0.16, however all of the dev options that you suggested provided errors

(base) C:\Users\orins\OneDrive\Documents\SDlocal>pip install xformers==0.0.17.dev441
ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev447, 0.0.17.dev448, 0.0.17.dev449, 0.0.17.dev451, 0.0.17.dev461)
ERROR: No matching distribution found for xformers==0.0.17.dev441

Since I assume it will be helpful, I will also provide the python -m xformers.info

(base) C:\Users\orins\OneDrive\Documents\SDlocal>python -m xformers.info
Traceback (most recent call last):
File "C:\Users\orins\miniconda3\lib\runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "C:\Users\orins\miniconda3\lib\runpy.py", line 110, in get_module_details
import(pkg_name)
File "C:\Users\orins\miniconda3\lib\site-packages\xformers_init.py", line 10, in
from . import _cpp_lib
File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 127, in
_build_metadata = _register_extensions()
File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 117, in _register_extensions
torch.ops.load_library(ext_specs.origin)
AttributeError: module 'torch' has no attribute 'ops'

Hope this is enough info, thanks

EandrewJones · 2023-02-21T04:22:25Z

Hi, TL;DR If you read the error from the attempted install, you'll see xformers version 0.0.17.dev441 is no longer available on PyPi. Instead, try installing one of the newer dev releases which should include the fix: 0.0.17.dev447, 0.0.17.dev448, 0.0.17.dev449, 0.0.17.dev451, 0.0.17.dev461 See for PyPi for all releases: https://pypi.org/project/xformers/#history You may wonder: why doesn't the version I posted exist anymore? Answer: All libraries have limited space available to them on PyPi to host different versions. They keep stable versions pinned, but as new development releases of the upcoming version are made available, they have to deprecate older minor dev releases to stay within their quota. Best Evan Jones Website: www.ea-jones.com

…

On Mon, Feb 20, 2023 at 10:03 PM Dragonswords102 ***@***.***> wrote: Hey, I stumbled upon your response while trying to fix my own issues with xformer 0.0.16, however all of the dev options that you suggested provided errors (base) C:\Users\orins\OneDrive\Documents\SDlocal>pip install xformers==0.0.17.dev441 ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev447, 0.0.17.dev448, 0.0.17.dev449, 0.0.17.dev451, 0.0.17.dev461) ERROR: No matching distribution found for xformers==0.0.17.dev441 Since I assume it will be helpful, I will also provide the python -m xformers.info (base) C:\Users\orins\OneDrive\Documents\SDlocal>python -m xformers.info Traceback (most recent call last): File "C:\Users\orins\miniconda3\lib\runpy.py", line 187, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "C:\Users\orins\miniconda3\lib\runpy.py", line 110, in *get_module_details import(pkg_name) File "C:\Users\orins\miniconda3\lib\site-packages\xformers_init*.py", line 10, in from . import _cpp_lib File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 127, in _build_metadata = _register_extensions() File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 117, in _register_extensions torch.ops.load_library(ext_specs.origin) AttributeError: module 'torch' has no attribute 'ops' Hope this is enough info, thanks — Reply to this email directly, view it on GitHub <#2234 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJ2T6AN3RZY7PVOUMFQOIQDWYQWANANCNFSM6AAAAAAUQT2EKM> . You are receiving this because you commented.Message ID: ***@***.***>

Dragonswords102 · 2023-02-24T08:51:35Z

Hi, that seemed to fix my issue, thank you. While we are here I have another issue that maybe you have knowledge on. I have created a hypernetwork and followed a GitHub guide on training settings, however when I press train hypernetwork, the command prompt tells me that cuda is out of memory, which does not make much sense to me as I have plenty of space. I have 16gbs of RAM and about 128MB of VRAM

EandrewJones · 2023-02-24T15:36:15Z

It appears you actually only have 6GB of VRAM on your GPU which is probably too limited for training most image models unless you run an extremely optimized algorithm. Anyways, if you have questions about Auto1111 webUI, I would take your questions over there. Best Evan Jones Website: www.ea-jones.com

…

On Fri, Feb 24, 2023 at 3:51 AM Dragonswords102 ***@***.***> wrote: Hi, that seemed to fix my issue, thank you. While we are here I have another issue that maybe you have knowledge on. I have created a hypernetwork and followed a GitHub guide on training settings, however when I press train hypernetwork, the command prompt tells me that cuda is out of memory, which does not make much sense to me as I have plenty of space. I have 16gbs of RAM and about 128MB of VRAM [image: false error] <https://user-images.githubusercontent.com/125940602/221134518-52d4df32-0033-4ff9-9e67-62bb62354f9d.png> — Reply to this email directly, view it on GitHub <#2234 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJ2T6APGTLGZH76NZSJNKWDWZBZCFANCNFSM6AAAAAAUQT2EKM> . You are receiving this because you commented.Message ID: ***@***.***>

technologiespro · 2023-03-21T18:13:22Z

issue

ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev466, 0.0.17.dev473, 0.0.17.dev474, 0.0.17.dev476, 0.0.17.dev480, 0.0.17.dev481)

pip install xformers==0.0.17.dev481

patrickvonplaten · 2023-03-23T12:37:06Z

Hey @technologiespro ,

This looks like a problem with xformers: https://github.com/facebookresearch/xformers - could you please post the issue there?

kime541200 · 2023-04-16T14:28:41Z

Hi everyone, I encountered the issue while training my Dreambooth model and found a solution that may be helpful to you.
In the setup.bat file, modify the packages to be installed to:

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install pyre-extensions==0.0.23
pip install --use-pep517 --upgrade -r requirements.txt
pip install xformers==0.0.16

After making these changes, you should be able to start training your Dreambooth model.

For your information, I am using Windows10 as my operating system and a 3060 GPU.

Additionally, I came across some information at (https://huggingface.co/docs/diffusers/optimization/xformers) that suggests xFormers v0.0.16 may not be suitable for training (fine-tune or Dreambooth) on certain GPUs. If you encounter any issues, please refer to the comment on that page and install the recommended development version to test whether it resolves the problem for you all.

liuchenbaidu · 2023-04-19T02:25:58Z

I encountered the issue while running
code "txt2img_pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp)"
with using stable diffusion 1.6 model.

python -m xformers.info:
xFormers 0.0.16
memory_efficient_attention.cutlassF: available
memory_efficient_attention.cutlassB: available
memory_efficient_attention.flshattF: available
memory_efficient_attention.flshattB: available
memory_efficient_attention.smallkF: available
memory_efficient_attention.smallkB: available
memory_efficient_attention.tritonflashattF: available
memory_efficient_attention.tritonflashattB: available
swiglu.fused.p.cpp: available
is_triton_available: True
is_functorch_available: False
pytorch.version: 1.13.1+cu117
pytorch.cuda: available
gpu.compute_capability: 8.6
gpu.name: NVIDIA GeForce RTX 3090
build.info: available
build.cuda_version: 1107
build.python_version: 3.8.16
build.torch_version: 1.13.1+cu117
build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6
build.env.XFORMERS_BUILD_TYPE: Release
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None
build.env.NVCC_FLAGS: None
build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.16
source.privacy: open source

liuxz-cs · 2023-07-18T01:40:01Z

That fixed it, thanks!

How do you solve this problem, which version of xformers you have installed? I try 0.0.17. 0.0.17rc481. 0.0.17rc482, but cannot solve this problem.

liuxz-cs · 2023-07-18T01:40:29Z

issue

ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev466, 0.0.17.dev473, 0.0.17.dev474, 0.0.17.dev476, 0.0.17.dev480, 0.0.17.dev481)

pip install xformers==0.0.17.dev481

Do you solve this problem finally?

qingcong1224 · 2024-05-08T03:30:37Z

hi ,I have encountered the following problem，torch==2.3.0+cu118 and xformers==0.0.26 post1+cu118

return self.call_impl(*args, **kwargs)
File "C:\Users\Admin\anaconda3\envs\svd\lib\site-packages\torch\nn\modules\module.py", line 1541, in call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Admin\anaconda3\envs\svd\lib\site-packages\sgm\modules\diffusionmodules\model.py", line 263, in forward
h = self.attention(h)
File "C:\Users\Admin\anaconda3\envs\svd\lib\site-packages\sgm\modules\diffusionmodules\model.py", line 249, in attention
out = xformers.ops.memory_efficient_attention(
AttributeError: module 'xformers' has no attribute 'ops'

LucasSloan added the bug Something isn't working label Feb 3, 2023

LucasSloan closed this as completed Feb 6, 2023

adhikjoshi mentioned this issue Feb 9, 2023

xFormers attention op arg #2049

Merged

pcuenca mentioned this issue Feb 20, 2023

Train text to image slower with xformers #2416

Closed

williamberman mentioned this issue Mar 6, 2023

Objects From Dreambooth Training Are Not in Output #2371

Closed

JingyeChen mentioned this issue Jul 7, 2023

TextDiffuser - When does the model starts to predict plausible results? microsoft/unilm#1180

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't finetune stable diffusion with --enable_xformers_memory_efficient_attention #2234

Can't finetune stable diffusion with --enable_xformers_memory_efficient_attention #2234

LucasSloan commented Feb 3, 2023

EandrewJones commented Feb 5, 2023 •

edited

Loading

LucasSloan commented Feb 6, 2023

Dragonswords102 commented Feb 21, 2023

EandrewJones commented Feb 21, 2023 via email •

edited

Loading

Dragonswords102 commented Feb 24, 2023

EandrewJones commented Feb 24, 2023 via email

technologiespro commented Mar 21, 2023

patrickvonplaten commented Mar 23, 2023

kime541200 commented Apr 16, 2023

liuchenbaidu commented Apr 19, 2023 •

edited

Loading

liuxz-cs commented Jul 18, 2023

liuxz-cs commented Jul 18, 2023

qingcong1224 commented May 8, 2024

Can't finetune stable diffusion with --enable_xformers_memory_efficient_attention #2234

Can't finetune stable diffusion with --enable_xformers_memory_efficient_attention #2234

Comments

LucasSloan commented Feb 3, 2023

Describe the bug

Reproduction

Logs

System Info

EandrewJones commented Feb 5, 2023 • edited Loading

LucasSloan commented Feb 6, 2023

Dragonswords102 commented Feb 21, 2023

EandrewJones commented Feb 21, 2023 via email • edited Loading

Dragonswords102 commented Feb 24, 2023

EandrewJones commented Feb 24, 2023 via email

technologiespro commented Mar 21, 2023

patrickvonplaten commented Mar 23, 2023

kime541200 commented Apr 16, 2023

liuchenbaidu commented Apr 19, 2023 • edited Loading

liuxz-cs commented Jul 18, 2023

liuxz-cs commented Jul 18, 2023

qingcong1224 commented May 8, 2024

EandrewJones commented Feb 5, 2023 •

edited

Loading

EandrewJones commented Feb 21, 2023 via email •

edited

Loading

liuchenbaidu commented Apr 19, 2023 •

edited

Loading