Flux model is being fully created before bfloat16 cast #67

PRPA1984 · 2024-09-13T19:32:51Z

When loading Flux model, the entire model is being created before the cast.

model = Flux(configs[name].params).to(torch.bfloat16)

The issue here is that a lot of RAM is being drained during the model creation (because submodels are being initialized with random parameters). I fixed this in the meanwhile by casting every submodule during its creation

denred0 · 2024-09-15T03:31:40Z

When loading Flux model, the entire model is being created before the cast.
model = Flux(configs[name].params).to(torch.bfloat16)
The issue here is that a lot of RAM is being drained during the model creation (because submodels are being initialized with random parameters). I fixed this in the meanwhile by casting every submodule during its creation

Everyone with the same problem.
Need to add .to(torch.bfloat16) in flux/model.py here:

 self.double_blocks = nn.ModuleList(
            [
                DoubleStreamBlock(
                    self.hidden_size,
                    self.num_heads,
                    mlp_ratio=params.mlp_ratio,
                    qkv_bias=params.qkv_bias,
                ).to(torch.bfloat16)

                for _ in range(params.depth)
            ]
        )

and here

self.single_blocks = nn.ModuleList(
            [
                SingleStreamBlock(self.hidden_size, self.num_heads, mlp_ratio=params.mlp_ratio).to(torch.bfloat16)

                for _ in range(params.depth_single_blocks)
            ]
        )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux model is being fully created before bfloat16 cast #67

Flux model is being fully created before bfloat16 cast #67

PRPA1984 commented Sep 13, 2024

denred0 commented Sep 15, 2024 •

edited

Loading

Flux model is being fully created before bfloat16 cast #67

Flux model is being fully created before bfloat16 cast #67

Comments

PRPA1984 commented Sep 13, 2024

denred0 commented Sep 15, 2024 • edited Loading

denred0 commented Sep 15, 2024 •

edited

Loading