Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update FSDPPrecision to support 16-true precision setting #17657

Closed
awaelchli opened this issue May 17, 2023 · 4 comments · Fixed by #17807
Closed

Update FSDPPrecision to support 16-true precision setting #17657

awaelchli opened this issue May 17, 2023 · 4 comments · Fixed by #17807
Labels
fabric lightning.fabric.Fabric feature Is an improvement or enhancement refactor strategy: fsdp Fully Sharded Data Parallel
Milestone

Comments

@awaelchli
Copy link
Contributor

awaelchli commented May 17, 2023

Description & Motivation

We recently added true precision support in Fabric:
#17287
#17576

The FSDPPrecision plugin needs an update to support the strings "16-true" and "bf16-true".

Pitch

Update the plugin's code by following the same pattern as in #17576.

Alternatives

No response

Additional context

No response

cc @Borda @justusschock @awaelchli @carmocca

@awaelchli awaelchli added feature Is an improvement or enhancement needs triage Waiting to be triaged by maintainers refactor strategy: fsdp Fully Sharded Data Parallel and removed needs triage Waiting to be triaged by maintainers labels May 17, 2023
@awaelchli awaelchli added this to the 2.1 milestone May 17, 2023
@awaelchli awaelchli added the fabric lightning.fabric.Fabric label May 17, 2023
@carmocca
Copy link
Contributor

I think #17670 fixes this issue too

@speediedan
Copy link
Contributor

You guys may have observed this internally already but if not I thought I'd share that though #17670 partially addresses this issue, patching of at least the following two sections was required to enable FSDP 16-true support for the downstream package I maintain: https://github.com/Lightning-AI/lightning/blob/0cc458e237c19a725d57ed33b3abe90dd7a0d3a4/src/lightning/pytorch/plugins/precision/fsdp.py#L36-L43 and https://github.com/Lightning-AI/lightning/blob/0cc458e237c19a725d57ed33b3abe90dd7a0d3a4/src/lightning/pytorch/trainer/connectors/accelerator_connector.py#L542-L544

Thanks for all your continued work making Lightning better every day!
@awaelchli @carmocca

@awaelchli
Copy link
Contributor Author

@speediedan #17670 was a different issue. We definitely want to bring support for 16-true to the Lightning Trainer (#17609) and with it also in the special FSDP precision plugin. What you point out is the change required to make the plugin accept the strings, and that's basically what I opened this ticket for, just for the Fabric side of things. But yes, it applies to both Fabric and PyTorch :)

@speediedan
Copy link
Contributor

Awesome, figured it was all planned but thanks for thorough (as always!) clarification @awaelchli! Reminds me I'm looking forward to contributing/collaborating with you guys more in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fabric lightning.fabric.Fabric feature Is an improvement or enhancement refactor strategy: fsdp Fully Sharded Data Parallel
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants