Using empty_init
results in 0 gradient
#19720
Unanswered
RuABraun
asked this question in
DDP / multi-GPU / multi-node
Replies: 1 comment
-
@RuABraun We encountered the same problem when running a model with only one GPU ( |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My code looks like
This results in all parameters having 0 gradient. This changes when I remove the
init_module
line.Guessing I'm using it wrong, should I be wrapping everything model related in it?
Beta Was this translation helpful? Give feedback.
All reactions