Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting loss as 'nan' after 1st epoch only? #47

Open
shubhamk16 opened this issue Jun 11, 2020 · 3 comments
Open

getting loss as 'nan' after 1st epoch only? #47

shubhamk16 opened this issue Jun 11, 2020 · 3 comments

Comments

@shubhamk16
Copy link

shubhamk16 commented Jun 11, 2020

Hello guys,
just like the Glove, I created a dictionary of all the possible words. with keys as words and values as 768 embedding vector for BERT.
But when I use this dictionary and train the model, the loss is getting nan in 1st epoch only.

  1. How to handle this problem?
  2. what are the possible reasons for getting a loss 'nan'?
  3. Is this a good approach, to make a dictionary of embedding vectors?
@alkaideemo
Copy link

I got a similar problem. Here it's not numerical stable while computing the loss.

IRNet/src/model.py

Lines 308 to 309 in c329460

sketch_prob_var = torch.stack(
[torch.stack(action_probs_i, dim=0).log().sum() for action_probs_i in action_probs], dim=0)

IRNet/src/model.py

Lines 479 to 480 in c329460

lf_prob_var = torch.stack(
[torch.stack(action_probs_i, dim=0).log().sum() for action_probs_i in action_probs], dim=0)

I add a small number before log operation, problem solved.

@liguozhanglearner
Copy link

i have no idea about the loss function computing way

@ersaurabhverma
Copy link

Try reducing the learning rate.
Your gradient is exploding due to high learning rate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants