Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Layer positions labelling & layernorm #348

Conversation

blefaudeux
Copy link
Contributor

What does this PR do?

Fixes #347

Before submitting

  • Did you have fun?
    • Make sure you had fun coding 🙃
  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • N/A
  • Did you make sure to update the docs?
    • N/A
  • Did you write any new necessary tests?
    • N/A
  • Did you update the changelog? (if needed)
    • N/A

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 3, 2022
@blefaudeux blefaudeux changed the base branch from main to hierachical_models_improvement July 3, 2022 13:10
@@ -149,10 +149,12 @@ def __init__(
for i in range(config.num_layers):
# Label where this layer is in the stack
# (for instance useful for the positional encoding, or late layer norm)
if i > 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would count within the repeated layers, not the overall layer stack

@blefaudeux blefaudeux force-pushed the hierachical_models_improvement branch from e5f22e4 to ebc4f6f Compare July 3, 2022 13:31
@@ -175,10 +175,7 @@ def __init__(self, config: xFormerEncoderConfig, **kwargs):
# Optional patch embedding
self.patch_emb: Optional[nn.Module] = None

if (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was not respecting the config, while trying to do the right thing: if the config was asking for a patch embedding, but the layer was not first, it would not be instantiated. In retrospect I think that it's risky, not doing what the API says it will do, plus it only worked in practice because is_first() was often wrong. I think now that it's better to respect the config no matter what, and not silently diverge

@codecov-commenter
Copy link

Codecov Report

Merging #348 (f8f4972) into hierachical_models_improvement (ebc4f6f) will not change coverage.
The diff coverage is 100.00%.

@@                       Coverage Diff                       @@
##           hierachical_models_improvement     #348   +/-   ##
===============================================================
  Coverage                           93.95%   93.95%           
===============================================================
  Files                                  70       70           
  Lines                                3984     3984           
===============================================================
  Hits                                 3743     3743           
  Misses                                241      241           
Flag Coverage Δ
Python 93.95% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
xformers/factory/block_factory.py 97.03% <100.00%> (ø)
xformers/factory/model_factory.py 98.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebc4f6f...f8f4972. Read the comment docs.

Copy link
Contributor

@dianaml0 dianaml0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@dianaml0 dianaml0 merged commit 58b36eb into facebookresearch:hierachical_models_improvement Jul 5, 2022
blefaudeux added a commit that referenced this pull request Jul 14, 2022
)

* handling different normalizations + layer repetition

* bugfix localizing the layers in the stack (#348)

* renaming the layer_norm_style param when building from config

Co-authored-by: Benjamin Lefaudeux <lefaudeux@Benjamins-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[factory] First and last layers are not properly tagged -> superfluous layernorms
4 participants