Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RoBERTa model #908

Closed
stefan-it opened this issue Jul 27, 2019 · 3 comments
Closed

RoBERTa model #908

stefan-it opened this issue Jul 27, 2019 · 3 comments

Comments

@stefan-it
Copy link
Contributor

stefan-it commented Jul 27, 2019

Hi,

thanks for releasing the RoBERTa model ❤️

I've one question regarding to the output features:

features = roberta.extract_features(tokens)
features.size()
torch.Size([1, 5, 1024])

Are these features the output of the last layer (layer nr. 24) of the Transformer model? Is it currently possible to select a specific layer?

Thanks in advance,

Stefan

@myleott
Copy link
Contributor

myleott commented Jul 27, 2019

Yes, it’s the last layer before lm_head. I will add an option to expose specific layers, but for now you can copy this: https://github.com/pytorch/fairseq/blob/master/fairseq/models/roberta.py#L106

If you add return_all_hiddens=True, then the second element of the returned tuple will contain all of the inner states.

@myleott
Copy link
Contributor

myleott commented Jul 27, 2019

Exposed new functionality in #909:

>>> roberta.eval()

>>> tokens = roberta.encode('Hello world.')

>>> last_layer_features = roberta.extract_features(tokens)
>>> last_layer_features.size()
torch.Size([1, 6, 1024])

>>> all_layers = roberta.extract_features(tokens, return_all_hiddens=True)
>>> len(all_layers)
25
>>> all_layers[-1].size()
torch.Size([1, 6, 1024])

>>> torch.all(all_layers[-1] == last_layer_features)
tensor(1, dtype=torch.uint8)

@myleott myleott closed this as completed Jul 27, 2019
@stefan-it
Copy link
Contributor Author

stefan-it commented Aug 1, 2019

Hi @myleott thanks for your kind help and extending the interface ❤️ I could now integrate the RoBERTa into an upcoming version of Flair 🤗

facebook-github-bot pushed a commit that referenced this issue Nov 14, 2019
Summary:
(1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding;
(2) Fix a small bug in the decoding process in Levenshtein Transformer.
Pull Request resolved: fairinternal/fairseq-py#908

Differential Revision: D18493234

Pulled By: MultiPath

fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24
ebetica pushed a commit to ebetica/fairseq that referenced this issue Nov 20, 2019
…search#908)

Summary:
(1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding;
(2) Fix a small bug in the decoding process in Levenshtein Transformer.
Pull Request resolved: fairinternal/fairseq-py#908

Differential Revision: D18493234

Pulled By: MultiPath

fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24
moussaKam pushed a commit to moussaKam/language-adaptive-pretraining that referenced this issue Sep 29, 2020
…search#908)

Summary:
(1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding;
(2) Fix a small bug in the decoding process in Levenshtein Transformer.
Pull Request resolved: fairinternal/fairseq-py#908

Differential Revision: D18493234

Pulled By: MultiPath

fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24
yfyeung pushed a commit to yfyeung/fairseq that referenced this issue Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants