RoBERTa model #908

stefan-it · 2019-07-27T09:22:28Z

Hi,

thanks for releasing the RoBERTa model ❤️

I've one question regarding to the output features:

features = roberta.extract_features(tokens)
features.size()
torch.Size([1, 5, 1024])

Are these features the output of the last layer (layer nr. 24) of the Transformer model? Is it currently possible to select a specific layer?

Thanks in advance,

Stefan

myleott · 2019-07-27T10:53:46Z

Yes, it’s the last layer before lm_head. I will add an option to expose specific layers, but for now you can copy this: https://github.com/pytorch/fairseq/blob/master/fairseq/models/roberta.py#L106

If you add return_all_hiddens=True, then the second element of the returned tuple will contain all of the inner states.

myleott · 2019-07-27T12:54:55Z

Exposed new functionality in #909:

>>> roberta.eval()

>>> tokens = roberta.encode('Hello world.')

>>> last_layer_features = roberta.extract_features(tokens)
>>> last_layer_features.size()
torch.Size([1, 6, 1024])

>>> all_layers = roberta.extract_features(tokens, return_all_hiddens=True)
>>> len(all_layers)
25
>>> all_layers[-1].size()
torch.Size([1, 6, 1024])

>>> torch.all(all_layers[-1] == last_layer_features)
tensor(1, dtype=torch.uint8)

stefan-it · 2019-08-01T00:27:58Z

Hi @myleott thanks for your kind help and extending the interface ❤️ I could now integrate the RoBERTa into an upcoming version of Flair 🤗

Summary: (1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding; (2) Fix a small bug in the decoding process in Levenshtein Transformer. Pull Request resolved: fairinternal/fairseq-py#908 Differential Revision: D18493234 Pulled By: MultiPath fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24

…search#908) Summary: (1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding; (2) Fix a small bug in the decoding process in Levenshtein Transformer. Pull Request resolved: fairinternal/fairseq-py#908 Differential Revision: D18493234 Pulled By: MultiPath fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24

myleott closed this as completed Jul 27, 2019

yfyeung pushed a commit to yfyeung/fairseq that referenced this issue Dec 6, 2023

fix reduction conformer_ctc3/train.py (facebookresearch#908)

4e832fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RoBERTa model #908

RoBERTa model #908

stefan-it commented Jul 27, 2019 •

edited

Loading

myleott commented Jul 27, 2019 •

edited

Loading

myleott commented Jul 27, 2019

stefan-it commented Aug 1, 2019 •

edited

Loading

RoBERTa model #908

RoBERTa model #908

Comments

stefan-it commented Jul 27, 2019 • edited Loading

myleott commented Jul 27, 2019 • edited Loading

myleott commented Jul 27, 2019

stefan-it commented Aug 1, 2019 • edited Loading

stefan-it commented Jul 27, 2019 •

edited

Loading

myleott commented Jul 27, 2019 •

edited

Loading

stefan-it commented Aug 1, 2019 •

edited

Loading