Add new ELMoTransformerEmbeddings class #399

stefan-it · 2019-01-16T15:26:58Z

Hi,

this PR introduces a new ELMoTransformerEmbeddings class. With the help from @brendan-ai2 it is possible to get embeddings from a transformer-based ELMo model.

That model was proposed in Dissecting Contextual Word Embeddings: Architecture and Representation.

Embeddings from a transformed-based ELMo model can now be used in flair. New CUDA semantic is also used. Training a model both on CPU and GPU works.

A pretrained transformer-based ELMo model for Basque can be downloaded from:

wget https://schweter.eu/cloud/elmo-transformer-models/eu-elmo-transformer-model.tar.gz

Downstream task example

To train a model for PoS tagging (in this example for Basque) with the new ELMoTransformerEmbeddings just follow the following instructions:

Clone a recent version of allennlp and install it, e.g.:

git clone https://github.com/allenai/allennlp.git
cd allennlp
pip3 install -e .

I tested it with commit 4c5de57 of allennlp.

Then download a pretrained transformer-based ELMo model.

wget https://schweter.eu/cloud/elmo-transformer-models/eu-elmo-transformer-model.tar.gz

The training can be started with the following script:

from typing import List

from flair.data_fetcher import NLPTaskDataFetcher, NLPTask
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, ELMoTransformerEmbeddings

corpus = NLPTaskDataFetcher.load_corpus(NLPTask.UD_BASQUE).downsample(0.1)

tag_type = 'upos'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)

custom_embedding = WordEmbeddings('eu')

embedding_types: List[TokenEmbeddings] = [
    custom_embedding,
    ELMoTransformerEmbeddings(model_file='eu-elmo-transformer-model.tar.gz')
]

embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)

from flair.models import SequenceTagger

tagger: SequenceTagger = SequenceTagger(hidden_size=512,
                                        embeddings=embeddings,
                                        tag_dictionary=tag_dictionary,
                                        tag_type=tag_type,
                                        use_crf=False,
                                        pickle_module='dill')

from flair.trainers import ModelTrainer
from flair.training_utils import EvaluationMetric

trainer: ModelTrainer = ModelTrainer(tagger, corpus)

trainer.train(f'resources/taggers/ud-basque-elmo-transformer',
              EvaluationMetric.MICRO_ACCURACY,
              learning_rate=0.1,
              mini_batch_size=8,
              max_epochs=1)

flair/embeddings.py

alanakbik · 2019-01-16T17:53:35Z

I am getting the following error when running the script:

Traceback (most recent call last):
  File "/home/aakbik/PycharmProjects/flair/local_test_local.py", line 15, in <module>
    ELMoTransformerEmbeddings(model_file='/home/aakbik/Documents/Data/Embeddings/eu-elmo-transformer-model.tar.gz')
  File "/home/aakbik/PycharmProjects/flair/flair/embeddings.py", line 337, in __init__
    self.allen_nlp_utils = ELMoTransformerEmbeddings.AllenNlpUtils(model_file)
  File "/home/aakbik/PycharmProjects/flair/flair/embeddings.py", line 360, in __init__
    requires_grad=False
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/modules/token_embedders/bidirectional_language_model_token_embedder.py", line 68, in __init__
    archive = load_archive(archive_file, overrides=json.dumps(overrides))
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/models/archival.py", line 156, in load_archive
    cuda_device=cuda_device)
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/models/model.py", line 321, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/common/registrable.py", line 58, in by_name
    raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
allennlp.common.checks.ConfigurationError: 'language_model is not a registered name for Model'

Since I haven't worked with allennlp much: Any ideas where this error is coming from?

flair/embeddings.py

stefan-it · 2019-01-16T17:56:54Z

@alanakbik You should use a recent master version from allennlp -> I think the one from pip is too old.

alanakbik · 2019-01-16T18:00:06Z

Ah ok, I will!

stefan-it · 2019-01-19T17:13:18Z

@alanakbik It would work with the latest allennlp version, but for saving the trained model the dill package needs to be used:

torch.save(model_state, str(model_file), pickle_module=dill)

Do you think we can add a new parameter to SequenceTagger that allows to use another library for saving a torch model? And of course the default value would be to use pickle :)

alanakbik · 2019-01-20T12:09:20Z

@stefan-it yes I think that would be better since it would make it easier for people to use the class (just install allennlp and dill)!

stefan-it · 2019-01-21T01:10:08Z

I rebased the code on latest master and refactored the ELMoTransformerEmbeddings class code. I also introduced a new helper method for saving a torch model using an user-defined pickle module (like dill, but default is the normal pickle module).

Testing was done successfully on CPU, on GPU I'm currently not able to train a model, see #407.

tabergma · 2019-01-22T11:44:15Z

mmhh.. I guess something went wrong with the rebasing as the changes which are already in the master branch are shown as changes of this PR. Would you mind updating the branch again so that we only see the changes you actually did? Thanks!

This supports different pickle modules (like dill library). SequenceTagger class gets new variable for specifying a pickle model. Default is standard pickle method. In order to use the new ELMoTransformerEmbeddings class, just use dill.

tabergma · 2019-01-22T12:19:36Z

👍 Thanks! Looks good.

stefan-it · 2019-01-22T12:23:23Z

Sorry for the confusion!

alanakbik · 2019-01-31T16:05:43Z

Hi @stefan-it - I'm testing the current version locally with allennlp 8.1. When running this code:

embeddings = ELMoTransformerEmbeddings('eu-elmo-transformer-model.tar.gz')
embeddings.embed(Sentence('I love Berlin'))

I get the error:

Traceback (most recent call last):
  File "/home/aakbik/PycharmProjects/stefan-flair/flair/train.py", line 21, in <module>
    embeddings = ELMoTransformerEmbeddings('eu-elmo-transformer-model.tar.gz')
  File "/home/aakbik/PycharmProjects/stefan-flair/flair/flair/embeddings.py", line 356, in __init__
    requires_grad=False
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/modules/token_embedders/bidirectional_language_model_token_embedder.py", line 68, in __init__
    archive = load_archive(archive_file, overrides=json.dumps(overrides))
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/models/archival.py", line 156, in load_archive
    cuda_device=cuda_device)
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/models/model.py", line 321, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/common/registrable.py", line 58, in by_name
    raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
allennlp.common.checks.ConfigurationError: 'language_model is not a registered name for Model'

Any idea where this error comes from?

stefan-it · 2019-01-31T16:37:45Z

0.8.1 is too old :( Please try the latest master 2b5249 - I also tested 55b9b.

With the latest master version you should see something like:

from flair.data import Sentence
from flair.embeddings import ELMoTransformerEmbeddings

sentence = Sentence("It is no longer snowing in Munich .")
embeddings = ELMoTransformerEmbeddings(model_file='eu-elmo-transformer-model.tar.gz')
embeddings.embed(sentence)

for token in sentence.tokens:
    print(token.embedding)

tensor([  6.3030, -25.0204,   3.3425,  ...,   4.0137, -29.3740,   7.8902])
tensor([ 12.4846, -19.3540,  -3.5304,  ...,   6.7518,  -6.8744,  -7.1798])
tensor([ 16.5951, -19.1923,  -6.8334,  ...,  12.1934,  -0.0000,  -0.0000])
tensor([ 4.6706, -0.0000, -0.0000,  ...,  6.8242, -8.0350, -3.6316])
tensor([  5.2432, -15.5042,  -4.4524,  ...,   2.1873, -12.7269,  -5.6938])
tensor([ 20.1224, -21.2971,   2.6443,  ...,   9.2506,   1.6562,  -4.3060])
tensor([ 2.4384, -0.0000, -7.1844,  ...,  0.0000, -0.0000,  3.2789])
tensor([ 16.3643,  -0.0000,  -0.0000,  ..., -14.8744,  16.0890,  13.4172])

alanakbik · 2019-01-31T16:57:50Z

Thanks this works - sorry with the dill change I somehow thought this meant that it works with pip install allennlp :)

For now, we could include this class but advise people that it is experimental and requires to check out the current master branch of allennlp. As soon as a new version of allennlp is pushed to pip that allows this class to work we could remove the experimental tag. What do you think?

stefan-it · 2019-01-31T17:01:40Z

I fully agree with you :) Even in the allennlp library the ELMo Transformer is experimental, see their disclaimer:

ExperimentalFeatureWarning: This particular transformer implementation 
is a provisional feature that's intended for AI2 internal use and might 
be deleted at any time

You can add me as a kind of maintainer for the ELMoTransformerEmbeddings class, because I'll use it regulary :)

alanakbik · 2019-01-31T17:10:14Z

Ok, sounds good! Will merge as soon as tests run through.

alanakbik · 2019-01-31T17:26:50Z

Thanks for adding this - really look forward to seeing what people do with this and how it compares!

stefan-it mentioned this pull request Jan 16, 2019

Comparison between BERT, ELMo, and Flair embeddings #308

Closed

tabergma reviewed Jan 16, 2019

View reviewed changes

flair/embeddings.py Outdated Show resolved Hide resolved

alanakbik reviewed Jan 16, 2019

View reviewed changes

flair/embeddings.py Outdated Show resolved Hide resolved

stefan-it changed the title ~~WIP: Add new ELMoTransformerEmbeddings class~~ Add new ELMoTransformerEmbeddings class Jan 21, 2019

stefan-it added 2 commits January 22, 2019 12:59

GH-308: implement new ELMoTransformerEmbeddings class

88e6a9f

Merge branch 'master' into elmo-transformer

6b1b2c9

alanakbik merged commit df21e6e into flairNLP:master Jan 31, 2019

stefan-it deleted the elmo-transformer branch January 31, 2019 17:53

nelson-liu mentioned this pull request Mar 22, 2019

ELMo Transformer model nelson-liu/contextual-repr-analysis#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new ELMoTransformerEmbeddings class #399

Add new ELMoTransformerEmbeddings class #399

stefan-it commented Jan 16, 2019 •

edited

Loading

alanakbik commented Jan 16, 2019

stefan-it commented Jan 16, 2019

alanakbik commented Jan 16, 2019

stefan-it commented Jan 19, 2019 •

edited

Loading

alanakbik commented Jan 20, 2019

stefan-it commented Jan 21, 2019 •

edited

Loading

tabergma commented Jan 22, 2019

tabergma commented Jan 22, 2019

stefan-it commented Jan 22, 2019

alanakbik commented Jan 31, 2019

stefan-it commented Jan 31, 2019 •

edited

Loading

alanakbik commented Jan 31, 2019

stefan-it commented Jan 31, 2019 •

edited

Loading

alanakbik commented Jan 31, 2019

alanakbik commented Jan 31, 2019

Add new ELMoTransformerEmbeddings class #399

Add new ELMoTransformerEmbeddings class #399

Conversation

stefan-it commented Jan 16, 2019 • edited Loading

Downstream task example

alanakbik commented Jan 16, 2019

stefan-it commented Jan 16, 2019

alanakbik commented Jan 16, 2019

stefan-it commented Jan 19, 2019 • edited Loading

alanakbik commented Jan 20, 2019

stefan-it commented Jan 21, 2019 • edited Loading

tabergma commented Jan 22, 2019

tabergma commented Jan 22, 2019

stefan-it commented Jan 22, 2019

alanakbik commented Jan 31, 2019

stefan-it commented Jan 31, 2019 • edited Loading

alanakbik commented Jan 31, 2019

stefan-it commented Jan 31, 2019 • edited Loading

alanakbik commented Jan 31, 2019

alanakbik commented Jan 31, 2019

stefan-it commented Jan 16, 2019 •

edited

Loading

stefan-it commented Jan 19, 2019 •

edited

Loading

stefan-it commented Jan 21, 2019 •

edited

Loading

stefan-it commented Jan 31, 2019 •

edited

Loading

stefan-it commented Jan 31, 2019 •

edited

Loading