Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new ELMoTransformerEmbeddings class #399

Merged
merged 3 commits into from
Jan 31, 2019
Merged

Add new ELMoTransformerEmbeddings class #399

merged 3 commits into from
Jan 31, 2019

Conversation

stefan-it
Copy link
Member

@stefan-it stefan-it commented Jan 16, 2019

Hi,

this PR introduces a new ELMoTransformerEmbeddings class. With the help from @brendan-ai2 it is possible to get embeddings from a transformer-based ELMo model.

That model was proposed in Dissecting Contextual Word Embeddings: Architecture and Representation.

Embeddings from a transformed-based ELMo model can now be used in flair. New CUDA semantic is also used. Training a model both on CPU and GPU works.

A pretrained transformer-based ELMo model for Basque can be downloaded from:

wget https://schweter.eu/cloud/elmo-transformer-models/eu-elmo-transformer-model.tar.gz

Downstream task example

To train a model for PoS tagging (in this example for Basque) with the new ELMoTransformerEmbeddings just follow the following instructions:

Clone a recent version of allennlp and install it, e.g.:

git clone https://github.com/allenai/allennlp.git
cd allennlp
pip3 install -e .

I tested it with commit 4c5de57 of allennlp.

Then download a pretrained transformer-based ELMo model.

wget https://schweter.eu/cloud/elmo-transformer-models/eu-elmo-transformer-model.tar.gz

The training can be started with the following script:

from typing import List

from flair.data_fetcher import NLPTaskDataFetcher, NLPTask
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings, ELMoTransformerEmbeddings

corpus = NLPTaskDataFetcher.load_corpus(NLPTask.UD_BASQUE).downsample(0.1)

tag_type = 'upos'
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)

custom_embedding = WordEmbeddings('eu')

embedding_types: List[TokenEmbeddings] = [
    custom_embedding,
    ELMoTransformerEmbeddings(model_file='eu-elmo-transformer-model.tar.gz')
]

embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)

from flair.models import SequenceTagger

tagger: SequenceTagger = SequenceTagger(hidden_size=512,
                                        embeddings=embeddings,
                                        tag_dictionary=tag_dictionary,
                                        tag_type=tag_type,
                                        use_crf=False,
                                        pickle_module='dill')

from flair.trainers import ModelTrainer
from flair.training_utils import EvaluationMetric

trainer: ModelTrainer = ModelTrainer(tagger, corpus)

trainer.train(f'resources/taggers/ud-basque-elmo-transformer',
              EvaluationMetric.MICRO_ACCURACY,
              learning_rate=0.1,
              mini_batch_size=8,
              max_epochs=1)

flair/embeddings.py Outdated Show resolved Hide resolved
@alanakbik
Copy link
Collaborator

I am getting the following error when running the script:

Traceback (most recent call last):
  File "/home/aakbik/PycharmProjects/flair/local_test_local.py", line 15, in <module>
    ELMoTransformerEmbeddings(model_file='/home/aakbik/Documents/Data/Embeddings/eu-elmo-transformer-model.tar.gz')
  File "/home/aakbik/PycharmProjects/flair/flair/embeddings.py", line 337, in __init__
    self.allen_nlp_utils = ELMoTransformerEmbeddings.AllenNlpUtils(model_file)
  File "/home/aakbik/PycharmProjects/flair/flair/embeddings.py", line 360, in __init__
    requires_grad=False
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/modules/token_embedders/bidirectional_language_model_token_embedder.py", line 68, in __init__
    archive = load_archive(archive_file, overrides=json.dumps(overrides))
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/models/archival.py", line 156, in load_archive
    cuda_device=cuda_device)
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/models/model.py", line 321, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/common/registrable.py", line 58, in by_name
    raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
allennlp.common.checks.ConfigurationError: 'language_model is not a registered name for Model'

Since I haven't worked with allennlp much: Any ideas where this error is coming from?

flair/embeddings.py Outdated Show resolved Hide resolved
@stefan-it
Copy link
Member Author

@alanakbik You should use a recent master version from allennlp -> I think the one from pip is too old.

@alanakbik
Copy link
Collaborator

Ah ok, I will!

@stefan-it
Copy link
Member Author

stefan-it commented Jan 19, 2019

@alanakbik It would work with the latest allennlp version, but for saving the trained model the dill package needs to be used:

torch.save(model_state, str(model_file), pickle_module=dill)

Do you think we can add a new parameter to SequenceTagger that allows to use another library for saving a torch model? And of course the default value would be to use pickle :)

@alanakbik
Copy link
Collaborator

@stefan-it yes I think that would be better since it would make it easier for people to use the class (just install allennlp and dill)!

@stefan-it
Copy link
Member Author

stefan-it commented Jan 21, 2019

I rebased the code on latest master and refactored the ELMoTransformerEmbeddings class code. I also introduced a new helper method for saving a torch model using an user-defined pickle module (like dill, but default is the normal pickle module).

Testing was done successfully on CPU, on GPU I'm currently not able to train a model, see #407.

@stefan-it stefan-it changed the title WIP: Add new ELMoTransformerEmbeddings class Add new ELMoTransformerEmbeddings class Jan 21, 2019
@tabergma
Copy link
Collaborator

mmhh.. I guess something went wrong with the rebasing as the changes which are already in the master branch are shown as changes of this PR. Would you mind updating the branch again so that we only see the changes you actually did? Thanks!

This supports different pickle modules (like dill library).

SequenceTagger class gets new variable for specifying a pickle model.
Default is standard pickle method.

In order to use the new ELMoTransformerEmbeddings class, just use dill.
@tabergma
Copy link
Collaborator

👍 Thanks! Looks good.

@stefan-it
Copy link
Member Author

Sorry for the confusion!

@alanakbik
Copy link
Collaborator

Hi @stefan-it - I'm testing the current version locally with allennlp 8.1. When running this code:

embeddings = ELMoTransformerEmbeddings('eu-elmo-transformer-model.tar.gz')
embeddings.embed(Sentence('I love Berlin'))

I get the error:

Traceback (most recent call last):
  File "/home/aakbik/PycharmProjects/stefan-flair/flair/train.py", line 21, in <module>
    embeddings = ELMoTransformerEmbeddings('eu-elmo-transformer-model.tar.gz')
  File "/home/aakbik/PycharmProjects/stefan-flair/flair/flair/embeddings.py", line 356, in __init__
    requires_grad=False
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/modules/token_embedders/bidirectional_language_model_token_embedder.py", line 68, in __init__
    archive = load_archive(archive_file, overrides=json.dumps(overrides))
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/models/archival.py", line 156, in load_archive
    cuda_device=cuda_device)
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/models/model.py", line 321, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/common/registrable.py", line 58, in by_name
    raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
allennlp.common.checks.ConfigurationError: 'language_model is not a registered name for Model'

Any idea where this error comes from?

@stefan-it
Copy link
Member Author

stefan-it commented Jan 31, 2019

0.8.1 is too old :( Please try the latest master 2b5249 - I also tested 55b9b.

With the latest master version you should see something like:

from flair.data import Sentence
from flair.embeddings import ELMoTransformerEmbeddings

sentence = Sentence("It is no longer snowing in Munich .")
embeddings = ELMoTransformerEmbeddings(model_file='eu-elmo-transformer-model.tar.gz')
embeddings.embed(sentence)

for token in sentence.tokens:
    print(token.embedding)
tensor([  6.3030, -25.0204,   3.3425,  ...,   4.0137, -29.3740,   7.8902])
tensor([ 12.4846, -19.3540,  -3.5304,  ...,   6.7518,  -6.8744,  -7.1798])
tensor([ 16.5951, -19.1923,  -6.8334,  ...,  12.1934,  -0.0000,  -0.0000])
tensor([ 4.6706, -0.0000, -0.0000,  ...,  6.8242, -8.0350, -3.6316])
tensor([  5.2432, -15.5042,  -4.4524,  ...,   2.1873, -12.7269,  -5.6938])
tensor([ 20.1224, -21.2971,   2.6443,  ...,   9.2506,   1.6562,  -4.3060])
tensor([ 2.4384, -0.0000, -7.1844,  ...,  0.0000, -0.0000,  3.2789])
tensor([ 16.3643,  -0.0000,  -0.0000,  ..., -14.8744,  16.0890,  13.4172])

@alanakbik
Copy link
Collaborator

Thanks this works - sorry with the dill change I somehow thought this meant that it works with pip install allennlp :)

For now, we could include this class but advise people that it is experimental and requires to check out the current master branch of allennlp. As soon as a new version of allennlp is pushed to pip that allows this class to work we could remove the experimental tag. What do you think?

@stefan-it
Copy link
Member Author

stefan-it commented Jan 31, 2019

I fully agree with you :) Even in the allennlp library the ELMo Transformer is experimental, see their disclaimer:

ExperimentalFeatureWarning: This particular transformer implementation 
is a provisional feature that's intended for AI2 internal use and might 
be deleted at any time

You can add me as a kind of maintainer for the ELMoTransformerEmbeddings class, because I'll use it regulary :)

@alanakbik
Copy link
Collaborator

Ok, sounds good! Will merge as soon as tests run through.

@alanakbik alanakbik merged commit df21e6e into flairNLP:master Jan 31, 2019
@alanakbik
Copy link
Collaborator

Thanks for adding this - really look forward to seeing what people do with this and how it compares!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants