Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flair Regression #564

Merged
merged 14 commits into from
Apr 16, 2019
Merged

Flair Regression #564

merged 14 commits into from
Apr 16, 2019

Conversation

heukirne
Copy link
Contributor

Building model and trainer for regression task with Flair framework #440

Still need some improvements:

  • fix loggin at RegressorTrainer._calculate_evaluation_results_for()
  • need to return properly the result in Metric() format
  • add commit messages with Support for regression #440

@heukirne
Copy link
Contributor Author

Hello @alanakbik and @rnditdev,
Do you have any suggestions for this PR?
Thanks

@alanakbik
Copy link
Collaborator

Hi @heukirne it looks good. However, I ran the current code with the following script:

# get corpus
corpus = NLPTaskDataFetcher.load_corpus(NLPTask.REGRESSION, 'tests/resources/tasks')

# init document embeddings
document_embeddings: DocumentRNNEmbeddings = DocumentRNNEmbeddings(
    [WordEmbeddings('glove'),
     FlairEmbeddings('news-forward', use_cache=True),
     FlairEmbeddings('news-backward', use_cache=True)],
    128, 1, False, 64, False, False)

# init regressor
model = TextRegressor(document_embeddings, Dictionary(), False)

# train
trainer = RegressorTrainer(model, corpus)

trainer.train('resources/taggers/regression',
              max_epochs=150,
              mini_batch_size=4,
              embeddings_in_memory=True,
              )

and it gave me the following results for the final model after 72 epochs:

AVG: mse 0.021794370514728552 - mae 0.13093452751636503 - pearson -1.0 - spearman -0.9999999999999999

Is this correct? In particular, the pearson and spearman numbers look odd. Are you getting similar results?

@heukirne
Copy link
Contributor Author

heukirne commented Mar 1, 2019

Hello @alanakbik, yes, I got a similar result:

glove_embedding: WordEmbeddings = WordEmbeddings('glove')
document_embeddings: DocumentRNNEmbeddings = DocumentRNNEmbeddings([glove_embedding], 128, 1, False, 64, False, False)
modelR = TextRegressor(document_embeddings, Dictionary(), False)
trainerR = RegressorTrainer(modelR, corpus)
trainerR.train('regression_train/',  max_epochs=150, mini_batch_size=4, embeddings_in_memory=True)

after 75 epochs

AVG: mse 0.04095809102497722 - mae 0.1944224864244461 - pearson 0.7231063539038256 - spearman 0.39999999999999997

Seems an possible result. Is a little weird the negative result in pearson and spearman, but probably the corpus is upside-down.

There is something wrong with the corpus files, probably the test is part of train.

@alanakbik
Copy link
Collaborator

@heukirne Ok! Could you double-check the corpus and the implementation before we merge?

@heukirne
Copy link
Contributor Author

heukirne commented Mar 6, 2019

Hello @alanakbik, I re-run the tests and add a new evaluation metric name.
But the trainer is not logging in the file yet. I'm return an Metric object.
I'll create a new class Metric for Regression and perform more tests.

@alanakbik
Copy link
Collaborator

Cool, thanks!

@heukirne
Copy link
Contributor Author

heukirne commented Mar 6, 2019

Hi @alanakbik , now the MetricRegression object is more compatible with the Metric one. The results now is working properly (I fix a problem when add MetricRegression).
Now it's logging the results in loss.tsv file also, but the header name and the visual Plotter still print the wrong label (there is a static method Metric used in train() function 77 and 196 lines).
I believe now is fine to merge as a beta feature.

@alanakbik
Copy link
Collaborator

@heukirne thanks! It looks like the new unit tests for the regressor are failing as of the last commit, with the message:

NameError: name 'log' is not defined

You probably need to define the logger in the class first.

@heukirne
Copy link
Contributor Author

heukirne commented Mar 7, 2019

@alanakbik, my bad, now it's fixed! ;)

@alanakbik
Copy link
Collaborator

thanks :)

@heukirne
Copy link
Contributor Author

heukirne commented Apr 9, 2019

👍

@alanakbik
Copy link
Collaborator

@heukirne sorry for the delay - only now got back from my vacation. I ran some tests and I think we're good to go to add this as a beta feature. We're planning a refactoring of the flair.nn.Model interface and data loaders soon so it is really good to have this in the project! Thank you very much for adding this and also for your patience :)

@alanakbik
Copy link
Collaborator

👍

@heukirne
Copy link
Contributor Author

heukirne commented Apr 15, 2019

Excelent, @alanakbik !

@alanakbik
Copy link
Collaborator

👍

@kashif
Copy link
Contributor

kashif commented Apr 16, 2019

👍

@alanakbik alanakbik merged commit 11850ee into flairNLP:master Apr 16, 2019
alanakbik pushed a commit that referenced this pull request May 27, 2019
alanakbik pushed a commit that referenced this pull request May 27, 2019
alanakbik pushed a commit that referenced this pull request May 27, 2019
alanakbik pushed a commit that referenced this pull request May 27, 2019
alanakbik pushed a commit that referenced this pull request May 27, 2019
alanakbik pushed a commit that referenced this pull request May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants