Flair Regression #564

heukirne · 2019-02-24T16:18:39Z

Building model and trainer for regression task with Flair framework #440

new model TextRegressor
new trainer RegressorTrainer
new dataset for regression task: WASSA-2017 Shared Task on Emotion Intensity
basic unit test for TextRegressor

Still need some improvements:

fix loggin at RegressorTrainer._calculate_evaluation_results_for()
need to return properly the result in Metric() format
add commit messages with Support for regression #440

heukirne · 2019-02-28T22:24:59Z

Hello @alanakbik and @rnditdev,
Do you have any suggestions for this PR?
Thanks

alanakbik · 2019-03-01T13:14:49Z

Hi @heukirne it looks good. However, I ran the current code with the following script:

# get corpus
corpus = NLPTaskDataFetcher.load_corpus(NLPTask.REGRESSION, 'tests/resources/tasks')

# init document embeddings
document_embeddings: DocumentRNNEmbeddings = DocumentRNNEmbeddings(
    [WordEmbeddings('glove'),
     FlairEmbeddings('news-forward', use_cache=True),
     FlairEmbeddings('news-backward', use_cache=True)],
    128, 1, False, 64, False, False)

# init regressor
model = TextRegressor(document_embeddings, Dictionary(), False)

# train
trainer = RegressorTrainer(model, corpus)

trainer.train('resources/taggers/regression',
              max_epochs=150,
              mini_batch_size=4,
              embeddings_in_memory=True,
              )

and it gave me the following results for the final model after 72 epochs:

AVG: mse 0.021794370514728552 - mae 0.13093452751636503 - pearson -1.0 - spearman -0.9999999999999999

Is this correct? In particular, the pearson and spearman numbers look odd. Are you getting similar results?

heukirne · 2019-03-01T22:19:57Z

Hello @alanakbik, yes, I got a similar result:

glove_embedding: WordEmbeddings = WordEmbeddings('glove')
document_embeddings: DocumentRNNEmbeddings = DocumentRNNEmbeddings([glove_embedding], 128, 1, False, 64, False, False)
modelR = TextRegressor(document_embeddings, Dictionary(), False)
trainerR = RegressorTrainer(modelR, corpus)
trainerR.train('regression_train/',  max_epochs=150, mini_batch_size=4, embeddings_in_memory=True)

after 75 epochs

AVG: mse 0.04095809102497722 - mae 0.1944224864244461 - pearson 0.7231063539038256 - spearman 0.39999999999999997

Seems an possible result. Is a little weird the negative result in pearson and spearman, but probably the corpus is upside-down.

There is something wrong with the corpus files, probably the test is part of train.

alanakbik · 2019-03-03T14:47:31Z

@heukirne Ok! Could you double-check the corpus and the implementation before we merge?

TODO: still need a self-contained MSE and MAE metric

add mean squared error as default for regression

heukirne · 2019-03-06T22:09:58Z

Hello @alanakbik, I re-run the tests and add a new evaluation metric name.
But the trainer is not logging in the file yet. I'm return an Metric object.
I'll create a new class Metric for Regression and perform more tests.

alanakbik · 2019-03-06T22:32:17Z

Cool, thanks!

still need unit test for MetricRegression

heukirne · 2019-03-06T23:35:17Z

Hi @alanakbik , now the MetricRegression object is more compatible with the Metric one. The results now is working properly (I fix a problem when add MetricRegression).
Now it's logging the results in loss.tsv file also, but the header name and the visual Plotter still print the wrong label (there is a static method Metric used in train() function 77 and 196 lines).
I believe now is fine to merge as a beta feature.

alanakbik · 2019-03-07T13:07:27Z

@heukirne thanks! It looks like the new unit tests for the regressor are failing as of the last commit, with the message:

NameError: name 'log' is not defined

You probably need to define the logger in the class first.

heukirne · 2019-03-07T13:32:38Z

@alanakbik, my bad, now it's fixed! ;)

alanakbik · 2019-03-07T13:41:49Z

thanks :)

heukirne · 2019-04-09T07:50:11Z

👍

alanakbik · 2019-04-15T12:36:40Z

@heukirne sorry for the delay - only now got back from my vacation. I ran some tests and I think we're good to go to add this as a beta feature. We're planning a refactoring of the flair.nn.Model interface and data loaders soon so it is really good to have this in the project! Thank you very much for adding this and also for your patience :)

alanakbik · 2019-04-15T12:36:45Z

👍

heukirne · 2019-04-15T13:55:24Z

Excelent, @alanakbik !

alanakbik · 2019-04-16T09:48:28Z

👍

kashif · 2019-04-16T09:49:15Z

👍

GH-564: regression datasets

heukirne added 7 commits March 6, 2019 19:00

beta version of flair-regression, first commit flairNLP#440

f57cefa

flairNLPGH-440 fix trainer torch import

a47c31f

flairNLPGH-404 improve tests

8486d98

flairNLPGH-440 fix assert test

7189cf3

flairNLPGH-440 add final_test to RegressorTrainer

58724c4

TODO: still need a self-contained MSE and MAE metric

flairNLPGH-440 add correlation metrics for regression

385e876

flairNLPGH-440 add new evaluation metrics

3129192

add mean squared error as default for regression

heukirne force-pushed the regression branch from 07e26ab to 3129192 Compare March 6, 2019 22:00

heukirne added 2 commits March 6, 2019 20:11

flairNLPGH-440 add MetricRegresssion for compatibility with classifier

6808ca4

still need unit test for MetricRegression

flairNLPGH-440 remove extra hyphen

0922a18

flairNLPGH-440 add experimental warning

a7f414e

Update text_regression_model.py

f6530b6

Alan Akbik added 3 commits April 16, 2019 10:00

Merge branch 'master' into regression

bc6a48e

Update data_fetcher.py

7e5b9de

flairNLPGH-652: code formatting

330dd56

alanakbik merged commit 11850ee into flairNLP:master Apr 16, 2019

alanakbik pushed a commit that referenced this pull request May 27, 2019

GH-564: fix loss function

740acc4

alanakbik pushed a commit that referenced this pull request May 27, 2019

GH-564: add WASSA dataset

b6825fc

alanakbik pushed a commit that referenced this pull request May 27, 2019

GH-564: better error message in cached_path

2b063a5

alanakbik pushed a commit that referenced this pull request May 27, 2019

GH-564: add WASSA datasets

9f777cb

alanakbik pushed a commit that referenced this pull request May 27, 2019

GH-564: identify as Flair

5b728fd

alanakbik pushed a commit that referenced this pull request May 27, 2019

Merge pull request #756 from zalandoresearch/GH-564-regression-datasets

723ecbe

GH-564: regression datasets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flair Regression #564

Flair Regression #564

heukirne commented Feb 24, 2019

heukirne commented Feb 28, 2019

alanakbik commented Mar 1, 2019

heukirne commented Mar 1, 2019 •

edited

Loading

alanakbik commented Mar 3, 2019

heukirne commented Mar 6, 2019

alanakbik commented Mar 6, 2019

heukirne commented Mar 6, 2019

alanakbik commented Mar 7, 2019

heukirne commented Mar 7, 2019

alanakbik commented Mar 7, 2019

heukirne commented Apr 9, 2019

alanakbik commented Apr 15, 2019

alanakbik commented Apr 15, 2019

heukirne commented Apr 15, 2019 •

edited

Loading

alanakbik commented Apr 16, 2019

kashif commented Apr 16, 2019

Flair Regression #564

Flair Regression #564

Conversation

heukirne commented Feb 24, 2019

heukirne commented Feb 28, 2019

alanakbik commented Mar 1, 2019

heukirne commented Mar 1, 2019 • edited Loading

alanakbik commented Mar 3, 2019

heukirne commented Mar 6, 2019

alanakbik commented Mar 6, 2019

heukirne commented Mar 6, 2019

alanakbik commented Mar 7, 2019

heukirne commented Mar 7, 2019

alanakbik commented Mar 7, 2019

heukirne commented Apr 9, 2019

alanakbik commented Apr 15, 2019

alanakbik commented Apr 15, 2019

heukirne commented Apr 15, 2019 • edited Loading

alanakbik commented Apr 16, 2019

kashif commented Apr 16, 2019

heukirne commented Mar 1, 2019 •

edited

Loading

heukirne commented Apr 15, 2019 •

edited

Loading