Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretrained Model for Keyphrase Extraction #1647

Closed
whoisjones opened this issue May 28, 2020 · 5 comments
Closed

Pretrained Model for Keyphrase Extraction #1647

whoisjones opened this issue May 28, 2020 · 5 comments
Assignees
Labels
feature A new feature

Comments

@whoisjones
Copy link
Member

Since we've integrated Keyphrase Detection Datasets it might a cool to have pretrained model for this. As a foundation we can either use Datasets from #1621 or #1646.
I would train some models and post the results here to check if it makes sense to integrate it.

@whoisjones whoisjones added the feature A new feature label May 28, 2020
@whoisjones whoisjones self-assigned this May 28, 2020
whoisjones added a commit that referenced this issue Jun 1, 2020
whoisjones added a commit that referenced this issue Jun 2, 2020
whoisjones added a commit that referenced this issue Jun 2, 2020
whoisjones added a commit that referenced this issue Jun 5, 2020
whoisjones added a commit that referenced this issue Jun 5, 2020
whoisjones added a commit that referenced this issue Jun 5, 2020
@whoisjones
Copy link
Member Author

whoisjones commented Jun 5, 2020

according to midas-research SciBert gives (almost) the best results among all three keyphrase datasets in flair (metric: f1-score):

INSPEC SEMEVAL2010 SEMEVAL2017
midas 0.593 0.357 0.521
flairl 0.5835 0.2391 0.4143
scibert 0.6047 0.3501 0.4706

@alanakbik
Copy link
Collaborator

What are the embeddings used in our model?

@whoisjones
Copy link
Member Author

SciBert, here is also the data from midas:
image

whoisjones added a commit that referenced this issue Jun 5, 2020
whoisjones added a commit that referenced this issue Jun 5, 2020
whoisjones added a commit that referenced this issue Jun 5, 2020
whoisjones added a commit that referenced this issue Jun 5, 2020
whoisjones added a commit that referenced this issue Jun 11, 2020
whoisjones added a commit that referenced this issue Jun 11, 2020
whoisjones added a commit that referenced this issue Jun 11, 2020
…-keyphrase-tagger-model

� Conflicts:
�	flair/datasets/sequence_labeling.py
whoisjones added a commit that referenced this issue Jun 11, 2020
alanakbik added a commit that referenced this issue Jun 11, 2020
@alanakbik
Copy link
Collaborator

Model added to Flair in #1689

@djstrong
Copy link
Contributor

djstrong commented Sep 3, 2020

@whoisjones How the model was trained? The scibert was finetuned in the process? What head is used (linear, rnn, crf)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature
Projects
None yet
Development

No branches or pull requests

3 participants