Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-1449: labeling logic is part of DataPoint #1450

Merged
merged 5 commits into from
Feb 24, 2020
Merged

Conversation

alanakbik
Copy link
Collaborator

@alanakbik alanakbik commented Feb 24, 2020

This PR refactors the DataPoint class and classes that inherit from it (Token, Sentence, Image, Span, etc.) so that all have the same methods for adding and accessing labels.

sentence_1 = Sentence("this is great", labels=[Label("POSITIVE")])

you should now do:

sentence_1 = Sentence("this is great")
sentence_1.add_label('sentiment', 'POSITIVE')

or:

sentence_1 = Sentence("this is great").add_label('sentiment', 'POSITIVE')

Note that Sentence labels now have a label_type (in the example that's 'sentiment').

  • The Corpus method _get_class_to_count is renamed to _count_sentence_labels
  • The Corpus method _get_tag_to_count is renamed to _count_token_labels
  • Span is now a DataPoint (so it has an embedding and labels)

@alanakbik alanakbik merged commit 1d24ca9 into master Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

All DataPoint classes should have the same Label logic
1 participant