-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TokenClassifier model #3203
TokenClassifier model #3203
Conversation
from flair.embeddings import TokenEmbeddings | ||
|
||
log = logging.getLogger("flair") | ||
|
||
|
||
class WordTagger(flair.nn.DefaultClassifier[Sentence, Token]): | ||
class TokenClassifier(flair.nn.DefaultClassifier[Sentence, Token]): | ||
"""This is a simple class of models that tags individual words in text.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shouldn't just remove the WordTagger as this breaks models using the WordTagger (Especially if they are part of a MultiTaskModel).
I would suggest, that we use a DeprecationHelper
like specified here, but specify the version when we delete (let's say 0.14.0; incrementing by 2.)
That way we could give users a chance to upgrade between multiple versions.
We can also discuss this further in private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good idea!
This PR introduces the
TokenClassifier
class, a renamed and extended version ofWordTagger
. It directly inherits fromDefaultClassifier
and should be used for all token-level prediction tasks that do not require an LSTM-CRF decoder (for such tasks, theSequenceTagger
should be used).The main idea is to offer a model that inherits from
DefaultClassifier
for each label type we predict, i.e.:TokenClassifier
for predictingToken
labelsTextPairClassifier
for predictingTextPair
labelsRelationClassifier
for predictingRelation
labelsSpanClassifier
for predictingSpan
labels (this class is currently calledEntityLinker
and should be renamed)TextClassifier
for predictingSentence
labels (might need to be renamed to SentenceClassifier)An advantage of such a structure is that most functionality (such as new decoders) needs to only be implemented once in
DefaultClassifier
and then is immediately usable for all model classes.Edit: This class also changes the default behavior of the make_label_dictionary method. The UNK token is no longer automatically added to a dictionary. We now skip unknown labels to handle loss computation in such cases.