get_spans_from_bio: Start new span for previous S- if class also changed #3195
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In previous flair versions (e.g. 0.10) get_span_from_bio would start a new span if the previous tag was a "S-" token of a different class than the current token, see:
https://github.com/flairNLP/flair/blob/v0.10/flair/data.py#L698
It happens that our production model semi-frequently produces these kinds of (invalid BIOES) prediction, and that the new span extraction performs worse on our data.
This adds back this special check for previous tag "S-", making span calculation more similar to what it was in 0.10, and remaining the same for all 100% valid BIOES tagging.
Also included are some minor code tweaks to the function to make it prettier.