-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-1445: Targeted Sentiment Analysis #1758
Conversation
…-speculation-model
👍 |
@whoisjones thanks for adding this! |
@whoisjones : Interesting PR! Was curious to know why you opted for a sequence tagger approach rather than just text classification approach to do aspect-based sentiment analysis? It will help me understand how you have framed the problem and understand the pros & cons of it. Thanks! |
@nipunsadvilkar a sequence tagger gives a way more information compared to a text classification. Classification by definition will aggregate information. consider you are having a nested negation and / or speculation in one document. instead of assigning it to a class, we can say that part of the sentence / document is negated oder speculative. |
@whoisjones Thanks for prompt reply! Appreciate it 👍
Entities/Aspects would be Disease mentions - Negated - Approach A) In the text classifier approach, plausible setting would be to label the above sentence with 4 different mentions and train aspect-based text classifier.
Approach B) Sequence Tagger from what I see in your repo - whoisjones/BioScopeSequenceLabelingData. You would tag it like following? Click to see IOB tagging scheme!The Opatient O states O of O no B-NEGATION fever I-NEGATION or I-NEGATION chills I-NEGATION but O has O asthma O and O a O sore O throat O that O has O been O going O on O for O 3 O months O . O I Agree this scope of negation or speculation becomes tricky to handle. In above example negation for 2 entities is consecutive. How do you handle disjoint spans? Also, would like to hear your thoughts on Approach A) |
@nipunsadvilkar sorry for the late reply. here's what I think:
If we would only give labels to this phrase like this, we would learn with a text classifier approach that fajitas in general are bad and salads in general are delicious. However we want to know which item is rated good or bad and this depends on where in the phrase fajitas or salads is positioned. So we would like a sequence labeling problem around it. Approach B Let our tagger annotate your text whether it is speculation or negation. Then search for all the diseases you would like to know about. This obviously only includes all diseases you are specifying beforehand. |
closes gh-1445: we've added a pretrained tagger model for negation and speculation based on the bioscope data