-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NER + RE #2726
Comments
Hey @igormis, Unfortunately, there is no tutorial for the relation extractor training process like the other models, e.g. the sequence tagger. I'm currently working on another relation extractor architecture implementation and plan to add a tutorial. For now, you can train and use the existing relation extractor as follows: from flair.data import Sentence
from flair.datasets import RE_ENGLISH_CONLL04
from flair.embeddings import TransformerWordEmbeddings
from flair.models import RelationExtractor
from flair.trainers import ModelTrainer
def train() -> None:
# Hyperparameters
transformer: str = 'xlm-roberta-large'
learning_rate: float = 5e-5
mini_batch_size: int = 8
# Step 1: Create the training data
# The relation extractor is *not* trained end-to-end.
# A corpus for training the relation extractor requires annotated entities and relations.
corpus: RE_ENGLISH_CONLL04 = RE_ENGLISH_CONLL04()
# Print examples
sentence: Sentence = corpus.test[0]
print(sentence)
print(sentence.get_spans('ner')) # 'ner' is the entity label type
print(sentence.get_relations('relation')) # 'relation' is the relation label type
# Step 2: Make the label dictionary from the corpus
label_dictionary = corpus.make_label_dictionary('relation')
label_dictionary.add_item('O')
print(label_dictionary)
# Step 3: Initialize fine-tunable transformer embeddings
embeddings = TransformerWordEmbeddings(
model=transformer,
layers='-1',
subtoken_pooling='first',
fine_tune=True
)
# Step 4: Initialize relation classifier
model: RelationExtractor = RelationExtractor(
embeddings=embeddings,
label_dictionary=label_dictionary,
label_type='relation',
entity_label_type='ner',
entity_pair_filters=[ # Define valid entity pair combinations, used as relation candidates
('Loc', 'Loc'),
('Peop', 'Loc'),
('Peop', 'Org'),
('Org', 'Loc'),
('Peop', 'Peop')
]
)
# Step 5: Initialize trainer
trainer: ModelTrainer = ModelTrainer(model, corpus)
# Step 7: Run fine-tuning
trainer.fine_tune(
'conll04',
learning_rate=learning_rate,
mini_batch_size=mini_batch_size,
main_evaluation_metric=('macro avg', 'f1-score')
)
def predict_example() -> None:
# Step 1: Load trained relation extraction model
model: RelationExtractor = RelationExtractor.load('conll04/final-model.pt')
# Step 2: Create sentences with entity annotations (as these are required by the relation extraction model)
# In production, use another sequence tagger model to tag the relevant entities.
sentence: Sentence = Sentence('On April 14, while attending a play at the Ford Theatre in Washington, '
'Lincoln was shot in the head by actor John Wilkes Booth.')
sentence[15:16].add_label(typename='ner', value='Peop', score=1.0) # Lincoln -> Peop
sentence[23:26].add_label(typename='ner', value='Peop', score=1.0) # John Wilkes Booth -> Peop
# Step 3: Predict
model.predict(sentence)
print(sentence.get_relations('relation'))
if __name__ == '__main__':
train()
predict_example() In this example I've used an integrated dataset. You can also load you own, e.g. in the form of a
Example:
I hope that this helps. :) |
Hi @dobbersc it looks clear, tnx. I have only one question:
|
That is correct. The flair RelationExtractor does not handle end-to-end relation extraction, as it requires pre-tagged entities. One easy way is to use another sequence tagger, e.g. some NER model. Afterwards, you may predict using the relation extractor. When using a sequence tagger to predict the relation entities, be sure to use the same label type and specified labels as in the |
Closing since question is answered (thanks @dobbersc), but feel free to reopen if there are more questions! |
Hello, How do we handle sentences without relations? Our approach looked like this But we received the following error:
|
Since the if-check is |
Thanks a lot! |
@alanakbik are there any tutorials on how to train NER together with Relation Extraction on top of it? What I need is the input data format, the training process and the inference.
The text was updated successfully, but these errors were encountered: