Update documentation for Hunflair2 release #3410

mariosaenger · 2024-02-23T14:38:44Z

This PR updates the documentation for HunFlair lifting it to new release. More specifically the PR contains:

Documentation pages highlighting the new features and usage of HunFlair2
Minor adaptations to smooth user experience
Integration of warning messages if using HunFlair (version 1) models

…ow warnings if hunflair v1 models are loaded

alanakbik · 2024-02-23T14:45:05Z

flair/models/multitask_model.py

@@ -260,6 +260,14 @@ def _fetch_model(model_name) -> str:

        cache_dir = Path("models")
        if model_name in model_map:
+            if model_name.startswith("hunflair") or model_name == "bioner":


"hunflair2" also starts with "hunflair", so I think this warning would always be printed.

True. However, HunFlair2 will not be loaded as MultitaskModel. I fix it anyway.

alanakbik · 2024-02-23T14:45:38Z

flair/models/sequence_tagger_model.py

@@ -781,6 +781,14 @@ def _fetch_model(model_name) -> str:
        elif model_name in hu_model_map:
            model_path = cached_path(hu_model_map[model_name], cache_dir=cache_dir)

+            if model_name.startswith("hunflair"):


True. However, HunFlair2 will not be loaded as SequenceTaggerModel. I fix it anyway.

Added full description

Fixed some syntax errors

Fixed descriptions

alanakbik · 2024-04-05T09:26:46Z

flair/models/entity_mention_linking.py

@@ -648,6 +649,8 @@ def p(text: str) -> str:
                        emb = emb / torch.norm(emb)
                    dense_embeddings.append(emb.cpu().numpy())
                    sent.clear_embeddings()
+
+                # empty cuda cache if device is a cuda device
                if flair.device.type == "cuda":


@sg-wbi Is this really required?

alanakbik · 2024-04-05T09:26:54Z

flair/models/entity_mention_linking.py

+                # Sanity conversion: if flair.device was set as a string, convert to torch.device
+                if isinstance(flair.device, str):
+                    flair.device = torch.device(flair.device)
+
                if flair.device.type == "cuda":


@sg-wbi Is this really required?

alanakbik

All good, thanks for adding this and thanks for your patience! There are some smaller points that we will address with follow-up PRs.

One question: is the manual deleting of the cuda cache really necessary?

sg-wbi · 2024-04-05T09:35:12Z

Great thanks! I am not sure either. @helpmefindaname / @mariosaenger can you remember why this was added?

helpmefindaname · 2024-04-05T12:14:45Z

Hi @alanakbik @sg-wbi
The GPU cache clear is done after each batch, as otherwise we would get Memory errors if the embeddings of the whole dataset is larger than the memory available on the GPU (besides the needs for the Model).
Since the search is always performed on CPU, we don't need to keep the embeddings cached on the GPU, hence there is no big advantage of not deleting it.

alanakbik · 2024-04-05T12:15:45Z

Alright, thanks!

Mario Sänger added 7 commits February 23, 2024 09:34

Add support of loading of hunflair2 model via Classifier.load()

a0ffa82

Fix model loading and update main documentation pages

f3db229

Add platform check while loading entity mention linking models and sh…

e5b99e9

…ow warnings if hunflair v1 models are loaded

Update documentation

f1d5b03

Add hint for model import order in __init__ file

c8ba30a

Revised link on main page

cf07676

Remove sync file

ec08032

alanakbik reviewed Feb 23, 2024

View reviewed changes

Mario Sänger and others added 16 commits February 23, 2024 15:51

Fix HunFlair (v1) warnings

7b3b79d

Create HUNFLAIR2_TUTORIAL_3_TRAINING_NER.md (WIP)

bfb5747

Update HUNFLAIR2_TUTORIAL_3_TRAINING_NER.md

c1d7912

Added full description

Update HUNFLAIR2_TUTORIAL_3_TRAINING_NER.md

1f55a7a

Fixed some syntax errors

Update HUNFLAIR2_TUTORIAL_3_TRAINING_NER.md

1fde791

Fixed descriptions

feat: linking tutorial w/ customizations

67960cd

fix: single entity vs multi-entity tagger

7200b10

feat: add new tutorials links

6e00ca9

chore: fix formatting on file not related to PR

132ed81

chore: make mypy happy

d35fd5d

fix: try remove circular import

587f53c

chore: revert changes in non-pr related files

76ed90f

Merge branch 'master' into hunflair2-release

09e0bfe

Merge branch 'master' into hunflair2-release

3a9aaa1

Merge branch 'master' into hunflair2-release

848ac61

Add conversion of string to torch device for convenience

e3634ea

alanakbik reviewed Apr 5, 2024

View reviewed changes

alanakbik approved these changes Apr 5, 2024

View reviewed changes

Ruff fixes

189c6e2

alanakbik merged commit 223f346 into master Apr 5, 2024
1 check passed

alanakbik deleted the hunflair2-release branch April 5, 2024 10:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update documentation for Hunflair2 release #3410

Update documentation for Hunflair2 release #3410

mariosaenger commented Feb 23, 2024

alanakbik Feb 23, 2024

mariosaenger Feb 23, 2024

alanakbik Feb 23, 2024

mariosaenger Feb 23, 2024

alanakbik Apr 5, 2024

alanakbik Apr 5, 2024

alanakbik left a comment

sg-wbi commented Apr 5, 2024

helpmefindaname commented Apr 5, 2024

alanakbik commented Apr 5, 2024

Update documentation for Hunflair2 release #3410

Update documentation for Hunflair2 release #3410

Conversation

mariosaenger commented Feb 23, 2024

alanakbik Feb 23, 2024

Choose a reason for hiding this comment

mariosaenger Feb 23, 2024

Choose a reason for hiding this comment

alanakbik Feb 23, 2024

Choose a reason for hiding this comment

mariosaenger Feb 23, 2024

Choose a reason for hiding this comment

alanakbik Apr 5, 2024

Choose a reason for hiding this comment

alanakbik Apr 5, 2024

Choose a reason for hiding this comment

alanakbik left a comment

Choose a reason for hiding this comment

sg-wbi commented Apr 5, 2024

helpmefindaname commented Apr 5, 2024

alanakbik commented Apr 5, 2024