Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CU-862hyd5wx Rosalind down #287

Merged
merged 4 commits into from
Jan 4, 2023
Merged

CU-862hyd5wx Rosalind down #287

merged 4 commits into from
Jan 4, 2023

Conversation

mart-r
Copy link
Collaborator

@mart-r mart-r commented Dec 22, 2022

Rosalind is currently down. Some MedCAT tests rely on downloading a Vocab from medcat.rosalind.kcl.ac.uk/media/vocab.dat. When that is not available (currently returns 503), the tests fail since the Vocab is (understandably) unable to be loaded from the web response (the 503).

Because of the above, GHA/CIs keep failing due to the test failures.

What this PR does:

  • Unify some of the Vocab downloads
    • Different tests did that separately using identical (or near identical) code
  • Checks if the current tmp_vocab.dat has the 503 info
    • Which cannot be loaded
    • If so, tries fetching again
  • If rosalind is unavailable
    • Generates a simple Vocab to use
  • If rosalind is available and the current Vocab is the generated simple one
    • Downloads the new one

Made sure the tests run and succeed in 4 cases

  • No tmp_vocab.dat
    • Running in new environment
  • Existing simple, generated tmp_vocab.dat
    • Running in no-Rosalind situation
  • The tmp_vocab.dat from Rosalind
    • Running in existing environment
    • Or Rosalind is available
  • The tmp_vocab.dat that contains the 503
    • Running in existing environment
    • Where the tests were first run in a no-Rosalind situation

PS:
I've made sure all tests run and are successful with the generated Vocab. But I have not checked specifically whether they run in the same exact way they would be expected with the real Vocab. Because this generated Vocab has only 2 words, iterating over those may result in different behaviour compared to the real deal.

@mart-r
Copy link
Collaborator Author

mart-r commented Dec 22, 2022

I will be merging this in around 2h if nobody has any objections before that.

@mart-r
Copy link
Collaborator Author

mart-r commented Dec 22, 2022

It seems the Rosalind issue is now resolved. And the CIs have been re-run and are passing. So there's no rush in merging this PR.

So I'll leave it up until the new year in case someone has concerns.

But I think it's still good to have an alternative in case the link becomes unresponsive again at some point in the future. Plus, it'll be able to identify the 503 saved as the vocab and redownload (i.e when someone first ran the tests on a new environment and simply saved the 503).

@mart-r mart-r merged commit 4c72cc6 into CogStack:master Jan 4, 2023
@mart-r mart-r deleted the rosalindDown branch February 10, 2023 11:10
mart-r added a commit to mart-r/MedCAT that referenced this pull request Jun 14, 2023
CU-862hyd5wx Workaround for when Rosalind is down and/or fixes the downloaded 503
mart-r added a commit that referenced this pull request Oct 12, 2023
CU-862hyd5wx Workaround for when Rosalind is down and/or fixes the downloaded 503
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants