Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nextclade Web: don't store unnecessary dataset info in local storage #1464

Open
ivan-aksamentov opened this issue May 30, 2024 · 0 comments
Open
Labels
t:bug Type: bug, error, something isn't working

Comments

@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented May 30, 2024

Back in the day, when datasets description objects were very small, I decided (lazily) to store in the local storage the dataset object as-is, as it comes from the index.json. The "current dataset" Recoil state atom and local storage is neatly synchronized using Recoil plugin.

Time flies by and new features were added, like custom dataset servers, single datasets without servers and since #1455 also Auspice-based datasets.

The dataset description has grown in size. When switching dataset servers you now can receive errors, because new server might not have the dataset the old server had and local storage preserves the old dataset (e.g. its name).

This has aggravated with #1455, when I decided to also store the whole Auspice JSON in the dataset description. Very convenient, not very responsible! Trees might reach up to around 50-100MB in size and local storage sometimes refuses to update (with corresponding errors in the console).

Long story short, this needs to change. Only minimal information should be retained and when external conditions change, the local storage should be discarded.

We need to enumerate all possibilities for different input types and handle them correctly.

@ivan-aksamentov ivan-aksamentov added t:bug Type: bug, error, something isn't working good first issue Good for newcomers help wanted Extra attention is needed needs triage Mark for review and label assignment and removed good first issue Good for newcomers help wanted Extra attention is needed needs triage Mark for review and label assignment labels May 30, 2024
ivan-aksamentov added a commit that referenced this issue Jun 4, 2024
After `dataset-json-url` param is used, the entire auspice json was stored in local storage. For large files this would be bigger that what browsers allow and the store operation would fail. Subsequent reads would retrieve either nothing or some fragment of data.

Here I add some logic to make sure we store auspice json in memory only. Sadly, this means that page reload wipes all the data and we currently don't have a good mechanism to store the previous dataset URL in order to re-fetch it. This will need to be implemented. Currently I opted for discarding the stored dataset if it was auspice dataset. This should be alright, as the feature was primarily designed to be used when navigating to Nextclade with url params set.

This PR only implements a workaround for something that's fundamentally poorly designed and broken. The full dataset description object is still saved to local storage. The definitive fix requires serious restructuring and altering many components. It will be implemented later (#1464)

But this workaround should allow most use-cases of `dataset-json-url` to work and makes this auspice dataset feature releasable.

Can be tested with a large tree like this:
```
?dataset-json-url=https://nextstrain.org/ncov/gisaid/global/all-time
```
ivan-aksamentov added a commit that referenced this issue Jun 4, 2024
After `dataset-json-url` param is used, the entire auspice json was stored in local storage. For large files this would be bigger that what browsers allow and the store operation would fail. Subsequent reads would retrieve either nothing or some fragment of data.

Here I add some logic to make sure we store auspice json in memory only. Sadly, this means that page reload wipes all the data and we currently don't have a good mechanism to store the previous dataset URL in order to re-fetch it. This will need to be implemented. Currently I opted for discarding the stored dataset if it was auspice dataset. This should be alright, as the feature was primarily designed to be used when navigating to Nextclade with url params set.

This PR only implements a workaround for something that's fundamentally poorly designed and broken. The full dataset description object is still saved to local storage. The definitive fix requires serious restructuring and altering many components. It will be implemented later (#1464)

But this workaround should allow most use-cases of `dataset-json-url` to work and makes this auspice dataset feature releasable.

Can be tested with a large tree like this:

```
?dataset-json-url=https://nextstrain.org/ncov/gisaid/global/all-time
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:bug Type: bug, error, something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant