Cache parsed results to speed up initial load #5659

Rheeseyb · 2024-05-13T09:23:04Z

Since parsing of large projects takes so long, and we do not store the parsed model (only the code) on the server (mainly due to the size of the requests), we would like to cache the parsed results locally. There are a few possible goals we can achieve here, so it is important when taking this ticket to assess a solution's suitability based on which of these goals it can achieve, how long it will take to implement and test, and how complex it is.

Possible goals:

Improve loading time of an existing (previously opened project)
Improve loading time of a project that is very similar to a previously opened project (e.g. creating a new project from a template that you've used before, or a fork of someone else's project that you were viewing, or a pair of projects with maybe a single file difference)
Reduce the time taken to clone a project from github which has previously been cloned (probably the same problem as above, but not necessarily)

Important caveats:

We need to make absolutely sure that the parsed model is still valid. Any editor changes which could in any affect the parsed model must cause a cache invalidation. The fallback option here is to invalidate the cache based on the editor's commit sha, but that of course means the cache will always be very short lived. We could try something which only invalidates it based on changes which actually affect the parsed model, but that would likely either be brittle (prone to human error since it would involve manually flagging these kinds of changes and updating the projectVersion each time) or time consuming (moving all parsing into a separate package so we can use the version number of that package).
We do not want a cache that will grow out of control, so e.g. keying based on the file contents itself would require aggressive cleaning
The filename is important, as without that we will definitely see UID clashes

Note this work should not be started until #5655 has been completed

The text was updated successfully, but these errors were encountered:

This PR introduces `IndexedDB` cache for parse results, keyed by file name and string contents (so every change in the file contents invalidates the parse cache). This greatly improves the >1st time load on every project. <video src="https://github.com/user-attachments/assets/faa3560b-81ca-4f1d-9d92-52526bf7f004"></video> Important points: 1. The cache is implemented in the worker level - to keep our current parse flow as identical to now as possible. If the feature flag is on, the worker tries to look for the file in the cache, compares the content - and if there is a cache hit it returns it from the cache instead of parsing it. 2. When a parsing does happen - if the feature flag is on, the parsed results are stored in the cache. 3. The cache is not project specific - allowing for cached results to be shared between projects (if the file name and contents are similar) 4. Currently arbitrary code (chunks of code we send as `code.tsx`) is not being cached, this can be controlled using a feature flag. 5. This PR also adds a settings pane for controlling cache behavior. The pane allows controlling the cache, the cache log and manually clear the cache if necessary. The cache settings toggle (with sound 🔉) : <video src="https://github.com/user-attachments/assets/a3c901c6-1b87-4013-a9ab-a2f531fb4457"></video> **Commit Details:** - The main logic changes are in `parser-printer-worker.ts` and `parse-cache-utils.worker.ts`. - The worker now gets a `parsingCacheOptions` argument, which controls whether or not to use the cache (and also logging) **Manual Tests:** I hereby swear that: - [X] I opened a hydrogen project and it loaded - [X] I could navigate to various routes in Preview mode Fixes #5659

Rheeseyb added the Parser Performance label May 13, 2024

liady mentioned this issue Sep 24, 2024

feat(parser): parse cache #6381

Merged

2 tasks

liady closed this as completed in #6381 Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache parsed results to speed up initial load #5659

Cache parsed results to speed up initial load #5659

Rheeseyb commented May 13, 2024 •

edited

Loading

Cache parsed results to speed up initial load #5659

Cache parsed results to speed up initial load #5659

Comments

Rheeseyb commented May 13, 2024 • edited Loading

Rheeseyb commented May 13, 2024 •

edited

Loading