Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parsed results to speed up initial load #5659

Closed
Rheeseyb opened this issue May 13, 2024 · 0 comments · Fixed by #6381
Closed

Cache parsed results to speed up initial load #5659

Rheeseyb opened this issue May 13, 2024 · 0 comments · Fixed by #6381

Comments

@Rheeseyb
Copy link
Contributor

Rheeseyb commented May 13, 2024

Since parsing of large projects takes so long, and we do not store the parsed model (only the code) on the server (mainly due to the size of the requests), we would like to cache the parsed results locally. There are a few possible goals we can achieve here, so it is important when taking this ticket to assess a solution's suitability based on which of these goals it can achieve, how long it will take to implement and test, and how complex it is.

Possible goals:

  • Improve loading time of an existing (previously opened project)
  • Improve loading time of a project that is very similar to a previously opened project (e.g. creating a new project from a template that you've used before, or a fork of someone else's project that you were viewing, or a pair of projects with maybe a single file difference)
  • Reduce the time taken to clone a project from github which has previously been cloned (probably the same problem as above, but not necessarily)

Important caveats:

  • We need to make absolutely sure that the parsed model is still valid. Any editor changes which could in any affect the parsed model must cause a cache invalidation. The fallback option here is to invalidate the cache based on the editor's commit sha, but that of course means the cache will always be very short lived. We could try something which only invalidates it based on changes which actually affect the parsed model, but that would likely either be brittle (prone to human error since it would involve manually flagging these kinds of changes and updating the projectVersion each time) or time consuming (moving all parsing into a separate package so we can use the version number of that package).
  • We do not want a cache that will grow out of control, so e.g. keying based on the file contents itself would require aggressive cleaning
  • The filename is important, as without that we will definitely see UID clashes

Note this work should not be started until #5655 has been completed

@liady liady mentioned this issue Sep 24, 2024
2 tasks
liady added a commit that referenced this issue Oct 4, 2024
This PR introduces `IndexedDB` cache for parse results, keyed by file
name and string contents (so every change in the file contents
invalidates the parse cache). This greatly improves the >1st time load
on every project.
<video
src="https://github.com/user-attachments/assets/faa3560b-81ca-4f1d-9d92-52526bf7f004"></video>

Important points:
1. The cache is implemented in the worker level - to keep our current
parse flow as identical to now as possible. If the feature flag is on,
the worker tries to look for the file in the cache, compares the content
- and if there is a cache hit it returns it from the cache instead of
parsing it.
2. When a parsing does happen - if the feature flag is on, the parsed
results are stored in the cache.
3. The cache is not project specific - allowing for cached results to be
shared between projects (if the file name and contents are similar)
4. Currently arbitrary code (chunks of code we send as `code.tsx`) is
not being cached, this can be controlled using a feature flag.
5. This PR also adds a settings pane for controlling cache behavior. The
pane allows controlling the cache, the cache log and manually clear the
cache if necessary.
The cache settings toggle (with sound 🔉) :
<video
src="https://github.com/user-attachments/assets/a3c901c6-1b87-4013-a9ab-a2f531fb4457"></video>

**Commit Details:**
- The main logic changes are in `parser-printer-worker.ts` and
`parse-cache-utils.worker.ts`.
- The worker now gets a `parsingCacheOptions` argument, which controls
whether or not to use the cache (and also logging)

**Manual Tests:**
I hereby swear that:

- [X] I opened a hydrogen project and it loaded
- [X] I could navigate to various routes in Preview mode

Fixes #5659
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant