Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user to cache filename; better handle cache load/save failures #1179

Merged
merged 5 commits into from
Sep 11, 2022

Conversation

klauer
Copy link
Contributor

@klauer klauer commented Aug 12, 2022

Context

Closes #1178

Description

This is my attempt at fixing the issue I reported (#1178) regarding cache files generated by different users on the same system.

Consider this a proposal - I'd be happy to adjust it to your preference.

Three main changes were made:

  • Added $USER (via getpass.getuser()) to the cache filename (77d8277)
  • Moved the try/except block to include cache load / open(cache_fn) (1c36098)
    • This means that if, for any reason, the cache file fails to load, the normal loading procedure can continue
  • Wrapped the cache save section in a try/except block and a similar logger.exception message (835e692)
    • I believe cache save failures should not be fatal

@erezsh
Copy link
Member

erezsh commented Aug 16, 2022

Thanks, I agree with the changes.

Moved the try/except block to include cache load

The one thing that bugs me is that now it will always say "Loading grammar from cache ...", even when the cache file doesn't exist. At least, we should change it to "checking for cache for grammar", or something of the sort.

@MegaIng Mind also taking a look?

Copy link
Member

@MegaIng MegaIng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erezsh The behavior of always printing loading from cache might just be only when running tests and not when actually running the code with a real FileSystem. I would think the call to FS.open should already cause the Exception before getting to the logger.debug statement.

lark/lark.py Outdated
@@ -308,28 +309,30 @@ def __init__(self, grammar: 'Union[Grammar, str, IO[str]]', **options) -> None:
if self.options.cache is not True:
raise ConfigurationError("cache argument must be bool or str")

cache_fn = tempfile.gettempdir() + '/.lark_cache_%s_%s_%s.tmp' % (cache_md5, *sys.version_info[:2])
cache_fn = tempfile.gettempdir() + "/.lark_cache_%s_%s_%s_%s.tmp" % (getpass.getuser(), cache_md5, *sys.version_info[:2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the gurantees are about the value returned from getpass.getuser(). It could potentially return a string that results in an invalid file name, but we could also wait till someone actually reports a problem with this. In a similar vein, getuser might fail on some weird OSes and raise an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether anything should be done here depends on whether or not this might fail on some platforms and how often this happens. I would say keep it like this, and we take another look if someone reports this breaking caching for them. But we should keep this in the back of our minds.


if FS.exists(cache_fn):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the call to FS.exists might break the test suite. The MockFS.open function does not result in an exception being generated if the file is missing.

Copy link
Contributor Author

@klauer klauer Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A call to file.exists() and then subsequently open() is redundant and susceptible to a race condition.

It looks like the test suite would need reworking to appropriately handle this change.
Would you like prefer that a test suite fix be added to this PR, or would you prefer the exists() check to be re-added?

Edit: the fix is actually relatively minor. I'll make that change and will backpedal if desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, what you implemented is pretty much the same I also came to conclude to be the best solution.

@klauer
Copy link
Contributor Author

klauer commented Sep 6, 2022

Anything else I can do to help move this along?

@erezsh
Copy link
Member

erezsh commented Sep 6, 2022

@klauer Sorry, I've been a little busy. I'll try to look into it this weekend.

erezsh added a commit that referenced this pull request Sep 11, 2022
erezsh added a commit that referenced this pull request Sep 11, 2022
@erezsh erezsh merged commit 90f9af5 into lark-parser:master Sep 11, 2022
@erezsh
Copy link
Member

erezsh commented Sep 11, 2022

Thanks for the PR!

@klauer klauer deleted the fix_cache_filename branch September 12, 2022 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lark.Lark cache=True default cache filename can be problematic on multi-user systems
3 participants