Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Set lazy = FALSE by default on Windows? #1266

Closed
christopherkenny opened this issue Aug 6, 2021 · 15 comments
Closed

Suggestion: Set lazy = FALSE by default on Windows? #1266

christopherkenny opened this issue Aug 6, 2021 · 15 comments

Comments

@christopherkenny
Copy link

Hello!

The new speed is great with version 2.0.0. Thank you guys for all you do for the R community.

I've noticed across a few command checks unexpected issues with using readr, especially with tempfiles on Windows. For example, PL94171, which we are submitting an update using lazy = FALSE. We see a similar issue within the redist package where a tempfile is written and can't be overwritten once opened with readr. (That particular test is skipped on CRAN because it's time consuming.)

After some looking around, I found that this is explained and expected, see the readr 2.0.0 post

I was wondering if you would ever consider setting lazy = FALSE by default if it's a Windows machine. One approach could be a simple function, like the following:

is_windows <- function() {
    Sys.info()[['sysname']] == 'Windows'
}

Then the default for lazy would be lazy = !is_windows()

I understand if this type of change could cause other concerns, so feel free to disregard it.

Thanks,
Chris

This is loosely related to #1263.

@Robinlovelace
Copy link

I'm not 100% sure but suspect this issue that would be fixed in this suggestion is the same that caused tests to fail on GitHub Actions on 2 of my packages.

The fix implemented (credit: @agila5 ) there was to add the following lines before using readr code:

  # Set the local edition for readr.
  # See https://github.com/ropensci/stats19/issues/205
  if (.Platform$OS.type == "windows" && utils::packageVersion("readr") >= "2.0.0") {
    readr::local_edition(1)
  }

I think an upstream fix would be ideal to prevent others being affected by this platform-specific and potentially hard to diagnose issue so big 👍 on the suggestion.

@Robinlovelace
Copy link

Commits that stopped readr code from failing in case of use: ropensci/stats19@23e61a1 and ITSLeeds/pct@8035434

@jimhester
Copy link
Collaborator

Why did you not just put lazy = FALSE in your calls?

@Robinlovelace
Copy link

Robinlovelace commented Aug 20, 2021

Why did you not just put lazy = FALSE in your calls?

I was unaware of this solution. The solution was a suggestion form Andrea and it fixed the failing actions so I merged it: ropensci/stats19#206

Do you think updating the relevant section would be better? Happy to update the code in both packages if so if an upstream fix on the readr side is not planned.

Are you considering implementing Christopher's suggestion?

@andoomcz
Copy link

This is a similar suggestion but instead of having lazy = FALSE would it be possible to create a function that kills the lazy reading without having to restart the session?

@jimhester
Copy link
Collaborator

@andoomcz you can currently use the unexported vroom:::vroom_materialize() function to eagerly read the rest of the data. It is possible we could supply a function in readr to call this.

@natbprice
Copy link

I am often finding that readr is locking files. After reading a file in R and editing the file I find I cannot save the file again: "Error: Cannot open file for writing". How do I close the file connection and unlock the file so it can be overwritten?

@jimhester
Copy link
Collaborator

Either use lazy = FALSE on the file when reading, or make sure the data is fully read via, vroom:::vroom_materialize().

@natbprice
Copy link

@jimhester Thanks. What are the arguments for vroom:::vroom_materialize()? It looks like it wants an argument x and an argument replace.

Do you have any ideas for making this more seamless? Would it be possible to catch the file lock error and retry after calling vroom:::vroom_materialize()?

@jimhester
Copy link
Collaborator

You should just use lazy = FALSE when you read the file.

@natbprice
Copy link

After the file is locked it is too late to set lazy = FALSE. To add some context, this is not a minor issue for me. Reading files is a low level workhorse function and if the defaults are not reasonable and/or maintainers are not open to developing a fix (@hadley), I will probably just find another package to use.

  1. Is there a way to unlock a file which readr has locked without killing my R session? Is vroom:::vroom_materialize() the suggested workaround and if so what are the arguments?

  2. Would you be open to using options(...) to override the default lazy = TRUE argument?

@hadley
Copy link
Member

hadley commented Sep 30, 2021

@natbprice if using lazy = FALSE is not possible for you, it definitely sounds like another package might be a good fit for your needs. data.table::fread() is also very very fast.

@jimhester
Copy link
Collaborator

jimhester commented Sep 30, 2021

I added a readr.read_lazy global option to let users control the default behavior of lazy reading

@natbprice
Copy link

natbprice commented Sep 30, 2021

Thanks for adding the new option! Really appreciate your help.

@christopherkenny
Copy link
Author

Thank you @jimhester! The readr.read_lazy option should fix my concern, which is focused on when readr is used within a package, rather than directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants