Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in fs::create_dir_all #39799

Merged
merged 8 commits into from
Mar 19, 2017
Merged

Conversation

dpc
Copy link
Contributor

@dpc dpc commented Feb 14, 2017

The code would crash if the directory was created after create_dir_all
checked whether the directory already existed. This was contrary to
the documentation which claimed to create the directory if it doesn't
exist, implying (but not stating) that there would not be a failure
due to the directory existing.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @aturon (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@dpc
Copy link
Contributor Author

dpc commented Feb 14, 2017

This is resubmission of #30152 , that I've discussed in: #33707 .

I'd like to add a test for it, but how exactly should it work?

@frewsxcv frewsxcv added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Feb 14, 2017
@alexcrichton
Copy link
Member

Thanks for the PR! Could this actually lift the implementation in the compiler which I believe tries to minimize syscalls in the "normal" case.

Also yeah you can just add a test with a few threads hammering at creating directories and just make sure they don't fail.

@seppo0010
Copy link
Contributor

There's still room for a race condition if the directory is created after mkdir but before the second is_dir (in the pattern matching), right? If that's the case having threads hammering the function may trigger it. I'm not sure that's solvable, it would return AlreadyExists when the directory was a directory and no longer exists.

@dpc
Copy link
Contributor Author

dpc commented Feb 14, 2017

@alexcrichton How to "lift it"? Move from librustc to libstd and then use that new location instead in librustc code (or just call directly from the old librustc name)?

@seppo0010 : I don't understand. If another thread called mkdir after current one did it, it would fail. Can you plese explain? I need more coffee. Anyway, I'd probably use the version @alexcrichton pointed to.

@seppo0010
Copy link
Contributor

@dpc sorry, my case is quite weird, let me write it down a more detailed version. Let's say process1 and process2 are calling create_dir_all with the same Path that's just a directory (no recursion) and then calling rmdir. The directory does not exist before this scenario.

process1: is_dir(path) # false
process2: is_dir(path) # false
process1: mkdir(path) # Ok(())
process2: mkdir(path) # Err(ErrorKind::AlreadyExists)
process1: rmdir # Ok(())
process2: is_dir(path) # false
process2 returns Err(ErrorKind::AlreadyExists)

I'm not saying it's a bug, but it might be unexpected, and I don't even know what may be better in that case.

@alexcrichton
Copy link
Member

@dpc yeah we can just copy the code and then update the compiler to call libstd (as libstd would handle this race)

src/libstd/fs.rs Outdated
@@ -1769,11 +1769,20 @@ impl DirBuilder {
}

fn create_dir_all(&self, path: &Path) -> io::Result<()> {
use io::{ErrorKind};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this import get moved up to the top of the file with the other imports?

@alexcrichton
Copy link
Member

ping @dpc, thoughts on updating this?

@dpc
Copy link
Contributor Author

dpc commented Mar 7, 2017

I agree with all feedback, and will get to it, probably on a weekend. If anyone would like you to pick it up, that's also fine with me.

@alexcrichton
Copy link
Member

Thanks @dpc!

@dpc dpc force-pushed the create_dir_all branch 2 times, most recently from bc9df80 to 9865187 Compare March 11, 2017 07:58
@dpc
Copy link
Contributor Author

dpc commented Mar 11, 2017

Ooops. Sorry. I wasn't able to compile anything on current master, but I was able to work with auto, and I pushed that.

Also, seems like create_dir_racy has a bug. If it was a file that was created, it won't detect it. It doesn't ever call path.is_dir... But it's late, so I'll double check everything during the weekend.

src/libstd/fs.rs Outdated
Err(e) => return Err(e),
}
match path.parent() {
Some(p) => try!(create_dir_all(p)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this call self.create_dir_all to ensure reuse of configuration in self? (if we add it in the future)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@alexcrichton
Copy link
Member

Looking good to me, thanks! Could you also add a test for specifically this use case?

@dpc
Copy link
Contributor Author

dpc commented Mar 13, 2017

Done

@alexcrichton
Copy link
Member

Looks good! I think Travis has a tidy error though

(r=me otherwise)

@petrochenkov
Copy link
Contributor

@dpc
Maybe you could replace create_dir_racy with fs::create_dir_all in build_helper as well?

@alexcrichton
Copy link
Member

@petrochenkov @dpc oh not yet, we'll need to wait for a new release. That's compiled against the stage0 libstd which doesn't have this fix.

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Mar 13, 2017

📌 Commit 4e2114f has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Mar 13, 2017

⌛ Testing commit 4e2114f with merge e1be448...

@bors
Copy link
Contributor

bors commented Mar 13, 2017

💔 Test failed - status-appveyor

@alexcrichton
Copy link
Member

@bors: retry

@bors
Copy link
Contributor

bors commented Mar 13, 2017

⌛ Testing commit 4e2114f with merge 4d01a7d...

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 18, 2017
Fix race condition in fs::create_dir_all

The code would crash if the directory was created after create_dir_all
checked whether the directory already existed.  This was contrary to
the documentation which claimed to create the directory if it doesn't
exist, implying (but not stating) that there would not be a failure
due to the directory existing.
frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 18, 2017
Fix race condition in fs::create_dir_all

The code would crash if the directory was created after create_dir_all
checked whether the directory already existed.  This was contrary to
the documentation which claimed to create the directory if it doesn't
exist, implying (but not stating) that there would not be a failure
due to the directory existing.
@frewsxcv
Copy link
Member

@dpc
Copy link
Contributor Author

dpc commented Mar 19, 2017

I don't understand what happened, and I don't have Windows to debug. If one of the threads have failed, I would expect check! to print panic!("{} failed with: {}", stringify!($e), e), and I don't see it anywhere (and I did the last time).

Could someone with Windows help?

src/libstd/fs.rs Outdated
@@ -2261,11 +2282,41 @@ mod tests {
}

#[test]
fn concurrent_recursive_mkdir() {
for _ in 0..50 {
let mut dir = tmpdir().join("a");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main problem here is this use of tmpdir(). It deletes the path it returns when it goes out of scope and because it's not assigned to a variable that's immediately. See here. Changing this to the following should fix that:

let tmpdir = tmpdir();
let mut dir = tmpdir.join("a");

src/libstd/fs.rs Outdated
@@ -2261,11 +2282,41 @@ mod tests {
}

#[test]
fn concurrent_recursive_mkdir() {
for _ in 0..50 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test really need to be run 50 times? It fails the first time for me on Windows.

Copy link
Member

@retep998 retep998 Mar 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because it relies on there being a race condition between the threads, but race conditions don't occur very often. By doing the test multiple times, it is much more likely to hit the race condition.

src/libstd/fs.rs Outdated
fn concurrent_recursive_mkdir() {
for _ in 0..50 {
let mut dir = tmpdir().join("a");
for _ in 0..100 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you reduced the number of times this test is run rather than the the length of the path 😉.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear. 🤦‍

Increase lifetime of `tmpdir`, and really change the length of test
path.
@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Mar 19, 2017

📌 Commit 088696b has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Mar 19, 2017

⌛ Testing commit 088696b with merge 6eb9960...

bors added a commit that referenced this pull request Mar 19, 2017
Fix race condition in fs::create_dir_all

The code would crash if the directory was created after create_dir_all
checked whether the directory already existed.  This was contrary to
the documentation which claimed to create the directory if it doesn't
exist, implying (but not stating) that there would not be a failure
due to the directory existing.
@bors
Copy link
Contributor

bors commented Mar 19, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 6eb9960 to master...

@bors bors merged commit 088696b into rust-lang:master Mar 19, 2017
@emk
Copy link
Contributor

emk commented Mar 22, 2017

@dpc Thank you for fixing one of my least favorite Rust annoyances! The path to merge was a bit rock, but it's greatly appreciated.

@dpc
Copy link
Contributor Author

dpc commented Mar 22, 2017

@emk You're welcome. I'm very happy about it myself. First, I got on contributor's list. Second, it was a torn for me too. :)

emk added a commit to faradayio/xsv that referenced this pull request Mar 22, 2017
This is a minor tweak to handle some problems which I've encountered
several times now.  The only real excuse for this change is that it's
unobstrusive and has natural semantics.

This patch modifies --filename to support two use cases:

    xsv partition --filename {}/cities.csv state . all-cities.csv
    xsv partition --filename {}/monuments.csv state . all-monuments.csv
    xsv partition --filename {}/parks.csv state . all-parks.csv

Above, we want to partition our records by state into separate
directories, but we have multiple kinds of data.

    xsv partition --filename {}/$(uuidgen).csv . input1.csv
    xsv partition --filename {}/$(uuidgen).csv . input2.csv

Above, we're running multiple (possibly parallel) copies of xsv and we
want to parition the data into multiple directories without a filename
clash.

There's one limitation to the implementation: We might theoretically hit
the `create_dir_all` race condition recently fixed by
rust-lang/rust#39799.  I'm planning to supply a final patch which works
around this race condition in stable Rust.
@emk emk mentioned this pull request Mar 22, 2017
8 tasks
emk added a commit to faradayio/xsv that referenced this pull request Mar 23, 2017
bors added a commit that referenced this pull request Jun 12, 2017
rustdoc: Use `create_dir_all` to create output directory

Currently rustdoc will fail if passed `-o foo/doc` if the `foo`
directory doesn't exist.

Also remove unneeded `mkdir` as `create_dir_all` can now handle
concurrent invocations since #39799.
bors added a commit that referenced this pull request Aug 8, 2017
rustbuild: Replace create_dir_racy with create_dir_all

`create_dir_all` has since been fixed (in #39799) so no need for `create_dir_racy`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.