-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial version of checksum based freshness #14137
base: master
Are you sure you want to change the base?
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @weihanglo (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
320f73c
to
310cd79
Compare
cce62ba
to
59441b6
Compare
27e2a18
to
c83be55
Compare
Add unstable support for outputting file checksums for use in cargo Adds an unstable option that appends file checksums and expected lengths to the end of the dep-info file such that `cargo` can read and use these values as an alternative to file mtimes. This PR powers the changes made in this cargo PR rust-lang/cargo#14137 Here's the tracking issue for the cargo feature rust-lang/cargo#14136.
☔ The latest upstream changes (presumably #13947) made this pull request unmergeable. Please resolve the merge conflicts. |
c83be55
to
b8c21fa
Compare
Merge conflicts resolved. |
use cargo_test_support::{basic_lib_manifest, basic_manifest, project, rustc_host, rustc_host_env}; | ||
|
||
#[cargo_test] | ||
fn checksum_actually_uses_checksum() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At minimum, could you structure your PR so its
- A commit with these tests without
-Zchecksum-freshness
- A commit with the checksum work that also updates the tests to pass
-Zchecksum-freshness
A big benefit to this is it shows to reviewers / the community how this feature is comparing to what was being done before
(sometimes, I also break out "adding an unstable feature" into its own commit which is the flag + docs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I've not dug deep into the tests, waiting on this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's worth noting that freshness_checksum.rs
derives very heavily from freshness.rs
. So a version of freshness_checksum.rs
without the freshness flag would just be a subset of freshness.rs
. I'm not sure how this provides new information. There are two tests which are truly unique to freshness_checksum.rs
, which are same_size_different_content()
and checksum_actually_uses_checksum()
.
One might debate the merit of duplicating the tests like that. If you really wanted to deduplicate the tests then this would likely require a special case be added to the test runner code.
☔ The latest upstream changes (presumably #14493) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are almost there! Thank for yours efforts, epage and Xaeroxe. Thought it would be a tough review, but honestly it was a happy time :)
## checksum-freshness | ||
* Tracking issue: [#14136](https://github.com/rust-lang/cargo/issues/14136) | ||
|
||
The `-Z checksum-freshness` flag will replace the use of file mtimes in cargo's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth noting that build script execution is not included in the current implementation.
@@ -23,6 +23,7 @@ anstream = "0.6.15" | |||
anstyle = "1.0.8" | |||
anyhow = "1.0.86" | |||
base64 = "0.22.1" | |||
blake3 = "1.5.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: need to check the compatibility of blake3 for Tier 1 with Host Tools and Tier 2 with Host Tools, as it contains some assembly code.
}); | ||
} | ||
let Ok(checksum) = Checksum::compute(prior_checksum.algo, file) else { | ||
return Some(StaleItem::MissingFile(path.to_path_buf())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is checksum computation failure also a StaleItem::MissingFile
?
let Ok(file) = File::open(path) else { | ||
return Some(StaleItem::MissingFile(path.to_path_buf())); | ||
}; | ||
let Ok(current_file_len) = file.metadata().map(|m| m.len()) else { | ||
return Some(StaleItem::FailedToReadMetadata(path.to_path_buf())); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since syscall stat
is generally faster than open
, should we reorder this part a bit that first compare file size then open it?
let dep_info = target_root.join(dep_info); | ||
let cargo_exe = cargo_exe; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let cargo_exe = cargo_exe; |
@@ -2250,3 +2479,201 @@ pub fn parse_rustc_dep_info(rustc_dep_info: &Path) -> CargoResult<RustcDepInfo> | |||
Ok(ret) | |||
} | |||
} | |||
|
|||
/// Some algorithms are here to ensure compatibility with possible rustc outputs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: checksum part could potentially be split into a module. The LoC here in this file is already frightening.
(though I believe it will not help much 😓)
@@ -2102,13 +2271,14 @@ pub struct RustcDepInfo { | |||
struct EncodedDepInfo { | |||
files: Vec<(DepInfoPathType, PathBuf)>, | |||
env: Vec<(String, Option<String>)>, | |||
checksum: Vec<(DepInfoPathType, PathBuf, u64, String)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not a blocker)
I feel like files
and checksum
should eventually merge into one field.
"the file `{}` has changed (checksum didn't match, {} != {})", | ||
file.display(), | ||
stored_checksum, | ||
new_checksum, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: arg capture whenver possible and making sense
"the file `{}` has changed (checksum didn't match, {} != {})", | |
file.display(), | |
stored_checksum, | |
new_checksum, | |
"the file `{}` has changed (checksum didn't match, {stored_checksum} != {new_checksum})", | |
file.display(), |
@@ -183,6 +187,16 @@ impl DirtyReason { | |||
DirtyReason::PrecalculatedComponentsChanged { .. } => { | |||
s.dirty_because(unit, "the precalculated components changed") | |||
} | |||
DirtyReason::ChecksumUseChanged { old, new: _ } => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are not using new
anywhere, should we just remove this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. The test module might need to be rewritten. See #14039
We can sort this out later as follow-ups. Don't worry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file looks like a copy of freshness.rs
with -Zchecksum-freshness
and masquerade_as_nightly_cargo
everywhere.
Could you point out which tests are new and worth a review? Guess they are
checksum_actually_uses_checksum
same_size_different_content
modifying_and_moving
Also could you add a test that verifying -Zchecksum-freshness
is gated behind nightly? (A test without masquerade_as_nightly_cargo
is sufficient.
Implementation for #14136 and resolves #6529
This PR implements the use of checksums in cargo fingerprints as an alternative to using mtimes. This is most useful on systems with poor mtime implementations.
This has a dependency on rust-lang/rust#126930. It's expected this will increase the time it takes to declare a build to be fresh. Still this loss in performance may be preferable to the issues the ecosystem has had with the use of mtimes for determining freshness.