Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow compilation on large project when nothing changes #40858

Closed
jonhoo opened this issue Mar 27, 2017 · 5 comments
Closed

Slow compilation on large project when nothing changes #40858

jonhoo opened this issue Mar 27, 2017 · 5 comments
Labels
A-incr-comp Area: Incremental compilation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times.

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Mar 27, 2017

I'm working on a decently large Rust project, and have lately found compilation times (especially for cargo test) to be excruciatingly slow. In particular, from a complete build, followed by a touch src/lib.rs takes ~80s with the latest nightly and CARGO_INCREMENTAL=1:

$ env | grep CARGO
CARGO_INCREMENTAL=1
$ git clone https://github.com/mit-pdos/distributary.git
$ git checkout 2fd16356a15ab3b0fbb7dbd9cbc85add984251b7 # HEAD has a workaround
$ cd distributary
$ cargo --version
cargo 0.19.0-nightly (c995e9eb5 2017-03-17)
$ rustc --version
rustc 1.17.0-nightly (7846dbe0c 2017-03-26)
$ cargo clean
$ cargo t --no-run
   Compiling nom v1.2.4
   Compiling unicode-normalization v0.1.4
   Compiling serde v0.9.11
...
   Compiling distributary v0.1.0 (file:///home/jon/dev/projects/distributary)
    Finished dev [unoptimized + debuginfo] target(s) in 362.66 secs
$ cargo t --no-run
    Finished dev [unoptimized + debuginfo] target(s) in 0.1 secs
$ touch src/lib.rs
$ cargo t --no-run
   Compiling distributary v0.1.0 (file:///home/jon/dev/projects/distributary)
    Finished dev [unoptimized + debuginfo] target(s) in 80.84 secs

@mbrubeck suggested on #rust that I try using ld.gold. That helps, but the slowdown is still there:

$ cargo clean
$ env RUSTFLAGS="-C link-args=-fuse-ld=gold" cargo t --no-run
   ...
   Compiling distributary v0.1.0 (file:///home/jon/dev/projects/distributary)
    Finished dev [unoptimized + debuginfo] target(s) in 276.47 secs
$ touch src/lib.rs
$ env RUSTFLAGS="-C link-args=-fuse-ld=gold" cargo t --no-run
   Compiling distributary v0.1.0 (file:///home/jon/dev/projects/distributary)
    Finished dev [unoptimized + debuginfo] target(s) in 45.35 secs

Part of the issue seems to be that rust re-links the library and all binaries (of which this particular project has 11) each time, and double-compiles a bunch of things (due to #[cfg(test)]). Compiling just the unit tests in lib is better, though still somewhat slow, and obviously insufficient (I also want to compile and run tests/):

$ touch src/lib.rs
$ env RUSTFLAGS="-C link-args=-fuse-ld=gold" cargo rustc --lib -- --test
   Compiling distributary v0.1.0 (file:///home/jon/dev/projects/distributary)
    Finished dev [unoptimized + debuginfo] target(s) in 11.72 secs

@aidanhs observed that it is actually necessary to compile the binaries, because the tests script may depend on them (e.g., cargo), or because the binaries themselves contain tests. I tried using cargo test --test lib --lib to avoid compiling binaries, but that doesn't seem to work correctly.

I'm not entirely sure exactly what the right fix for this is, but 12s compilation time for the core library (and 80s for the full crate) when nothing has changed seems excessive, and just what incremental compile was intended to help with. Having a way to say "don't compile binaries when testing" would also be nice, and save a decent chunk of compilation time for projects with many binaries, but I'm guessing that's more of an orthogonal cargo issue?

@michaelwoerister michaelwoerister added the A-incr-comp Area: Incremental compilation label Mar 28, 2017
@michaelwoerister
Copy link
Member

Would you mind running things with RUSTFLAGS=-Ztime-passes? The output of that would provide some useful information where the compiler spends its time during re-compilation.

@jonhoo
Copy link
Contributor Author

jonhoo commented Mar 28, 2017

On HEAD, where all binaries are hidden behind a feature flag (and thus aren't compiled):

$ env RUSTFLAGS=-Ztime-passes cargo t --no-run > /dev/null
$ touch src/lib.rs
$ env RUSTFLAGS=-Ztime-passes cargo t --no-run
   Compiling distributary v0.1.0 (file:///home/jon/dev/projects/distributary)
time: 0.053; rss: 63MB	parsing
time: 0.000; rss: 63MB	recursion limit
time: 0.000; rss: 63MB	crate injection
time: 0.052; rss: 63MB	parsing
time: 0.000; rss: 63MB	recursion limit
time: 0.000; rss: 63MB	crate injection
time: 0.001; rss: 66MB	plugin loading
time: 0.000; rss: 66MB	plugin registration
time: 0.001; rss: 66MB	plugin loading
time: 0.000; rss: 66MB	plugin registration
time: 0.245; rss: 214MB	expansion
time: 0.000; rss: 214MB	maybe building test harness
time: 0.003; rss: 214MB	maybe creating a macro crate
time: 0.000; rss: 214MB	checking for inline asm in case the target doesn't support it
time: 0.008; rss: 214MB	early lint checks
time: 0.003; rss: 214MB	AST validation
time: 0.039; rss: 222MB	name resolution
time: 0.008; rss: 222MB	complete gated feature checking
time: 0.025; rss: 235MB	lowering ast -> hir
time: 0.008; rss: 237MB	indexing hir
time: 0.003; rss: 237MB	attribute checking
time: 0.001; rss: 232MB	language item collection
time: 0.006; rss: 232MB	lifetime resolution
time: 0.000; rss: 232MB	looking for entry point
time: 0.000; rss: 232MB	looking for plugin registrar
time: 0.015; rss: 242MB	region resolution
time: 0.003; rss: 242MB	loop checking
time: 0.001; rss: 242MB	static item recursion checking
time: 0.389; rss: 231MB	expansion
time: 0.017; rss: 232MB	maybe building test harness
time: 0.003; rss: 232MB	maybe creating a macro crate
time: 0.000; rss: 232MB	checking for inline asm in case the target doesn't support it
time: 0.013; rss: 232MB	early lint checks
time: 0.003; rss: 232MB	AST validation
time: 0.089; rss: 243MB	compute_incremental_hashes_map
time: 0.069; rss: 241MB	name resolution
time: 0.010; rss: 241MB	complete gated feature checking
time: 0.043; rss: 256MB	lowering ast -> hir
time: 0.020; rss: 257MB	indexing hir
time: 0.005; rss: 257MB	attribute checking
time: 0.003; rss: 246MB	language item collection
time: 0.009; rss: 246MB	lifetime resolution
time: 0.000; rss: 246MB	looking for entry point
time: 0.000; rss: 246MB	looking for plugin registrar
time: 0.023; rss: 254MB	region resolution
time: 0.007; rss: 254MB	loop checking
time: 0.001; rss: 254MB	static item recursion checking
time: 0.262; rss: 258MB	load_dep_graph
time: 0.005; rss: 249MB	stability index
time: 0.012; rss: 250MB	stability checking
time: 0.034; rss: 262MB	type collecting
time: 0.003; rss: 262MB	variance inference
time: 0.004; rss: 262MB	impl wf inference
time: 0.151; rss: 255MB	compute_incremental_hashes_map
time: 0.068; rss: 281MB	coherence checking
time: 0.071; rss: 283MB	wf checking
time: 0.056; rss: 284MB	item-types checking
time: 0.302; rss: 262MB	load_dep_graph
time: 0.006; rss: 262MB	stability index
time: 0.016; rss: 264MB	stability checking
time: 0.048; rss: 276MB	type collecting
time: 0.007; rss: 276MB	variance inference
time: 0.007; rss: 278MB	impl wf inference
time: 0.072; rss: 297MB	coherence checking
time: 0.085; rss: 299MB	wf checking
time: 0.117; rss: 312MB	item-types checking
time: 1.647; rss: 321MB	item-bodies checking
time: 0.066; rss: 323MB	const checking
time: 0.027; rss: 323MB	privacy checking
time: 0.010; rss: 323MB	intrinsic checking
time: 0.005; rss: 323MB	effect checking
time: 0.018; rss: 323MB	match checking
time: 0.013; rss: 325MB	liveness checking
time: 0.049; rss: 325MB	rvalue checking
time: 0.128; rss: 367MB	MIR dump
  time: 0.019; rss: 367MB	SimplifyCfg
  time: 0.029; rss: 367MB	QualifyAndPromoteConstants
  time: 0.040; rss: 367MB	TypeckMir
  time: 0.002; rss: 367MB	SimplifyBranches
  time: 0.010; rss: 367MB	SimplifyCfg
time: 0.100; rss: 367MB	MIR cleanup and validation
time: 0.159; rss: 369MB	borrow checking
time: 0.003; rss: 369MB	reachability checking
time: 0.014; rss: 369MB	death checking
time: 0.000; rss: 369MB	unused lib feature checking
time: 0.096; rss: 369MB	lint checking
time: 0.000; rss: 369MB	resolving dependency formats
  time: 0.003; rss: 369MB	NoLandingPads
  time: 0.008; rss: 369MB	SimplifyCfg
  time: 0.022; rss: 369MB	EraseRegions
  time: 0.004; rss: 369MB	AddCallGuards
  time: 0.083; rss: 375MB	ElaborateDrops
  time: 0.002; rss: 375MB	NoLandingPads
  time: 0.018; rss: 375MB	SimplifyCfg
  time: 0.000; rss: 375MB	Inline
  time: 0.012; rss: 375MB	InstCombine
  time: 0.005; rss: 375MB	Deaggregator
  time: 0.002; rss: 375MB	CopyPropagation
  time: 0.013; rss: 375MB	SimplifyLocals
  time: 0.003; rss: 375MB	AddCallGuards
  time: 0.001; rss: 375MB	PreTrans
time: 0.178; rss: 375MB	MIR optimisations
  time: 0.025; rss: 380MB	write metadata
time: 2.632; rss: 362MB	item-bodies checking
time: 0.105; rss: 368MB	const checking
time: 0.052; rss: 368MB	privacy checking
time: 0.016; rss: 369MB	intrinsic checking
time: 0.008; rss: 369MB	effect checking
time: 0.026; rss: 369MB	match checking
time: 0.020; rss: 371MB	liveness checking
time: 0.068; rss: 371MB	rvalue checking
time: 0.196; rss: 443MB	MIR dump
  time: 0.026; rss: 443MB	SimplifyCfg
  time: 0.042; rss: 443MB	QualifyAndPromoteConstants
  time: 0.067; rss: 443MB	TypeckMir
  time: 0.004; rss: 443MB	SimplifyBranches
  time: 0.015; rss: 443MB	SimplifyCfg
time: 0.154; rss: 443MB	MIR cleanup and validation
time: 0.266; rss: 443MB	borrow checking
time: 0.002; rss: 443MB	reachability checking
time: 0.015; rss: 443MB	death checking
time: 0.000; rss: 443MB	unused lib feature checking
time: 0.139; rss: 443MB	lint checking
time: 0.013; rss: 443MB	resolving dependency formats
  time: 0.003; rss: 443MB	NoLandingPads
  time: 0.015; rss: 443MB	SimplifyCfg
  time: 0.036; rss: 447MB	EraseRegions
  time: 0.006; rss: 447MB	AddCallGuards
  time: 0.123; rss: 455MB	ElaborateDrops
  time: 0.003; rss: 455MB	NoLandingPads
  time: 0.024; rss: 455MB	SimplifyCfg
  time: 0.000; rss: 455MB	Inline
  time: 0.016; rss: 455MB	InstCombine
  time: 0.007; rss: 455MB	Deaggregator
  time: 0.004; rss: 455MB	CopyPropagation
  time: 0.018; rss: 455MB	SimplifyLocals
  time: 0.005; rss: 455MB	AddCallGuards
  time: 0.002; rss: 455MB	PreTrans
time: 0.261; rss: 455MB	MIR optimisations
  time: 0.000; rss: 455MB	write metadata
  time: 2.681; rss: 438MB	translation item collection
  time: 0.637; rss: 480MB	codegen unit partitioning
  time: 2.946; rss: 529MB	translation item collection
  time: 0.286; rss: 499MB	internalize symbols
time: 5.346; rss: 499MB	translation
time: 0.000; rss: 499MB	assert dep graph
time: 0.477; rss: 489MB	serialize dep graph
  time: 0.002; rss: 395MB	codegen passes [3]
time: 0.083; rss: 395MB	LLVM passes
time: 0.000; rss: 395MB	serialize work products
  time: 0.685; rss: 554MB	codegen unit partitioning
time: 1.331; rss: 399MB	linking
time: 0.007; rss: 55MB	parsing
time: 0.000; rss: 55MB	recursion limit
time: 0.000; rss: 55MB	crate injection
time: 0.000; rss: 55MB	plugin loading
time: 0.000; rss: 55MB	plugin registration
time: 0.151; rss: 196MB	expansion
time: 0.004; rss: 198MB	maybe building test harness
time: 0.000; rss: 198MB	maybe creating a macro crate
time: 0.000; rss: 198MB	checking for inline asm in case the target doesn't support it
time: 0.003; rss: 198MB	early lint checks
time: 0.001; rss: 198MB	AST validation
time: 0.034; rss: 204MB	name resolution
time: 0.001; rss: 204MB	complete gated feature checking
time: 0.006; rss: 204MB	lowering ast -> hir
time: 0.002; rss: 206MB	indexing hir
time: 0.000; rss: 206MB	attribute checking
time: 0.000; rss: 210MB	language item collection
time: 0.001; rss: 210MB	lifetime resolution
time: 0.000; rss: 210MB	looking for entry point
time: 0.000; rss: 210MB	looking for plugin registrar
time: 0.002; rss: 210MB	region resolution
time: 0.000; rss: 210MB	loop checking
time: 0.000; rss: 210MB	static item recursion checking
time: 0.015; rss: 210MB	compute_incremental_hashes_map
time: 0.054; rss: 218MB	load_dep_graph
time: 0.000; rss: 218MB	stability index
time: 0.003; rss: 218MB	stability checking
time: 0.003; rss: 218MB	type collecting
time: 0.000; rss: 218MB	variance inference
time: 0.000; rss: 218MB	impl wf inference
time: 0.000; rss: 218MB	coherence checking
time: 0.002; rss: 218MB	wf checking
time: 0.013; rss: 220MB	item-types checking
  time: 0.337; rss: 557MB	internalize symbols
time: 5.774; rss: 557MB	translation
time: 0.000; rss: 557MB	assert dep graph
time: 0.414; rss: 268MB	item-bodies checking
time: 0.011; rss: 268MB	const checking
time: 0.003; rss: 268MB	privacy checking
time: 0.002; rss: 268MB	intrinsic checking
time: 0.001; rss: 268MB	effect checking
time: 0.003; rss: 268MB	match checking
time: 0.002; rss: 268MB	liveness checking
time: 0.006; rss: 268MB	rvalue checking
time: 0.028; rss: 276MB	MIR dump
  time: 0.003; rss: 276MB	SimplifyCfg
  time: 0.006; rss: 276MB	QualifyAndPromoteConstants
  time: 0.008; rss: 276MB	TypeckMir
  time: 0.000; rss: 276MB	SimplifyBranches
  time: 0.002; rss: 276MB	SimplifyCfg
time: 0.020; rss: 276MB	MIR cleanup and validation
time: 0.032; rss: 276MB	borrow checking
time: 0.000; rss: 276MB	reachability checking
time: 0.003; rss: 276MB	death checking
time: 0.000; rss: 276MB	unused lib feature checking
time: 0.022; rss: 276MB	lint checking
time: 0.014; rss: 276MB	resolving dependency formats
  time: 0.000; rss: 276MB	NoLandingPads
  time: 0.002; rss: 276MB	SimplifyCfg
  time: 0.004; rss: 278MB	EraseRegions
  time: 0.001; rss: 278MB	AddCallGuards
  time: 0.020; rss: 280MB	ElaborateDrops
  time: 0.000; rss: 280MB	NoLandingPads
  time: 0.004; rss: 280MB	SimplifyCfg
  time: 0.000; rss: 280MB	Inline
  time: 0.003; rss: 280MB	InstCombine
  time: 0.000; rss: 280MB	Deaggregator
  time: 0.000; rss: 280MB	CopyPropagation
  time: 0.002; rss: 280MB	SimplifyLocals
  time: 0.001; rss: 280MB	AddCallGuards
  time: 0.000; rss: 280MB	PreTrans
time: 0.038; rss: 280MB	MIR optimisations
  time: 0.000; rss: 280MB	write metadata
time: 0.668; rss: 542MB	serialize dep graph
  time: 0.003; rss: 406MB	codegen passes [9]
time: 0.081; rss: 401MB	LLVM passes
time: 0.000; rss: 401MB	serialize work products
  time: 0.465; rss: 300MB	translation item collection
  time: 0.100; rss: 304MB	codegen unit partitioning
  time: 0.046; rss: 320MB	internalize symbols
time: 1.056; rss: 320MB	translation
time: 0.000; rss: 320MB	assert dep graph
time: 0.110; rss: 324MB	serialize dep graph
  time: 0.002; rss: 307MB	codegen passes [1]
time: 0.021; rss: 308MB	LLVM passes
time: 0.000; rss: 308MB	serialize work products
  time: 3.697; rss: 401MB	running linker
time: 3.711; rss: 401MB	linking
  time: 3.605; rss: 308MB	running linker
time: 3.620; rss: 308MB	linking
    Finished dev [unoptimized + debuginfo] target(s) in 16.75 secs

@michaelwoerister
Copy link
Member

These numbers look in line with incr. compilation's current capabilities to me, as there's almost no time spent in LLVM. translation item collection takes up quite a bit of time, which is unfortunate. Hopefully, we'll find ways to speed that up in the future. item-bodies checking, which is the bulk of type-checking and inference, also shows up high on the list. Incrementalizing that is being worked on.

It would be nice to integrate Gold's and MSVC's incremental linking capabilities but nobody has gotten around to doing that yet. Here's a related issue: #37543

In summary: This is not the final state of how fast incremental compilation will be.

@Mark-Simulacrum Mark-Simulacrum added C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times. labels Jul 27, 2017
@ghost
Copy link

ghost commented Sep 2, 2017

I'm also suffering with extreme compilation times. My Ztime-passes stats look a lot like @jonhoo's, but actually a bit worse.

@alexcrichton
Copy link
Member

I'm going to close this in favor of #42293 as it sounds like this is "working as is" today and otherwise #42293 should help us improve quite a bit here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-compiletime Issue: Problems and improvements with respect to compile times.
Projects
None yet
Development

No branches or pull requests

4 participants