-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Optimize calls for utility and lottery #9339
Conversation
/benchmark runtime pallet pallet_utility |
Finished benchmark for branch: gui-optimize-calls Benchmark: Benchmark Runtime Pallet cargo run --release --features=runtime-benchmarks --manifest-path=bin/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_utility --extrinsic="*" --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/utility/src/weights.rs --template=./.maintain/frame-weight-template.hbs ResultsPallet: "pallet_utility", Extrinsic: "batch", Lowest values: [], Highest values: [], Steps: [50], Repeat: 20
|
… gui-optimize-calls
/benchmark runtime pallet pallet_lottery |
…/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_utility --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/utility/src/weights.rs --template=./.maintain/frame-weight-template.hbs
Finished benchmark for branch: gui-optimize-calls Benchmark: Benchmark Runtime Pallet cargo run --release --features=runtime-benchmarks --manifest-path=bin/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_lottery --extrinsic="*" --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/lottery/src/weights.rs --template=./.maintain/frame-weight-template.hbs ResultsPallet: "pallet_lottery", Extrinsic: "buy_ticket", Lowest values: [], Highest values: [], Steps: [50], Repeat: 20
|
…/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_lottery --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/lottery/src/weights.rs --template=./.maintain/frame-weight-template.hbs
@thiolliere we can't really trust the output of the benchmarking bot right now as @joao-paulo-parity is working to remove some of the overhead that snuck in recently: paritytech/bench-bot#47 My question is also: would this benchmark be able to capture the impact of this change? |
No it didn't, but I changed them in 5a93791 |
@thiolliere do you think we should add this by default into the whole benchmarking pipeline? |
I think it is relevant, also it doesn't look too difficult to do: #9343 Though most manual benchmark don't do that |
Yeah manual benchmarks, we will just need to add some documentation 🤷 |
/benchmark runtime pallet pallet_lottery |
Finished benchmark for branch: gui-optimize-calls Benchmark: Benchmark Runtime Pallet cargo run --release --features=runtime-benchmarks --manifest-path=bin/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_lottery --extrinsic="*" --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/lottery/src/weights.rs --template=./.maintain/frame-weight-template.hbs ResultsPallet: "pallet_lottery", Extrinsic: "buy_ticket", Lowest values: [], Highest values: [], Steps: [50], Repeat: 20
|
… gui-optimize-calls
…/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_lottery --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/lottery/src/weights.rs --template=./.maintain/frame-weight-template.hbs
/benchmark runtime pallet pallet_utility |
Finished benchmark for branch: gui-optimize-calls Benchmark: Benchmark Runtime Pallet cargo run --release --features=runtime-benchmarks --manifest-path=bin/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_utility --extrinsic="*" --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/utility/src/weights.rs --template=./.maintain/frame-weight-template.hbs ResultsPallet: "pallet_utility", Extrinsic: "batch", Lowest values: [], Highest values: [], Steps: [50], Repeat: 20
|
// Standard Error: 5_000 | ||
.saturating_add((301_000 as Weight).saturating_mul(n as Weight)) | ||
.saturating_add((786_000 as Weight).saturating_mul(n as Weight)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the difference with encoding? Double the weight per call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't say for sure until we compare with the benchmark with the decoding but without the Box, but anyway this seems ok to me
comparing to the benchmark result in this PR #9343 it is indeed from the difference in decoding.
…/node/cli/Cargo.toml -- benchmark --chain=dev --steps=50 --repeat=20 --pallet=pallet_utility --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --output=./frame/utility/src/weights.rs --template=./.maintain/frame-weight-template.hbs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to bump TX version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code lgtm
Although I can't really speak too the merit of the performance trade offs of this change
@@ -149,7 +149,7 @@ pub mod pallet { | |||
})] | |||
pub fn batch( | |||
origin: OriginFor<T>, | |||
calls: Vec<<T as Config>::Call>, | |||
calls: Vec<Box<<T as Config>::Call>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make any sense. When you use a Vec, these values are already "boxed". What you want to achieve here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inside a vec, the array of element is boxed, not each element individually.
because Call is around 400B, for a Vec<Call>
of capacity 10 allocate an array of size 10*400 B.
whereas a Vec<Box<Call>>
of capacity 10 allocate an array of size 10*size_of<usize>() B
and also allocate 10 individual calls.
This way the maximum length of an individual allocation is minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is right, but what is the reason for doing this?
We will not have a call vec with 83886 calls anyway?
let's do in another better way |
This PR changes for call argument from
Vec<Call>
toVec<Box<Call>>