Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve performance of nested cache store #14444

Closed
wants to merge 6 commits into from

Conversation

yihuang
Copy link
Collaborator

@yihuang yihuang commented Dec 29, 2022

Description

Closes: #14377

Solution:

  • Unify cachekv's cache content with single tidwall/btree.
  • Use the copy-on-write supported of tidwall/btree to implement cheap Clone/Restore methods in cache store.
  • Add RunAtomic method in Context to provide more efficient store branching out,
    it don't notifies the tracing module because it don't have the Write process as before.
  • API breaking: Context always hold a CacheMultiStore instead of MultiStore.
  • Refactor runTx and other module logics to use new RunAtomic method instead of CacheContext.

Benchmarks

Compare go test -run=^$ -bench=. -benchmem ./store/cachekv/... -count=5 between 1bbf3b5 and benchmark ref(backport benchmark changes to main branch):

name                                 old time/op    new time/op    delta
DeepCacheStack1-12                     5.67µs ± 1%    4.53µs ± 1%   -20.17%  (p=0.008 n=5+5)
DeepCacheStack3-12                     8.00µs ± 1%    4.51µs ± 1%   -43.60%  (p=0.016 n=5+4)
DeepCacheStack10-12                    29.1µs ± 1%     4.5µs ± 1%   -84.46%  (p=0.008 n=5+5)
DeepCacheStack13-12                    48.2µs ± 1%     4.5µs ± 2%   -90.59%  (p=0.008 n=5+5)
BlankParentIteratorNextKeySize32-12    1.04µs ± 3%    0.05µs ± 2%   -94.90%  (p=0.008 n=5+5)
BlankParentAppendKeySize32-12           729ns ±13%     380ns ± 1%   -47.91%  (p=0.008 n=5+5)
SetKeySize32-12                        1.42µs ±50%    1.46µs ± 5%      ~     (p=1.000 n=5+5)
IteratorOnParentWith1MDeletes-12        1.16s ± 4%     0.00s ± 1%  -100.00%  (p=0.008 n=5+5)
CacheKVStoreGetNoKeyFound-12            708ns ± 2%     678ns ± 1%    -4.26%  (p=0.008 n=5+5)
CacheKVStoreGetKeyFound-12              176ns ±11%     336ns ± 3%   +90.47%  (p=0.008 n=5+5)
CacheKVStoreSetAndCommit-12            9.69µs ± 0%    6.12µs ± 2%   -36.85%  (p=0.016 n=4+5)

name                                 old alloc/op   new alloc/op   delta
DeepCacheStack1-12                     2.10kB ± 0%    1.74kB ± 0%   -17.18%  (p=0.008 n=5+5)
DeepCacheStack3-12                     2.83kB ± 0%    1.74kB ± 0%   -38.70%  (p=0.008 n=5+5)
DeepCacheStack10-12                    5.41kB ± 0%    1.74kB ± 0%   -67.90%  (p=0.008 n=5+5)
DeepCacheStack13-12                    6.51kB ± 0%    1.74kB ± 0%   -73.34%  (p=0.008 n=5+5)
BlankParentIteratorNextKeySize32-12      205B ± 0%        0B       -100.00%  (p=0.008 n=5+5)
BlankParentAppendKeySize32-12            238B ±16%      112B ± 0%   -52.94%  (p=0.008 n=5+5)
SetKeySize32-12                          327B ±19%      164B ± 0%   -49.76%  (p=0.008 n=5+5)
IteratorOnParentWith1MDeletes-12        253MB ± 0%       0MB           ~     (p=0.079 n=4+5)
CacheKVStoreGetNoKeyFound-12             283B ± 1%      163B ± 0%   -42.32%  (p=0.008 n=5+5)
CacheKVStoreGetKeyFound-12              3.00B ± 0%     3.00B ± 0%      ~     (all equal)
CacheKVStoreSetAndCommit-12            4.83kB ± 0%    3.02kB ± 0%   -37.54%  (p=0.008 n=5+5)

name                                 old allocs/op  new allocs/op  delta
DeepCacheStack1-12                       19.0 ± 0%      12.0 ± 0%   -36.84%  (p=0.008 n=5+5)
DeepCacheStack3-12                       31.0 ± 0%      12.0 ± 0%   -61.29%  (p=0.008 n=5+5)
DeepCacheStack10-12                      73.0 ± 0%      12.0 ± 0%   -83.56%  (p=0.008 n=5+5)
DeepCacheStack13-12                      91.0 ± 0%      12.0 ± 0%   -86.81%  (p=0.008 n=5+5)
BlankParentIteratorNextKeySize32-12      2.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
BlankParentAppendKeySize32-12            1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
SetKeySize32-12                          1.40 ±43%      0.00       -100.00%  (p=0.008 n=5+5)
IteratorOnParentWith1MDeletes-12        3.07M ± 0%     0.00M           ~     (p=0.079 n=4+5)
CacheKVStoreGetNoKeyFound-12             4.00 ± 0%      2.00 ± 0%   -50.00%  (p=0.008 n=5+5)
CacheKVStoreGetKeyFound-12               1.00 ± 0%      1.00 ± 0%      ~     (all equal)
CacheKVStoreSetAndCommit-12              97.0 ± 0%      45.0 ± 0%   -53.61%  (p=0.008 n=5+5)

Benchmark result looks pretty good, it's a huge win on nested cases and iterator cases, it's even much faster in the SetAndCommit case which combines several Set operations and cache.Write together. CacheKVStoreGetKeyFound is the major slowdown case.


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

store/cachemulti/store.go Fixed Show fixed Hide fixed
Comment on lines +132 to +136
for key, store := range cms.stores {
stores[key] = store.Clone()
}

Check failure

Code scanning / gosec

the value in the range statement should be _ unless copying a map: want: for key := range m

the value in the range statement should be _ unless copying a map: want: for key := range m
x/group/keeper/msg_server.go Fixed Show fixed Hide fixed
Comment on lines +70 to +86
err := ctx.RunAtomic(func(cacheCtx sdk.Context) error {
messages, err := proposal.GetMsgs()
if err == nil {
for idx, msg = range messages {
handler := keeper.Router().Handler(msg)

var res *sdk.Result
res, err = handler(cacheCtx, msg)
if err != nil {
break
}

events = append(events, res.GetEvents()...)
}

events = append(events, res.GetEvents()...)
}
}
return err
})

Check warning

Code scanning / CodeQL

Panic in BeginBock or EndBlock consensus methods

Possible panics in BeginBock- or EndBlock-related consensus methods could cause a chain halt
if err != nil {
break
err := ctx.RunAtomic(func(cacheCtx sdk.Context) error {
messages, err := proposal.GetMsgs()

Check warning

Code scanning / CodeQL

Panic in BeginBock or EndBlock consensus methods

Possible panics in BeginBock- or EndBlock-related consensus methods could cause a chain halt
messages, err := proposal.GetMsgs()
if err == nil {
for idx, msg = range messages {
handler := keeper.Router().Handler(msg)

Check warning

Code scanning / CodeQL

Panic in BeginBock or EndBlock consensus methods

Possible panics in BeginBock- or EndBlock-related consensus methods could cause a chain halt
messages, err := proposal.GetMsgs()
if err == nil {
for idx, msg = range messages {
handler := keeper.Router().Handler(msg)

Check warning

Code scanning / CodeQL

Panic in BeginBock or EndBlock consensus methods

Possible panics in BeginBock- or EndBlock-related consensus methods could cause a chain halt
Comment on lines +132 to +136
for key, store := range cms.stores {
stores[key] = store.Clone()
}

Check warning

Code scanning / CodeQL

Iteration over map

Iteration over map may be a possible source of non-determinism
store/cachemulti/store.go Fixed Show fixed Hide fixed
@alexanderbez
Copy link
Contributor

@yihuang please ensure CI is green across the board, perhaps consider leaving this in draft mode until it is?

We can discuss and review this PR collectively in the WG too :)

@yihuang yihuang marked this pull request as draft December 29, 2022 18:40
@yihuang yihuang force-pushed the copy-cache-store branch 3 times, most recently from f454f7a to 9b12aa8 Compare December 30, 2022 02:23
@yihuang yihuang force-pushed the copy-cache-store branch 3 times, most recently from c90c7a3 to acf0ff6 Compare December 30, 2022 03:22
@yihuang yihuang marked this pull request as ready for review December 30, 2022 03:27
Comment on lines +754 to +783
_ = ctx.RunAtomic(func(cacheCtx sdk.Context) error {
results, err := k.doExecuteMsgs(cacheCtx, k.router, proposal, addr, decisionPolicy)
if err != nil {
proposal.ExecutorResult = group.PROPOSAL_EXECUTOR_RESULT_FAILURE
logs = fmt.Sprintf("proposal execution failed on proposal %d, because of error %s", id, err.Error())
k.Logger(ctx).Info("proposal execution failed", "cause", err, "proposalID", id)
return err
}

proposal.ExecutorResult = group.PROPOSAL_EXECUTOR_RESULT_SUCCESS
for _, res := range results {
// NOTE: The sdk msg handler creates a new EventManager, so events must be correctly propagated back to the current context
ctx.EventManager().EmitEvents(res.GetEvents())
}
}

return nil
})

Check warning

Code scanning / gosec

Returned error is not propagated up the stack.

Returned error is not propagated up the stack.
baseapp/baseapp.go Fixed Show fixed Hide fixed
@yihuang yihuang force-pushed the copy-cache-store branch 2 times, most recently from 522bb1e to 4eaafd1 Compare December 30, 2022 04:35
Closes: cosmos#14377

Solution:
- Unify cachekv's cache content with single tidwall/btree.
- Use the copy-on-write supported of tidwall/btree to implement cheap `Clone`/`Restore` methods in cache store.
- Add `RunAtomic` method in Context to provide more efficient store branching out,
  it don't notifies the tracing module because it don't have the `Write` process as before.
- API breaking: Context always hold a `CacheMultiStore` instead of `MultiStore`.
- Refactor runTx and other module logics to use new `RunAtomic` method instead of `CacheContext`.
@yihuang
Copy link
Collaborator Author

yihuang commented Jan 20, 2023

@08d2 I removed the CloneMultiStore() method, and the changes related to it, they'll keep use existing CacheContext, only keep the RunAtomic, appreciate the review suggestions. Let's discuss later should we replace all the use cases of CacheContext and deprecate it completely.

@@ -740,15 +716,15 @@ func (app *BaseApp) runTx(mode runTxMode, txBytes []byte) (gInfo sdk.GasInfo, re
if mode == runTxModeDeliver {
// When block gas exceeds, it'll panic and won't commit the cached store.
consumeBlockGas()

msCache.Write()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this write called now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of RunAtomic, RunAtomic will commit when success, discard when fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunAtomic itself doesn't call Write, as far as I can see?

Copy link
Collaborator Author

@yihuang yihuang Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunAtomic itself doesn't call Write, as far as I can see?

The semantic of commit is implemented in RunAtomic directly, there's no write method with this API.

Copy link
Contributor

@08d2 08d2 Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean. runTx must call msCache.Write, at some point, if mode == runTxModeDeliver. I see that the Write call here is deleted, but I don't see where another Write call is made. Where does that Write happen, now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean. runTx must call msCache.Write, at some point, if mode == runTxModeDeliver. I see that the Write call here is deleted, but I don't see where another Write call is made. Where does that Write happen, now?

msCache is gone as well :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, that's fine, but it doesn't address my point. My point is that runTx operates on a fork of app state, and (assuming mode == runTxModeDeliver) that state must be committed if the runTx operation succeeds, and must be discarded if the runTx operation fails. Where is that logic now?

Comment on lines +151 to +157
for key, store := range cms.stores {
otherStore, ok := ms.stores[key]
if !ok {
panic("Invariant violation: Restore should only be called on a store cloned from itself")
}
store.Restore(otherStore)
}

Check failure

Code scanning / gosec

the value in the range statement should be _ unless copying a map: want: for key := range m

expected exactly 1 statement (either append, delete, or copying to another map) in a range with a map, got 3
Comment on lines +308 to +315
cacheCtx := c.WithCacheMultiStore(c.cms.Clone()).WithEventManager(NewEventManager())
if err := fn(cacheCtx); err != nil {
return err
}

c.cms.Restore(cacheCtx.cms)
c.EventManager().EmitEvents(cacheCtx.EventManager().Events())
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't safe: it's possible for another caller to interact with the context's cms between the fn call and the Restore call in this method, and in that case the mutations of that caller would be overwritten and lost. Equivalently, it's possible for another caller to write to the context's cms after this method calls Restore but before it returns, which would overwrite and lose the mutations of the RunAtomic caller. So this basically doesn't work.

Copy link
Collaborator Author

@yihuang yihuang Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consensus state machine runs sequentially, that's the implicit assumption here.
Queries runs concurrently with each other, but each query itself runs sequentially, and each query will have it's own branched out cache store at the beginning of the life cycle.
So basically the cache store is not access concurrently at all inside the SDK, the mutex is mainly to keep it backward compatible, some third-party use the cache store in a concurrent way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A context is not guaranteed to be used exclusively in a single-threaded manner by the consensus state machine, right? A context can absolutely be accessed by concurrent callers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A context is not guaranteed to be used exclusively in a single-threaded manner by the consensus state machine, right? A context can absolutely be accessed by concurrent callers.

I think that's the case in current sdk and our chain, but I can't speak for the other chains though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter how a package happens to be used by consumers right now, what matters is how the package can possibly be used by any consumer in the future. If a package has a Foo method that takes an int, and all current consumers happen to call Foo with a value of 0, that doesn't mean that Foo can assume it will be called with a value of 0, it has to deal with any valid int. Similarly just because a context happens to be called in a single threaded way by the chains you care about, that doesn't mean that the context can assume that it will always be called in a single threaded way by all consumers.

Copy link
Collaborator Author

@yihuang yihuang Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller can decide if his case is concurrent or not, the callback work on a branched out ctx, it's ok for the callback to do concurrency inside. So for example, it's safe to call this function in runTx, while it's ok for the message handler to do concurrency inside, does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yihuang is right -- there isn't the capability to do this in the SDK. We are planning on it and when we do, this will be taken into account.

Copy link
Contributor

@08d2 08d2 Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the callback work on a branched out ctx

That branched context is the result of calling c.cms.Clone, but if the "atomic" transaction succeeds, you then call c.cms.Restore, which mutates the original c.cms value. And because c.cms is an interface, and therefore has reference semantics, the Restore call can happen concurrently with any other operation on that cms value. But CacheMultiStore doesn't say that its implementations need to be safe for concurrent use. So any other goroutine with a context that has the same CacheMultiStore can potentially do a read while that Restore is doing a write. That's a data race, and as long as the CacheMultiStore is not documented as being safe for concurrent use, preventing that data race is the responsibility of the context and its methods.

it's ok for the callback to do concurrency inside

As the type exists today, no, it actually isn't.

The CacheMultiStore type doesn't document that it is safe for concurrent use. That means that any code which uses a CacheMultiStore, and can be called concurrently, must serialize access to those stores.

An SDK context's methods can be called concurrently. This isn't a point of debate: the context type is exported, and its methods are part of the SDK's public API. That means callers can call its types and methods however they like. If the context type were documented to make it clear that it isn't safe for concurrent use, that would be a different story, of course. But that documentation doesn't exist, and so that assumption can't be assumed.

there isn't the capability to do this in the SDK. We are planning on it and when we do, this will be taken into account

The SDK absolutely provides this capability, simply due to the fact that the context type is exported and usable by downstream consumers. If the SDK relies on an assumption that context methods will not be called concurrently, that assumption must be documented.

Copy link
Collaborator Author

@yihuang yihuang Jan 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, sdk has rarely made it clear in doc if a method is thread-safe or not, our assumption has always been defaulting to unsafe when unclear.
RunAtomic is not thread-safe when calling on the same ctx, no doubt about that, so the issue is improving the doc to make that clearer, or do you think we shouldn't put it as a context method?
Also a naming suggestion to it is welcome, I don't know if "atomic" is the best word for it, what I try to say is the state changes made by the callback are either all committed or all discarded.

Comment on lines +151 to +157
for key, store := range cms.stores {
otherStore, ok := ms.stores[key]
if !ok {
panic("Invariant violation: Restore should only be called on a store cloned from itself")
}
store.Restore(otherStore)
}

Check warning

Code scanning / CodeQL

Iteration over map

Iteration over map may be a possible source of non-determinism
Comment on lines +57 to +65
// Restore restores the store cache to a given snapshot.
func (store *Store) Restore(s types.CacheKVStore) {
cache := s.(*Store).swapCache()

store.mtx.Lock()
defer store.mtx.Unlock()

store.cache = cache
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now means that calling a.Restore(b) invalidates/clears the state of b. That's surprising, potentially breaks existing code, and isn't part of the documentation of the Restore method! 😬

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can make it clearer in the documentation later, that's most common use case that the snapshot is not reused anymore, to reuse it user can do a clone explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's reasonable for a method called Restore to destroy the state of its parameter as a side effect. But, whatever.

Comment on lines +699 to +705
store := cachekv.NewStore(mem)
store1 := store.Clone()
for j := 0; j < 10; j++ {
store1.Set(sdk.Uint64ToBigEndian(uint64(i+j)), []byte{byte(i)})
}
store.Restore(store1)
store.Write()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark allocates a new store, clones that store, sets a few values in the clone, restores the initial store from the clone, and the persists the initial store to an underlying persistence layer, with each iteration of b.N. That's an enormous amount of work which is dominated by allocations. You're not getting any meaningful information out of this benchmark.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just simulate a common path on real world case, two layers of cache store, a bunch of writes, and commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but if you do that work as part of a benchmark, the benchmark isn't providing any useful information, because that "common path on real world case" is dominated by external factors unrelated to the code in the package.

Copy link
Collaborator Author

@yihuang yihuang Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shows this code path get faster or slower, if it can be reproduced stablely, it must be because of the new code changes in this repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not show that, not really, no. The benchmark is measuring allocation performance, whatever changes are made to the code in the package are super tiny in comparison to the other work happening.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean allocation speed can increase 36% stablely and magically for reasons unrelated to the code changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this benchmark is showing the performance of a common pattern used in the SDK, which is pretty useful, even if it is influenced heavily by external factors in a production environment, having insight into this is still useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty simple, the code in the b.N loop is allocating a new store, cloning it, and then calling restore, in each iteration. The overhead of those things dominate whatever you're trying to measure here.

@tac0turtle
Copy link
Member

Been thinking about this pr. i forgot who, but someone mentioned how its api breaking in a decently sized manner and if we want to push this breakage onto users now when we may push another store breaking change on them in 2-4 months. Would like to hear from others if we think this is fine or we should aim for a single api break change instead of two with in a 2-4 month time period

@yihuang
Copy link
Collaborator Author

yihuang commented Jan 27, 2023

we may push another store breaking change on them in 2-4 months

do you mean a bigger refactoring of cache store?

@alexanderbez
Copy link
Contributor

we may push another store breaking change on them in 2-4 months

do you mean a bigger refactoring of cache store?

Yeah might be best to hold off until we have a more clear picture

@tac0turtle
Copy link
Member

Hey, sorry for the long delay. Lets close this pr for now, and then talk about it in the next storage working group call. With the adr being open for storage, id hate to break the api on users twice instead of once. Sorry again for the long delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EPIC: CacheKV Improvements
7 participants