Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ledger block cache #4303

Merged
merged 2 commits into from
Sep 2, 2023
Merged

Ledger block cache #4303

merged 2 commits into from
Sep 2, 2023

Conversation

yacovm
Copy link
Contributor

@yacovm yacovm commented Jun 30, 2023

This commit introduces an in-memory cache for the block storage of the ledger. It caches new blocks that are committed and assumes blocks are committed in-order and with consecutive sequences.

The block iterators now attempt to retrieve the blocks from the cache if possible before going to the block storage.

The intent is twofold:

  1. Speedup the block Deliver API by not doing disk I/O when clients (peers, orderers) fetch blocks.

  2. Reduce the impact of the deliver API on the performance of writing new blocks into the ledger.

@yacovm yacovm requested a review from a team as a code owner June 30, 2023 19:27
Copy link
Contributor

@cendhu cendhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful optimisation.

common/ledger/blkstorage/cache.go Outdated Show resolved Hide resolved
common/ledger/blkstorage/cache.go Outdated Show resolved Hide resolved
common/ledger/blkstorage/cache.go Outdated Show resolved Hide resolved
Comment on lines 63 to 77
if c.maxSeq > seq {
return
}

// Insert the block to the cache
c.maxSeq = seq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if seq can be lower than the maxSeq, it can be higher too. It may be good to check if c.maxSeq + 1 == seq too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's higher it means we have appended a block out of order. If we did that, we should panic here, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at the new version I pushed. It protects against all cases except where the cache is not initialized. I did not want to initialize the cache with the latest block number for simplicity reasons.

common/ledger/blkstorage/blockfile_mgr.go Outdated Show resolved Hide resolved
Comment on lines +16 to +18
const (
estimatedBlockSize = 512 * 1024
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can get rid of this to keep it simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not keep magic numbers if possible :-)

Comment on lines 81 to 86
if itr.cachedPrev {
itr.cachedPrev = false
if itr.stream != nil {
itr.stream.close()
itr.stream = nil
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a rare corner case where we might be opening and closing the stream too frequently. This depends on the blocks present in the cache and the requested blocks.

Example scenario: The cache currently contains blocks 100 to 199 (with a cache size of 100 blocks), and the iterator is reading from block 99 to 200 while new blocks are being added to the cache. In this situation, we may encounter cache hits for some blocks and cache misses for others, resulting in the opening and closing of the stream.

It would be beneficial if we could incorporate a Seek(fileNum, offset) function into the existing stream instead of closing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way to open the stream and then close it, is if you retrieved a block from the cache and then you got a cache miss.

If you retrieve a block from the cache and then get a cache miss, it means you were pulling the "tail" of the cache (since it is continuous) and then the ledger is appended with so many transactions per second that the client that pulls the transactions can't keep up with the speed of blocks being appended to the ledger.

If the client is the peer or something equivalent to the peer that processed transactions, and it can't keep up with the ordering service, then we have a much bigger problem here as the effective latency of transactions will explode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of completeness - This could happen in the normal scenario as well as you first append the block to the file and then add it to the cache. Occasionally, the iterator may keep oscillating between the file and the cache. But, yes, that's not expected to be frequent at the regular transaction rates.

However, one note on code readability here. Why do you have this if itr.cachedPrev { block? Can't you simply close the stream inside the if existsInCache { block and get rid of cachedPrev concept altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

common/ledger/blkstorage/cache_test.go Outdated Show resolved Hide resolved
common/ledger/blkstorage/cache.go Outdated Show resolved Hide resolved
common/ledger/blkstorage/cache.go Outdated Show resolved Hide resolved
@yacovm yacovm force-pushed the fastLedger branch 3 times, most recently from 5d7350a to bdac9cb Compare July 22, 2023 11:22
Copy link
Contributor

@manish-sethi manish-sethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments; Otherwise LGTM.

Just curious, have you measured the improvement because of this change? I am raising this because, for the most recent blocks, the blocks mostly come from the filesystem cache. Moreover, the underlying iterator uses a buffered I/O (the default value is 4K, but can be increased). So, basically what this cache saves is one de-serialization; unlike what you mentioned in the commit message about disk IO.

common/ledger/blkstorage/blockfile_mgr.go Outdated Show resolved Hide resolved
Comment on lines 81 to 86
if itr.cachedPrev {
itr.cachedPrev = false
if itr.stream != nil {
itr.stream.close()
itr.stream = nil
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of completeness - This could happen in the normal scenario as well as you first append the block to the file and then add it to the cache. Occasionally, the iterator may keep oscillating between the file and the cache. But, yes, that's not expected to be frequent at the regular transaction rates.

However, one note on code readability here. Why do you have this if itr.cachedPrev { block? Can't you simply close the stream inside the if existsInCache { block and get rid of cachedPrev concept altogether?

This commit introduces an in-memory cache for the block storage of the ledger.
It caches new blocks that are committed and assumes blocks are committed in-order and with consecutive sequences.

The block iterators now attempt to retrieve the blocks from the cache if possible before going to the block storage.

The intent is twofold:
1) Speedup the block Deliver API by not doing disk I/O when clients (peers, orderers) fetch blocks.
2) Reduce the impact of the deliver API from writing new blocks into the ledger.

Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>
Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>
@manish-sethi manish-sethi merged commit 8a33e5f into hyperledger:main Sep 2, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants