-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ledger block cache #4303
Ledger block cache #4303
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a useful optimisation.
common/ledger/blkstorage/cache.go
Outdated
if c.maxSeq > seq { | ||
return | ||
} | ||
|
||
// Insert the block to the cache | ||
c.maxSeq = seq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if seq
can be lower than the maxSeq
, it can be higher too. It may be good to check if c.maxSeq + 1 == seq
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's higher it means we have appended a block out of order. If we did that, we should panic here, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at the new version I pushed. It protects against all cases except where the cache is not initialized. I did not want to initialize the cache with the latest block number for simplicity reasons.
const ( | ||
estimatedBlockSize = 512 * 1024 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can get rid of this to keep it simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not keep magic numbers if possible :-)
if itr.cachedPrev { | ||
itr.cachedPrev = false | ||
if itr.stream != nil { | ||
itr.stream.close() | ||
itr.stream = nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a rare corner case where we might be opening and closing the stream too frequently. This depends on the blocks present in the cache and the requested blocks.
Example scenario: The cache currently contains blocks 100 to 199 (with a cache size of 100 blocks), and the iterator is reading from block 99 to 200 while new blocks are being added to the cache. In this situation, we may encounter cache hits for some blocks and cache misses for others, resulting in the opening and closing of the stream.
It would be beneficial if we could incorporate a Seek(fileNum, offset)
function into the existing stream instead of closing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way to open the stream and then close it, is if you retrieved a block from the cache and then you got a cache miss.
If you retrieve a block from the cache and then get a cache miss, it means you were pulling the "tail" of the cache (since it is continuous) and then the ledger is appended with so many transactions per second that the client that pulls the transactions can't keep up with the speed of blocks being appended to the ledger.
If the client is the peer or something equivalent to the peer that processed transactions, and it can't keep up with the ordering service, then we have a much bigger problem here as the effective latency of transactions will explode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the sake of completeness - This could happen in the normal scenario as well as you first append the block to the file and then add it to the cache. Occasionally, the iterator may keep oscillating between the file and the cache. But, yes, that's not expected to be frequent at the regular transaction rates.
However, one note on code readability here. Why do you have this if itr.cachedPrev {
block? Can't you simply close the stream inside the if existsInCache {
block and get rid of cachedPrev
concept altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5d7350a
to
bdac9cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments; Otherwise LGTM.
Just curious, have you measured the improvement because of this change? I am raising this because, for the most recent blocks, the blocks mostly come from the filesystem cache. Moreover, the underlying iterator uses a buffered I/O (the default value is 4K, but can be increased). So, basically what this cache saves is one de-serialization; unlike what you mentioned in the commit message about disk IO.
if itr.cachedPrev { | ||
itr.cachedPrev = false | ||
if itr.stream != nil { | ||
itr.stream.close() | ||
itr.stream = nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the sake of completeness - This could happen in the normal scenario as well as you first append the block to the file and then add it to the cache. Occasionally, the iterator may keep oscillating between the file and the cache. But, yes, that's not expected to be frequent at the regular transaction rates.
However, one note on code readability here. Why do you have this if itr.cachedPrev {
block? Can't you simply close the stream inside the if existsInCache {
block and get rid of cachedPrev
concept altogether?
This commit introduces an in-memory cache for the block storage of the ledger. It caches new blocks that are committed and assumes blocks are committed in-order and with consecutive sequences. The block iterators now attempt to retrieve the blocks from the cache if possible before going to the block storage. The intent is twofold: 1) Speedup the block Deliver API by not doing disk I/O when clients (peers, orderers) fetch blocks. 2) Reduce the impact of the deliver API from writing new blocks into the ledger. Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>
Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>
This commit introduces an in-memory cache for the block storage of the ledger. It caches new blocks that are committed and assumes blocks are committed in-order and with consecutive sequences.
The block iterators now attempt to retrieve the blocks from the cache if possible before going to the block storage.
The intent is twofold:
Speedup the block Deliver API by not doing disk I/O when clients (peers, orderers) fetch blocks.
Reduce the impact of the deliver API on the performance of writing new blocks into the ledger.