Ledger block cache #4303

yacovm · 2023-06-30T19:27:43Z

This commit introduces an in-memory cache for the block storage of the ledger. It caches new blocks that are committed and assumes blocks are committed in-order and with consecutive sequences.

The block iterators now attempt to retrieve the blocks from the cache if possible before going to the block storage.

The intent is twofold:

Speedup the block Deliver API by not doing disk I/O when clients (peers, orderers) fetch blocks.
Reduce the impact of the deliver API on the performance of writing new blocks into the ledger.

cendhu

This is a useful optimisation.

common/ledger/blkstorage/cache.go

cendhu · 2023-07-04T10:39:52Z

common/ledger/blkstorage/cache.go

+	if c.maxSeq > seq {
+		return
+	}
+
+	// Insert the block to the cache
+	c.maxSeq = seq


if seq can be lower than the maxSeq, it can be higher too. It may be good to check if c.maxSeq + 1 == seq too

If it's higher it means we have appended a block out of order. If we did that, we should panic here, no?

Please look at the new version I pushed. It protects against all cases except where the cache is not initialized. I did not want to initialize the cache with the latest block number for simplicity reasons.

common/ledger/blkstorage/blockfile_mgr.go

cendhu · 2023-07-04T14:58:07Z

common/ledger/blkstorage/cache.go

+const (
+	estimatedBlockSize = 512 * 1024
+)


Maybe we can get rid of this to keep it simple.

I would rather not keep magic numbers if possible :-)

cendhu · 2023-07-18T12:59:39Z

common/ledger/blkstorage/blocks_itr.go

+	if itr.cachedPrev {
+		itr.cachedPrev = false
+		if itr.stream != nil {
+			itr.stream.close()
+			itr.stream = nil
+		}
+	}


There is a rare corner case where we might be opening and closing the stream too frequently. This depends on the blocks present in the cache and the requested blocks.

Example scenario: The cache currently contains blocks 100 to 199 (with a cache size of 100 blocks), and the iterator is reading from block 99 to 200 while new blocks are being added to the cache. In this situation, we may encounter cache hits for some blocks and cache misses for others, resulting in the opening and closing of the stream.

It would be beneficial if we could incorporate a Seek(fileNum, offset) function into the existing stream instead of closing it.

The only way to open the stream and then close it, is if you retrieved a block from the cache and then you got a cache miss.

If you retrieve a block from the cache and then get a cache miss, it means you were pulling the "tail" of the cache (since it is continuous) and then the ledger is appended with so many transactions per second that the client that pulls the transactions can't keep up with the speed of blocks being appended to the ledger.

If the client is the peer or something equivalent to the peer that processed transactions, and it can't keep up with the ordering service, then we have a much bigger problem here as the effective latency of transactions will explode.

Just for the sake of completeness - This could happen in the normal scenario as well as you first append the block to the file and then add it to the cache. Occasionally, the iterator may keep oscillating between the file and the cache. But, yes, that's not expected to be frequent at the regular transaction rates.

However, one note on code readability here. Why do you have this if itr.cachedPrev { block? Can't you simply close the stream inside the if existsInCache { block and get rid of cachedPrev concept altogether?

common/ledger/blkstorage/cache_test.go

common/ledger/blkstorage/cache.go

manish-sethi

I left a couple of comments; Otherwise LGTM.

Just curious, have you measured the improvement because of this change? I am raising this because, for the most recent blocks, the blocks mostly come from the filesystem cache. Moreover, the underlying iterator uses a buffered I/O (the default value is 4K, but can be increased). So, basically what this cache saves is one de-serialization; unlike what you mentioned in the commit message about disk IO.

common/ledger/blkstorage/blockfile_mgr.go

manish-sethi · 2023-09-01T23:36:10Z

common/ledger/blkstorage/blocks_itr.go

+	if itr.cachedPrev {
+		itr.cachedPrev = false
+		if itr.stream != nil {
+			itr.stream.close()
+			itr.stream = nil
+		}
+	}


Just for the sake of completeness - This could happen in the normal scenario as well as you first append the block to the file and then add it to the cache. Occasionally, the iterator may keep oscillating between the file and the cache. But, yes, that's not expected to be frequent at the regular transaction rates.

However, one note on code readability here. Why do you have this if itr.cachedPrev { block? Can't you simply close the stream inside the if existsInCache { block and get rid of cachedPrev concept altogether?

This commit introduces an in-memory cache for the block storage of the ledger. It caches new blocks that are committed and assumes blocks are committed in-order and with consecutive sequences. The block iterators now attempt to retrieve the blocks from the cache if possible before going to the block storage. The intent is twofold: 1) Speedup the block Deliver API by not doing disk I/O when clients (peers, orderers) fetch blocks. 2) Reduce the impact of the deliver API from writing new blocks into the ledger. Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>

Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>

yacovm requested a review from a team as a code owner June 30, 2023 19:27

cendhu reviewed Jul 18, 2023

View reviewed changes

pfi79 reviewed Jul 18, 2023

View reviewed changes

common/ledger/blkstorage/cache.go Show resolved Hide resolved

yacovm force-pushed the fastLedger branch 3 times, most recently from 5d7350a to bdac9cb Compare July 22, 2023 11:22

manish-sethi reviewed Sep 1, 2023

View reviewed changes

yacovm added 2 commits September 2, 2023 12:17

Address comments from Manish

ffcb97c

Signed-off-by: Yacov Manevich <yacov.manevich@ibm.com>

yacovm force-pushed the fastLedger branch from bdac9cb to ffcb97c Compare September 2, 2023 10:23

manish-sethi approved these changes Sep 2, 2023

View reviewed changes

manish-sethi merged commit 8a33e5f into hyperledger:main Sep 2, 2023
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ledger block cache #4303

Ledger block cache #4303

yacovm commented Jun 30, 2023 •

edited

Loading

cendhu left a comment

cendhu Jul 4, 2023

yacovm Jul 19, 2023

yacovm Jul 20, 2023

cendhu Jul 4, 2023

yacovm Jul 20, 2023

cendhu Jul 18, 2023

yacovm Jul 19, 2023

manish-sethi Sep 1, 2023

yacovm Sep 2, 2023

manish-sethi left a comment

manish-sethi Sep 1, 2023

Ledger block cache #4303

Ledger block cache #4303

Conversation

yacovm commented Jun 30, 2023 • edited Loading

cendhu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manish-sethi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yacovm commented Jun 30, 2023 •

edited

Loading