Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix missing err check in the commit path #1543

Merged
merged 1 commit into from
Jul 16, 2020

Conversation

cendhu
Copy link
Contributor

@cendhu cendhu commented Jul 7, 2020

Type of change

  • Bug fix

Description

This PR addresses some of the missing err check in the ledger commit path.

Additional details

A careful analysis must be done to ensure that this PR is not creating a possibility of a fork.

Signed-off-by: senthil <cendhu@gmail.com>
@cendhu cendhu marked this pull request as ready for review July 15, 2020 14:09
@cendhu cendhu requested a review from a team as a code owner July 15, 2020 14:09
@@ -67,7 +67,9 @@ func (combiner *itrCombiner) Next() (commonledger.QueryResult, error) {
}
}
kv := combiner.kvAt(smallestHolderIndex)
combiner.moveItrAndRemoveIfExhausted(smallestHolderIndex)
if _, err := combiner.moveItrAndRemoveIfExhausted(smallestHolderIndex); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an err can happen here due leveldb internal error (i.e., iter.Next()). As this error is related to the system resources and does not change depending on the block content, there is no possibility of a fork.

@@ -106,7 +106,10 @@ func (helper *RangeQueryResultsHelper) Done() ([]*kvrwset.KVRead, *kvrwset.Query
return helper.pendingResults, nil, err
}
}
helper.mt.done()
if err := helper.mt.done(); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An err can happen here only when the hash computation returns an error., i.e., either GetHash() or hash.Write(). Given that this error would not change with respect to the block content, no possibility of a fork.

helper.mt.update(hash)
return nil
// TODO: Need to analyze any fork possibilities.
return helper.mt.update(hash)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the helper.mt.done()

@@ -106,7 +106,9 @@ func NewLockBasedTxMgr(initializer *Initializer) (*LockBasedTxMgr, error) {
return nil, errors.New("create new lock based TxMgr failed: passed in nil ledger hasher")
}

initializer.DB.Open()
if err := initializer.DB.Open(); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open() always return a nil and it is a nop. We can remove this API in a separate PR.

@@ -119,7 +119,10 @@ func (v *rangeQueryHashValidator) validate() (bool, error) {
return equals, nil
}
versionedKV := result.(*statedb.VersionedKV)
v.resultsHelper.AddResult(rwsetutil.NewKVRead(versionedKV.Key, versionedKV.Version))
if err := v.resultsHelper.AddResult(rwsetutil.NewKVRead(versionedKV.Key, versionedKV.Version)); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An err can happen here only if either the proto marshal fails or the hash function returns an error. As the validator reads the value from the DB, the content of the block should not change the result. Hence, there is no possibility of a fork here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll summarize it a bit differently... Though, practically its a remote chance, but theoretically, this error miss could already cause a fork, as per the existing code. Because, the peer that would miss this error will continue with evaluating the results further and will eventually find the query results different and hence will mark the transaction invalid. Now, this change, instead of marking the transaction invalid, will cause a panic in upstream - So, this should be an acceptable change without an additional capability check.

On a side note, having said the above, I see the possibility of this error getting generated only if someone is using HSM for hash computation. Because, golang's crypto hash library does not return error. Also, as you mentioned the other possibility of the error is the proto marshal down stream, which is only possible if there would have been a programming error in our code, potentially passing nil (which is not happening here)

l.pvtdataStore.ResetLastUpdatedOldBlocksList()

return nil
return l.pvtdataStore.ResetLastUpdatedOldBlocksList()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An err can occur due to a leveldb internal error. Further, this is part of the peer recovery code and executes during the peer startup. Hence, it is safe to process this error and it would not create any fork or make difficulties in joining a new peer (which would fetch and process all blocks). A leveldb error during the peer startup would result in a peer panic and is the expected behavior.

@@ -78,7 +78,9 @@ func (p *PurgeMgr) UpdateExpiryInfoOfPvtDataOfOldBlocks(pvtUpdates *privacyenabl
builder := newExpiryScheduleBuilder(p.btlPolicy)
pvtUpdateCompositeKeyMap := pvtUpdates.ToCompositeKeyMap()
for k, vv := range pvtUpdateCompositeKeyMap {
builder.add(k.Namespace, k.CollectionName, k.Key, util.ComputeStringHash(k.Key), vv)
if err := builder.add(k.Namespace, k.CollectionName, k.Key, util.ComputeStringHash(k.Key), vv); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An err can occur here while retrieving collection info (NoSuchCollectionError or DB read error). At this point, NoSuchCollectionError error would not occur because the private data coordinator return err (which eventually results in peer panic) if a collection does not exist. Hence, an err can happen here only due to DB access. For DB access error, peer panic would occur, and no forks possibility as such.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that you mistakenly linked this function to the regular commit path. This function is for old missing pvt data which is in the reconciliation path and does cause a panic but rather retries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Correct. My mistake.

@@ -78,12 +78,16 @@ func prepareExpiryEntries(committingBlk uint64, dataEntries []*dataEntry, missin

// 1. prepare expiryData for non-missing data
for _, dataEntry := range dataEntries {
prepareExpiryEntriesForPresentData(mapByExpiringBlk, dataEntry.key, btlPolicy)
if err := prepareExpiryEntriesForPresentData(mapByExpiringBlk, dataEntry.key, btlPolicy); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An err can occur here only if the collection does not exist or if DB access results in an error. As NoSuchCollectionError is not possible here, it is safe to process DB access error and it would not cause a fork.

}

// 2. prepare expiryData for missing data
for missingDataKey := range missingDataEntries {
prepareExpiryEntriesForMissingData(mapByExpiringBlk, &missingDataKey, btlPolicy)
if err := prepareExpiryEntriesForMissingData(mapByExpiringBlk, &missingDataKey, btlPolicy); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@@ -776,7 +776,9 @@ func (s *Store) purgeExpiredData(minBlkNum, maxBlkNum uint64) error {
for _, missingDataKey := range missingDataKeys {
batch.Delete(encodeMissingDataKey(missingDataKey))
}
s.db.WriteBatch(batch, false)
if err := s.db.WriteBatch(batch, false); err != nil {
return err
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only results in a warning message.

@manish-sethi
Copy link
Contributor

@cendhu - thanks for catching these. I am still looking at 2/3 of these more closely to understand the fallout a bit better.
Also, I guess we should perhaps add errcheck linter in CI.

Copy link
Contributor

@manish-sethi manish-sethi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to take notes, I have made a few representative comments. The rest of the errors miss cases are either of these (or not affecting the critical path)

@@ -18,8 +18,9 @@ import (
func prepareTxOps(rwset *rwsetutil.TxRwSet, txht *version.Height,
precedingUpdates *publicAndHashUpdates, db *privacyenabledstate.DB) (txOps, error) {
txops := txOps{}
txops.applyTxRwset(rwset)
//logger.Debugf("prepareTxOps() txops after applying raw rwset=%#v", spew.Sdump(txops))
if err := txops.applyTxRwset(rwset); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather, in the current code, the error miss could have caused a state fork (because of missing applying the metadata). Now, instead, this would cause panic in the upstream code instead of committing the transaction without metadata - So, safe and desired change as such.

However, this error should never happen, as this is mainly a programming error (e.g., passing nil to proto.Marshal.

@@ -78,7 +78,9 @@ func (p *PurgeMgr) UpdateExpiryInfoOfPvtDataOfOldBlocks(pvtUpdates *privacyenabl
builder := newExpiryScheduleBuilder(p.btlPolicy)
pvtUpdateCompositeKeyMap := pvtUpdates.ToCompositeKeyMap()
for k, vv := range pvtUpdateCompositeKeyMap {
builder.add(k.Namespace, k.CollectionName, k.Key, util.ComputeStringHash(k.Key), vv)
if err := builder.add(k.Namespace, k.CollectionName, k.Key, util.ComputeStringHash(k.Key), vv); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that you mistakenly linked this function to the regular commit path. This function is for old missing pvt data which is in the reconciliation path and does cause a panic but rather retries.

@@ -228,7 +234,9 @@ func (v *validator) validateRangeQuery(ns string, rangeQueryInfo *kvrwset.RangeQ
logger.Debug(`Hashing results are not present in the range query info hence, initiating raw KVReads based validation`)
qv = &rangeQueryResultsValidator{}
}
qv.init(rangeQueryInfo, combinedItr)
if err := qv.init(rangeQueryInfo, combinedItr); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize it in simple terms, this does not change the current behavior but rather returns the error early on.

@manish-sethi manish-sethi merged commit 329c858 into hyperledger:master Jul 16, 2020
cendhu added a commit to cendhu/fabric that referenced this pull request Sep 8, 2020
mastersingh24 pushed a commit that referenced this pull request Sep 9, 2020
Signed-off-by: senthil <cendhu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants