Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass unreconciled missingData info to ledger from reconciler #1797

Merged
merged 1 commit into from
Sep 3, 2020

Conversation

cendhu
Copy link
Contributor

@cendhu cendhu commented Aug 24, 2020

Type of change

  • New feature

Description

Reconciler needs to pass the unreconciled missing data info to the ledger so that the ledger can deprioritize these unreconcilable data.

@cendhu cendhu requested a review from a team as a code owner August 24, 2020 14:18
@cendhu cendhu force-pushed the pass-deprioritizedList branch 2 times, most recently from b5cd580 to 4216698 Compare August 26, 2020 14:34
@cendhu cendhu changed the title [WIP] Pass unreconciled missingData info to ledger from reconciler Pass unreconciled missingData info to ledger from reconciler Aug 26, 2020
Copy link
Contributor

@denyeart denyeart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested e2e scenarios with this PR and it is working as intended. The deprioritized missing data is not getting in the way of new reconciliations. And the deprioritized missing data eventually gets reconciled on the 1000th cycle.
Just a couple nit comments...

@@ -140,6 +143,7 @@ func (p *Provider) OpenStore(ledgerid string) (*Store, error) {
return nil, err
}
s.launchCollElgProc()
s.deprioritizedMissingDataPeriodicity = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this a configurable value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just that we have too many configuration parameters already and even for the performance benchmarking, I do not use anything other than disabling certain gossip features when not needed. For example, we allow historyDB to be disabled. I don't think many aware of this though it is present in the config. When there is too much, even important things get lost. Our configuration space is vast. Another problem it creates is that people really need to understand some internal details or peer implementation to configure this value.

Having said that, I understand why you are asking. Let's discuss this during the scrum today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed this response last week... I also wish we didn't have so many config options in general, however I feel this one is important for users... with all the defaults taken, every 1000th reconciliation cycle would be once every 16 hours. Furthermore, it resets after every peer restart. So if a user fixed their missing private data, they would have to wait patiently for 16 hours before it would get tried again, and hope that peer doesn't get restarted in the meantime. This type of user may prefer the deprioritized reconciliation it to be hourly, while another user with private data missing forever may never want the reconciliation attempted, and therefore they'd want to set it to a very high value. With such different and unpredictable usages, I think we need it to be configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we can make the default value to 100 from 1000. For both the above scenarios, giving a chance to deprioritized missing data every hour seems adequate. When we want to change the deprioritization logic in the future (with multiple buckets), this default might vary for different buckets too. If we at least have a way to configure parameters dynamically without stopping the peer, I would have been all in.

Based on the mailing list questions and rocket chat questions, I strongly believe that configurations are very much under-utilized and less useful (other than enabling/disabling certain gossip features, a few ledger related parameters). If you think that our own deployment requires this to be configurable for some reason and they would be okay to restart the peer after every reconfiguration, I will add a config parameter in the core.yaml. Otherwise, I would prefer to wait to receive some feedback and then decide on making it configurable.

When we do that, I think that it is necessary to allow the user to query the number of missing entries in the prioritized/deprioritized list so that they can decide whether they need to update the configuration now or later, etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I have reduced this value to 100 from 1000.

@@ -403,16 +412,27 @@ func (s *Store) GetMissingPvtDataInfoForMostRecentBlocks(maxBlock int) (ledger.M
return nil, nil
}

if s.iterSinceDeprioMissingDataAccess == s.deprioritizedMissingDataPeriodicity {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a debug message stating that this is a deprioritized reconciliation cycle.
If nothing else, it helps with my own trials when I am investigating reconciliations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@cendhu
Copy link
Contributor Author

cendhu commented Aug 28, 2020

I've tested e2e scenarios with this PR and it is working as intended. The deprioritized missing data is not getting in the way of new reconciliations. And the deprioritized missing data eventually gets reconciled on the 1000th cycle.
Just a couple nit comments...

Thanks, Dave, for verifying this.

Signed-off-by: senthil <cendhu@gmail.com>
@manish-sethi manish-sethi merged commit 75d4c8d into hyperledger:master Sep 3, 2020
cendhu added a commit to cendhu/fabric that referenced this pull request Sep 9, 2020
denyeart pushed a commit that referenced this pull request Sep 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants