Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

Closed
2 tasks
rphmeier opened this issue Jan 28, 2022 · 8 comments · Fixed by #6512
Closed
2 tasks

Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

rphmeier opened this issue Jan 28, 2022 · 8 comments · Fixed by #6512
Assignees
Labels
I4-annoyance Code behaves within expectations, however this “expected behaviour” itself is at issue. T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.

Comments

@rphmeier
Copy link
Contributor

rphmeier commented Jan 28, 2022

Currently, the chain-selection subsystem only reverts a block B when one of its descendants B' issues a Revert log in its header targeting B. In practice, this happens when B' receives the dispute statements necessary to conclude a dispute against some parachain candidate contained within B. And it also means that nodes would be willing to build upon B long after they have the evidence to prove that B is garbage.

This behaves very inefficiently in the presence of many disputes. In such cases, nodes will abandon one chain only to build upon alternative blocks that they very well know contain garbage. This can lead to a lot of churn in the growth of the chain and even chain stalls when session lengths are too low (encountering the dreaded 'Unexpected Epoch Change'). Nodes are stupid. Let's be smarter.

The solution is to plug the dispute coordinator more directly into chain-selection. When a dispute is concluded against a candidate, a few things need to happen:

  1. We need to notify chain-selection that a dispute has just concluded against a candidate (and in which session)
  2. We need chain-selection to discover all the blocks that contain the candidate. This can be fairly simply done with the addition of a new runtime API fn included_at(CandidateHash) -> Option<BlockNumber> which just accesses the Disputes::Included storage map. We can invoke this for all the viable leaves in chain-selection. If we don't want to muck around with a runtime API, we can also probably just dig in the approval-voting DB. That might be easier.
  3. We need chain-selection to revert all those blocks with something similar to chain-selection::tree::apply_reversions.
  4. [OPTIONAL] with Integrate ChainSelectionSubsystem fork-choice into Substrate's fork-choice polkadot-sdk#875 update the best block in Substrate's Client.

This will make Revert logs more or less obsolete. They're still good as a last-resort mechanism, but by doing this work off-chain nodes will much more eagerly abandon disputed forks.

  • Implement
  • Clearly document in code/guide(probably both) the two remaining reasons for vote import on chain:
    • Slashing
    • Chain Scraping: By importing votes on chain, we enable other validators to scrape votes from chain and thus learn about them, this is in addition to dispute-distribution. I keep forgetting about this reason, hence I would like to have it documented as it changes reason on when/how important chain import actually is.E.g. without this there would be a good reason to prioritize concluded disputes on import, over ongoing ones. With this, ongoing disputes are actually also quite important.
@rphmeier rphmeier added the I4-annoyance Code behaves within expectations, however this “expected behaviour” itself is at issue. label Jan 28, 2022
@eskimor
Copy link
Member

eskimor commented Jan 28, 2022

Note about that: While working through spam considerations, @drahnr and I decided that we should be smarter with dispute imports. E.g. Have a runtime API to ask the chain what disputes it already knows about and only try to import disputes it does not yet know about. With this together with ordered disputes, in practice the building on alternative blocks will mean importing the relevant dispute and issuing a revert log on that chain as well. We should therefore only create one more block on each fork, ensuring all forks are marked as invalid.

That does not sound too bad, in particular I can't quite follow how this can lead to the dreaded unexpected epoch change, assuming the number of considered forks is reasonable limited.

@rphmeier
Copy link
Contributor Author

That is orthogonal. This issue was created in reaction to an incident where an alternative block was built and included a bad parachain candidate. Within 2 seconds of the block being built, all nodes had concluded that it was bad. But 10 minutes later as the result of other reversions, nodes were fine with building on top of this block and reverting it only then.

@rphmeier
Copy link
Contributor Author

With #5048 the included_at runtime API can be added as a staging_included_at.

@eskimor
Copy link
Member

eskimor commented Apr 26, 2022

@tdimitrov that's the issue we were talking about.

@ordian ordian added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Aug 16, 2022
@eskimor
Copy link
Member

eskimor commented Aug 30, 2022

Related: https://github.com/paritytech/srlabs_findings/issues/126 .. in particular we should clearly define what to do with chains if a dispute won't conclude over a longer period of time.

@BradleyOlson64
Copy link
Contributor

Would it be advantageous to also inform chain-selection of candidates which have a local invalid vote? Then chain-selection for an honest validator could avoid building on a relay chain block B when that validator knows B contains an invalid candidate. This way we wouldn't necessarily have to wait on disputes concluding, though I suppose the theoretical worst case wouldn't be improved.

@burdges
Copy link
Contributor

burdges commented Dec 27, 2022

Yes, we should not build upon chains we know to be invalid, or that have lost dispute votes, but the other question is what to do with chains where someone else claims invalidity. Alistair was always quite worried people could DoS this way, if we dumped that fork proactively, but really the false positive penalties should be high enough to prevent this. I think the real issue is that disputes are usually just bugs, so if they're going to conclude as valid then maybe we should keep building upon the current chain. This is how I read the original issue here, but not sure the current situation.

@eskimor
Copy link
Member

eskimor commented Dec 27, 2022

Note about the new runtime API: Dispute coordinator is already tracking inclusion events on all forks.

@BradleyOlson64 Good thinking, but I am dubious this would really be beneficial and worth the complication: Either the dispute is concluding quickly (others see it and vote), then this optimization will do nothing - or it is concluding slowly, in that case we would revert a chain, build on that reverted chain, but everyone else would disregard our fork because it is not the best and will continue building on the original fork anyway.

By waiting for conclusion we ensure to have consensus on what really is the best fork.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I4-annoyance Code behaves within expectations, however this “expected behaviour” itself is at issue. T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

6 participants