Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

rphmeier · 2022-01-28T05:41:10Z

Currently, the chain-selection subsystem only reverts a block B when one of its descendants B' issues a Revert log in its header targeting B. In practice, this happens when B' receives the dispute statements necessary to conclude a dispute against some parachain candidate contained within B. And it also means that nodes would be willing to build upon B long after they have the evidence to prove that B is garbage.

This behaves very inefficiently in the presence of many disputes. In such cases, nodes will abandon one chain only to build upon alternative blocks that they very well know contain garbage. This can lead to a lot of churn in the growth of the chain and even chain stalls when session lengths are too low (encountering the dreaded 'Unexpected Epoch Change'). Nodes are stupid. Let's be smarter.

The solution is to plug the dispute coordinator more directly into chain-selection. When a dispute is concluded against a candidate, a few things need to happen:

We need to notify chain-selection that a dispute has just concluded against a candidate (and in which session)
We need chain-selection to discover all the blocks that contain the candidate. This can be fairly simply done with the addition of a new runtime API fn included_at(CandidateHash) -> Option<BlockNumber> which just accesses the Disputes::Included storage map. We can invoke this for all the viable leaves in chain-selection. If we don't want to muck around with a runtime API, we can also probably just dig in the approval-voting DB. That might be easier.
We need chain-selection to revert all those blocks with something similar to chain-selection::tree::apply_reversions.
[OPTIONAL] with Integrate ChainSelectionSubsystem fork-choice into Substrate's fork-choice polkadot-sdk#875 update the best block in Substrate's Client.

This will make Revert logs more or less obsolete. They're still good as a last-resort mechanism, but by doing this work off-chain nodes will much more eagerly abandon disputed forks.

Implement
Clearly document in code/guide(probably both) the two remaining reasons for vote import on chain:
- Slashing
- Chain Scraping: By importing votes on chain, we enable other validators to scrape votes from chain and thus learn about them, this is in addition to dispute-distribution. I keep forgetting about this reason, hence I would like to have it documented as it changes reason on when/how important chain import actually is.E.g. without this there would be a good reason to prioritize concluded disputes on import, over ongoing ones. With this, ongoing disputes are actually also quite important.

The text was updated successfully, but these errors were encountered:

eskimor · 2022-01-28T10:14:36Z

Note about that: While working through spam considerations, @drahnr and I decided that we should be smarter with dispute imports. E.g. Have a runtime API to ask the chain what disputes it already knows about and only try to import disputes it does not yet know about. With this together with ordered disputes, in practice the building on alternative blocks will mean importing the relevant dispute and issuing a revert log on that chain as well. We should therefore only create one more block on each fork, ensuring all forks are marked as invalid.

That does not sound too bad, in particular I can't quite follow how this can lead to the dreaded unexpected epoch change, assuming the number of considered forks is reasonable limited.

rphmeier · 2022-01-28T16:33:48Z

That is orthogonal. This issue was created in reaction to an incident where an alternative block was built and included a bad parachain candidate. Within 2 seconds of the block being built, all nodes had concluded that it was bad. But 10 minutes later as the result of other reversions, nodes were fine with building on top of this block and reverting it only then.

rphmeier · 2022-04-21T20:46:09Z

With #5048 the included_at runtime API can be added as a staging_included_at.

eskimor · 2022-04-26T10:33:49Z

@tdimitrov that's the issue we were talking about.

eskimor · 2022-08-30T13:45:44Z

Related: https://github.com/paritytech/srlabs_findings/issues/126 .. in particular we should clearly define what to do with chains if a dispute won't conclude over a longer period of time.

BradleyOlson64 · 2022-12-26T22:03:56Z

Would it be advantageous to also inform chain-selection of candidates which have a local invalid vote? Then chain-selection for an honest validator could avoid building on a relay chain block B when that validator knows B contains an invalid candidate. This way we wouldn't necessarily have to wait on disputes concluding, though I suppose the theoretical worst case wouldn't be improved.

burdges · 2022-12-27T00:57:52Z

Yes, we should not build upon chains we know to be invalid, or that have lost dispute votes, but the other question is what to do with chains where someone else claims invalidity. Alistair was always quite worried people could DoS this way, if we dumped that fork proactively, but really the false positive penalties should be high enough to prevent this. I think the real issue is that disputes are usually just bugs, so if they're going to conclude as valid then maybe we should keep building upon the current chain. This is how I read the original issue here, but not sure the current situation.

eskimor · 2022-12-27T09:50:43Z

Note about the new runtime API: Dispute coordinator is already tracking inclusion events on all forks.

@BradleyOlson64 Good thinking, but I am dubious this would really be beneficial and worth the complication: Either the dispute is concluding quickly (others see it and vote), then this optimization will do nothing - or it is concluding slowly, in that case we would revert a chain, build on that reverted chain, but everyone else would disregard our fork because it is not the best and will continue building on the original fork anyway.

By waiting for conclusion we ensure to have consensus on what really is the best fork.

rphmeier added the I4-annoyance Code behaves within expectations, however this “expected behaviour” itself is at issue. label Jan 28, 2022

eskimor mentioned this issue May 2, 2022

More efficient dispute vote import #4573

Closed

3 tasks

eskimor assigned tdimitrov Jul 28, 2022

ordian added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Aug 16, 2022

BradleyOlson64 assigned BradleyOlson64 and unassigned tdimitrov Nov 7, 2022

BradleyOlson64 linked a pull request Jan 5, 2023 that will close this issue

Issue 4804: Notify chain selection of concluded disputes directly #6512

Merged

paritytech-processbot bot closed this as completed in #6512 Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

rphmeier commented Jan 28, 2022 •

edited by eskimor

Loading

eskimor commented Jan 28, 2022 •

edited

Loading

rphmeier commented Jan 28, 2022

rphmeier commented Apr 21, 2022

eskimor commented Apr 26, 2022

eskimor commented Aug 30, 2022

BradleyOlson64 commented Dec 26, 2022

burdges commented Dec 27, 2022

eskimor commented Dec 27, 2022

Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

Improve ChainSelectionSubsystem to revert based on disputes concluding off-chain #4804

Comments

rphmeier commented Jan 28, 2022 • edited by eskimor Loading

eskimor commented Jan 28, 2022 • edited Loading

rphmeier commented Jan 28, 2022

rphmeier commented Apr 21, 2022

eskimor commented Apr 26, 2022

eskimor commented Aug 30, 2022

BradleyOlson64 commented Dec 26, 2022

burdges commented Dec 27, 2022

eskimor commented Dec 27, 2022

rphmeier commented Jan 28, 2022 •

edited by eskimor

Loading

eskimor commented Jan 28, 2022 •

edited

Loading