Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment of availability-chunk indices to validators #47

Merged
merged 11 commits into from
Jan 25, 2024
212 changes: 212 additions & 0 deletions text/0047-random-assignment-of-availability-chunks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
# RFC-0047: Random assignment of availability chunks to validators

| | |
| --------------- | ------------------------------------------------------------------------------------------- |
| **Start Date** | 03 November 2023 |
| **Description** | An evenly-distributing indirection layer between availability chunks and validators . |
alindima marked this conversation as resolved.
Show resolved Hide resolved
| **Authors** | Alin Dima |

## Summary

Propose a way of randomly permuting the availability chunk indices assigned to validators for a given core and relay
chain block, in the context of
[recovering available data from systematic chunks](https://github.com/paritytech/polkadot-sdk/issues/598).

## Motivation

Currently, the ValidatorIndex is always identical to the ChunkIndex. Since the validator array is only shuffled once
per session, naively using the ValidatorIndex as the ChunkIndex would pose an unreasonable stress on the first N/3
validators during an entire session, when favouring availability recovery from systematic chunks.

Therefore, the relay chain node needs a deterministic way of evenly distributing the first ~(N_VALIDATORS / 3)
systematic availability chunks to different validators, based on the session, relay chain block number and core.
The main purpose is to ensure fair distribution of network bandwidth usage for availability recovery in general and in
particular for systematic chunk holders.

## Stakeholders

Relay chain node core developers.

## Explanation

### Systematic erasure codes

An erasure coding algorithm is considered systematic if it preserves the original unencoded data as part of the
resulting code.
[The implementation of the erasure coding algorithm used for polkadot's availability data](https://github.com/paritytech/reed-solomon-novelpoly) is systematic. Roughly speaking, the first N_VALIDATORS/3
chunks of data can be cheaply concatenated to retrieve the original data, without running the resource-intensive and
time-consuming reconstruction algorithm.
alindima marked this conversation as resolved.
Show resolved Hide resolved

### Availability recovery now

At this moment, the availability recovery process looks different based on the estimated size of the available data:

(a) for small PoVs (up to 128 Kib), sequentially try requesting the unencoded data from the backing group, in a random
order. If this fails, fallback to option (b).

(b) for large PoVs (over 128 Kib), launch N parallel requests for the erasure coded chunks (currently, N has an upper
limit of 50), until enough chunks were recovered. Validators are tried in a random order. Then, reconstruct the
original data.

### Availability recovery from systematic chunks

As part of the effort of
[increasing polkadot's resource efficiency, scalability and performance](https://github.com/paritytech/roadmap/issues/26),
work is under way to modify the Availability Recovery subsystem by leveraging systematic chunks. See
[this comment](https://github.com/paritytech/polkadot-sdk/issues/598#issuecomment-1792007099) for preliminary
performance results.

In this scheme, the relay chain node will first attempt to retrieve the N/3 systematic chunks from the validators that
should hold them, before falling back to recovering from regular chunks, as before.

### Chunk assignment function

#### Properties
The function that decides the chunk index for a validator should be parametrised by at least
alindima marked this conversation as resolved.
Show resolved Hide resolved
`(validator_index, session_index, block_number, core_index)`
and have the following properties:
alindima marked this conversation as resolved.
Show resolved Hide resolved
1. deterministic
1. pseudo-random
alindima marked this conversation as resolved.
Show resolved Hide resolved
1. relatively quick to compute and resource-efficient.
1. when considering the other params besides `validator_index` as fixed,
the function should describe a random permutation of the chunk indices
1. considering `session_index` and `block_number` as fixed arguments, the validators that map to the first N/3 chunk indices should
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about my argument to use session_index here, I think it's a very niche edge-case that's not worth worrying about. An alternative to (session_index, block_number) would be block hash which would also sidestep that issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fair. Another issue I just thought of with using block number is that for disputes hapenning on unimported forks, the ChainAPI call for getting the block number would also fail.

have as little overlap as possible for different cores.

#### Proposed function and runtime API
alindima marked this conversation as resolved.
Show resolved Hide resolved

Pseudocode:

```rust
pub fn get_chunk_index(
n_validators: u32,
validator_index: ValidatorIndex,
session_index: SessionIndex,
block_number: BlockNumber,
core_index: CoreIndex
) -> ChunkIndex {
let threshold = systematic_threshold(n_validators); // Roughly n_validators/3
let seed = derive_seed(session_index, block_number);
let mut rng: ChaCha8Rng = SeedableRng::from_seed(seed);
let mut chunk_indices: Vec<ChunkIndex> = (0..n_validators).map(Into::into).collect();
chunk_indices.shuffle(&mut rng);

let core_start_pos = threshold * core_index.0;
return chunk_indices[(core_start_pos + validator_index) % n_validators]
}
```

The function should be implemented as a runtime API, because:
alindima marked this conversation as resolved.
Show resolved Hide resolved

1. it's critical to the consensus protocol that all validators have a common view of the Validator->Chunk mapping.
alindima marked this conversation as resolved.
Show resolved Hide resolved
1. it enables further atomic changes to the shuffling algorithm.
1. it enables alternative client implementations (in other languages) to use it
1. mitigates the problem of third-party libraries changing the implementations of the `ChaCha8Rng` or the `rand::shuffle`
that could be introduced in further versions, which would stall parachains. This would be quite an "easy" attack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this argument. Are you talking about supply chain attacks? How is it specific to this RFC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's an example:

  1. all validators are running normally, using systematic recovery.
  2. a contributor on the rand crate (malicious or not) makes a perfectly valid change to the shuffling algorithm, that results in a different output for shuffle().
  3. there's a polkadot node release that bumps rand to this new version.
  4. some of the validators upgrade their node.
  5. as a result, some/all parachains are stalled, because some validators have a differing view of the validator->chunk mapping. Also, the validators that upgraded stop receiving rewards

IMO this has a high chance of happening in the future.

I view this mapping in a similar way to the per-session validator shuffle we do in the runtime to choose the validator active set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now, we could implement our own version of shuffle, which would mitigate this issue.
The thing I'm concerned about is implementing our own version of ChaCha8Rng. How feasible is it to assume that it won't change in the future? CC: @burdges

Copy link

@burdges burdges Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a supply chain attack so much as rand having a history of churn & instability.


Additionally, so that client code is able to efficiently get the mapping from the runtime, another API will be added
for retrieving chunk indices in bulk for all validators at a given block and core.

#### Upgrade path

Considering that the Validator->Chunk mapping is critical to para consensus, the change needs to be enacted atomically
via governance, only after all validators have upgraded the node to a version that is aware of this mapping.
It needs to be explicitly stated that after the runtime upgrade and governance enactment, validators that run older
client versions that don't support this mapping will not be able to participate in parachain consensus.
alindima marked this conversation as resolved.
Show resolved Hide resolved

### Getting access to core_index

Availability-recovery can currently be triggered by the following steps in the polkadot protocol:
1. During the approval voting process.
1. By other collators of the same parachain
3. During disputes.

The `core_index` refers to the index of the core that the candidate was occupying while it was pending availability
(from backing to inclusion).
Getting the right core index for a candidate can prove troublesome. Here's a breakdown of how different parts of the
node implementation can get access to this data:

1. The approval-voting process for a candidate begins after observing that the candidate was included. Therefore, the
node has easy access to the block where the candidate got included (and also the core that it occupied).
1. The `pov_recovery` task of the collators starts availability recovery in response to noticing a candidate getting
backed, which enables easy access to the core index the candidate started occupying.
1. Disputes may be initiated on a number of occasions:

3.a. is initiated by the validator as a result of finding an invalid candidate while participating in the
approval-voting protocol. In this case, availability-recovery is not needed, since the validator already issued their
vote.

3.b is initiated by the validator noticing dispute votes recorded on-chain. In this case, we can safely
assume that the backing event for that candidate has been recorded and kept in memory.

3.c is initiated as a result of getting a dispute statement from another validator. It is possible that the dispute
is happening on a fork that was not yet imported by this validator, so the subsystem may not have seen this candidate
being backed.

As a solution to 3.c, a new version for the disputes request-response networking protocol will be added.
This message type will include the relay block hash where the candidate was included. This information will be used
in order to query the runtime API and retrieve the core index that the candidate was occupying.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a second thought: is this even possible? If the validator is not aware of the fork, how can it call a runtime API on that fork?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the validator hasn't seen that fork, it's can't call into runtime API of that


The usage of the systematic data availability recovery feature will also be subject to all nodes using the V2 disputes
networking protocol.

#### Alternatives to using core_index

As an alternative to core_index, the `ParaId` could be used. It has the advantage of being readily available in the
`CandidateReceipt`, which would enable the dispute communication protocol to not change and would simplify the
implementation.
However, in the context of [CoreJam](https://github.com/polkadot-fellows/RFCs/pull/31), `ParaId`s will no longer exist
(at least not in their current form).
Copy link
Contributor

@ordian ordian Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO we shouldn't worry too much about CoreJam compatibility. IIUC there might be a drop-in replacement for ParaId (AuthId), so we should first explore an avenue with ParaId perhaps.

The (bigger) problem with ParaId is that it's claimable by users, so in the theory one could create a collision attack where multiple paras have the same systematic chunks. In practice, I believe such an attack would be high cost and low benefit. And, perhaps, it could be mitigated by using a cryptographic hash of ParaId (or AuthId or core_index in the future).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the core index here probably, not the para id. I'd thought the core indexes could be spaced somewhat evenly, but with the actual sequence being random.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my view here above in #47 (comment)

It's probably more future proof if we do not require the core index here, so the map uses num_cores, num_validators, relay parent, and paraid. We've other options but this looks pretty future proof. CoreJam cannot really work without something like paraid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using paraid was my first attempt, but it complicates things quite a bit.
As Andronik notes, it's claimable by a user. I don't think it's the biggest problem though, they can't get a different ParaId every day (can they?).

paraid's biggest problem is that it doesn't form a strictly monontonically increasing sequence, so we'd have to do something else with it. My initial attempt was to seed a ChaCha8 RNG with it and generate one random number. Then this number was the offset into the chunk index vector.

But then if we complicate it so much, we may want to embed it into the runtime, so that future updates to the rand crate don't break us (which creates all sorts of problems that could be avoided).

We've had this paraid -> core_index back and forth for a while now, it'd be great if we could all (the ones interested in this RFC) hop on a call or somehow reach a conclusion. There are pros and cons to both, AFAICT.

I think this remains the only bit that prevents this RFC from moving on (anybody correct me if I'm wrong)


Using the candidate hash as a random seed for a shuffle is another option.

## Drawbacks

Has a fair amount of technical complexity involved:

- Introduces another runtime API that is going to be queried by multiple subsystems. With adequate client-side caching,
this should be acceptable.

- Requires a networking protocol upgrade on the disputes request-response protocol

## Testing, Security, and Privacy

Extensive testing will be conducted - both automated and manual.
This proposal doesn't affect security or privacy.

## Performance, Ergonomics, and Compatibility

### Performance

This is a necessary DA optimisation, as reed-solomon erasure coding has proven to be a top consumer of CPU time in
polkadot as we scale up the parachain block size and number of availability cores.

With this optimisation, preliminary performance results show that CPU time used for reed-solomon coding can be halved
and total POV recovery time decrease by 80% for large POVs. See more
[here](https://github.com/paritytech/polkadot-sdk/issues/598#issuecomment-1792007099).

### Ergonomics

Not applicable.

### Compatibility

This is a breaking change. See "upgrade path" section above.
All validators need to have upgraded their node versions before the feature will be enabled via a runtime upgrade and
governance call.

## Prior Art and References

See comments on the [tracking issue](https://github.com/paritytech/polkadot-sdk/issues/598) and the
[in-progress PR](https://github.com/paritytech/polkadot-sdk/pull/1644)

## Unresolved Questions

- Is it a future-proof idea to utilise the core index as a parameter of the chunk index compute function?
Is there a better alternative that avoid complicating the implementation?
- Is there a better upgrade path that would preserve backwards compatibility?

## Future Directions and Related Material

This enables future optimisations for the performance of availability recovery, such as retrieving batched systematic
chunks from backers/approval-checkers.