Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Submit a slot number alongside nonce #5297

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

marcellorigotti
Copy link
Contributor

@marcellorigotti marcellorigotti commented Sep 25, 2024

Pull Request

Closes: PRO-1635

Checklist

Please conduct a thorough self-review before opening the PR.

  • I am confident that the code works.
  • I have written sufficient tests.
  • I have written and tested required migrations.
  • I have updated documentation where appropriate.

Summary

Updated the Change Electoral system to Nonce Electoral system, which takes both nonce value and slot as votes and update the nonce to the new value only if the new value is different than the previous one and the slot is higher than the previous one.

Copy link

codecov bot commented Sep 25, 2024

Codecov Report

Attention: Patch coverage is 34.96503% with 93 lines in your changes missing coverage. Please review.

Project coverage is 71%. Comparing base (ab7104b) to head (5b18b78).

Files with missing lines Patch % Lines
...in/pallets/cf-elections/src/vote_storage/change.rs 2% 58 Missing ⚠️
...lections/src/electoral_systems/monotonic_change.rs 72% 7 Missing and 6 partials ⚠️
engine/src/witness/sol.rs 0% 9 Missing ⚠️
engine/src/witness/sol/nonce_witnessing.rs 0% 9 Missing ⚠️
state-chain/pallets/cf-environment/src/lib.rs 0% 3 Missing ⚠️
...cf-elections/src/electoral_systems/mocks/access.rs 86% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##            main   #5297    +/-   ##
======================================
- Coverage     71%     71%    -0%     
======================================
  Files        488     489     +1     
  Lines      84898   84876    -22     
  Branches   84898   84876    -22     
======================================
- Hits       60375   60229   -146     
- Misses     21822   21932   +110     
- Partials    2701    2715    +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@dandanlen dandanlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just some minor comments mostly around naming. I'll take a closer look at tests on Monday.

@@ -724,6 +724,7 @@ impl pallet_cf_governance::Config for Runtime {
type UpgradeCondition = (
pallet_cf_validator::NotDuringRotation<Runtime>,
pallet_cf_swapping::NoPendingSwaps<Runtime>,
pallet_cf_environment::NoUsedNonce<Runtime>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you write this as a nested tuple then you don't need to change the trait definition:

	type UpgradeCondition = (
		pallet_cf_validator::NotDuringRotation<Runtime>,
		(
			pallet_cf_swapping::NoPendingSwaps<Runtime>,
			pallet_cf_environment::NoUsedNonce<Runtime>,
		)
	);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the file name.

Settings: Member + Parameter + MaybeSerializeDeserialize + Eq,
Hook: OnChangeHook<Identifier, Value> + 'static,
ValidatorId: Member + Parameter + Ord + MaybeSerializeDeserialize,
> Change<Identifier, Value, Settings, Hook, ValidatorId>
> NonceWitnessing<Identifier, Value, Slot, Settings, Hook, ValidatorId>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit: we rename this to something specific (nonce witnessing instead of change, 'Slot' which is Solana-specific) but haven't renamed any of the generics to match this more specific purpose (for example nonce account instead of identifier).

I would suggest either we keep a more general name (MonotonicChange? BlockHeight?), or we rename the generics to match the more specific use case. This would make the code easier to reason about (no need to remember whether value is the nonce or the address etc.).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, making the names more specific is better in this case

Copy link
Collaborator

@dandanlen dandanlen Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? (Would be good to give a reason otherwise it's just a question of who has the most willingness to argue 😅 )

Copy link
Contributor

@kylezs kylezs Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i don't think it matters much, but unless it has a general use better to name it specific. Like we don't name every parameter "variable", the name becomes more general as what it is actually doing becomes more general. Also there are no doubts about how it's intended to be used when it's specifically named.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in this case, we're implementing a type with a bunch of generics, so it's pretty clear that it's supposed to work generically. And of course I'm not saying we should name everything variable and thing. What I mean is that naming should be as specific as possible within its own context. In this particular case, for example, there's nothing in the functionality of the implementation that is nonce-specific.

) -> Result<VoteComponents<Self>, CorruptStorageError> {
Ok(VoteComponents {
bitmap_component: Some(partial_vote.value),
individual_component: Some((_properties, partial_vote.slot)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the properties used here?

edit: I see now, these are vote properties () not election properties. Still, they value should have a leading underscore unless it's unused.

use crate::{CorruptStorageError, SharedDataHash};

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Encode, Decode, TypeInfo)]
pub struct NonceVote<Value, Slot> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the other comment, this vote type seems like it could be used more generally, so maybe there is a better / more general name than NonceVote? MonotonicChangeVote or something?

Copy link
Contributor

@kylezs kylezs Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can generalise it when it necessary? else it's a little confusing it has a general name and is used for one (quite specific) thing. Though, if we go the way of composing the existing vote storage then this conversation is void.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My view is that if we implement something generically, then it should have generic names. Reading generic code with specifically-named values is confusing because you always have the specific use-case in mind.

counts
.entry(vote.value)
.and_modify(|(count, slots)| {
*count += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need this count, it's already implicitly tracked in the length of slots.

Some(vote.clone())
let mut slots = slots.clone();
let num_slots = slots.len() as u32;
let (_, median_vote, _) = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid calling it the median vote, it's not a median ;)

consensus_slot or consensus_height if we're using more general naming?

if previous_value != value {
if let Some((value, slot)) = election_access.check_consensus()?.has_consensus() {
let (identifier, previous_value, previous_slot) = election_access.properties()?;
if previous_value != value && slot > previous_slot {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the implementation of is_vote_valid, I think this statement should always be true (It's not possible to vote, and therefore not possible to gain consensus, if this condition is violated). So as an extra safety measure, we could add and else { log_or_panic!(..) here to catch any logic errors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have the on-chain logic, which should prevent it on its own, without having to rely on correct behaviour from the engines. So I don't mind having this here as a way of ensuring the chain logic is valid independent of engine behaviour - though I do think there should be a comment here that we don't expect this to be false because in general the engines should behave.

Copy link
Collaborator

@dandanlen dandanlen Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'm not saying we remove it, just that we should add something so that we don't silently fail if the assumption is wrong.


impl<T: Config> ExecutionCondition for NoUsedNonce<T> {
fn is_satisfied() -> bool {
SolanaAvailableNonceAccounts::<T>::get().len() == 10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was 7 😅

Could we not use SolanaUnavailableNonceAccounts::<T>::iter().next().is_none()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was originally 7 indeed. However, I did some optimizations to be able to bump it up to ten, as that is the bottleneck.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should been less indirect: we can't assume that it will always be 10. We've already change it once, we might change it again. So the implementation should not assume the number.

engine/src/witness/sol.rs Outdated Show resolved Hide resolved
Settings: Member + Parameter + MaybeSerializeDeserialize + Eq,
Hook: OnChangeHook<Identifier, Value> + 'static,
ValidatorId: Member + Parameter + Ord + MaybeSerializeDeserialize,
> Change<Identifier, Value, Settings, Hook, ValidatorId>
> NonceWitnessing<Identifier, Value, Slot, Settings, Hook, ValidatorId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, making the names more specific is better in this case

use crate::{CorruptStorageError, SharedDataHash};

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Encode, Decode, TypeInfo)]
pub struct NonceVote<Value, Slot> {
Copy link
Contributor

@kylezs kylezs Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can generalise it when it necessary? else it's a little confusing it has a general name and is used for one (quite specific) thing. Though, if we go the way of composing the existing vote storage then this conversation is void.

@marcellorigotti
Copy link
Contributor Author

I kept everything generic in the end!

vote: &Self::Vote,
mut _h: H,
) -> Self::PartialVote {
(*vote).clone()
Copy link
Contributor

@kylezs kylezs Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we don't have a partial vote, and we always vote the full vote. The plan is to extend this electoral system for the contract witnessing, and so there the value would be much larger in size - and we don't want every validator submitting all that data - we should probably store a hash of the value in the bitmap, and then construct the full vote from the shared data - like we do in the Bitmap VoteStorage impl

let mut blocks_height = blocks_height.clone();
let (_, consensus_block_height, _) = {
blocks_height
.select_nth_unstable(threshold_from_share_count(num_votes) as usize)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are computing a threshold here based on the number of votes rather than the number of authorities. Is this intentional? Why?

(
For comparison, for monotonic_median consensus we do:

active_votes.select_nth_unstable((num_authorities - success_threshold) as usize);

)

Copy link
Contributor Author

@marcellorigotti marcellorigotti Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to keep only the valid votes into consideration for the slot.
Cause if someone voted for a wrong value their slot could be way off and it doesn't make much sense to keep that into consideration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if we are in this branch of the code, then we know that there are at least success_threshold valid votes. Our thresholds should always be based on the total number of authorities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll switch back to use the authority_count

pub async fn get_durable_nonce<SolRetryRpcClient>(
sol_client: &SolRetryRpcClient,
nonce_account: SolAddress,
) -> Result<Option<SolHash>>
previous_slot: SlotNumber,
) -> Result<Option<MonotonicChangeVote<SolHash, SlotNumber>>>
Copy link
Collaborator

@dandanlen dandanlen Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: To keep the idea of a vote within the VoterApi, this fn should not return a vote, it can just return (SolHash, SlotNumber). (and it's the VoterApi's job to convert that into a vote)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a test that checks that we can only form consensus if the block increases. (ie. everyone votes for a new value, but at a lower block).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is not true, we can form consensus even if the block decrease.
Votes are filtered at the moment of voting only, so if we manually create votes that have a lower block we can still form consensus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am implementing is_vote_valid() for the mock so that we can test that these bad votes are correctly rejected!

@@ -152,6 +156,37 @@ fn consensus_when_all_votes_the_same_but_different_slot() {
);
}

#[test]
fn votes_with_old_value_or_lower_block_are_rejected() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@dandanlen dandanlen added this pull request to the merge queue Oct 10, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 10, 2024
@dandanlen dandanlen self-requested a review October 11, 2024 10:28
@dandanlen
Copy link
Collaborator

As discussed: the upgrade condition check doesn't work, we can delete it.

We need to migrate by deleting all the old elections and then requesting nonce witness elections for any missing nonces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants