Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add creation time for DHT authority discovery records #91

Merged

Conversation

alexggh
Copy link
Contributor

@alexggh alexggh commented May 20, 2024

This adds a new creation time for authority discovery records stored in the DHT.

This RFC has already an implementation in polkadot-sdk here: paritytech/polkadot-sdk#3786

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

## Motivation

Currently, we use the Kademlia DHT for storing records regarding the p2p address of an authority discovery key, the problem is that if the nodes decide to change its PeerId/Network key it will publish a new record, however because of the distributed and replicated nature of the DHT there is no way to tell which record is newer so both old PeerId and the new PeerId will live in the network until the old one expires(36h), that creates all sort of problem and leads to the node changing its address not being properly connected for up to 36h.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem is that if the nodes decide to change its PeerId/Network key it will publish a new record, however because of the distributed and replicated nature of the DHT there is no way to tell which record is newer so both old PeerId and the new PeerId will live in the network until the old one expires(36h)

I don't think that's true? The new record is very much supposed to overwrite the old one.

Copy link
Contributor

@tomaka tomaka May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Putting a record" consists in finding the 20 peers whose PeerId is the closest to the relevant key (here the relevant key is the authority discovery key) and sending them the record. If later you want to modify the record, you find these 20 peers again (they are normally still the same) and send them a new record, which overwrites the old one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's true? The new record is very much supposed to overwrite the old one.

I've tested this several times on different network configurations and the old record still survives in the network until it expires, as far as I can understand this happens because you do publish the record initially to the closest 20 nodes, but then when the record reaches for different reasons other nodes will still store it locally with this formula: https://github.com/libp2p/rust-libp2p/blob/ad7ad5b3fc5b4bc9a431ece90e9a5ce8c33ca0e2/protocols/kad/src/behaviour.rs#L1796, so not just the initial 20 nodes would store a record.

Copy link
Contributor

@tomaka tomaka May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the record reaches for different reasons

Well, this isn't really supposed to happen. It can happen in some rare cases where the publisher has been lied to during its iterative query, for example.

Even if the record reaches nodes other than the initial 20 closest nodes, reading the record (i.e. querying it) is always done against the 20 closest nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, double checked again, nodes don't actually have perfect connectivity, so their view of what are peer ids are not always the same, especially after your restart, so we end up in a situation where we don't always publish to the same 20 closest nodes, so you need just one node to still have the old record and then because that will replicate the record to its view of 20 closest nodes, we end up in a situation where the old record will override the new record.

The issue can be replicated systematically, by running a network of around 30-40 nodes, you let it run for sometime then you restart one of the node with a different PeerID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the exact scenario I'm testing is like this:

  1. Run a stable polkadot network.
  2. Just restart a single node with a new network key(PeerId) which makes the node to publish a new value on the dht for the key it is responsible.

What I'm observing is that the old record of the single node is actually on more than the 20 closest nodes that get updated at step 2) and because those nodes do replication https://github.com/libp2p/rust-libp2p/blob/ad7ad5b3fc5b4bc9a431ece90e9a5ce8c33ca0e2/protocols/kad/src/behaviour.rs#L2520 regularly, it is overwriting the newer record.

I consider this to be a valid use case since it is what happens when nodes upgrade and forget to persist their network key, it happened a few times in the past on kusama.

For the tests I ran the network does not seem to be split since I see everyone communicating with everyone on the other protocols, so what do you think am I missing here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole discussion is off-topic for this RFC, but the record could be on more than 20 nodes simply because during step 1 you have temporary net splits. What's important is that the 20 nodes where the record is updated at step 2 are the same as the ones queried by the rest of the network when they want to read the record. The logs would tell you that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's important is that the 20 nodes where the record is updated at step 2 are the same as the ones queried by the rest of the network when they want to read the record. The logs would tell you that

Right, but because the one(outside of the closest 20) still has the old record and does replication on the closest 20, all will have the old record.

This whole discussion is off-topic for this RFC

It is the reason I started the RFC in the first place, but I agree it shouldn't affect the RFC, I'll go ahead and apply your suggestion about using a single signature.

Copy link
Contributor

@tomaka tomaka May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but because the one(outside of the closest 20) still has the old record and does replication on the closest 20, all will have the old record.

As far as I can remember only the original publisher periodically re-publishes and can periodically re-publish.
When receiving a record, a node ensures that the PeerId of the sender of the record (i.e. the "live" PeerId of the connection where the record is being received) can be found in the list of PeerIds that is found in the record and that is covered by the signature.
Otherwise you would have a pretty obvious vulnerability, and we thought of these kind of things when designing the system.

It is the reason I started the RFC in the first place, but I agree it shouldn't affect the RFC, I'll go ahead and apply your suggestion about using a single signature.

I see this RFC as a hack to work around a bug somewhere, and it would be a better idea to actually fix the bug. However I'm personally not really involved in Polkadot anymore, so I'm not going to vote against.

Copy link
Contributor Author

@alexggh alexggh Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to this.

As far as I can remember only the original publisher periodically re-publishes and can periodically re-publish.
When receiving a record, a node ensures that the PeerId of the sender of the record (i.e. the "live" PeerId of the connection where the record is being received) can be found in the list of PeerIds that is found in the record and that is covered by the signature.
Otherwise you would have a pretty obvious vulnerability, and we thought of these kind of things when designing the system.

Yes, you are correct, re-publishing is done only by the original published, but there is also replication which is done by all peers that store the record, the integrity of the record is guaranteed by the fact that the signature is checked, so you can't replicate anything, you need to replicate the original record.

I see this RFC as a hack to work around a bug somewhere, and it would be a better idea to actually fix the bug. However I'm personally not really involved in Polkadot anymore, so I'm not going to vote against.

I've investigated this and I couldn't find anything wrong with the way it works, the scenario this fixes it is just a consequence of the fact that not all nodes have the same routing table always, so at different point in time the 20 closest 20 nodes could differ slightly.

However I'm personally not really involved in Polkadot anymore, so I'm not going to vote against.

Then if you don't strongly oppose this RFC, I intend to move forward with adding this new field(creation_time) since I see benefits in having it, so we can use it to always pick the newest record.

... to sign it only once

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Copy link
Contributor

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor questions, otherwise it looks good to me 👍

+
```

Each time a node wants to resolve an authorithy ID it will issue a query with a certain redundancy factor, and from all the results it receives it will decide to pick only the newest record. Additionally, the nodes that answer with old records will be updated with the newer record.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, the nodes that answer with old records will be updated with the newer record.

So, the node that send the DHT request will inform these nodes?

Strictly speaking, I don't think that this needs to be part of the RFC, because even a node not doing this would behave well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct, I was just trying to explain how a well behaving validator should/could use this field, I can explicitly state that this part is optional or remove it.

text/0091-dht-record-creation-time.md Show resolved Hide resolved
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
@bkchr
Copy link
Contributor

bkchr commented Jul 5, 2024

/rfc propose

@paritytech-rfc-bot
Copy link
Contributor

Hey @bkchr, here is a link you can use to create the referendum aiming to approve this RFC number 0091.

Instructions
  1. Open the link.

  2. Switch to the Submission tab.

  1. Adjust the transaction if needed (for example, the proposal Origin).

  2. Submit the Transaction


It is based on commit hash fbd4296f2a8a14cb2a17b2b0f0260b1df6e80783.

The proposed remark text is: RFC_APPROVE(0091,3df081c71e09b40d9057410b431accdf8fd0a7b20e58d176d47411fb1aa7083d).

Copy link

github-actions bot commented Jul 5, 2024

Voting for this referenda is ongoing.

Vote for it here

Copy link

PR can be merged.

Write the following command to trigger the bot

/rfc process 0x997290e7221101d7fe625a2f4eb58dd73393845a696d3ac3682c9f546a25c00b

@bkchr
Copy link
Contributor

bkchr commented Jul 15, 2024

/rfc process 0x997290e7221101d7fe625a2f4eb58dd73393845a696d3ac3682c9f546a25c00b

@paritytech-rfc-bot paritytech-rfc-bot bot merged commit 65cf83e into polkadot-fellows:main Jul 15, 2024
@paritytech-rfc-bot
Copy link
Contributor

The on-chain referendum has approved the RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Implemented Is merged or live as a feature/service.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants