-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leader VRF value no longer settling ties #4051
Comments
In case it was an intentional decision (to randomise slot battle outcomes) the following is worth noting:
Therefore the way things are currently enables stake pools to game the system by running custom software to re-order, or select transactions, in order to generate a low block VRF result to maximise their chances of winning a potential slot battle. It is probably not good to have an extra incentive for block producers to manipulate transactions. |
you are absolutely correct @TerminadaPool , and we've made sure not to incentivize this kind of behavior.
it does not! (to your point above). the value of the VRF in the block header depends on:
|
Hi, I appreciate it if you can elaborate on why in these 2 cases CCYT lost 2 blocks to a pool with low VRF after the VHF, is it completely random now? https://pooltool.io/realtime/7825147 |
@HT-Moh Yes, it is currently completely random without any slight advantage to smaller pools as was the case in Alonzo. This issue was opened to see if the change was intentional or unintentional. If unintentional, it can be filed as a bug and corrected in the node. |
@AndrewWestberg thanks |
My 2 cents: Here is some documentation about it near bottom of page ( that website is down right now, so linking wayback machine URL ) |
I think you might be missing the point @reqlez. Slot battles used to be determined in favour of the lowest VRF score (which would favour smaller pools), but now the determination is random. Small pools no longer have the slot battle advantage your link refers to. |
i'm fully aware. i'm just saying, that i believe the lowest VRF score preference ( how it was before ) was a design decision versus a bug. But I do not have any information from IOG that would suggest that, of course. The reason i believe that, is because i feel it makes sense to give smaller pools a preference, because a lost block is not very noticeable to a big pool, but would hurt a smaller pool quite a bit. What I do not know, is what are the potential security issues with this. I mean... it worked like this for 2 years, and seemed fine. Has anybody found a way to abuse this reliably? |
It was a deliberate design feature to determine slot battles in preference of the lower VRF score. But the changes in the most recent version 1.35.3, resulted in the removal of this feature. This was why I was surprised to notice the change and the rest of the community was also surprised. The truth hasn't come out yet about whether the removal of this feature was intentional or not. |
Here is some more context that everyone may not be aware of. At the Vasil hard fork, we switched from using the
|
Do we have any ETA for this fix? This should have higher priority. As it already affecting the ratings and ROI % of smaller pools. In my personal experience I already lost 5 substantial delegators to large pools. Thank you. |
@cardanoinvest I assume that a fix for this will be unlikely to be rolled out separately. This is because the larger stake pools will have no incentive to upgrade since it will slightly disadvantage them. If only part of the network upgrades then slot battles will become non-deterministic (since upgraded nodes nodes will award a different winner half of the time) resulting in more chain forks needing to be settled by longest chain rule. Yes, it is a blow for small pools especially when the mandatory 340 Ada min fee is also working against them recruiting delegation. |
@TerminadaPool if they fix and release it as 1.35.4 with high recommendation to update and send note to exchanges to update, I believe it will become main version. Because 1.35.5 , 1.35.6 will come anyway with that fix implemented. Not like we have any other major alternative Cardano node development team. |
@cardanoinvest If you update your node and the majority doesn't then your node might produce a block on the non-consensus fork causing your block to become orphaned. |
This only affects slot battles. Whether someone is running a 1.35.4 with the patch or 1.35.3 without won't make much difference as only 5% of blocks are battles. What matters is whether the node making the block AFTER yours is upgraded. You cannot reduce or improve your chances by being on the same or different version of the node coming after your block. All you can do is upgrade yourself to try and be nice to any smaller pool making a block before yours. Whether your block gets adopted or not is out of your hands. |
Agreed. But wouldn't this sort of fix better fit a CUE event anyway? In that case the upgrade discussion would be moot no? |
This is indeed hurting small pool operator's pool performance. |
There is a misconception that small pools decentralize. Making slot battle random instead of favoring small pools is an incentive in the right direction, to consolidate pools. A single pool with 30MM stake is better than 10 pools with 3MM stake. We need to stop incentivizing thousands of pools with low stake and begin incentivizing K desirable pools. |
Could you please explain this misconception? |
You know the pools in the above photo are part of a large group because they are marketed that way. If each of those pools had different metadata, website, etc. they would still be part of a group. Not all groups are so blatant. Assuming that any pool with low pledge is independent because it's marketed that way is naive. Many people (especially ITN OGs) are running tons of pools with low pledge and low stake to farm minPoolCost. Slot battle preference paired with giant minPoolCost are two incentives for SPO to run many small pools instead of one large pool. By making slot battles random you incentives delegators and operators to consolidate their stake into one large pool instead of dozens of small pools. |
Well if they splitting into multiple small pools they still have to pay for the resources, CPU/RAM etc, so those 340 should cover it. They still make it decentralized |
I've now had a chance to talk to the researchers, the cryptographers, and the folks that did the implementation work. It was always intended that ties be settled uniformly random. The behavior in Praos is expected, the behavior in TPraos is not. The Praos paper does not make assumptions about how tie-breaking is done, and is assumed to be controlled by the adversary. The problem is that this is an unintended incentive mechanism. This incentive was not intentionally added To TPraos, nor was it intentionally removed from Praos. The assumption was always that ties were being resolved fairly. @kaskjabhdlf is right to remind us that we cannot equivocate "good for small pools" with "better for decentralization". Though @cardanoinvest is almost certainly correct that the small pool advantage is dwarfed by other advantages, the fact is that the incentive mechanism that was designed by our research already takes into account the fact that we want decentralized block production. I'm going to close this issue now, not because I want to end the discussion, but because my original question has been answered. Please feel free to keep the discussion going here, elsewhere on GitHub, or Discord, etc. |
@JaredCorduan Regardless of intentions, "uniformly random" does not have uniform consequences. The impact of slot battles already negatively impacts a small pool much more than a large pool. It's a huge hit to ROI numbers for a small pool to lose even a single block. For a pool near saturation, it's insignificant. I believe we will see decentralization decrease and fewer new pools being able to compete with established pools. I would ask you to re-consider the decision to not fix it, otherwise we will likely see a our first community-supported fork of the IOG code. |
@AndrewWestberg I guess we all can manage to gather large enough community to make that fork, even @CharlesHoskinson "RATS" pools is small enough to be negatively affected by this. I would like to hear his comments on that issue. |
@JaredCorduan - wouldn't that incentivise those operators to create more adversarial forks (as in run multiple BP nodes - was evident in ITN as chain selection was primary reason for those being run) to try and get an advantage? It's not gonna be uniformly random if there are ways to tilt (even if not guarantee) results into a favour by creating forks. |
@TerminadaPool don't worry :-) I wasn't taking it as an accusation. Indeed I'm flattered that you thought I was so clever that I knew about and introduced this bias in the design (but never wrote it down in the specs). |
I'd recommend reading the incentives paper and the incentives section of the Shelley design, and there were a load of presentations on the topic by Lars (which no doubt can be found on youtube). Within the existing design, the main lever to pull to control the level of decentralisation is the K param, but one does need to deal with the issue of pool splitting. |
Yes, pool splitting essentially negates the effect of increasing K. It seems that the main lever to pull to control the pool splitting problem is by properly incentivising pledge. |
and just to connect the dots for everyone (apologies if this is obvious!), we can achieve this by increasing the |
But doing so, in the setting of the current formula, will massively benefit those that can pledge saturate a pool whilst making almost no difference to those that pledge in the 100K to 3M Ada range. |
Unfortunately this code change that removed the leader VRF as the tie breaker for slot battles, was yet another blow to small pools in the decentralisation war against the mega pools. Increasing a0 will not help. The current reward formula is not fit for purpose. Here are some quotes about the a0 parameter from CIP proposal 0050:
Without changing the formula, increasing a0 will exacerbate these problems. |
I agree with what @TerminadaPool said here. Making the slot battle tie breaker equally random for all competitors sounds nice superficially, but it's actually against the whole incentivized strategy of Cardano network operation. Even though Cardano is a proof of stake network, we need more human resource such as builders and participants more than ever. |
@daehan-koreapool yes, and we had this "bias" towards small pool for 2 years, and I see nothing bad that happened. But with this gone, I see no incentives to delegate to small pool if you care about your ROI. Plus big pool get more rewards from TXs. They have 50 TXs per epoch for example, small pool get 1 - 5. So their total ADA rewards at the end is larger. There should be some bias for small pools to make sense. So where do we go from here? When can this be put to vote at least. |
Hi @dcoutts, for my edification, trying to learn, in CIP-9, what makes the "Updatable Parameters" aka the reward protocol parameters "substantially easier" to change exactly? I know they don't require a hardfork. I heard they only need 4/7 signing keys? Could you explain the differences and maybe point me in the right direction? Thanks! |
Changing any of the updatable protocol parameters simply needs a governance decision and then posting a suitable-signed protocol parameter update on the chain (yes, currently that means 4 of the 7 governance keys). That's it. The change then takes place automatically at the next epoch boundary, without any actions needed on the part of any other user (SPO, wallet users etc). By contrast, a hard fork is a much more elaborate and time consuming procedure. It requires all SPOs and all other users to upgrade to a new version of the code, giving people enough time to do so. This often involves a lot of communication and time for developers to test their applications on a testnet. Then in addition, once enough end users and SPOs have upgraded, then it also needs a signed protocol parameter update to be posted on the chain (to update the major protocol version). So a hard fork is really a lot more work, and needs a lot more lead time. |
Sooo, maybe we can revert it back now? And while everything is back to stable, like it was for 2 years, we can run some simulations how this change will affect the play field in the long run? Because I already feel those effects on myself and I know other SPOs with the same issue. My average monthly ROI dropped from stable average 4% to 1.6% . 3 blocks lost out of 8 in 6 Epochs all in battles. Maybe I am just very unlucky, but amount of battles seems bigger for me. And I run my pool from beta and ITN. And Cardano rating doesn't consider that new change. So small pool = bad. Even though it is random and not because server lags or smth. |
I am going to poke my head up and see if I get shot again: I want to disagree that a hard fork is necessary to change how slot battles are settled. As I understand, the decider of a slot battle is really the next block producer because he chooses which fork to build his block upon. If the next block producer after a slot battle is running software that chooses the winner based on lowest leader VRF then he will build on the block with the lower leader VRF. Other nodes in the network will then follow his chosen fork because it will now be the longest chain with his extra block added. In other words, I believe that a modification to the software to decide slot battles based on lowest leader VRF is possible to be rolled out without a hard fork. In fact, if someone pointed out where to make the change in the code, I believe stake pool operators could choose to do this themselves. If a group did this then the percentage of slot battles that got decided by leader VRF score would reflect the stake weighed percentage of pool operators that made this change to their software. Having said all that, I am not sure that I would want to do such a thing even though I run a small pool that would benefit from this change. I say this because I agree with Duncan's general principle of wanting the design to properly reflect the original intentions of the research. |
Are those all slot battles or mostly height battles? If height battles, are those mostly caused by a high propagation time of the other pools? Because if that is the case, the proposal mentioned/discussed by @TerminadaPool and @dcoutts earlier in this thread (#2913 (comment)) will be a better solution for this problem. Slot battles and 'real' height battles (so not those caused by really bad propagation time) should be much rarer I think (or you really do have really bad luck now, that's also possible). |
I also think a HF isn't necessary, but it will cause a 'wilder' chain with more forks. Pools with high saturation have an incentive not to upgrade, so a majority of the blocks will probably still be minted by pools using the current rules. With a HF, you can force everyone to use the new rules. |
2 were in slot battles 1 in height battle. But it was very rare for me before HF, maybe I had like 1 or 2 height battles lost in 2 years. During height battle I won slot battle but lost height to 8s prop. stake pool with 21 nodes confirming. |
@TerminadaPool I also think you're right. The winning block in a slot battle isn't enforced by the ledger rules but by the node, so in theory SPOs could run custom code. This would probably lead to longer settlement times for forks. On the actual topic, I don't think there's a need to roll back to the previous selection rule. In tandem with the minPoolCost this incentivizes pool splitting even more. Most small pools are supposed to die out, so that stake can move to a few small pools so they can become saturated. The actual delegation decision should be non-myopic. Delegators seem to be pretty bad at that, although, in their defense, they are not given the right tools. The ranking mechanism proposed by @brunjlar is still not implemented correctly in Daedalus or any other wallet or pool ranking site. |
Well done to the eagle-eyed folks who spotted that in principle the tie breaking would not necessarily require a hard fork. That is true, right now. I should caution however that it will not be true in future: there will later be a change to the chain sync protocol that will require that both parties exactly agree on the chain ordering (i.e. including how one does tie breaking), and any disagreement would become a protocol error. So again, it's not a direction that I would encourage anyone to go in. |
I would have thought there is no fundamental difference between the next block producer choosing one particular single block fork (slot battle winner) versus dropping the block he didn't like as though he never received it. In both cases he will build his block on the slot battle winner he prefers. This sort of thing already currently happens when one of the blocks from a slot battle is received too late by the next block producer. The Nakamoto longest chain rule is then used to decide the canonical chain. Are you saying that if a block producer mints a block on the wrong slot battle fork, because he didn't receive the other fork in time, that his block will then become invalidated due to a protocol error? |
I suppose this has to do with the introduction of input endorses. I also welcome any change that enforces correct behavior by ledger rules. Right now there's nothing stopping block producers from omitting Txs or reordering their mempool. Input endorsers will make this kind of foul play much more difficult. |
If your node builds on top of a block that is not the winner, your block and the fork will be sorted out sooner or later yes. Take a look at the issue we had with the node a while ago. If you introduce different node decisions in parallel, you will see a lot of those forks again and a lot of lost blocks too. The only way to avoid this is via a hard fork event. |
If you are the next block producer after a slot battle and you have not received one of the blocks then you will build on the block you received because it is a valid block. Your block produced is also a valid block. Other nodes in the network will now see your 2 block fork as longer than the alternative 1 block fork from the slot battle. I mean, it is not like you can wind back time and re-mint your block on the other fork when you eventually receive the other block from the slot battle. Nobody can know if you did receive both blocks of the slot battle fork and selectively dropped one, or only received the block you prefer. I realise that I am a nobody on this forum, but surely my argument is simply restating the "longest chain rule". |
Sure the longest chain wins, but this only happens if you have the majority of the block producing nodes on that chain. So again, doing this without a hardfork will result in the same behavior we saw before with the node bug. Most pools with large amount of stake will not upgrade to that newer version without massive social pressure. |
With input endorsers I imagine the analogous scenario would be where two pools are slot leader to produce a ranking block for the same slot. Depending on the implementation both of these ranking blocks may be accepted and everyone will agree the ordering of these ranking blocks such that the block with the higher "block VRF" will be ordered later. In other words, with input endorsers, slot battles will go the way of the dodo. So this whole discussion will be rendered irrelevant. |
@AndrewWestberg sorry for a bit off topic but, does it mean that we can run now 2 BPs for same pool side by side for HA purposes? If there will be no battle between blocks produced by 2 BPs of same pool... Am I missing something? |
@os11k NO. It does not mean you can now run 2 BPs for the same pool. |
Sorry to dredge up the past @JaredCorduan but I'm trying to document how ties are settled in the case of multiple BP's running with the same keys. Plus, this is technically still an open issue apparently... It appears to me there is no preference given, The PraosChainSelectView will get all the way down to EQ here: @rdlrt referenced this issue here: #2014 Which makes it seem like the decision would fall down to block hash everything else being equal. When I analyzed 9 different well distributed forker events I don't see a clear rule here to use. 8097023 lower block hash https://pooltool.io/realtime/8097023 I'm ok saying it's random depending on how blocks are stored in the volatileDB for the block producer that makes the next block. Just want to be accurate as its going in this "learning cardano" book. |
That's great, thank you! If two blocks are at the same chain depth, and they have the same issuer (ie |
Thank you for your response. Nonces are equal because its the same pool secrets making the same block at the same slot. cncli is sending pooltool the "block vrf's" and I've indeed documented that the vrf is exactly the same for both submitted blocks with different hashes. (just to be precise, i don't keep all the LSB's of the vrf, so if its in the lower vrf then I guess it could be different.. for example for height 8078291 the vrf values are both |
I do not know whether or not we've found a bug or just discovered the consequences of an intentional decision.
The final tie breaker for "slot battles" is the leader VRF value:
https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-protocol/src/Ouroboros/Consensus/Protocol/Praos/Common.hs#L62-L68
In the
TPraos
protocol (used prior to the Vasil HF),csvLeaderVRF
was the leader VRF value. In thePraos
protocol, however,csvLeaderVRF
is being set to the single VRF value in the block header (prior to the range extension).This removes a small advantage that small pools previously enjoyed. Small pools are more likely to win this tie breaker, since by being a small pool they need a smaller leader VRF value in order to win the leader check. Using the the VRF value before the range extension is applied removes this small advantage.
The Evidence:
The view,
PraosChainSelectView
, is populated by theBlockSupportsProtocol
class methodselectView
, which uses theProtocolHeaderSupportsProtocol
class methodpHeaderVRFValue
to setcsvLeaderVRF
in the view.TPraos
,pHeaderVRFValue
uses the leader VRF value:https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Protocol/TPraos.hs#L111
Praos
,pHeaderVRFValue
uses the raw header VRF value:https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Protocol/Praos.hs#L141
This was discover here: cardano-community/cncli#19
The text was updated successfully, but these errors were encountered: