So I have gotten a bit of additional clarity from the Chia Network team about the drop in netspace and pool netspace this weekend and the corresponding slight reduction in signage points. It seems that it was an arms race between the fastest timelord on the network and a new one coming online and fighting for top spot. The netspace drop the pools saw seems to be have been largely artificial and not a function of large swathes of farmers having issues. It also seems that the advice to update, while good in general, would not have solved this problem.
I had suspected originally that the issues would be caused by something to do with the timelords, as that’s where the signage points are generated from my understanding. Now we have confirmation. But why did the pools report lower netspace? It seems that they are calculating their netspace based on a static constant for signage point rate. So when the signage points slowed down, the netspace dropped by a corresponding rate. I was able to confirm this with a pool operator who agreed with the assessment.
Summarizing my personal thoughts on this (thank you @altendky and everyone for the analysis and data):
– Chia consensus is farmers (space) + timelords (vdf speed). Either one changing will change block rate and thus difficulty
– If the fastest timelord goes offline, the SP and block rate decreases, since VDF speed decreases
– Difficulty adjustment will then kick in, and bring the SP rate back to normal
– If the fastest comes back online, the SP will go even higher than normal
– More competing timelords (if for example the 2nd and 3rd are close in speed) can cause more forks in the chain
– Worse or different connectivity in the fastest timelords can cause more forks as well
It looks like the consensus algorithm is working exactly as intended. In the netspace calculators, we should always take into account the signage point rate and not assume a constant VDF speed.
Mariano Sorgente, Software Engineer Chia Network
Chia is recommending that pool operators calculate their signage point rate over 30 minutes of rolling real world data instead of assuming a standard if they want to their pool space accurately reported. At the end of the day, it seems like this is just the network working as expected with components coming on and offline. I think it does speak to the centrally important nature of timelords to the Chia network health and that possibly that functionality should be built straight into the full node software so that no matter what the network continues to hum along as long a there are farmers.
I think this event shows how sensitive the Chia network is to timelord issues, and from what I can tell if there is going to be a major exploit or issue with consensus on the network it will come from the timelords. On a decentralized network as distributed as Chia it is odd to see such a single piece of infrastructure so central to the process. I know they are trying to solve this issue with ASIC timelords but to me that means that the person running the most voltage through their ASIC will control the network, and will just create a new arms race on a different battlefield trying to get the ASIC clock as high as possible.