This post is also available in the following languages. Japanese

Intern Report - A survey on selection techniques of consensus participants in blockchains

Takahashi Masahiro2021-02-17

Hello, I am Takahashi. I participated in the internship of the engineering on-the-job training course in 2020. I usually research algorithms and data structures at Kyoto University. During my internship, I became a member of the LINE Blockchain Lab team, where I conducted research on various blockchain technologies and proposed functional improvements based on my investigation. In this article, I would like to share the technologies that I found interesting during my research and the impressions I had during my internship.

About blockchain

I believe that many people have heard about blockchain. Blockchain is a technology that in which Bitcoin and other cryptocurrencies are based; it has recently attracted attention as a platform for decentralized applications (dapps). However, only a few people may be familiar with the details of the technology. Blockchain realizes a "distributed system for decentralized decision-making." I will explain it in a step-by-step manner.

First, a distributed system is a system where multiple independent computers execute a specific task while cooperating through communication. For example, the internet is a typical (and the largest) distributed system aimed at exchanging information. Suppose you want to make a decision, for example, to take a majority vote on this kind of a distributed system. If all computers on the system are normal and reliable, the matter is simple. It is the case when the distributed system is operated as an internal system of a particular company. Here, the computers simply inform each other of their opinions, and the majority vote is made based on them.

However, in decentralized peer-to-peer (P2P) distributed systems and others, malicious computers could be mingled with intent to attack the system.

As computers with malicious intent may be infinitely inflated, a risk that hinders the right decision if you simply try to make a majority decision for the entire system exists. Tolerating attacks from such computers and making a majority decision among normal computers are significant problems for distributed systems; a situation known as the Byzantine Generals Problem. A blockchain is a kind of system that realizes consensus building in such a P2P network. For example, Bitcoin, the first blockchain, was developed by Satoshi Nakamoto as a system for consensus building in terms of cryptocurrency payments. Bitcoin is designed to prevent fraudulent payments by requiring many computational resources (work) to process the payments. This kind of method is called a proof-of-work (PoW).

Meanwhile, some blockchains use a structure called a proof-of-stake (PoS). Under a PoS-based blockchain, participants in the consensus building process (consensus) are weighted based on the number of tokens in the network (like currency) they put up. Betted tokens increase by the number of good arguments and are confiscated by the number of bad arguments in the consensus. Then, fraud is addressed by requesting a lot of betted tokens (stakes) to influence the agreement.

Under PoW, security is guaranteed by the value outside a blockchain, such as computational resources. In contrast, under PoS, security is ensured by the value inside a blockchain, such as a token.

Selection of consensus participants

Polkadot, the blockchain that I mainly investigated during my internship, uses a PoS-based consensus. More precisely, the consensus of Polkadot is called Nominated PoS. A subset of participants, elected by a vote among participants, is involved in the consensus process. (Participants who joined in the consensus are called validators in Polkadot.) The reasons for limiting the number of participants in the consensus in this way are as follows.

Reduce latency to reach a consensus

The algorithm used in a PoS-based consensus generally takes significant time to execute as the number of participants increases. In particular, if all participants join the consensus process, it may consume an enormous amount of time for a single consensus. Such a situation can be avoided by only having a representative of the participants in the consensus.

Consensus participants need to be highly available

The consensus in a blockchain is a continuing process that takes place during day and night. Therefore, participants in the consensus need to be continuously connected to the network and secure a high bandwidth. These requirements are not achievable for all participants, so the number of participants needs to be limited. By the way, how exactly should you select validators by voting? If you merely choose validators by the number of votes, you may fail to properly reflect the opinions of voters into them. Ideally, you would want to get validators who keep and condense the structure of the principles that voters have.

BalPhragmms

Web 3.0 Technologies Foundation, the operator of Polkadot, has proposed an algorithm called BalPhragmms, which can be used for elections with the above requirements. This algorithm is quite complicated and difficult to thoroughly explain, but I would like to provide an overview.

BalPhragmms captures the state of voting among participants in a bipartite graph structure, as shown below.

Using this graph, consider how validators can be elected from the candidates through votes cast by voters. The vertices on the left side of the graph represent the participants who are voting (called nominators); the vertices on the right side represent the candidates of validators to be voted. The edge connecting a voter and a candidate means that the voter voted for the candidate. A nominator can have multiple edges. This means that a voter can cast a ballot for multiple candidates, which may differ from a general image of voting. Although not shown in the diagram above, during the vote in the network, each nominator bets (stakes) their Polkadot tokens, called DOT. The more stakes a nominator has, the more weights are placed on their ballots (the PoS principle).

Given the voting situation, how should you measure the "goodness" of the validators you elect from the candidates? BalPhragmms uses an unusual metric to measure the goodness of validators: that is, "how evenly are stakes distributed across the validators based on the assumption that the nominators' stakes are distributed to the validators for whom the nominators voted?" This metric is proven to be useful in the literature to avoid underrepresentation (such as not adopting an opinion that has more than a certain level of support) and overrepresentation (such as adopting an opinion that has less than a certain level of support). Suppose that each nominator stakes (represented by a coin in the figure) and they are distributed along the edges of the graph to elected validators, as shown in the following figure.

If you choose the first and second candidates from the top as validators, you can distribute stakes so that the two validators receive equal values, as follows.

On the other hand, if you choose the second and third candidates from the top, you would only distribute stakes in a slightly skewed manner, as shown below.

In this case, BalPhragmms will evaluate the former validators as better.

Although informally explained here, concepts such as "proportional allocation" and the "goodness of validators" can be rigorously defined using formulas (see the original paper for details). Based on the definition, the algorithm of BalPhragmms can find the best possible validators. (Although finding the best validators ends up to NP-hard, which leads you to abandon the solution, BalPhragmms is an approximation algorithm that theoretically ensures good outputs.)

Results

During the internship, I researched algorithms used in blockchains, including BalPhragmms mentioned above. I delivered a final presentation of my findings at a team meeting in the last week. I received many suggestions from team members about my presentation; it was a very beneficial experience to explain complicated things like algorithms.

Impressions of the internship

Interns at LINE are assigned to work in an actual active team. Accordingly, I participated in conferences and meetings, managed attendance, and so forth, just like a regular employee. I felt that the environment allowed me to gain a more authentic work experience. This year, the internship was almost entirely carried out remotely, but I had an opportunity to visit the Miraina office in Shinjuku just once.

The Miraina office was very clean and spacious, and I felt that there were many ingenious ideas scattered throughout the office to make it comfortable to work.

In the store space near the reception desk, lunch boxes and coffee were available for purchase, and various LINE Friends merchandise was on display.

I believe that the internship at LINE is a valuable one that will give you an exclusive experience. If you find it interesting, please apply for next year's internship.

Blog