Intelligent bribery selfish mining attack algorithm for Ethernet workshop network
Technical Field
The invention belongs to the field of privacy protection, relates to technologies such as block chains, selfish mining, machine learning and the like, provides an attack algorithm with higher income and lower threshold value under a rational environment while improving the intelligence of an attacker, discovers a vulnerability of an identification mechanism in a block chain system, and provides a new idea for further improving the security of the block chain system.
Background
In an ether house network, miners organize and pack legal transaction information in the network into a block. Through the consensus agreement, all miners ' nodes participate in the competition for the accounting right, and finally one miners ' node obtains the accounting right, and the miners add the newly generated block to the distributed account book (i.e., the block chain) by using the link, so that the miners ' nodes obtain the transaction fee and the block-out reward. This has attracted the attention of many attackers, since ethernet coins have a high economic value. It should be noted that, even for miners with high calculation power, the capability of generating new blocks is also high, and the probability of obtaining the accounting right in the consensus protocol is also high. One extreme case is that when an attacker has the vast majority (> 51%) of the computing power, a 51% attack can be performed, and the ledger information can be arbitrarily changed by forking, thereby obtaining illegal profits (e.g., doublespinning). In a blockchain, forking is mainly divided into two cases: normal forking and malicious forking. Normal forking is caused by protocol modification or simultaneous discovery of new blocks by multiple honest miners. Malicious forking is the intentional forking by an attacker through some attack algorithms in order to gain more profit. Miners with less calculation power integrate the calculation power of the miners to form a mine pool, mine excavation is carried out according to the overall calculation power of the mine pool, and if a new block is found in the mine pool, rewards are distributed according to proportion. When the mine is developed to a certain scale, 51% of attacks are extremely easy to carry out, so that miners with little calculation power still have the opportunity to obtain greater benefits. The mine may also cause other attacks such as selfish mining attacks, stubborn attacks, and the like. These strategic attacks seriously undermine the economic ecology of cryptocurrency systems, affecting their benign development. Therefore, the security problem caused by such attacks has been a focus. One solution to this type of attack is to increase the scale of honest miners and construct consensus protocols that encourage compatibility.
Disclosure of Invention
The invention aims to provide a mixed currency quality updating rule with maximized income, which comprises the following specific processes: when a selfish pool digs a new block, the block is temporarily reserved, and then the block is disclosed at a proper time so as to invalidate the blocks of other rational miners and increase the relative income of the selfish pool; when a block is dug out, the selfish pool refers to uncalebock which is not referred to in the network to obtain nephewrewarded; and the selfish mine pool can also be bribery attacked when a new block is dug; the specific method comprises the following steps: when a new block is dug on the private chain from the private mine pool, a certain extra reward is added on the block, and a miner who subsequently digs the next block on the private chain can obtain the extra reward; when competitive forking occurs in the Ethernet factory network, a briy attack can attract a part of rational miners to work on the branches of the selfish pool, so that the probability that the private chain of the selfish pool becomes the longest legal chain is increased, and the income of the selfish pool is increased globally;
establishing a Markov decision process model for a BSM-Ether algorithm through reinforcement learning, wherein the model is defined as a quadruple, S represents a state space, A represents an action set, P represents a state probability transition matrix, and R is a reward matrix; the following section will describe M components in detail:
(1) state space S: the state of the markov decision process at any time is < la, lh, optional, unity >, wherein la represents the private chain length of the selfish pool, lh represents the public chain length, and optional can take any value in the set { irrelevant, relevant, active }, and the meaning of each value is as follows:
a) if the current state is < la, lh, irrelevant, uncle >, the last state is < la-1, lh, optional, uncle >, which means that a new block is dug out from the selfish pool during state transfer;
b) the previous state of < la, lh, relevant, uncle > is < la, lh-1, optional, uncle >, which means that a new block is excavated by a rational miner;
c) if the current state is < la, lh, active, circle >, it means that the private chain loses the leading advantage due to the fact that a rational miner digs a new block, and the selfish pool executes the match action (the definition of the match is given in the action set), so that the block chain network is divided into two parts, namely competition branches;
the value of the uncle is exist or none; exist represents that the calculation of uncle rewarded needs to be considered when the state of the current system is transferred, and none represents that uncle rewarded does not need to be calculated;
(2) action set A: represents a set of policies selectable by a selfish attacker in a certain system state:
a) adopt: the selfish attacker gives up the operation of the private chain and chooses to dig a mine on the public chain; when the private chain length is behind the public chain (i.e. la < lh), selecting to execute the action;
b) wait: an attacker does not publicize any block and continues to dig a mine on the private chain branch; when la > lh, this action is performed;
c) override: when the system state is < la, lh, replace, uncle > or < la, lh, active, exit >, wherein la ═ lh +1 and lh is greater than or equal to 1, the attacker executes the action, and publishes and broadcasts all blocks on the private chain, so that the blocks on the corresponding public chain are invalid;
d) match: a new block is discovered by rational miners, so that the length of the public chain is equal to that of the private chain, namely la equals lh, at the moment, an attacker publishes all blocks on the private chain, and performs bribery attack on the current private chain, so that the probability that the private chain becomes the longest legal chain is increased;
(3) state transition matrix P: in the current Markov model, each state is a quadruple expressed as < la, lh, optional, unity >, the initial state of the model is <0,1, relevant, none >, and then a selfish attacker and a rational miner dig a mine on the Markov model with calculated forces alpha and 1-alpha respectively; an attacker selects an optimal action according to the current state, and then the model is transferred to the next state;
(4) reward matrix R: the attacker selects an action in each state, then the model is transferred to the next state, and when the state is transferred, the attacker obtains corresponding rewards in the form of binary groups < r _ la, r _ lh >, wherein r _ la represents the rewards obtained by the attacker, and r _ lh represents the rewards obtained by rational miners.
The method can construct a Markov decision process for the bribery selfie mining attack under the condition of considering rational miners and intelligent selfish miners, and optimizes an attack algorithm by utilizing reinforcement learning. The invention achieves the following effects: in the EtherFang network, all participants are rational, and there are intelligent attackers, the attackers use the machine learning thought, through constructing Markov decision process, choose the appropriate parameter, achieve the effects of minimum threshold value, maximum income. The method is suitable for the safety of the block chain system adopting the workload certification consensus mechanism.
Drawings
FIG. 1 BSM-Ether state machine diagram
FIG. 2 BSM-Ether graph comparing profits from other algorithms
FIG. 3 is a BSM-Ether and BingtongN network attack algorithm comparison diagram
Detailed Description
Fig. 1 illustrates the state transition process of the system. The system state is initialized to <0,1, Relevant, none > first, and the action space is initialized with the set { adopt, wait, override, match }. When the attacker digs a new block, the function onselfishool (state) is called, the attacker refers to an unreferenced un-referenced un-block in the network and implements fragment attack. When a rational miner digs out a new block, the function OnRationMiners (state) is called, an attacker takes corresponding action according to the current system state, and then the function computeReward (state) is called to calculate the reward.
The BSM-Ether algorithm comprises the following specific steps:
(1)state←<0,1,relevant,none>
(2)actionSpace←{adopt,wait,override,match}
(3)Function OnSelfishPool(state)
(4)references some unreferenced uncle blocks
(5)perform bribery attack
(6)la←la+1
(7)If la==lh+1 and lh>=1 then
(8)take the action==override
(9)state←<la,lh,active,exist>
(10)computeReward(state,override)
(11)Else
(12) take the action==wait
(13)state←<la,lh,irrelevant,none>
(14)computeReward(state,wait)
(15)EndIf
(16)EndFunction
(17)Function OnRationalMiners(state)
(18)the miner references some unreferenced uncle blocks
(19)lh←lh+1
(20)If la<lh then
(21)take the action==adopt
(22)state←<la,lh,relevant,exist>
(23)computeReward(state,adopt)
(24)Else If la==lh then
(25)take the action==match
(26)state←<la,lh,relevant,none>
(27)computeReward(state,match)
(28)Else If la==lh+1 then
(29)take the action==override
(30)state←<la,lh,relevant,none>
(31)computeReward(state,override)
(32)Else
(33)take the action==match
(34)state←<la,lh,active,exist>
(35)computeReward(state,match)
(36)EndIf
(37)EndFunction
validation of the invention
To verify the validity and superiority of the algorithm, BSM-Ether is compared with SM1 in Ethereum and Honest Mining in FIG. 2. For the SM1 algorithm, the SM1 algorithm will only obtain higher relative gains than honest mining if the attacker's computing power is higher than 16.3%, whereas for the BSM-Ether algorithm, the attacker only needs to have more than 8.5% computing power to obtain higher relative gains than honest mining. Moreover, the BSM-Ether algorithm has a higher relative gain than the SM1 algorithm in case the attacker has equal strength. Therefore, the BSM-Ether algorithm not only obviously reduces the calculation threshold of the selfish excavation, but also increases the relative benefit of the selfish excavation. In fig. 3, the BSM-Ether algorithm is compared with the bitcoin network attack algorithm. BSM-Ether in etherhouse has a higher relative yield curve than IPBSM algorithm in bitcoin, because the uncle rewarded existing in etherhouse can provide additional yield for the attacker. Not only here, uncle rewarded also compensates for the cost of partially implementing selfish mining, so that BSM-Ether has a lower computational threshold than IPBSM. This indicates that the etherhouses are more vulnerable to selfish mining than bitcoin, and therefore there is a need to improve the security of the etherhouses.