bitcoin-learning-old

Warning

This article is a total 100% piece of s^_^t that only myself can understand it.

I kindly request you to close the tab and refrain from reading more.

Youtube Princeton Bitcoin/Blockchain Course

Bitcoin is a blockchain-based cryptocurrency. Block addresses, transaction values, and other information are all made public on the blockchain. Hash and signature are the two cryptographic concepts that Bitcoin primarily uses.

Hash function

A hash function accepts any string( or bitstream ) as input, and outputs a fixed-length string. It is efficiently computable, thus it can be used extensively.

Security properties

Each of the hash functions MUST meet these standards:

Collision-free
Hiding
Puzzle-friendly

Collision-free

We define a collsion that

$ a ≠ b\space and\space H(a) = H(b) $

Commonly, collisions can be discovered using the following methods:

Brute-force: Calculates all potential inputs.
Identify exploits in hash functions.

It is unavoidable since a hash function maps inifinite inputs to finite outputs. However, being collision-free does not mean there is no collision. It actually means that no one can FIND a collision because it is extremely hard to find it.

Therefore, we choose to believe it has no collisions

$H(x)=H(y)$, it is safe to assume that $x=y$.

What can hash do?

Message digest: to check whether or not the message has been altered.

Hiding

Given $H(x)$, it is infeasible to find $x$. Hash functions should be one-way.

The prerequisite is that $x$ must be taken from a fairly wide range in order to guarantee hiding. If not, we must concatenate a $nonce$ from a probability distribution with a high-min entropy, such as a set of 256-bit values, and then perform hash operations like $H(nonce||\space x)$.

Puzzle-friendly

For every possible output value $y$, if $k$ is chosen from a distribution with high min-entropy, then it is infeasible to find $x$ such that $H(k\space ||\space x)=y$.

There is no solving strategy that is much better than trying random values of $x$.

In summary, it have to be arduous to solve, but easy to verify.

SHA-256

Secure hash algorithm 256-bit, SHA-256, which the bitcoin uses.

SHA-256 Working Mechanism

Hash pointers

Hash pointers is NOT a pointer, it is a STRUCT that contains a pointer to the previous block and a hash of that block.

Hash Pointer

Blockchain

Most data structures, including Chained List( for example the Blockchain ), can be constructed using hash pointers.

Blockchain

The initial block produced by the system is known as the Genesis Block, and the most recent block we added, is pointed to by a hash pointer that we should remember and record.

Tampering Detection in Blockchain

We can quickly determine whether the entire chain has been modified thanks to the hash pointers.

Consider the scenario when an attacker changed data in Block 2. However, block 3 contains a hash pointer that points to block 2 and stores a hash of it; since the current hash differs from that stored there, we can infer that the data in block 2 has been altered.
Attackers must edit block 3, block 4, all the way to the final block in order to continue the story. The final block, though, is what we remebered and recorded, so attackers cannot modify it!

Merkle Tree

Merkle tree

A binary tree with hash pointers.

Advantages:

It is easier to demonstrate membership in it through $\Theta(\log{}n)$ time and space.

Variant: Sorted Merkle tree.

Can verify non-membership in $\Theta(\log{}n)$.

Not a cycle

You can replace plain pointers to hash pointers in any pointer-based data structure WITHOUT CYCLES. Because hashes in cycled structure couldn’t match.

Digital Signatures

What we desire:

Anyone can verify, but only you can sign.
A signature is attached to a certain document and cannot be copied and pasted to another.

Generate a pair – (secret signing key, public verification key). Use a secret key to sign a message, and returns a signature. Publish your public key to anyone who wants to verify the message, so they can easily verify your message using the public key, message, and signature.

Requirements:

Valid signatures verify

1	verify(public_key, message, sign(secret_key, message)) == true

can’t forge signatures

An adversary who knows the public key and gets to see signatures on messages of his choice can’t produce a verifiable signature on another message. Attackers can’t forge a message with a valid signature.

Practical things…

Algorithms are randomized: If not, the private key might be leaked.
Limit on message size: fix – Hash(message)

Sign a hash pointer

Not to sign the STRUCT, but to sign the whole previous chain.

ECDSA

Elliptic Curve Digital Signature Algorithm.

It is a standard that relies on hairy math and has good randomness.

Bitcoin uses the ECDSA standard.

Public Keys as Identities

If verify(public_key, message, signature) == true, then we can think: public_key says “{message}”. To speak as public_key, you need to know private_key to sign your message.

How to make a new identity

Create a new random key pair (secret_key,public_key). public_key is your public “name” you can use (usually better to use Hash(public_key), while secret_key lets you “speak for” the identity.

Since only you know the secret_key, you are in charge of the identity.

Decentralized identity management

Anybody can create as many different identities as they wish at any moment.

No central point of control, and no one can control you.

These identities are called *addresses* in Bitcoin. The address is a public key or a hash of the public key.

Privacy

Addresses are not directly related to a person’s identity in the actual world.
But observers can infer your identity by your behavioral pattern over time.

A Simple Cryptocurrency

A-Very-Simple-Coin

Admin can create new coins. New coins belong to him.

Create

Coin owners can pay them to someone.

Trade

Double-spending attack

The coin owner can pay one coin to multiple identities without awareness.

Double-Spending Attack

A-Better-Coin

The admin will publish a history of all transactions by a signed blockchain. Every block is marked with a transaction ID.

Better coin chain

Coins Creation:

Better creation

Trade:

Better payment

Coins are Immutable

Coins can’t be transferred, subdivided, or combined.

However, you can perform these activities using transactions, like subdividing.

You can start a new transaction, spend your coin, and pay out TWO NEW COINS to yourself since transactions work by consuming old coins and producing new coins with an identical total value.

Centralized Problem

In these two case, every transaction must be SIGNED BY THE ADMIN, AND THE ADMIN MUST BE RELIABLE. If he wants to mess it up, he can simply stop signing or sign the wrong blocks.

Decentralization

I want to start by stating that hardly any systems are actually decentralized or centralized.

For instance, webmail services are centralized, despite the fact that email uses a decentralized protocol (SMTP).

Aspects of decentralization in Bitcoin:

Who maintains the ledger?
Who has authority over which transactions are valid?
Who creates new bitcoins?
Who determines how the rules of the system change?
How do bitcoins acquire exchange value?

Other centralized things BEYOND PROTOCOL: exchanges, wallet software, and service providers.

P2P: open to anyone, low barrier to entry.
Mining: open to anyone, but need high costs.
Updates to software: core developers trusted by the community, have great power.

Distributed consensus

A distributed system may have a massive amount of nodes. Values in each node have to be the same.

Bitcoin is a peer-to-peer system

When a person wants to pay another person, he or she broadcasts the transaction to all Bitcoin nodes.

The one who received Bitcoin doesn’t need to run a Bitcoin node to receive the Bitcoin.

How consensus could work in Bitcoin

At any given time:

All nodes have a sequence of blocks of transactions they’ve reached a consensus on
Each node has a set of outstanding transactions it’s heard about.

Consensus is hard

Nodes may be malicious
Nodes may crash
Network is imperfect
- Not all pairs of nodes connected
- Faults in network
- Latency

Impossibility results:

Byzantine generals problem.
Fischer-Lynch-Paterson: consensus impossible with a single faulty node.

Well-known protocols: Paxos

Paxos never produces an inconsistent result, but can rarely get stuck.

These results say more about the model than about the problem. The models were developed to study the system of distributive databases. Bitcoin consensus works better in practice than in theory.

But theory is important, can help us predict unforeseen attacks.

Some things Bitcoin does differently

Introduces incentives: possible only because it is a currency.
Embraces randomness.
- Does away with the notion of a specific end-point.
- Consensus happens over long time scales – about 1 hour.

The probability of your transactions being merged into the consensus blockchain rises as time goes on, at the end, you won’t say it’s 100%, but the probability of you’re wrong goes to a very small value. This is a kind of guarantee that Bitcoin gives you.

Consensus without identity

Every bitcoin node does not have a long-lasting identity.

Why identity?

Pragmatic: some protocols need node IDs.
Security: assume less than 50% nodes are malicious.

Why don’t Bitcoin nodes have identities?

Identity is hard in a P2P system — it is decentralized, and no authorities could prove it. (Sybil attack)
Pseudonymity is the goal of Bitcoin.

A simplified consensus algorithm

New transactions are broadcast to all nodes.
Each node collects new transactions into a block.
In each round, a random node gets to broadcast its block.
Other nodes accept the block only if all transactions in it are valid (unspent, valid signatures..).
Nodes express their acceptance of the block silently by including its hash in the next block they create.

What can a malicious node do?

Double-spending attack.

double spending attack

The one that nodes previously got will be preferred. But we are unsure whether transaction is earlier due to the latency. It is easy to get completely opposite results.

If the next node proposes to extend the bad block, the longest valid chain will contain the bad block, and the chain that holds the good block will be ignored.

How to solve it?

In this case, the receiver trusts the attacker, he accepts the trade once he hears about the Bitcoin transaction with 0 confirmations.

He can wait until the following block is suggested. to determine whether the faulty block is exisited or continued.

confirmations to avoid Double spending attacking

With more confirmations, there is an exponentially lower chance of double spending. The most common heuristic: wait for 6 confirmations.

Incentives and PoW

Can we penalize the node that created the malicious block? No, because nodes have no identity.

Can we reward nodes that created honest blocks? Yes, by giving them Bitcoin.

Incentive 1: block reward

The creator of the block gets to…

include a special coin-creation transaction in the block
choose the recipient address of this transaction

The Block creator gets to collect the reward only if the block ends up on a long-term consensus branch.

Reward value: currently (At 2022) 6.25 BTC, halves every 4 years.

Therefore, there is a limited supply of bitcoins, 21 million in total.

This is the only way of Bitcoin creation!

Incentive 2: transaction fees

The creator of a transaction can choose to make the output value less than the input value. Purely voluntary.

Proof of work(PoW)

To approximate selecting a random node: select nodes in proportion to a resource that no one can monopolize.

In proportion to computing power: proof-of-work
In proportion to ownership: proof-of-stake

Select nodes in proportion to computing power. Let nodes compete for the right to create a block. And make it moderately hard to create new identities.

Hash puzzles

To create block, find $nonce$ such that $H(nonce\space ||\space prevHash \space ||\space tx\space ||\space ..\space ||\space tx)$ is very small($<\space target$). Where $tx$ stands for transactions, $target$ is much smaller than 1% of total outputs.

target space in hash puzzles

If H() is secure, the only way to succeed is to try enough nonces until you get lucky.
Attacks are infeasible if a majority of miners weighted by hash power follow the protocol.

Pow property 1: difficult to compute

As of Aug 2014: about $10^{20}$ hashes each block.

Only some nodes bother to compete — miners.

Pow property 2: parameterizable cost

Nodes automatically re-calculate the $target$ every two weeks.

Goal: average time between blocks = 10 min.

$ Prob(\text{Alice wins next block})\space =\space \text{fraction of global hash power she controls} $

For individual miners:

$ \text{mean time to find block} \space =\space \frac{ \text{10 minutes} }{ \text{fraction of hash power} } $

Pow property 3: trivial to verify

$nonce$ must be published as part of the block. Easy for other miners to verify via 1 time Hash function.

Mining economics

$ \mathop{}_{\text{block reward + Tx fees}}^{\text{mining reward}}\space < \space\text{hardware + electricity cost} \Rightarrow \text{Profit} $

Bitcoin has three types of consensus

Value
State
Rules

Append-only ledger

Decentralized consenus

Miners to validate transactions.

Mechanics in Bitcoin

Bitcoin transactions

Account-based ledger is not used in Bitcoin because it is hard to tell the value of an account.

Alternatively, Bitcoin uses a trasaction-based ledger, where each of the transactions has a unique identifier, an input value, and an output value. You must spend all your Bitcoins in this kind of transactions, even you can’t pay all your Bitcoins to others, in this case you have to pay the Bitcoin you left to your self, and this is known as address changing. How to check if your Bitcoins are enough to pay? Just check the inputs where a pointer points to the previous block which contains your transaction.

A transacation-based ledger

And yes, you can do multiple to single transactions call joint transaction, it only needs multiple signatures from every payers.

Joint transaction

An trasaction in JSON-like text

In real case, this will be a compact binary format which couldn’t be understood by human.

{
  "hash": "5a42590fbe0a90ee8e8747244d6c84f0db1a3a24e8f1b95b10c9e050990b8b6b",
  "ver": 1,
  "vin_sz": 2,
  "vout_sz": 1,
  "lock time": 0,
  "size": 404,
  "in": [
    {
      "prev_out": {
        "hash": "3be4ac9728a0823cf5e2deb2e86fc0bd2aa503a91d307b42ba76117d79280260",
        "n": 0
      },
      "scriptSig": "30440....3f3a4ce81"
    },
    {
      "prev_out": {
        "hash": "7508e6ab259b4df0fd5147bab0c949d81473db4518f81afc5c3f52f91ff6b34e",
        "n": 0
      },
      "scriptSig": "304602210....3f3a4ce81"
    }
  ],
  "out": [
    {
      "value": "10.12287097",
      "scriptPubKey": "OP_DUP OP_HASH160 69e02e18b5705a05dd6b28ed517716c894b3d42e OP_EQUALVERIFY OP CHECKSIG"
    }
  ]
}

Metadata
- hash: unique identifier
- ver, vin_sz, vout_sz, size: housekeeping information
- lock_time: not valid before… (later for details)
Inputs: A list
- prev_out: previous trasaction
- scriptSig: signature, sign to pay
Outputs: A list
- value: output value
- scriptPubKey: Bitcoin scripts, recipient address

Bitcoin Scripts

Addresses in both inputs and outputs are all scripts. We concatenate input scripts and output scripts, if it can run with no error, then this transaction is valid.

Design Goals

Built for Bitcoin.
Simple, Compact.
Support for crypotography.
Stack-based: No variable, easy for data processing.
Limits on time / memory.
No looping: It is not turing complete, so we can ignore halting problem. So node owners shouldn’t worry about infinite-loop scripts that random users submit.

Simple script

1	<sig> <pubkey> OP_DUP OP_HASH160 <pubKeyHash?> OP_EQUALVERIFY OP_CHECKSIG

<sig> <pubKey>
OP_DUP: duplicate the top value and push it into top, in this case the <pubKey>
OP_HASH160: take the top value and hash it, in this case it converts <pubKey> to a hash of <pubKey> – <pubKeyHash>
<pubKeyHash?>: specified by sender
OP_EQUALVERIFY: to verify are the two values at top of the stack equal, if not, errors occur, script stop excuting. After the verification, that two values will pop off, in this case only <sig> and that duplicated <pubKey> is on the stack now.
OP_CHECKSIG: to check if the transaction is signed.
No error, returns OK, the transaction is being merged into the chain.