以太坊智能合约虚拟机(EVM)原理与实现

以太坊 EVM原理与实现

以太坊底层经过EVM模块支持合约的执行与调用,调用时根据合约地址获取到代码,生成环境后载入到EVM中运行。一般智能合约的开发流程是用solidlity编写逻辑代码,再经过编译器编译元数据,最后再发布到以太坊上。html

01220829_zhgu[1].png

代码结构
.
├── analysis.go            //跳转目标断定
├── common.go
├── contract.go            //合约数据结构
├── contracts.go           //预编译好的合约
├── errors.go
├── evm.go                 //执行器 对外提供一些外部接口   
├── gas.go                 //call gas花费计算 一级指令耗费gas级别
├── gas_table.go           //指令耗费计算函数表
├── gen_structlog.go       
├── instructions.go        //指令操做
├── interface.go           
├── interpreter.go         //解释器 调用核心
├── intpool.go             //int值池
├── int_pool_verifier_empty.go
├── int_pool_verifier.go
├── jump_table.go           //指令和指令操做(操做,花费,验证)对应表
├── logger.go               //状态日志
├── memory.go               //EVM 内存
├── memory_table.go         //EVM 内存操做表 主要衡量操做所需内存大小
├── noop.go
├── opcodes.go              //Op指令 以及一些对应关系     
├── runtime
│   ├── env.go              //执行环境 
│   ├── fuzz.go
│   └── runtime.go          //运行接口 测试使用
├── stack.go                //栈
└── stack_table.go          //栈验证

指令
OpCode
文件opcodes.go中定义了全部的OpCode,该值是一个byte,合约编译出来的bytecode中,一个OpCode就是上面的一位。opcodes按功能分为9组(运算相关,块操做,加密相关等)。c++

//算数相关
    const (
        // 0x0 range - arithmetic ops
        STOP OpCode = iota
        ADD
        MUL
        SUB
        DIV
        SDIV
        MOD
        SMOD
        ADDMOD
        MULMOD
        EXP
        SIGNEXTEND
    )

Instruction
文件jump.table.go定义了四种指令集合,每一个集合实质上是个256长度的数组,名字翻译过来是(荒地,农庄,拜占庭,君士坦丁堡)估计是对应了EVM的四个发展阶段。指令集向前兼容。数据库

frontierInstructionSet       = NewFrontierInstructionSet()
    homesteadInstructionSet      = NewHomesteadInstructionSet()
    byzantiumInstructionSet      = NewByzantiumInstructionSet()
    constantinopleInstructionSet = NewConstantinopleInstructionSet()

具体每条指令结构以下,字段意思见注释。数组

type operation struct {
    //对应的操做函数
    execute executionFunc
    // 操做对应的gas消耗
    gasCost gasFunc
    // 栈深度验证
    validateStack stackValidationFunc
    // 操做所需空间
    memorySize memorySizeFunc

    halts   bool // 运算停止
    jumps   bool // 跳转(for)
    writes  bool // 是否写入
    valid   bool // 操做是否有效
    reverts bool // 出错回滚
    returns bool // 返回
}

按下面的ADD指令为例缓存

定义
    ADD: {
        execute:       opAdd,
        gasCost:       constGasFunc(GasFastestStep),
        validateStack: makeStackFunc(2, 1),
        valid:         true,
    },

操做
不一样的操做有所不一样,操做对象根据指令不一样可能影响栈,内存,statedb。数据结构

func opAdd(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        //弹出一个值,取出一个值(这个值依旧保存在栈上面,运算结束后这个值就改变成结果值)
        x, y := stack.pop(), stack.peek()
        //加运算
        math.U256(y.Add(x, y))
        //数值缓存
        evm.interpreter.intPool.put(x)
        return nil, nil
    }

gas花费
不一样的运算有不一样的初始值和对应的运算方法,具体的方法都定义在gas_table里面。 按加法的为例,一次加操做固定耗费为3。app

//固定耗费
    func constGasFunc(gas uint64) gasFunc {
        return func(gt params.GasTable, evm *EVM, contract *Contract, stack *Stack, mem *Memory, memorySize uint64) (uint64, error) {
            return gas, nil
        }
    }

除此以外还有两个定义会影响gas的计算,一般做为量化的一个单位。less

//file go-ethereum/core/vm/gas.go
    const (
        GasQuickStep   uint64 = 2
        GasFastestStep uint64 = 3
        GasFastStep    uint64 = 5
        GasMidStep     uint64 = 8
        GasSlowStep    uint64 = 10
        GasExtStep     uint64 = 20

        GasReturn       uint64 = 0
        GasStop         uint64 = 0
        GasContractByte uint64 = 200
    )

    //file go-ethereum/params/gas_table.go
    type GasTable struct {
        ExtcodeSize uint64
        ExtcodeCopy uint64
        Balance     uint64
        SLoad       uint64
        Calls       uint64
        Suicide     uint64

        ExpByte uint64

        // CreateBySuicide occurs when the
        // refunded account is one that does
        // not exist. This logic is similar
        // to call. May be left nil. Nil means
        // not charged.
        CreateBySuicide uint64
    }

memorySize
由于加操做不须要申请内存于是memorySize为默认值0。ide

栈验证
先验证栈上的操做数够不够,再验证栈是否超出最大限制,加法在这里仅需验证其参数够不够,运算以后栈是要减一的。函数

func makeStackFunc(pop, push int) stackValidationFunc {
        return func(stack *Stack) error {
            //深度验证
            if err := stack.require(pop); err != nil {
                return err
            }
            //最大值验证
            //StackLimit       uint64 = 1024 
            if stack.len()+push-pop > int(params.StackLimit) {
                return fmt.Errorf("stack limit reached %d (%d)", stack.len(), params.StackLimit)
            }
            return nil
        }
    }

智能合约
合约是EVM智能合约的存储单位也是解释器执行的基本单位,包含了代码,调用人,全部人,gas相关的信息.

type Contract struct {
        // CallerAddress is the result of the caller which initialised this
        // contract. However when the "call method" is delegated this value
        // needs to be initialised to that of the caller's caller.
        CallerAddress common.Address
        caller        ContractRef
        self          ContractRef

        jumpdests destinations // result of JUMPDEST analysis.

        Code     []byte
        CodeHash common.Hash
        CodeAddr *common.Address
        Input    []byte

        Gas   uint64
        value *big.Int

        Args []byte

        DelegateCall bool
    }

EVM原生预编译了一批合约,定义在contracts.go里面。主要用于加密操做。

// PrecompiledContractsByzantium contains the default set of pre-compiled Ethereum
// contracts used in the Byzantium release.
var PrecompiledContractsByzantium = map[common.Address]PrecompiledContract{
    common.BytesToAddress([]byte{1}): &ecrecover{},
    common.BytesToAddress([]byte{2}): &sha256hash{},
    common.BytesToAddress([]byte{3}): &ripemd160hash{},
    common.BytesToAddress([]byte{4}): &dataCopy{},
    common.BytesToAddress([]byte{5}): &bigModExp{},
    common.BytesToAddress([]byte{6}): &bn256Add{},
    common.BytesToAddress([]byte{7}): &bn256ScalarMul{},
    common.BytesToAddress([]byte{8}): &bn256Pairing{},
}

执行机

EVM中栈用于保存操做数,每一个操做数的类型是big.int,这就是网上不少人说EVM是256位虚拟机的缘由。执行opcode的时候,从上往下弹出操做数,做为操做的参数。

type Stack struct {
    data []*big.Int
}

func (st *Stack) push(d *big.Int) {
    // NOTE push limit (1024) is checked in baseCheck
    //stackItem := new(big.Int).Set(d)
    //st.data = append(st.data, stackItem)
    st.data = append(st.data, d)
}

func (st *Stack) peek() *big.Int {
    return st.data[st.len()-1]
}

func (st *Stack) pop() (ret *big.Int) {
    ret = st.data[len(st.data)-1]
    st.data = st.data[:len(st.data)-1]
    return
}

内存
内存用于一些内存操做(MLOAD,MSTORE,MSTORE8)及合约调用的参数拷贝(CALL,CALLCODE)。

内存数据结构,维护了一个byte数组,MLOAD,MSTORE读取存入的时候都要指定位置及长度才能准确的读写。

type Memory struct {
        store       []byte
        lastGasCost uint64
    }

    // Set sets offset + size to value
    func (m *Memory) Set(offset, size uint64, value []byte) {
        // length of store may never be less than offset + size.
        // The store should be resized PRIOR to setting the memory
        if size > uint64(len(m.store)) {
            panic("INVALID memory: store empty")
        }

        // It's possible the offset is greater than 0 and size equals 0. This is because
        // the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)
        if size > 0 {
            copy(m.store[offset:offset+size], value)
        }
    }

    func (self *Memory) Get(offset, size int64) (cpy []byte) {
        if size == 0 {
            return nil
        }

        if len(self.store) > int(offset) {
            cpy = make([]byte, size)
            copy(cpy, self.store[offset:offset+size])

            return
        }

        return
    }

内存操做

func opMload(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        offset := stack.pop()
        val := evm.interpreter.intPool.get().SetBytes(memory.Get(offset.Int64(), 32))
        stack.push(val)

        evm.interpreter.intPool.put(offset)
        return nil, nil
    }

    func opMstore(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        // pop value of the stack
        mStart, val := stack.pop(), stack.pop()
        memory.Set(mStart.Uint64(), 32, math.PaddedBigBytes(val, 32))

        evm.interpreter.intPool.put(mStart, val)
        return nil, nil
    }

    func opMstore8(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        off, val := stack.pop().Int64(), stack.pop().Int64()
        memory.store[off] = byte(val & 0xff)

        return nil, nil
    }

stateDb
合约自己不保存数据,那么合约的数据是保存在哪里呢?合约及其调用相似于数据库的日志,保存了合约定义以及对他的一系列操做,只要将这些操做执行一遍就能获取当前的结果,可是若是每次都要去执行就太慢了,于是这部分数据是会持久化到stateDb里面的。code中定义了两条指令SSTORE SLOAD用于从db中读写合约当前的状态。

func opSload(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        loc := common.BigToHash(stack.pop())
        val := evm.StateDB.GetState(contract.Address(), loc).Big()
        stack.push(val)
        return nil, nil
    }

    func opSstore(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        loc := common.BigToHash(stack.pop())
        val := stack.pop()
        evm.StateDB.SetState(contract.Address(), loc, common.BigToHash(val))

        evm.interpreter.intPool.put(val)
        return nil, nil
    }

执行过程
执行入口定义在evm.go中,功能就是组装执行环境(代码,执行人关系,参数等)。

func (evm *EVM) Call(caller ContractRef, addr common.Address, input []byte, gas uint64, value *big.Int) (ret []byte, leftOverGas uint64, err error) {
        if evm.vmConfig.NoRecursion && evm.depth > 0 {
            return nil, gas, nil
        }

        // 合约调用深度检查
        if evm.depth > int(params.CallCreateDepth) {
            return nil, gas, ErrDepth
        }
        // balance 检查
        if !evm.Context.CanTransfer(evm.StateDB, caller.Address(), value) {
            return nil, gas, ErrInsufficientBalance
        }

        var (
            to       = AccountRef(addr)
            //保存当前状态,若是出错,就回滚到这个状态
            snapshot = evm.StateDB.Snapshot()
        )
        if !evm.StateDB.Exist(addr) {
            //建立调用对象的stateObject
            precompiles := PrecompiledContractsHomestead
            if evm.ChainConfig().IsByzantium(evm.BlockNumber) {
                precompiles = PrecompiledContractsByzantium
            }
            if precompiles[addr] == nil && evm.ChainConfig().IsEIP158(evm.BlockNumber) && value.Sign() == 0 {
                return nil, gas, nil
            }
            evm.StateDB.CreateAccount(addr)
        }
        //调用别人合约可能须要花钱
        evm.Transfer(evm.StateDB, caller.Address(), to.Address(), value)

        //建立合约环境
        contract := NewContract(caller, to, value, gas)
        contract.SetCallCode(&addr, evm.StateDB.GetCodeHash(addr), evm.StateDB.GetCode(addr))

        start := time.Now()

        // Capture the tracer start/end events in debug mode
        if evm.vmConfig.Debug && evm.depth == 0 {
            evm.vmConfig.Tracer.CaptureStart(caller.Address(), addr, false, input, gas, value)

            defer func() { // Lazy evaluation of the parameters
                evm.vmConfig.Tracer.CaptureEnd(ret, gas-contract.Gas, time.Since(start), err)
            }()
        }
        //执行操做
        ret, err = run(evm, contract, input)

        // When an error was returned by the EVM or when setting the creation code
        // above we revert to the snapshot and consume any gas remaining. Additionally
        // when we're in homestead this also counts for code storage gas errors.
        if err != nil {
            //错误回滚
            evm.StateDB.RevertToSnapshot(snapshot)
            if err != errExecutionReverted {
                contract.UseGas(contract.Gas)
            }
        }
        return ret, contract.Gas, err
    }

相似的函数有四个。详细区别见最后的参考。

Call A->B A,B的环境独立

CallCode、 和Call相似 区别在于storage位置不同

DelegateCall、 和CallCode相似,区别在于msg.send不同

StaticCall 和call类似 只是不能修改状态

Contract和参数构造完成后调用执行函数,执行函数会检查调用的是否会以前编译好的原生合约,若是是原生合约则调用原生合约,不然调用解释器执行函数运算合约。

// run runs the given contract and takes care of running precompiles with a fallback to the byte code interpreter.
    func run(evm *EVM, contract *Contract, input []byte) ([]byte, error) {
        if contract.CodeAddr != nil {
            precompiles := PrecompiledContractsHomestead
            if evm.ChainConfig().IsByzantium(evm.BlockNumber) {
                precompiles = PrecompiledContractsByzantium
            }
            if p := precompiles[*contract.CodeAddr]; p != nil {
                return RunPrecompiledContract(p, input, contract)
            }
        }
        return evm.interpreter.Run(contract, input)
    }

解释器

func (in *Interpreter) Run(contract *Contract, input []byte) (ret []byte, err error) {

        //返回数据
        in.returnData = nil

        var (
            op    OpCode        // 当前指令
            mem   = NewMemory() // 内存
            stack = newstack()  // 栈
            pc   = uint64(0)    // 指令位置
            cost uint64         // gas花费
            pcCopy  uint64      // debug使用
            gasCopy uint64      // debug使用
            logged  bool        // debug使用
        )
        contract.Input = input  //函数入参

        //*****省略******

        for atomic.LoadInt32(&in.evm.abort) == 0 {
            //获取一条指令及指令对应的操做
            op = contract.GetOp(pc)
            operation := in.cfg.JumpTable[op]
            //valid校验
            if !operation.valid {
                return nil, fmt.Errorf("invalid opcode 0x%x", int(op))
            }
            //栈校验
            if err := operation.validateStack(stack); err != nil {
                return nil, err
            }
            //修改检查
            if err := in.enforceRestrictions(op, operation, stack); err != nil {
                return nil, err
            }

            var memorySize uint64
            //计算内存 按操做所须要的操做数来算
            if operation.memorySize != nil {
                memSize, overflow := bigUint64(operation.memorySize(stack))
                if overflow {
                    return nil, errGasUintOverflow
                }
                // 
                if memorySize, overflow = math.SafeMul(toWordSize(memSize), 32); overflow {
                    return nil, errGasUintOverflow
                }
            }
            // 校验cost 调用前面提到的costfunc 计算本次操做cost消耗
            cost, err = operation.gasCost(in.gasTable, in.evm, contract, stack, mem, memorySize)
            if err != nil || !contract.UseGas(cost) {
                return nil, ErrOutOfGas  //超出挂掉
            }
            if memorySize > 0 {
                //若是本次操做须要消耗memory ,扩展memory 
                mem.Resize(memorySize)  
            }

            // 执行操做
            res, err := operation.execute(&pc, in.evm, contract, mem, stack)

            if verifyPool {
                verifyIntegerPool(in.intPool)
            }
            // 若是遇到return 设置返回值
            if operation.returns {
                in.returnData = res
            }

            switch {
            case err != nil:
                return nil, err       //报错
            case operation.reverts:   //出错回滚
                return res, errExecutionReverted
            case operation.halts:
                return res, nil       //中止
            case !operation.jumps:    //跳转
                pc++
            }
        }
        return nil, nil
    }

Solidity案例
和其余语言相似,有了字节码运行机,就能够在字节码上面再组织其余高级语言,而solidlity语言就是实现了这样的语言编译器,方便了合约编写,有利于推广以太坊dapp开发。

pragma solidity ^0.4.17;

contract simple {
      uint num = 0;
    function simple(){
        num = 123;
    }
    
  
    function add(uint i) public returns(uint){
        uint m = 111;
        num =num * i+m;
        return num;
    } 

}

生成的Opcodes码

JUMPDEST 函数入口

PUSH + JUMPI/JUMP 相似于调用函数

CALLDATASIZE + CALLDATALOAD 大约是获取函数参数

.code
  PUSH 80           contract simple {\n      uint ...
  PUSH 40           contract simple {\n      uint ...
  MSTORE            contract simple {\n      uint ...
  PUSH 0            0  //成员变量初始值
  DUP1          uint num = 0
  //从下面这条指令能够看出,初始化的时候成员变量就会存到statedb里面去
  SSTORE            uint num = 0
  CALLVALUE             function simple(){\n        nu...
  DUP1          olidity ^
  ISZERO            a 
  PUSH [tag] 1          a 
  JUMPI             a 
  PUSH 0            r
  DUP1          o
  REVERT            .17;\n
contra
tag 1           a 
  //下面部分是构造函数执行的部分
  JUMPDEST          a 
  POP           function simple(){\n        nu...
  PUSH 7B           123
  PUSH 0            num  
  DUP2          num = 123
  SWAP1             num = 123
  //改变成员变量最后都会写入到statedb里面去
  SSTORE            num = 123
  POP           num = 123
  PUSH #[$] 0000000000000000000000000000000000000000000000000000000000000000            contract simple {\n      uint ...
  DUP1          contract simple {\n      uint ...
  PUSH [$] 0000000000000000000000000000000000000000000000000000000000000000         contract simple {\n      uint ...
  PUSH 0            contract simple {\n      uint ...
  CODECOPY          contract simple {\n      uint ...
  PUSH 0            contract simple {\n      uint ...
  RETURN            contract simple {\n      uint ...
  //上面部分作完初始化以后并不会进入到runtime阶段
.data
  0:
    .code
      //下面这段代码大约是处理参数的
      PUSH 80           contract simple {\n      uint ...
      PUSH 40           contract simple {\n      uint ...
      MSTORE            contract simple {\n      uint ...
      PUSH 4            contract simple {\n      uint ...
      CALLDATASIZE          contract simple {\n      uint ...
      LT            contract simple {\n      uint ...
      PUSH [tag] 1          contract simple {\n      uint ...
      JUMPI             contract simple {\n      uint ...
      PUSH 0            contract simple {\n      uint ...
      CALLDATALOAD          contract simple {\n      uint ...
      PUSH 100000000000000000000000000000000000000000000000000000000            contract simple {\n      uint ...
      SWAP1             contract simple {\n      uint ...
      DIV           contract simple {\n      uint ...
      PUSH FFFFFFFF         contract simple {\n      uint ...
      AND           contract simple {\n      uint ...
      DUP1          contract simple {\n      uint ...
      PUSH 1003E2D2         contract simple {\n      uint ...
      EQ            contract simple {\n      uint ...
      PUSH [tag] 2          contract simple {\n      uint ...
      JUMPI             contract simple {\n      uint ...
    tag 1           contract simple {\n      uint ...
      JUMPDEST          contract simple {\n      uint ...
      PUSH 0            contract simple {\n      uint ...
      DUP1          contract simple {\n      uint ...
      REVERT            contract simple {\n      uint ...
    tag 2           function add(uint i) public re...
      JUMPDEST          function add(uint i) public re...
      CALLVALUE             function add(uint i) public re...
      DUP1          olidity ^
      ISZERO            a 
      PUSH [tag] 3          a 
      JUMPI             a 
      PUSH 0            r
      DUP1          o
      REVERT            .17;\n
contra
    tag 3           a 
      JUMPDEST          a 
      POP           function add(uint i) public re...
      PUSH [tag] 4          function add(uint i) public re...
      PUSH 4            function add(uint i) public re...
      DUP1          function add(uint i) public re...
      CALLDATASIZE          function add(uint i) public re...
      SUB           function add(uint i) public re...
      DUP2          function add(uint i) public re...
      ADD           function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      DUP1          function add(uint i) public re...
      DUP1          function add(uint i) public re...
      CALLDATALOAD          function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      PUSH 20           function add(uint i) public re...
      ADD           function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      SWAP3             function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      POP           function add(uint i) public re...
      POP           function add(uint i) public re...
      POP           function add(uint i) public re...
      PUSH [tag] 5          function add(uint i) public re...
      JUMP          function add(uint i) public re...
    tag 4           function add(uint i) public re...
      JUMPDEST          function add(uint i) public re...
      PUSH 40           function add(uint i) public re...
      MLOAD             function add(uint i) public re...
      DUP1          function add(uint i) public re...
      DUP3          function add(uint i) public re...
      DUP2          function add(uint i) public re...
      MSTORE            function add(uint i) public re...
      PUSH 20           function add(uint i) public re...
      ADD           function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      POP           function add(uint i) public re...
      POP           function add(uint i) public re...
      PUSH 40           function add(uint i) public re...
      MLOAD             function add(uint i) public re...
      DUP1          function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      SUB           function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      RETURN            function add(uint i) public re...
    tag 5           function add(uint i) public re...
      //函数内容
JUMPDEST            function add(uint i) public re...
      //这下面就是函数的代码了
      PUSH 0            uint //局部变量在栈里面
      DUP1          uint m
      PUSH 6F           111
      SWAP1             uint m = 111
      POP           uint m = 111 //从push0到这里实现了定义局部变量并赋值
      DUP1          m
      DUP4          i            //获取参数
      PUSH 0            num
      SLOAD             num      //上面那句和这句实现了读取成员变量
      MUL           num * i      //乘
      ADD           num * i+m    //加
      PUSH 0            num
      DUP2          num =num * i+m
      SWAP1             num =num * i+m   //这三句赋值
      SSTORE            num =num * i+m   //成员变量存储
      POP           num =num * i+m
      //下面几句实现return
      PUSH 0            num
      SLOAD             num
      SWAP2             return num    
      POP           return num
      POP           function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      POP           function add(uint i) public re...
      JUMP [out]            function add(uint i) public re...
    .data

01221030_Z6Do[1].png

参考

Call、CallCode、DelegateCall:https://ethereum.stackexchange.com/questions/3667/difference-between-call-callcode-and-delegatecall

solidity结构:https://solidity.readthedocs.io/en/develop/structure-of-a-contract.html#

runtime bytecode和bytecode :https://ethereum.stackexchange.com/questions/13086/solc-bin-vs-bin-runtime/13087#13087

remix: https://remix.ethereum.org/

转自:(魂祭心)https://my.oschina.net/hunjixin/blog/1805306

安利两个教程:1.以太坊入门实战 2.以太坊电商DApp实战

相关文章
相关标签/搜索