ted@tedneward.com | Blog: http://blogs.tedneward.com | Twitter: tedneward | Github: tedneward | LinkedIn: tedneward
Building a virtual machine
understanding a VM can come from building one
so let's build one!
and understand that there's limits to what we can do in a single session
Architecture (simplified)
code memory: byte array(s) holding the program's code
IP register: instruction pointer
call stack: function call frames
FP register: pointer to the current stack frame
global memory (heap): memory for storage/use
processor: fetch-decode-execute mechanism
constant pool: collection of constants
usually anything that isn't "word"s to the machine
instructions are formed of two parts
operation code (opcode)
operation parameters (operands)
these will sometimes be supplemented by other things
directives (commands to the tools)
labels (symbolic names used)
some take none (NULL, pop, etc)
some take one (a constant value, etc)
some take two (add, subtract, etc)
some may take a varying number
depending on semantics of the opcode
machine ops
moving data in/out of parts of the machine
mathematical ops
add, subtract, multiply, divide
comparison ops
greater-than, less-than, equal, not-equal, greater-than-or-equal, less-than-or-equal
branching ops
unconditional, branch-if-true
call ops
direct, indirect
storage ops
global store/load, local store/load
Stack-based virtual machines
simulates a hardware processor w/no general-purpose registers
instructions must use an operand stack to hold temporary values
all operands come from the stack
all results go back onto the stack
All storage/work is in registers
general-purpose registers
floating-point registers
string/data registers
(usually) still a stack involved
which makes register machines a superset of stack machines
No-operand opcodes:
NOP: do nothing
HALT: end execution
Basic value-manipulation opcodes:
LOAD: load value into register
constant value
from memory
from other register
STORE: store register
into memory
Mathematical ops
ADD, SUB, MUL, DIV, MOD
three operands: src1, src2, and dest
or one operand: value to then operate on value in register (accumulator)
Steps (1/2)
Basic architecture and scaffolding
Simple (no-operand) ops: NOP, HALT, DUMP
Simple stack ops: CONST, LDC, POP
Globals ops: GSTORE, GLOAD
Math ops: ADD, SUB, etc
Steps (2/2)
Comparison ops: EQ, NE, GTE, etc
Branching ops: JMP, JT, JF, etc
Call ops: CALL, CALLI, RET
Locals ops: LSTORE, LLOAD
Futures
add other types beyond ints and functions
memories could be made smaller (blocks/chunks/etc) and demand-allocated
definitions for "structures"
new opcodes
optimize, optimize, optimize, ...
VM implementations to study
Java (JVM), .NET (CLR), Android (ART)
Javascript (V8, Chakra), WebAssembly
Python, Ruby, Smalltalk (Squeak)
Erlang (BEAM)
SQLite
ScummVM
Books
Language Implementation Patterns
Parr (Pragmatic Publishers)
Virtual Machines
Smith, Nair (Morgan Kaufman)
Web
C-- ("high-level assembly language")
https://www.cs.tufts.edu/~nr/c--/index.html
BEAM VM Wisdoms (by Dmytro Lytovchenko)
SCUMMVM Technical Reference
https://wiki.scummvm.org/index.php?title=SCUMM/Technical_Reference
Who is this guy?
Architect, Engineering Manager/Leader, "force multiplier"
Co-founder, Solidify US
http://www.solidify.dev
Principal -- Neward & Associates
Author
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)
See http://www.newardassociates.com