Busy Java Developer's Guide to JVM Bytecode

ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward

Objectives

What are we trying to do today?

JVM Bytecode

From data to code

JVM Bytecode

JVM Bytecode Instruction Set

JVM Bytecode

Why use an intermediate language like this?

JVM Bytecode

JVM provides "managed environment"

JVM Bytecode

Reasons to know bytecode

JVM Bytecode

javap

Hello, Java

package com.newardassociates.demo;

public class App {
    public static void main(String[] args) {
        System.out.println("Hello world!");
    }
}

Hello, Java

$ javap -p -c -l com/newardassociates/demo/App.class
Compiled from "App.java"
public class com.newardassociates.demo.App {
  public com.newardassociates.demo.App();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return
    LineNumberTable:
      line 6: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0       5     0  this   Lcom/newardassociates/demo/App;

  public static void main(java.lang.String[]);
    Code:
       0: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
       3: ldc           #13                 // String Hello world!
       5: invokevirtual #15                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
       8: return
    LineNumberTable:
      line 8: 0
      line 9: 8
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0       9     0  args   [Ljava/lang/String;
}

JVMIS Design Principles

What drives the look-and-feel of JVM bytecode

JVMIS Design Principles

Smallest atom of code: class

JVMIS Design Principles

Stack-based, no registers

JVMIS Design Principles

Names are fully-qualified class names

JVMIS Design Principles

Example names

JVM Class File Structure

Not just opcodes

JVM Class File Structure

Overview

JVM Class File Structure

Big Picture

JVM Class File Structure

ClassFile structure

ClassFile {
    u4             magic;
    u2             minor_version;
    u2             major_version;
    u2             constant_pool_count;
    cp_info        constant_pool[constant_pool_count-1];
    u2             access_flags;
    u2             this_class;
    u2             super_class;
    // ...

JVM Class File Structure

ClassFile structure

ClassFile {
    // ...
    u2             interfaces_count;
    u2             interfaces[interfaces_count];
    u2             fields_count;
    field_info     fields[fields_count];
    u2             methods_count;
    method_info    methods[methods_count];
    u2             attributes_count;
    attribute_info attributes[attributes_count];
}

JVM Class File Structure

Notes about ClassFile structure

JVM Class File Structure

ClassFile structure

JVM Class File Structure

Essentially, a class...

JVM Class File Structure

Constant pool

JVM Class File Structure

Field

JVM Class File Structure

Method

JVM Class File Structure

Attributes

JVM Class File Structure

CodeAttribute (Section 4.7.3)

JVM Class File Structure

CodeAttribute (Section 4.7.3)

JVM Instruction Set

Organized loosely by function

JVM Instruction Set

Before we begin

JVM Instruction Set

Operand stack manipulation

JVM Instruction Set

Local variables

JVM Instruction Set

Arithmetic operations

JVM Instruction Set

Add method using locals: Java

    public static int add() {
        int lhs = 5;
        int rhs = 28;
        return lhs + rhs;
    }

JVM Instruction Set

Add method using locals: Bytecode (preamble)

  public static int add();
    descriptor: ()I
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC

JVM Instruction Set

Add method using locals: Bytecode (code)

    Code:
      stack=2, locals=2, args_size=0
         0: iconst_5
         1: istore_0
         2: bipush        28
         4: istore_1
         5: iload_0
         6: iload_1
         7: iadd
         8: ireturn

JVM Instruction Set

Add method using locals: Bytecode (tables)

      LineNumberTable:
        line 11: 0
        line 12: 2
        line 13: 5
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            2       7     0   lhs   I
            5       4     1   rhs   I

JVM Instruction Set

Add method using parameters: Java

    public static int add(int lhs, int rhs) {
        return lhs + rhs;
    }

JVM Instruction Set

Add method using locals: Bytecode (preamble)

  public static int add(int, int);
    descriptor: (II)I
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC

JVM Instruction Set

Add method using locals: Bytecode (code)

    Code:
      stack=2, locals=2, args_size=2
         0: iload_0
         1: iload_1
         2: iadd
         3: ireturn

JVM Instruction Set

Add method using locals: Bytecode (tables)

      LineNumberTable:
        line 6: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       4     0   lhs   I
            0       4     1   rhs   I

JVM Instruction Set

Array operations

JVM Instruction Set

Branching, control flow

JVM Instruction Set

Sum an array: Java

    public static int sumArray() {
        int[] numbers = {12, 21, 37};
        int total = 0;
        for (int i : numbers) {
            total += i;
        }
        return total;
    }

JVM Instruction Set

Initialize an array: Bytecode (code)

         0: iconst_3
         1: newarray       int
         3: dup
         4: iconst_0
         5: bipush        12
         7: iastore
         8: dup
         9: iconst_1
        10: bipush        21
        12: iastore
        13: dup
        14: iconst_2
        15: bipush        37
        17: iastore
        18: astore_0

JVM Instruction Set

Iterate over an array: Bytecode (code)

        19: iconst_0
        20: istore_1
        21: aload_0
        22: astore_2
        23: aload_2
        24: arraylength
        25: istore_3
        26: iconst_0
        27: istore        4
        29: iload         4
        31: iload_3
        32: if_icmpge     52
        35: aload_2
        36: iload         4
        38: iaload
        39: istore        5
        41: iload_1
        42: iload         5
        44: iadd
        45: istore_1
        46: iinc          4, 1
        49: goto          29

JVM Instruction Set

Object model instructions

JVM Instruction Set

Method invocation

JVM Instruction Set

Create an object, invoke a method: Java

    public static void main(String[] args) {
        System.out.println(new Greeter().getGreeting());
    }

JVM Instruction Set

Create an object, invoke a method: Bytecode (code)

    Code:
      stack=3, locals=1, args_size=1
         0: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #13                 // class com/newardassociates/demo/Greeter
         6: dup
         7: invokespecial #15                 // Method com/newardassociates/demo/Greeter."<init>":()V
        10: invokevirtual #16                 // Method com/newardassociates/demo/Greeter.getGreeting:()Ljava/lang/String;
        13: invokevirtual #20                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        16: return

Quiz Time!

Test what you've learned!

Question

String concatenation

Question

"How are inner classes implemented?"

Question

assert keyword

Question

Generics

Question

Lambdas (anonymous methods)

Question

"Records: How are they implemented?"

Summary

JVMS is good to know

Resources

Books

Credentials

Who is this guy?

Appendices

More info for the curious

java.lang.classfile

The Java ClassFile API

java.lang.classfile

Overview

"The java.lang.classfile package contains API models for reading, writing, and modifying Java class files, as specified in Chapter 4 of the Java Virtual Machine Specification."

java.lang.classfile

Overview

Jasmin

A JVM assembler

Jasmin

Introduction

Jasmin

Hello, Jasmin

.class public HelloWorld
.super java/lang/Object

;
; standard initializer (calls java.lang.Object's initializer)
;
.method public <init>()V
    aload_0
    invokenonvirtual java/lang/Object/<init>()V
    return
.end method

Jasmin

Hello, Jasmin

;
; main() - prints out Hello World
;
.method public static main([Ljava/lang/String;)V
    .limit stack 2   ; up to two items can be pushed

    ; push System.out onto the stack
    getstatic java/lang/System/out Ljava/io/PrintStream;

    ; push a string onto the stack
    ldc "Hello Jasmin!"

    ; call the PrintStream.println() method.
    invokevirtual java/io/PrintStream/println(Ljava/lang/String;)V

    ; done
    return
.end method

jasm

A newer JVM assembler

jasm

Introduction

jasm

Hello, jasm

package com.newardassociates.demo;

// Keep in mind that most IDEs and tools will flag the use
// of the MessageProvider below as an error, since they won't
// know how to resolve the name (since they don't know about 
// jasm).
public class App {
    public static void main(String[] args) {
        System.out.println(new MessageProvider().getMessage());
    }
}
public class com/newardassociates/demo/MessageProvider {
    public getMessage()java/lang/String {
         ldc "Hello, World"
         areturn
    }

    public <init>()V {
        aload 0
        invokespecial java/lang/Object.<init>()V
        return
    }
}

Krakatau

A JVM assembler/disassembler

Krakatau

Introduction

Krakatau

Hello Krakatau assembler

.class public Foo 
.super java/lang/Object 

; ([Ljava/lang/String;)V  means "takes a single String[] argument and returns void"
.method public static main : ([Ljava/lang/String;)V
    ; We have to put an upper bound on the number of locals and the operand stack
    ; Machine generated code will usually calculate the exact limits, but that's a pain to do
    ; when writing bytecode by hand, especially as we'll be making changes to the code.
    ; Therefore, we'll just set a value that's way more than we're using, 13 in this case
    .code stack 13 locals 13
        ; Equivalent to "System.out" in Java code
        getstatic Field java/lang/System out Ljava/io/PrintStream;
        ; put our argument on the operand stack
        ldc "Hello World!"
        ; now invoke println()
        invokevirtual Method java/io/PrintStream println (Ljava/lang/Object;)V
        return
    .end code
.end method
.end class

Javassist

A high-level Java construct library

Javassist

Introduction

ProGuard (CORE)

An assembler/disassembler pair

ProGuard (CORE)

Overview

ProGuard (CORE)

Creating HelloWorld programmatically

ProgramClass programClass =
    new ClassBuilder(
        VersionConstants.CLASS_VERSION_1_8,
        AccessConstants.PUBLIC,
        "HelloWorld",
        ClassConstants.NAME_JAVA_LANG_OBJECT)
        .addMethod(
            AccessConstants.PUBLIC | AccessConstants.STATIC,
            "main",
            "([Ljava/lang/String;)V",
            50,
            code -> code
                .getstatic("java/lang/System", "out", "Ljava/io/PrintStream;")
                .ldc("Hello, world!")
                .invokevirtual("java/io/PrintStream", "println", "(Ljava/lang/String;)V")
                .return_())
        .getProgramClass();

ByteBuddy

A codegen and manipulation library

ByteBuddy

Overview

ByteBuddy

Hello, ByteBuddy

Class<?> dynamicType = new ByteBuddy()
  .subclass(Object.class)
  .method(ElementMatchers.named("toString"))
  .intercept(FixedValue.value("Hello World!"))
  .make()
  .load(getClass().getClassLoader())
  .getLoaded();
 
assertThat(dynamicType.newInstance().toString(), is("Hello World!"));

ASM

A low-level bytecode inspection/manipulation library

ASM

Overview

BCEL

Apache Commons jumps into JVM bytecode

BCEL

ByteCode Engineering Library

BCEL

Optimizing boolean expressions

CodeConstraint constraint = new CodeConstraint() {
    public boolean checkCode(InstructionHandle[] match) {
        IfInstruction if1 = (IfInstruction) match[0].getInstruction();
        GOTO g = (GOTO) match[2].getInstruction();
        return (if1.getTarget() == match[3]) &&
            (g.getTarget() == match[4]);
    }
};

BCEL

Optimizing boolean expressions

InstructionFinder f = new InstructionFinder(il);
String pat = "IfInstruction ICONST_0 GOTO ICONST_1 NOP(IFEQ|IFNE)";
for (Iterator e = f.search(pat, constraint); e.hasNext(); ) {
    InstructionHandle[] match = (InstructionHandle[]) e.next();
    ...
    match[0].setTarget(match[5].getTarget()); // Update target
    ...
    try {
        il.delete(match[1], match[5]);
    } catch (TargetLostException ex) {
        ...
    }
}