Busy Developer's Guide to Garbage Collection

ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward

Objectives

Let's talk about automatic memory management

... aka "garbage collection"

What it is

What it isn't

How it works

Garbage Collection

What exactly is it?

Garbage Collection

An old truth

It was said that C programmers knew that memory management was so important, it could not be left up to the system.

And it was said that Lisp programmers knew that memory management was so important, it could not be left up to the programmers.

Garbage Collection

"Automatic memory management"

as contrasted with "manual memory management"

basic idea is pretty simple:

programmers shouldn't have to manage memory

but the system should only release stuff not in use

within that simple idea lies a ton of complexity

Garbage Collection

Automatic memory management tasks:

to allocate space for new objects

to reclaim space occupied by dead objects

as part of the above, identify live objects (to prevent reclamation)

Garbage Collection

Comparison criteria

safety

prime directive: no object still "in use" should go away!

throughput

number of objects that can be passed through the system

speed/pause time

how long does it take to do a reclamation?

space overhead

above and beyond the actual allocated space

completeness

do we get all the reclamation-eligible space?

scalability and portability

how well can it take advantage of multi-core hardware?

Garbage Collection: History

History of an idea

Garbage Collection: History

Automatic memory management is not a new idea

first appeared in the early 60s

Smalltalk

Lisp

some functional languages

gained mainstream traction with the rise of the "VM"s

JVM, CLR

Python, Ruby

Garbage Collection: Concepts

How do we think about this?

Garbage Collection: Concepts

What's done?

"allocation": providing memory for use by a program

"deallocation", "reclamation": making memory available for allocation

"conservative": no live object will ever be reclaimed

Garbage Collection: Concepts

How/when is it done?

"automatic" vs "manual": programmer effort requirement

"automatic": the underlying language/platform handles it

"manual": the programmer must take care of it

this applies to both allocation and deallocation

most O-O languages combine allocation with initialization

ObjC is an example of one that didn't

GC Algorithms

4 basic starting points

GC Algorithms

All GC algorithms key off of four basic types:

Reference counting

Mark-sweep

Copying

Mark-compact

GC Algorithms

"Second-order"/"hybrid" GC algorithms build on the above

Mark-sweep-compact

Generational collectors

GC Algorithms

Real-world GC implementations base off of these

some will combine them together

JVM is a perfect example: copying + generational

others will allow for "tuning" in varying ways

JVM offers a gazillion tuning parameters

CLR puts "large objects" into their own heap

Mark-Sweep GC

Two-pass garbage collection

Mark-Sweep GC

Mark-Sweep collectors are two-phase collectors

"mark" phase: identify (mark) objects still in use

"sweep" phase: remove/reclaim any objects not so marked

Mark-Sweep GC

Mark Phase: Identifying objects still in use

start from a known "root set" of references

for each object referenced from that root set:

mark the object as in use

identify each object reference in that object

recursively mark said object

thus, only objects still in use will get marked

Mark-Sweep GC

Mark Phase: Root sets

different languages/platforms will identify different root sets

Java/JVM:

thread stack

static references

finalizable-queue references

.NET/CLR: essentially identical to JVM, plus:

"large object heap" (LOH)

Mark-Sweep GC

Sweep Phase: Reclaim unmarked objects

object is simply reclaimed in place

objects are NOT moved or adjusted in the heap

the space the object occupied is now considered free

potentially may be combined with adjacent free spaces

Mark-Sweep GC

History

used in any number of places/languages

most GC implementations start as M-S

Mark-Sweep GC

Questions of note:

concurrency: how many threads are in play?

one for mark and sweep operations

one for mark, one for sweep

multiple threads for each?

size of the heap in question

the larger the heap, the longer the phases

the "depth" of the reference tree

Mark-Sweep GC

Consequences:

objects must have a header (for the mark bit/flag)

the entire heap must be referencable

for large heaps, mark/sweep phases can be excessively long

tracer must be able to identify references (as opposed to integers)

Copying Collectors

Allocating/reclaiming in large chunks

Copying Collectors

Copying collectors are also two-phase collectors

also sometimes called "arena" collectors

fast allocation

fast reclamation

but huge overhead (50%)

Copying Collectors

Setup

divide the heap in half (!)

allocate new objects out of only the first half

the "fromspace"

leave the other half untouched/unused

the "tospace"

Copying Collectors

Reclamation

when the current space runs out, run a pass

identify objects still in use

copy those objects over to the "tospace"

empty the "fromspace"

"fromspace" is now "tospace" and vice versa

Copying Collectors

Questions of note:

can we copy/move objects? (handles-vs-pointers)

do we have to limit to 2 semispaces?

concurrency? (can we run this concurrently?)

Copying Collectors

Consequences

allocation is fast

complete elimination of fragmentation

compaction happens automatically during copy

loss of half of the available heap

means a lot more GC cycles

breaks down (significantly) for larger heaps

reclamation of the "fromspace" may not be as fast as we'd like

destructors/deinitializers still need to be run

long-lived objects require copying each pass

Mark-Compact GC

Two-phase garbage collection

Mark-Compact GC

Mark-Compact collectors are two-phase collectors

"mark" phase: identify (mark) objects still in use

"compact" phase: rearrange the heap

Mark-Compact GC

Mark Phase: Identifying objects still in use

start from a known "root set" of references

for each object referenced from that root set:

mark the object as in use

identify each object reference in that object

recursively mark said object

thus, only objects still in use will get marked

Mark-Compact GC

Mark Phase: Root sets

different languages/platforms will identify different root sets

Java/JVM:

thread stack

static references

finalizable-queue references

.NET/CLR: essentially identical to JVM, plus:

"large object heap" (LOH)

Mark-Compact GC

Compact Phase: Eliminate "holes"

for each marked object, "slide" it up next to another marked object

remaining space in the heap is considered free

Mark-Compact GC

Questions of note:

is compaction necessary?

concurrency: how many threads are in play?

one for mark and sweep operations

one for mark, one for sweep

multiple threads for each?

size of the heap

object references: handles or pointers?

Mark-Compact GC

Consequences (identical to Mark-Sweep):

objects must have a header (for the mark bit/flag)

tracer must be able to identify references (as opposed to integers)

the entire heap must be referencable

larger heaps will take longer to mark

"deeper" heaps will take longer to mark

Mark-Compact GC

Consequences:

reference/pointer fixups are necessary/critical

if a root-set object is moved, the root-set reference must also adjust

if a child object is moved, the parent reference to it must also adjust

fast ("bump a pointer") allocation

throughput costs of compaction

long-lived objects (will get copied/moved repeatedly)

requires multiple passes over live objects

Reference Counting

Objects track their own references

Reference Counting

Each object allocated carries with it a "reference count"

each time an object is referenced, count goes up

each time an object is "released", count goes down

at count zero, object is eligible for reclamation

either the object frees itself

or a runtime reclaims it at the time of release

Reference Counting

Usage

Objective-C (NeXT, Mac, iOS)

Swift (iOS)

Microsoft COM

early (pre-1.0) Java releases

Reference Counting

Questions of note:

who is responsible for inc/dec operations?

what is the maximum number of references?

Reference Counting

Consequences:

memory mgmnt costs distributed throughout the application

no runtime requirements/costs

can factor in as part of libraries (C++/Boost)

easy native language interoperability

pointers to objects are simply pointers

refct is stored as a field in the object

Reference Counting

Consequences:

time overhead on the mutator

every operation must manipulate reference counts

ref count manipulaions and pointer load/store must be atomic

potential concurrency contention

read-only operations can turn into read-write operations

meaning, system must inc ref count, which is a write op

cyclical references can create unclaimable garbage

pause times still possible

destroying one object triggers "destruction storm"

heap can get fragmented

no rearrangement/compaction opportunities

Summary

Garbage collection is not a panacea

it can manage memory

it does not manage "external" resources well

different GCs have different sets of assumptions

... yielding vastly different performance curves

Garbage Collection: Resources

Where to go to get more

Resources

Books

"Garbage Collection", by Jones/Lins (Wiley, 1996/1999)

this is my "gold standard" book; best starting point for me

"The Garbage Collection Handbook", by Jones/Hosking/Moss (Wiley, 2011)

this is Jones' "successor" book to the above

Resources

Classic papers

"A method for overlapping and erasure of lists." (George E. Collins)

Communications of the ACM, 3(12):655-657, December 1960

"A Lisp garbage collector for virtual memory computer systems." (Robert R. Fenichel and Jerome C. Yochelson)

Communications of the ACM, 12(11):611-612, November 1969

"Recursive functions of symbolic expressions and their computation by machine." (John McCarthy)

Communications of the ACM, 3:184-195, 1960

Credentials

Who is this guy?

Architect, Engineering Manager/Leader, "force multiplier"

http://www.newardassociates.com

http://blogs.newardassociates.com

Sr Distinguished Engineer, Capital One

Educative (http://educative.io) Author

Performance Management for Engineering Managers

Books

Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)

Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)

Effective Enterprise Java (Addison-Wesley, 2004)

SSCLI Essentials (w/Stutz, et al; OReilly, 2003)

Server-Based Java Programming (Manning, 2000)