ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward
... aka "garbage collection"
What it is
What it isn't
How it works
It was said that C programmers knew that memory management was so important, it could not be left up to the system.
And it was said that Lisp programmers knew that memory management was so important, it could not be left up to the programmers.
as contrasted with "manual memory management"
basic idea is pretty simple:
programmers shouldn't have to manage memory
but the system should only release stuff not in use
within that simple idea lies a ton of complexity
to allocate space for new objects
to reclaim space occupied by dead objects
as part of the above, identify live objects (to prevent reclamation)
safety
prime directive: no object still "in use" should go away!
throughput
number of objects that can be passed through the system
speed/pause time
how long does it take to do a reclamation?
space overhead
above and beyond the actual allocated space
completeness
do we get all the reclamation-eligible space?
scalability and portability
how well can it take advantage of multi-core hardware?
first appeared in the early 60s
Smalltalk
Lisp
some functional languages
gained mainstream traction with the rise of the "VM"s
JVM, CLR
Python, Ruby
"allocation": providing memory for use by a program
"deallocation", "reclamation": making memory available for allocation
"conservative": no live object will ever be reclaimed
"automatic" vs "manual": programmer effort requirement
"automatic": the underlying language/platform handles it
"manual": the programmer must take care of it
this applies to both allocation and deallocation
most O-O languages combine allocation with initialization
ObjC is an example of one that didn't
Reference counting
Mark-sweep
Copying
Mark-compact
Mark-sweep-compact
Generational collectors
some will combine them together
JVM is a perfect example: copying + generational
others will allow for "tuning" in varying ways
JVM offers a gazillion tuning parameters
CLR puts "large objects" into their own heap
"mark" phase: identify (mark) objects still in use
"sweep" phase: remove/reclaim any objects not so marked
start from a known "root set" of references
for each object referenced from that root set:
mark the object as in use
identify each object reference in that object
recursively mark said object
thus, only objects still in use will get marked
different languages/platforms will identify different root sets
Java/JVM:
thread stack
static references
finalizable-queue references
.NET/CLR: essentially identical to JVM, plus:
"large object heap" (LOH)
object is simply reclaimed in place
objects are NOT moved or adjusted in the heap
the space the object occupied is now considered free
potentially may be combined with adjacent free spaces
used in any number of places/languages
most GC implementations start as M-S
concurrency: how many threads are in play?
one for mark and sweep operations
one for mark, one for sweep
multiple threads for each?
size of the heap in question
the larger the heap, the longer the phases
the "depth" of the reference tree
objects must have a header (for the mark bit/flag)
the entire heap must be referencable
for large heaps, mark/sweep phases can be excessively long
tracer must be able to identify references (as opposed to integers)
also sometimes called "arena" collectors
fast allocation
fast reclamation
but huge overhead (50%)
divide the heap in half (!)
allocate new objects out of only the first half
the "fromspace"
leave the other half untouched/unused
the "tospace"
when the current space runs out, run a pass
identify objects still in use
copy those objects over to the "tospace"
empty the "fromspace"
"fromspace" is now "tospace" and vice versa
can we copy/move objects? (handles-vs-pointers)
do we have to limit to 2 semispaces?
concurrency? (can we run this concurrently?)
allocation is fast
complete elimination of fragmentation
compaction happens automatically during copy
loss of half of the available heap
means a lot more GC cycles
breaks down (significantly) for larger heaps
reclamation of the "fromspace" may not be as fast as we'd like
destructors/deinitializers still need to be run
long-lived objects require copying each pass
"mark" phase: identify (mark) objects still in use
"compact" phase: rearrange the heap
start from a known "root set" of references
for each object referenced from that root set:
mark the object as in use
identify each object reference in that object
recursively mark said object
thus, only objects still in use will get marked
different languages/platforms will identify different root sets
Java/JVM:
thread stack
static references
finalizable-queue references
.NET/CLR: essentially identical to JVM, plus:
"large object heap" (LOH)
for each marked object, "slide" it up next to another marked object
remaining space in the heap is considered free
is compaction necessary?
concurrency: how many threads are in play?
one for mark and sweep operations
one for mark, one for sweep
multiple threads for each?
size of the heap
object references: handles or pointers?
objects must have a header (for the mark bit/flag)
tracer must be able to identify references (as opposed to integers)
the entire heap must be referencable
larger heaps will take longer to mark
"deeper" heaps will take longer to mark
reference/pointer fixups are necessary/critical
if a root-set object is moved, the root-set reference must also adjust
if a child object is moved, the parent reference to it must also adjust
fast ("bump a pointer") allocation
throughput costs of compaction
long-lived objects (will get copied/moved repeatedly)
requires multiple passes over live objects
each time an object is referenced, count goes up
each time an object is "released", count goes down
at count zero, object is eligible for reclamation
either the object frees itself
or a runtime reclaims it at the time of release
Objective-C (NeXT, Mac, iOS)
Swift (iOS)
Microsoft COM
early (pre-1.0) Java releases
who is responsible for inc/dec operations?
what is the maximum number of references?
memory mgmnt costs distributed throughout the application
no runtime requirements/costs
can factor in as part of libraries (C++/Boost)
easy native language interoperability
pointers to objects are simply pointers
refct is stored as a field in the object
time overhead on the mutator
every operation must manipulate reference counts
ref count manipulaions and pointer load/store must be atomic
potential concurrency contention
read-only operations can turn into read-write operations
meaning, system must inc ref count, which is a write op
cyclical references can create unclaimable garbage
pause times still possible
destroying one object triggers "destruction storm"
heap can get fragmented
no rearrangement/compaction opportunities
it can manage memory
it does not manage "external" resources well
different GCs have different sets of assumptions
... yielding vastly different performance curves
"Garbage Collection", by Jones/Lins (Wiley, 1996/1999)
this is my "gold standard" book; best starting point for me
"The Garbage Collection Handbook", by Jones/Hosking/Moss (Wiley, 2011)
this is Jones' "successor" book to the above
"A method for overlapping and erasure of lists." (George E. Collins)
Communications of the ACM, 3(12):655-657, December 1960
"A Lisp garbage collector for virtual memory computer systems." (Robert R. Fenichel and Jerome C. Yochelson)
Communications of the ACM, 12(11):611-612, November 1969
"Recursive functions of symbolic expressions and their computation by machine." (John McCarthy)
Communications of the ACM, 3:184-195, 1960
Architect, Engineering Manager/Leader, "force multiplier"
http://www.newardassociates.com
http://blogs.newardassociates.com
Sr Distinguished Engineer, Capital One
Educative (http://educative.io) Author
Performance Management for Engineering Managers
Books
Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)