ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward
Java has a built-in multithreading model
learn when to use it
learn when not to use it
learn how to use it
Why go concurrent?
Concurrency carries with it some costs
most systems have two characteristics:
small # of CPUs
large # of concurrent tasks to run
so we "task-switch"
task-switching carries overhead
for any collection of work "w", dividing it into "n" parts and running them simultaneously takes longer due to the task-switch time "q" ("quantum")
So why do it?
"wait time"
while the CPU is waiting for a result, let other tasks run
"cancellation"/responsiveness
while a task is waiting for a result, keep the UI (or other foreground activities) responsive
segregation of work
some problems lend themselves naturally to being seen as separate-yet-connected sequences
"massively parallel" problems
shifts in hardware/CPU design
may or may not be backed by "real" thread
this is entirely up to the Java VM implementation
most are 1:1 ("native")
some are m:n
few are n:1 ("green")
fairly simple API model
either subclass Thread & override run()
...
... or construct Thread and then pass in a ...
Runnable
(no return, no exceptions) overriding run()
Callable
(return value, throws clause) overriding call()
generally preferred to not inherit from Thread
call start()
to begin the thread's processing
no guarantees regarding scheduling
Thread t = new Thread( () => System.out.println("Howdy, threaded world!") ); Thread t2 = new Thread(new Runnable() { public void run() { System.out.println("Howdy, old-school Thread!"); } }; Thread t3 = new Thread() t.start(); t2.start();
JVM will not die until all foreground threads terminate
to set Thread to background status, setDaemon(true)
must be done before call to start()
higher-priority thread gets first crack at CPU
Thread priorities range from 1 (MIN) to 10 (MAX)
default priority is 5 (NORM)
Threads inherit priority of parent Thread
manipulatable after start
Thread is either runnable or not-runnable
in other words, is Thread being scheduled?
runnable does not imply running
Thread could be blocked or waiting
not-runnable does not imply "dead"
Thread could be waiting to be started... or paused...
but once dead, Thread remains dead
... wait for it to finish using join()
... get stack trace (StackTraceElement)
... get name, state, priority, etc.
... suspend, resume, stop, destroy (deprecated!)
simple completion
run()
completes
voluntary suicide watch
track a variable for signal to complete
(but beware the thread cache!)
interruption
via Thread.interrupt()
& InterruptedException
forced termination
via Thread.stop()
, which is dangerous
stop()
can take a Throwable to throw on Thread
created to decouple the launching of tasks from the management of Thread objects directly
execute()
takes Runnable, executes it
actual behavior depends on Executor implementation
generally a superior tool to use
assumes that the threads want to be tracked and massaged
submit()
returns Future representing future results
shutdown()
stops accepting new tasks
awaitTermination()
blocks until all tasks complete
shutdownNow()
terminates running tasks
static class
starting point for instances of ExecutorService
newSingleThreadExecutor
newFixedThreadPool
: fixed-size pool of Threads
newCachedThreadPool
: reuses Threads from a pool
some of these take a ThreadFactory
parameter
defaultThreadFactory
is fine for most of these
customizing this is only for the brave
A concurrent joke: Why did the multithreaded chicken cross the road?
A concurrent joke: Why did the multithreaded chicken cross the road?
To side get to other the !
A concurrent joke: Why did the multithreaded chicken cross the road?
To side get to other the !
get side ! To to other the
A concurrent joke: Why did the multithreaded chicken cross the road?
To side get to other the !
get side ! To to other the
eo gTts or ttthde ohei e!
A concurrent joke: Why did the multithreaded chicken cross the road?
To side get to other the !
get side ! To to other the
eo gTts or ttthde ohei e!
... this joke never ends!
Concurrent activities usually require some degree of synchronization
cars at an intersection
employees working on a project
aircraft landing at an airport
deposits and withdrawals at a bank
processes reading/writing from the same file
threads executing within the same process
Multiple paths of execution create multiple interleaving permutations of execution
this means we must protect variables being accessed
even for simple statements like x = x + 1
32-bit CPUs guarantees 32-bit read/write atomicity
meaning 64-bit values (longs, doubles) cannot be read or written atomically!
Concurrent synchronization is a balancing act
Safety: "Nothing bad ever happens"
Mutual exclusion: no more than one process is ever present in a critical region
No deadlock: no process is ever delayed awaiting an event that cannot occur
Partial correctness: if a program terminates, the output is what is required
Liveness: "Something will eventually happen"
Fairness (weak): a process that can execute will be executed
Reliable communication: a message sent by one process to another will be received
Total correctness: a program terminates and the output is what is required
Patterns for safely representing/managing state:
Immutability: change nothing
Locking: enforce access/mutual exclusion
State dependence: define policies for actions that might affect state
policies can include: blind action; inaction; balking; guarding; trying; retrying; timeout; planning
Containment: structurally guarantee exclusive access by encapsulation
Splitting: isolate independent aspects of state into parts
A sample concurrency problem:
write a thread that "pings" every five seconds, until asked to quit by the user
The JVM memory model
lists requirements regarding memory optimization
allows threads to have per-thread value caches
avoids having to check main heap memory
also means multiple copies...
volatile
tells JVM to not cache field/local
this suffices for simple synchronization...
but suffers in performance
AtomicBoolean
AtomicInteger & AtomicIntegerArray
AtomicLong
AtomicReference
all support the typical collection of operations desired against atomic values: increment, decrement, update-and-return-new-value, update-and-return-old-value, and so on
A sample concurrency problem
multiple threads access a shared data structure
problem: individual put()
s and get()
s may be in the middle of bookkeeping when thread-switch occurs
nodes could get lost
nodes could get double-inserted
existing list could get horribly corrupted
Monitors are the basis for JVM thread synch
every object in VM has an associated monitor
synchronized block tells executing thread to attempt acquisition of indicated object's monitor
thread blocks while waiting
thus, marking methods as synchronized forces thread to attempt to acquire monitor before executing body of synchronized block
synchronized also forces synchronization of main heap and cached thread values
this is called a "memory barrier"
Monitor ownership
marking method as synchronized is equivalent to marking entire body of method as synchronized(this)
this means that anyone can attempt to acquire that object's monitor at any time; in other words, concurrency behavior is not encapsulated
alternative solution is to create private "lock" Object, and synchronize on that, instead
be careful: simple synchronization of methods does not eliminate all concurrency bugs
Locks allow developers to constrain execution
Lock interface provides basic lock/unlock pair
make sure to do this in try/finally block!
ReentrantLock: Lock that re-implements synchronized block behavior
allows only one thread through, but same thread can re-lock() without deadlocking itself
ReadWriteLock allows segregation of reads & writes
ReentrantReadWriteLock combines ReentrantLock with read/write semantics
A sample concurrency problem
producer thread creates data for processing
consumer thread processes created data
only one data item can be ready at a time
problem: how does producer not overwrite data?
problem: how does consumer not process twice?
problem: how do we have multiple producers and/or consumers?
Threads sometimes need to signal each other
Object.wait()
requires Thread to own monitor; it releases monitor, sleeps until notified, then attempts to re-acquire it, and continue
Object.notify()
signals one waiting Thread
Object.notifyAll()
signals all waiting Threads
this forms the basis for signalling, but is low-level & awkward
Conditional locking is very common
Condition interface captures this; obtained from Lock
Conditions are thus intrinsically bound against a Lock
obtain new Condition from Lock via newCondition()
call await()
where formerly wait()
call appeared
call signal()
where formerly notifyAll()
call appeared
Condition can provide additional semantics not present in wait/notify methods
guaranteed order, unlocked execution, and so on
Condition implementation must doc these semantics
Java5 introduced some Collections classes w/additional concurrency-safe methods
Lists and Sets
CopyOnWriteArrayList/CopyOnWriteArraySet
Maps and Sets
ConcurrentSkipListMap/ConcurrentSkipListSet
BlockingQueue, BlockingDeque
ArrayBlockingQueue(!), LinkedBlockingQueue, DelayQueue, PriorityBlockingQueue, SynchronousQueue
ConcurrentMap
ConcurrentHashMapUse these as often as possible!
A sample concurrency problem
multiple threads access a shared data structure
problem: individual put()
s and get()
s may be in the middle of bookkeeping when thread-switch occurs
nodes could get lost
nodes could get double-inserted
existing list could get horribly corrupted
Monitors are the basis for JVM thread synch
every object in VM has an associated monitor
synchronized block tells executing thread to attempt acquisition of indicated object's monitor
thread blocks while waiting
thus, marking methods as synchronized forces thread to attempt to acquire monitor before executing body of synchronized block
synchronized also forces synchronization of main heap and cached thread values
this is called a "memory barrier"
Monitor ownership
marking method as synchronized is equivalent to marking entire body of method as synchronized(this)
this means that anyone can attempt to acquire that object's monitor at any time; in other words, concurrency behavior is not encapsulated
alternative solution is to create private "lock" Object, and synchronize on that, instead
be careful: simple synchronization of methods does not eliminate all concurrency bugs
Locks allow developers to constrain execution
Lock interface provides basic lock/unlock pair
make sure to do this in try/finally block!
ReentrantLock: Lock that re-implements synchronized block behavior
allows only one thread through, but same thread can re-lock() without deadlocking itself
ReadWriteLock allows segregation of reads & writes
ReentrantReadWriteLock combines ReentrantLock with read/write semantics
A sample concurrency problem:
write a thread that "pings" every five seconds, until asked to quit by the user
The JVM memory model
lists requirements regarding memory optimization
allows threads to have per-thread value caches
avoids having to check main heap memory
also means multiple copies...
volatile
tells JVM to not cache field/local
this suffices for simple synchronization...
but suffers in performance
AtomicBoolean
AtomicInteger & AtomicIntegerArray
AtomicLong
AtomicReference
all support the typical collection of operations desired against atomic values: increment, decrement, update-and-return-new-value, update-and-return-old-value, and so on
A sample concurrency problem
producer thread creates data for processing
consumer thread processes created data
only one data item can be ready at a time
problem: how does producer not overwrite data?
problem: how does consumer not process twice?
problem: how do we have multiple producers and/or consumers?
Threads sometimes need to signal each other
Object.wait()
requires Thread to own monitor; it releases monitor, sleeps until notified, then attempts to re-acquire it, and continue
Object.notify()
signals one waiting Thread
Object.notifyAll()
signals all waiting Threads
this forms the basis for signalling, but is low-level & awkward
Conditional locking is very common
Condition interface captures this; obtained from Lock
Conditions are thus intrinsically bound against a Lock
obtain new Condition from Lock via newCondition()
call await()
where formerly wait()
call appeared
call signal()
where formerly notifyAll()
call appeared
Condition can provide additional semantics not present in wait/notify methods
guaranteed order, unlocked execution, and so on
Condition implementation must doc these semantics
Java5 introduced some Collections classes w/additional concurrency-safe methods
Lists and Sets
CopyOnWriteArrayList/CopyOnWriteArraySet
Maps and Sets
ConcurrentSkipListMap/ConcurrentSkipListSet
BlockingQueue, BlockingDeque
ArrayBlockingQueue(!), LinkedBlockingQueue, DelayQueue, PriorityBlockingQueue, SynchronousQueue
ConcurrentMap
ConcurrentHashMapUse these as often as possible!
Summary
multiple threads offer powerful means to enhance application throughput
concurrency is not easy
so rely on the java.util.concurrent
API/implementations
Architect, Engineering Manager/Leader, "force multiplier"
http://www.newardassociates.com
http://blogs.newardassociates.com
Sr Distinguished Engineer, Capital One
Educative (http://educative.io) Author
Performance Management for Engineering Managers
Books
Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)