ted@tedneward.com | Blog: http://blogs.tedneward.com | Twitter: tedneward | Github: tedneward | LinkedIn: tedneward
Understand the basics of messaging-based communication
what messaging is
how messaging works
Recognize when to use it
what messaging offers
what complications messaging presents
Learn some of the patterns around messaging
message-exchange patterns
architectural communication approaches
See some concrete messaging plumbing
Applications need to talk to one another
no violations of the "Once and Only Once" rule
processing sometimes needs centralization for correctness
processing sometimes needs distribution for scale
either way, machines need to talk to machines
The traditional answer has been RPC
"make it look like a function call"
define an interface (for compilation type-safe requirements)
write server-side implementation
tools generate the plumbing necessary between the two
RPC can't solve all communication issues
What if the network connectivity is flaky or periodically out?
traveling salesmen, dial-up connections, wireless ...
What if we get huge burst loads?
"the Slashdot effect"
What if we need to scale out?
"today Duluth; tomorrow, the world!"
What if we need to offer priority to certain clients?
"It's the VP on port 34865, and he's in a hurry..."
What if we need transactional semantics when communicating with multiple recipients?
"When it absolutely, positively has to be there"
What if we evolve?
"We didn't know we needed it when we wrote the spec!"
RPC enables easy communication, at a cost:
request/response communication
for every request, we expect a response
block caller thread until response is received
servers and servants must be available and well-known
RPC proxies typically "pin" against a given server
servant object must be up & running to answer
if server or servant dies, RPC proxies will fail
proxies and stubs are compile-time constants
strong binding/typing makes programming easier
but makes evolution/change harder
RPC exposes behavior
Strip the veneer off RPC
RPC == request message + response message
break the messages apart, treat them independently
allows for different message exchange patterns
request-response
fire-and-forget
solicit-notify
request-async response
Messaging exposes data
instead of creating contracts based on interfaces...
create contracts based on message types (Messaging)
Note: seductively similar to RPC
"On a NewCustomer message, create a new customer..."
strive for context-complete communications
Anatomy of a MOM system
Message = headers (routing info) + payload
Delivery can be asynchronous
Format is unspecified (loosely bound)
Destination = named message repository
Decouples message producer/consumer
Allows for easier redirection/change of call flow
Runtime = variety of delivery semantics
Reliable, transacted, prioritized, deadline-based, publish-and-subscribe etc
Provides one or more channels to deliver messages
File transfer
Messages: files
Destination: filesystem subdirectory
Runtime: operating system
Shared database
Messages: data tuples
Destination: database tables
Runtime: database
JMS/MSMQ
Messages: byte, text, object, stream, map
Destination: queues (point-to-point), topics (publish-subscribe)
Runtime: MOM support
SOAP
Messages: SOAP XML format
Destination: (depends on transport)
Runtime: Axis, WebLogic, etc
Benefits of a MOM system
Use messaging for flexibility
Use messaging to smooth out burst loads
Use messaging for system integration
Use messaging for flexibility
Permits more granular processing logic
Routing Slip EAI "routes a message consecutively through a series of processing steps when the sequence of steps is not known at design-time and may vary for each message"
Content-Based Router EAI "handles a situation where the implementation of a single logical function (e.g., inventory check) is spread across multiple physical systems"
Avoid concurrency deadlocks (due to blocking on RPC response)
Use messaging for flexibility
Application now has many more data flow options
Fire-and-forget, multicast, disconnected, load-balancing, flow control, priority routing, etc
Permits easier maintenance and evolution
change message format without recompiling uninterested clients
"flow" data from one side to another without modifying intermediate peers
Use messaging to smooth out burst loads
queues can fill up with messages waiting to be processed
processors/consumers pull messages as fast as possible
if processor nodes can't keep up...
add more processors, and/or...
wait for the burst load to disperse over time
Use messaging for system integration
messaging doesn't require agreement on type system
message is the type
no agreement on "object semantics" necessary
messaging doesn't require strongly-bound APIs
messaging can bridge several systems (Java, .NET, etc)
XML is absolutely perfect for this
other data representations (CSV, NDR, plain text) work too
messaging's flexibility permits easier integration
Message Translators EAI permit transmutation of message
Message Routers EAI enable processing workflow
Complications of a message-based system
communication is with queues, not objects
bidirectional communication requires at least two queues: one for the request message, one for the response
no sense of conversational state
sequence: messages may arrive out of order
synchronous communication requires addt'l work
no sense of "object identity"
messages come to queues, not objects
disruption of the usual "client-server" approach
more like "producer-consumer" or "peer-to-peer"
For strongly-typed, language-coupled, blocking request/response communication, stick with RPC!
Messaging systems have long been a part of enterprise systems
batch file processing (file == message)
email (email message == message)
database (database tuple(s) == message)
Plus a few new ones have cropped up in recent years
instant messengers (instant message == message)
HTTP request (request body == message)
When combined with XML (or SOAP) as data payload...
Simple Mail Transfer Protocol (SMTP)
Internet standard for the better part of a decade
Post Office Protocol (v3) (POP3)
Internet standard for storing and allowing user email download
widely supported, particularly for web-based email systems (Hotmail, Yahoo! mail, etc)
Internet Mail Access Protocol (v4) (IMAP4)
Internet standard for storing and accessing email
more sophisticated than POP3, less widely supported
All are straight text-based protocols
File Transfer Protocol (FTP)
simple file transfer from one machine to another
well-known/understood security implications
authentication
ports & firewall accessibility
create applications that "spin" on the FTP incoming directory
when a new file is created, launch, create a lock file, process
when that file is finished, delete the lock file and the message
Straight sockets: TCP, UDP
TCP/IP: "point-to-point, guaranteed delivery"
UDP/IP: "broadcast, non-guaranteed delivery"
either way, client opens socket, listens, acts on connection
up to developers to define message format
for best interop purposes, keep it textual and simple
XML and JSON are best candidates here
"Using elements of the client/server, pipe-and-filter, and distributed objects paradigms, this 'representational state transfer' style optimises the network transfer of representations of a resource. A Web-based application can be viewed as a dynamic graph of state representations (pages) and the potential transitions (links) between states. The result is an architecture that separates server implementation from the client's perception of resources, scales well with large numbers of clients, enables transfer of data in streams of unlimited size and type, supports intermediaries (proxies and gateways) as data transformation and caching components, and concentrates application state within the user agent components."
REST takes the position that the Web as it currently exists is all we really need--why reinvent the wheel?
URIs provide unique monikers on the network
HTTP provides commands and request/response
HTML/XML provides content format
a RESTful model seeks to establish "resources" and use the basic CRUD methods provided by HTTP (GET, POST, PUT, DELETE)
find an Employee:
GET /employeeDatabase?name='fred'
returned content body will be employee data
creating a new Employee:
PUT /employeeDatabase
content body is the employee data
modify an existing Employee:
POST /employeeDatabase?name='fred'
goal of RESTful system is to model the data elements
addressable resources (via URIs)
uniform interfaces that apply to all resources
manipulation of resources through representations
stateless self-descriptive messages
'representations'--multiple content types accepted or sent
in essence, we're organizing a distributed application into URI addressable resources that provide the full capabilities of that application solely through HTTP
this is a complete flip from traditional O-O
objects encapsulate data behind processors
REST hides processing behind data elements/structures
consider the World Wide Web:
well-established, well-documented, "debugged"
no new infrastructure to establish
payload-agnostic
well-defined solutions (HTTPS, proxies, gateways)
obviously extensible (WebDAV, explosive growth)
platform-neutral and technology-agnostic
it's hard to argue with success!
REST provides "anarchic scalability"
assumes there is no one central entity of control
architectural elements must continue operating when subjected to unexpected load ("the Slashdot effect")
REST allows for independent deployment
hardware/software can be introduced over time w/o breaking clients (the power of the URI and DNS)
not all participants need change/evolve simultaneously
REST returns us to simplicity
it's all URLs, HTTP, and HTML/XML; nothing else
REST depends a great deal on underlying technology
HTTP uses simple name/value pairs for headers
this leaves out complex headers (a la WS-Sec)
REST requires a loosely-bound API
"interface genericity"
no metadata constructs to key from
REST requires more work on your part
JMS defines two modes: point-to-point, and publish-subscribe
APIs are syntactically and semantically similar
differences are in how consumers consume messages
both sets uses inherited common base classes
When we say "point-to-point", we mean...
a message is consumed by one consumer
multiple producers may be able to send to a single Queue
depends on provider details; most will
multiple consumers may be able to consume from a Queue
depends on provider details; most will
When we say "publish-subscribe", we mean...
a message is consumed by 0..n consumers
multiple producers may be able to send to a single Topic
depends on provider details; most will
multiple consumers will consume from a Topic
depends on provider details; most will
message doesn't go away until all durable subscribers receive() it
When we say "subscribers can be durable", we mean...
they will be "remembered" by the plumbing when messages are delivered to the Topic
they will receive messages even if they are not currently alive to receive them (later delivery)
durable subscribers are frequently administered
or create a durable subscriber with createDurableSubscriber()
unsubscribe with unsubscribe() (important!)
duplicate subscribers not allowed
Messages can be of several different formats
raw bytes
text
a series of ordered primitive values (a tuple)
a Serializable Java object
a map of name/value pairs (of primitive values)Any other formats can usually be managed given the above
SOAP: over text
NDR: over raw bytes
An (incomplete) list of JMS providers:
Open source: ActiveMQ, OpenJMS, JORAM
Special-purpose: DropboxMQ, Somnifugi
Big vendors: Weblogic, IBM (MQ), Oracle (AQ, Sun OpenMQ)
Small vendors: SwiftMQ, SonicMQ, Fiorano, JbossMQ
Certain details will change with the provider used
some will use Properties files and the filesystem
some will use a central database/repository
configuration will change based on those details
some will offer certain features others don't
recoverability, administration, monitoring, ...
choosing the provider can be the hardest part at times
messaging is flexible
messaging is scalable
messaging can operate over many channels
messaging requires more work on the part of the programmer
many different messaging exchange patterns exist
Who is this guy?
Architect, Engineering Manager/Leader, "force multiplier"
Co-founder, Solidify US
http://www.solidify.dev
Principal -- Neward & Associates
Author
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)
See http://www.newardassociates.com