ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward
Explain what REST is
Understand where it came from, and why
Discuss some of the nuances
Figure out when it's useful, and when it's not
Talk about what your next steps are after REST
The Web has gone through several technology iterations
not so much the approach, but the tools and technologies
and the way in which we used it all
and, as a result, how the Web is built and used
Project Xanadu was the brainchild of Ted Nelson
"A word processor capable of storing multiple versions, and displaying the differences between these versions"
"On top of this basic idea, Nelson wanted to facilitate nonsequential writing, in which the reader could choose his or her own path through an electronic document."
http://en.wikipedia.org/wiki/Project_Xanadu
"Xanadu, a global hypertext publishing system, is the longest-running vaporware story in the history of the computer industry. It has been in development for more than 30 years."
"Xanadu was meant to be a universal library, a worldwide hypertext publishing tool, a system to resolve copyright disputes, and a meritocratic forum for discussion and debate. By putting all information within reach of all people, Xanadu was meant to eliminate scientific ignorance and cure political misunderstandings. And, on the very hackerish assumption that global catastrophes are caused by ignorance, stupidity, and communication failures, Xanadu was supposed to save the world."
Source: http://www.wired.com/wired/archive//3.06/xanadu_pr.html
Every Xanadu server is uniquely and securely identified.
Every Xanadu server can be operated independently or in a network.
Every user is uniquely and securely identified.
Every user can search, retrieve, create and store documents.
Every document can consist of any number of parts each of which may be of any data type.
Every document can contain links of any type including virtual copies ("transclusions") to any other document in the system accessible to its owner.
Links are visible and can be followed from all endpoints.
Permission to link to a document is explicitly granted by the act of publication.
Every document can contain a royalty mechanism at any desired degree of granularity to ensure payment on any portion accessed, including virtual copies ("transclusions") of all or part of the document.
Every document is uniquely and securely identified.
Every document can have secure access controls.
Every document can be rapidly searched, stored and retrieved without user knowledge of where it is physically stored.
Every document is automatically moved to physical storage appropriate to its frequency of access from any given location.
Every document is automatically stored redundantly to maintain availability even in case of a disaster.
Every Xanadu service provider can charge their users at any rate they choose for the storage, retrieval and publishing of documents.
Every transaction is secure and auditable only by the parties to that transaction.
The Xanadu client-server communication protocol is an openly published standard. Third-party software development and integration is encouraged.
Show me HTML!
This was the era of the static HTML page
Geocities, MySpace, "home pages", remember?--- http://www2.warnerbros.com/spacejam/movie/jam.htm
This was an age of entirely HTML-based web pages
Then we wanted some small amount of server interaction
Web page "hit counters"
Temperature conversion
Access to server-side environment values
Maybe even store a form to a database
This was the era of CGI/ISAPI/NSAPI/etc
Perl scripts, bash scripts, C++ plugins
Pages could now be slightly dynamic
HTML is too limiting; give me power in the browser
Applets, Flash, Silverlight--- other technologies came and went here, too
Server remained simple, though its complexity was growing
for example, applet could make CORBA calls to the server
or, applet could (w/the right JDBC drivers) call the database
But applets were ugly, and non-uniform--- not to mention a "security hole"
So processing shifted to the server
Servlets/JSP/"Model Two"
ASP/ASP.NET
Ruby-on-Rails
PHP
"MVC" server-side designs/patterns emerged
Enterprise systems needed ways to connect with one another
Firewalls made "traditional" interop tools difficult
HTTP was easy to punch through firewalls
Enter SOAP, WSDL, and the WS-DeathStar
At the same time, the Internet itself changed
Alternative access devices appeared (mobile!)
These access devices had their own UI technologies
So "Web APIs" began to emerge
Characterized by "simplicity" and nominally "RESTful"
JSON or XML over HTTP
HTTP/1.1: RFC 2616
"application-level protocol for distributed, collaborative, hypermedia information systems"
generic, stateless protocol
allows for very easy extension
typing and negotiation of data representation
Dependencies
TCP/IP, DNS
underlying communication infrastructure
URL and URI (RFCs 1738, 1630, 1808, 2396)
target server, port and resource to request
MIME (RFCs 2045, 2046, 2047)
description of content formats
TLS (SSL)
secure transmission
Basic details
server listens on well-known port (80)
client initiates communication
client sends request packet, server sends response packet
connection is closed after each send/receive cycle
no state retained across cycles
Quick note: stateless
HTTP explicitly holds no server fidelity
any server can answer any request
this is what allows HTTP to scale so well
the ubiquitous "web farm"
browser cookies are NOT(!) part of the HTTP spec
in fact, HTTP authors disdain the use of cookies
making HTTP stateful in some way usually fails miserably
Basic protocol notes
all text is in "7-bit ASCII clean" format
in other words, nothing above ASCII value 127
all text uses CRLF pairs to denote EOL
client/server request/resonse protocol
client always initiates
client blocks until server responds
packets are always single-line plus header/value pairs
and optional content body
Request packet
GET / HTTP/1.1 Host: www.newardassociates.com Accept: */*
Response packet
200 OK HTTP/1.1 Content-Type: text/html Content-Length: 32 <html><body>Howdy!</body></html>
Request packet
Request-Line: Method Request-URI HTTP-Version CRLF
Method: the "verb"
Request-URI: the resource
HTTP-Version: "HTTP/1.1" or "HTTP/1.0"-- other versions possible, never used
(Optional) Header: Value CRLF
CRLF
(Optional) Content body
Request methods
GET: retrieve resource idempotently
should have no side effects (cacheable results)
POST: accept this sent relevant data
PUT: store this data as the resource
DELETE: remove the resources
Request methods
OPTIONS: describe verbs supported for the resource
HEAD: GET without content body
TRACE: diagnostic trace
CONNECT: for use with a tunneling proxy
Request-URI
URL minus TCP/IP-related parts
no scheme (http://)
no server (www.google.com)
no port (:80)
used to identify resource requested
absolute resource path
not always a filesystem resource
... though early and simple webservers do map URL paths to filesystem paths
doing so is dangerous: beware relative paths ("../../../etc/passwd")
HTTP-Version
this is the version the client wishes to use
server will respond with its version in response
client can either upgrade or downgrade as necessary
in practice, this is almost always "HTTP/1.1"
ten years ago, negotation between 1.0 and 1.1 was common
if we ever see an HTTP/2.0, negotiation will become important
Header: Value lines
more on headers later
each header line must be ended with CRLF
each header describes one annotation/extension/adaptation to the request
request and response use same sets of headers
a few are client- or server-specific, but not many
CRLF (empty line)
empty line is mandatory
server will block until it receives this second CRLF!
separates Request-Line/Headers from Content body
must be present, even with no headers or content body
Content body
entirely opaque to HTTP protocol
we use headers to describe the content body
Content-Type, Content-Length most common
recipient then to read exactly that many bytes (no more, no less)
failure to do this is a security hole!
Response packet
Response-Line: Status-Code Reason-Phrase HTTP-Version CRLF
(Optional) Header: Value CRLF
CRLF
(Optional) Content body
Status-Code
quick integer description of server's results
1xx: Informational
2xx: Sucess
3xx: Redirect
4xx: Client error
5xx: Server error
Reason-Phrase
textual description of status-code
usually purely for human consumption
these are not standardized except de facto in a few cases
200: OK
404: Resource not found
401: Requires authorization
500: Internal server error
HTTP-Version
this is the version the client wishes to use
server will respond with its version in response
client can either upgrade or downgrade as necessary
in practice, this is almost always "HTTP/1.1"
ten years ago, negotation between 1.0 and 1.1 was common
if we ever see an HTTP/2.0, negotiation will become important
Header: Value lines
more on headers later
each header line must be ended with CRLF
each header describes one annotation/extension/adaptation to the request
request and response use same sets of headers
a few are client- or server-specific, but not many
CRLF (empty line)
empty line is mandatory
server will block until it receives this second CRLF!
separates Request-Line/Headers from Content body
must be present, even with no headers or content body
Content body
entirely opaque to HTTP protocol
we use headers to describe the content body
Content-Type, Content-Length most common
recipient then to read exactly that many bytes (no more, no less)
failure to do this is a security hole!
Common headers
Host: the host (and optional port) requested
required by 1.1 for multitenant server scenarios
User-Agent: description field describing the client
not required, but almost always included
this is how we determine client capabilities
Common headers
Content-Type: MIME type of content being sent
Content-Length: size (in octets/bytes) of content
these two are required if content body is present
Common headers
Accept: specify certain media types to be acceptable
comma-delimited list of MIME types
Accept-Encoding: describes content encodings
used to allow for request/response gzip compression
Authorization: client sends to authenticate to server
"Authorization: {credentials}"
credentials: scheme credential-data
where scheme is Basic, Digest, or others
Common headers
Connection: state of the connection
server most often sends "close" in 1.1 exchanges
WWW-Authenticate: required in all 401 responses
contains a challenge that indicates the authentication scheme
"WWW-Authenticate: {challenge}"
challenge is typically either Basic or Digest
For more information
Consult RFC 2616 for any remaining details
official: https://www.ietf.org/rfc/rfc2616.txt
Consult RFC 2324 for details on how to extend HTTP
Hyper Text Coffee Pot Control Protocol
official: https://www.ietf.org/rfc/rfc2324.txtAdditional, useful information
Fielding's dissertation
http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
Architecture of the World Wide Web
http://www.w3.org/TR/webarch/
"Using elements of the client/server, pipe-and-filter, and distributed objects paradigms, this 'representational state transfer' style optimises the network transfer of representations of a resource. A Web-based application can be viewed as a dynamic graph of state representations (pages) and the potential transitions (links) between states. The result is an architecture that separates server implementation from the client's perception of resources, scales well with large numbers of clients, enables transfer of data in streams of unlimited size and type, supports intermediaries (proxies and gateways) as data transformation and caching components, and concentrates application state within the user agent components."
REST takes the position that the Web as it currently exists is all we really need--why reinvent the wheel?
URIs provide unique monikers on the network
HTTP provides commands and request/response
HTML/XML provides content format
a RESTful model seeks to establish "resources" and use the basic CRUD methods provided by HTTP (GET, POST, PUT, DELETE)
find an Employee:
GET /employeeDatabase?name='fred'
returned content body will be employee data
creating a new Employee:
PUT /employeeDatabase
content body is the employee data
modify an existing Employee:
POST /employeeDatabase?name='fred'
goal of RESTful system is to model the data elements
addressable resources (via URIs)
uniform interfaces that apply to all resources
manipulation of resources through representations
stateless self-descriptive messages
'representations'--multiple content types accepted or sent
in essence, we're organizing a distributed application into URI addressable resources that provide the full capabilities of that application solely through HTTP
this is a complete flip from traditional O-O
objects encapsulate data behind processors
REST hides processing behind data elements/structures
consider the World Wide Web:
well-established, well-documented, "debugged"
no new infrastructure to establish
payload-agnostic
well-defined solutions (HTTPS, proxies, gateways)
obviously extensible (WebDAV, explosive growth)
platform-neutral and technology-agnostic
it's hard to argue with success!
REST provides "anarchic scalability"
assumes there is no one central entity of control
architectural elements must continue operating when subjected to unexpected load ("the Slashdot effect")
REST allows for independent deployment
hardware/software can be introduced over time w/o breaking clients (the power of the URI and DNS)
not all participants need change/evolve simultaneously
REST returns us to simplicity
it's all URLs, HTTP, and HTML/XML; nothing else
REST depends a great deal on underlying technology
HTTP uses simple name/value pairs for headers
this leaves out complex headers (a la WS-Sec)
REST requires a loosely-bound API
"interface genericity"
no metadata constructs to key from
REST requires more work on your part
In 2008, Lenny Richardson posited the "Richardson Maturity Model"
http://www.crummy.com/writing/speaking/2008-QCon/act3.html
basically an attempt to distinguish against SOAP/WSDL services
but also useful as a measuring stick for REST adoption
The RMM reads like this:
Stage Zero: POX/SOAP/XML-RPC
many URLs, one HTTP method/verb
Stage One: "Resources"
modeling endpoints to represent resources in the system
Stage Two: HTTP Verbs
modeling the endpoints using HTTP verbs (GET, POST, PUT, DELETE, etc)
Stage Three: HATEOAS
Hypertext As The Engine Of Application State
"replicate a call stack in XML over HTTP"
endpoints represent a service or a "call"
data (often) conveyed in XML, modeled as "request" and "response"
all HTTP interaction is done using POST
discard "services" and "request"/"response" data types
follow the REST model of thinking about "resources"
each resource gets its own endpoint
individual resources are given unique URIs
HTTP verbs model basic CRUD
GET - Retrieve
POST - Create
DELETE - Delete
PUT - Update
Use those against a single resource to indicate intent
Signal reactions/results using HTTP response codes
This helps separate "safe" against "unsafe" (modifying) actions
Now the data returned contains all knowledge of what is available next
similar in spirit to how HTML contains links to next pages
This allows server to modify its URL scheme (in theory)
This also allows for "discovery" of new functionality (in theory)
Issues with HATEOAS
Not all state is public
Not all state is safe to transfer
Not all state is easily represented using ATOM or other formats
We've kinda moved on from XML, to boot
Issues with HTTP
client-initiated
request-response
How does the server push anything?
Potential of "chatty" communication with server leads to performance woes
Wrapping up
REST is an interesting architectural style
Pro: Fits in perfectly on top of the Web infrastructure
Con: Doesn't always fit in with what we're building
RMM Level Three takes a ton of work
On both the server and the client(s) side
Mobile apps, for example, will have a hard time consuming an ATOM-based format
Not everything called "RESTful", is!
Not everything fits into a client/server communication model
But if the shoe fits....
Architect, Engineering Manager/Leader, "force multiplier"
http://www.newardassociates.com
http://blogs.newardassociates.com
Sr Distinguished Engineer, Capital One
Educative (http://educative.io) Author
Performance Management for Engineering Managers
Books
Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)