Fallacies

Fallacies are...

widely-believed falsehoods

incorrect assumptions

lovingly-endorsed "alternative facts"

mistakes that are all too easy to repeat

"anti-patterns"

Fallacies

To put it in Deutsch's own words...

"Essentially everyone, ... makes the following assumptions.

"All turn out to be false in the long run and all cause big trouble and painful learning experiences."

Fallacies

In theory...

... once you know about them, you can avoid them

In practice...

... "Everyone makes these assumptions"

largely because they're easy to make

and they help us avoid hard truths

and painful realizations/pain points

The Fallacies of Distributed Computing

Mistakes are all too easy to repeat

The Fallacies of Distributed Computing

"Essentially everyone, when they first build an distributed system, makes the following 10 assumptions. All turn out to be false in the long run and all cause big trouble and painful learning experiences."

The network is reliable

Latency is zero

Bandwidth is infinite

The network is secure

Topology doesn't change

There is one administrator

Transport cost is zero

The network is homogeneous

Fallacy: The Network is Reliable

Distributed Fallacy #1

Fallacy: The Network is Reliable

"The network is reliable"

Hardware fails

Routers go down, wires are cut (sometimes catastrophically), power spikes, hurricanes, ...

Sometimes it's even as simple as "who turned off the server?"

Software fails

Processes throw exceptions when they shouldn't need to, or hang, or ...

... or sometimes you get hacked

Physics fails

Not very often (we hope), but signal just doesn't travel the wireless airwaves like it should

Fallacy: The Network is Reliable

Mitigation: "When", not "if"

Assume the remote resource will fail

Code appropriately: timeouts, retries, backups, and so on

Never timeout infinitely

Local is usually pretty reliable, what can we use there?

Fallacy: Latency is Zero

Distributed Fallacy #2

Fallacy: Latency is Zero

Bits take time

... to move through the networking layers and physical hardware

And remember, they need to do it lots of times (once per intermediary)!

Even fast networks are orders of magnitude slower than slow PC buses

Fallacy: Latency is Zero

Mitigation: Count the bytes

Be frugal in passing data across the network; the more data passed, the longer it'll take for it all to get there

Remember, TCP/IP tries to "guarantee" delivery of all of those packets, which grows steadily more difficult with a larger number of packets

How can we communicate more on each trip? (batch)

How can we trim down the amount of data being sent?

Fallacy: Bandwidth is Infinite

Distributed Fallacy #3

Fallacy: Bandwidth is Infinite

A network can only carry so much data at once

A T-1 line's "phat pipe" gets saturated pretty quickly in the face of VOIP, streaming video, music downloads, graphic-heavy websites, ...

Once we throw Web services into the mix, assume the bandwidth demands double or triple

Once "everything goes over one wire", expect the available bandwidth for your application to be a fraction of what it is now

Remember, laying down new wire (fiber-optic) is an exercise in digging up your street...

Developers frequently write code on small lightly-congested LANs or standalone machines/laptops

But Production looks different than a dev laptop...

Fallacy: Bandwidth is Infinite

Mitigation: Don't send more than you need

Be frugal with the amount of data you send across the wire; send only that which can't be cached

Ironically, this argues against the browser-based application, since half the data sent is presentation information; hence the rise of the "smart client"

How can we trim down the amount of data being sent?

Keep data and processing close together

Fallacy: The network is secure

Distributed Fallacy #4

Fallacy: The network is secure

Secure networks depend on...

"Developers are competent"

Not always... how much do you know about security?

How about your coworkers, including Harvey The Intern?

"Remote data can be trusted"

TCP/IP packets themselves can be source-spoofed

Major impetus for IPv6 and other 'next-gen' efforts

"Remote system can be trusted"

Even if it could at one point, how do you know it hasn't been hacked since then?

"It'll never run outside of our firewall"

Lots of people carrying laptops, phones, Blackberrys...

Lots of wireless networks going up...

Fallacy: The network is secure

Mitigation: Assume insecurity

Remember that any application listening to the network has at least two client front-ends to it:

the one you wrote

... and Telnet (or curl), the hacker's best friend

If you assume that every byte that comes in off the network has to go through a 12-step recovery program before being used anywhere in your program, that's a good start

If you find yourself arguing crypto key bit size with another developer, you're arguing over the size of the vault door on your tent

If you find yourself trusting firewalls to take care of your security needs, please don't work for my company

Fallacy: Topology doesn't change

Distributed Fallacy #5

Fallacy: Topology doesn't change

Topology: Physical arrangement of networks

Topological changes sometimes happen without planning

Hardware failures, software failures, natural disasters, ...

The code could run on a laptop (or smartphone!) that gets carried from hotel to hotel

The network could be a wireless one, where nodes are constantly coming & going

or worse, it's a combination of wired & wireless

The code could also be "upgraded" to run in an entirely different environment

Fallacy: Topology doesn't change

Mitigation: Make use of layers of indirection

Networking frequently makes available "layers of indirection" to keep physical hardware topology somewhat hidden; use it

This means DNS, NAT, and so on

Some programming models provide one (JNDI)

Consider peer-to-peer tools (WS-Discovery, UDP/IP, Multicast, ...) to help keep track of topological changes

Fallacy: There is one administrator

Distributed Fallacy #6

Fallacy: There is one administrator

Just one person?

"... and he will never quit, get hit by a bus, or take a vacation"

Believe it or not, even hard-core sysadmin geeks like to get away from the computer once in a while

Maybe even date!

"But we control both ends"

For now, perhaps, but what happens if your app is wildly successful? Or your company buys a competitor? Or is bought? Or partners up?

Fallacy: There is one administrator

Mitigation: Make the system administrator-friendly

At any point, a relatively competent system administrator should be able to use standard tools and services to install and/or monitor and/or diagnose the system

Make use of O/S management facilities

Build in the management/administrative functionality that isn't otherwise handled (adding/removing users, finding "lost" records, and so on)

Fallacy: Transport cost is zero

Distributed Fallacy #7

Fallacy: Transport cost is zero

"It's the network! It's fast as light!"

Pointers don't travel well

Networking stacks spend a lot of time shuffling bits into a stream of bytes that can be sent across the wire

Process is called marshaling, and it's not a free action

Language (Java, .NET Remoting) use Serialization to do the marshaling

Web services have to marshal/unmarshal to SOAP/XML

REST services marshal/unmarshal to XML, JSON, ...

Object graphs can get very large, very quickly

Fallacy: Transport cost is zero

Mitigation: Know what you're sending, and its costs

Measure the full cost of sending data across the wire by measuring the full cost of marshaling

Either recreate the marshaling (by serializing all the parameters and back)

Or watch the data go across the wire

Or measure with a profiler

Consider separate models for each tier

Fallacy: The network is homogeneous

Distributed Fallacy #8

Fallacy: The network is homogeneous

"Networks are all made up of just one thing"

Most (all?) companies are also a mixture

Not even my home network is homogeneous--Linux, Windows & Mac OS/X

Originally an argument for "why Java"

But along came .NET... and Ruby... and ...

Never mind legacy C/C++, COBOL, ...

Even if it is today, there's tomorrow

And the inevitable partnerships, buyouts, mergers, and other corporate activities

You can run, but you can't hide

Fallacy: The network is homogeneous

Mitigation: Assume a heterogenous network at all points

Never assume it will always be "X" at both ends

Stick to well-known technologies at the edges of your component boundaries

When you do interop, prefer to do so at remoting & component boundaries

Summary

Wrapping up

Summary

EVERYBODY HAS MADE THESE MISTAKES

Summary

EVERYBODY HAS MADE THESE MISTAKES

No shame in admitting it!

Learn from the mistakes

Summary

EVERYBODY HAS MADE THESE MISTAKES

No shame in admitting it!

Learn from the mistakes

Recognize when you're falling into the traps

Avoid the implicit assumptions during design

Hold design reviews against the fallacies

Be aggressive in stamping them out

Credentials

Who is this guy?

Architect, Engineering Manager/Leader, "force multiplier"

http://www.newardassociates.com

http://blogs.newardassociates.com

Sr Distinguished Engineer, Capital One

Educative (http://educative.io) Author

Performance Management for Engineering Managers

Books

Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)

Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)

Effective Enterprise Java (Addison-Wesley, 2004)

SSCLI Essentials (w/Stutz, et al; OReilly, 2003)

Server-Based Java Programming (Manning, 2000)

Busy Architect's Guide to the Fallacies of Distributed Systems

Fallacies

What are these again?

Fallacies

Fallacies are...

Fallacies

To put it in Deutsch's own words...

Fallacies

In theory...

In practice...

The Fallacies of Distributed Computing

Mistakes are all too easy to repeat

The Fallacies of Distributed Computing

Fallacy: The Network is Reliable

Distributed Fallacy #1

Fallacy: The Network is Reliable

"The network is reliable"

Fallacy: The Network is Reliable

Mitigation: "When", not "if"

Fallacy: Latency is Zero

Distributed Fallacy #2

Fallacy: Latency is Zero

Fallacy: Latency is Zero

Fallacy: Bandwidth is Infinite

Distributed Fallacy #3

Fallacy: Bandwidth is Infinite

A network can only carry so much data at once

Fallacy: Bandwidth is Infinite

Mitigation: Don't send more than you need

Fallacy: The network is secure

Distributed Fallacy #4

Fallacy: The network is secure

Secure networks depend on...

Fallacy: The network is secure

Mitigation: Assume insecurity

Fallacy: Topology doesn't change

Distributed Fallacy #5

Fallacy: Topology doesn't change

Fallacy: Topology doesn't change

Fallacy: There is one administrator

Distributed Fallacy #6

Fallacy: There is one administrator

Just one person?

Fallacy: There is one administrator

Fallacy: Transport cost is zero

Distributed Fallacy #7

Fallacy: Transport cost is zero

Fallacy: Transport cost is zero

Fallacy: The network is homogeneous

Distributed Fallacy #8

Fallacy: The network is homogeneous

"Networks are all made up of just one thing"

Even if it is today, there's tomorrow

Fallacy: The network is homogeneous

Mitigation: Assume a heterogenous network at all points

Summary

Wrapping up

Summary

EVERYBODY HAS MADE THESE MISTAKES

Summary

EVERYBODY HAS MADE THESE MISTAKES

Summary

EVERYBODY HAS MADE THESE MISTAKES

Credentials

Who is this guy?