Friday, March 25, 2011

Digital Earthquakes and Tsunami

We’ve all been watching the events in Japan recently and felt the shock of seeing the devastation of the earthquake and resulting tsunami.  As of this writing there are still untold victims needing to be rescued.  But there is one element of the disaster that bears consideration - one element of bad news that just didn’t happen; at least one headline you haven’t seen.

If you haven’t already, play the video and carefully watch the tall office buildings sway back and forth.  One can only imagine being inside one of those skyscrapers during such an event.  I work in the US Steel Tower in Downtown Pittsburgh and when the high winds of Spring blow through you can see the light fixtures swing.  There is nothing quite like the sound a building makes when it sways like a Junior High slow dance. It’s terrifyingly cool.

The Tokyo office buildings do not sway by accident, it is not a design flaw, or a failure of structure or strength.  Quite the reverse - these building sway on purpose so that they don’t crumble and fall over.  Think about a pane of glass - it is very hard.  So much so, that you need a diamond to scratch it.  Try to bend it, and it doesn’t.  Try just a teeny bit harder and it cracks, shatters, and breaks apart. Glass is either in one piece, or a dozen - there’s no in between.

Tall buildings used to be constructed like glass.  Hard, Strong. Rigid.  But if even a small earthquake came along, the result was devastating.  Buildings shattered like glass.

That is the headline you didn’t read.  Amongst all the destruction, and make no mistake there was widespread loss - but none of the tall office structures collapsed, because they were designed to lean, bend, ebb-and-flow, and to dissipate the energy of ground movement through their skeletons, connections, and interconnected systems.

In the digital domain, this would be called loose coupling.  Loose coupling allows the failure in one system to be absorbed or deflected by others.  The failure of a back-end process should never cause the user interface to lock up - if they are loosely coupled.

We work in an environment where high availability and disaster recovery should be natural elements of our solutions.  Like the architect’s of Tokyo’s office buildings, we should take the time to ensure that our highly distributed, complex applications are loosely coupled so that in the event of the unthinkable, (or even the routine!) our systems behave in a manner most conducive to allowing our customers and employees to achieve as many of their goals as possible.

Servers will fail.  Queues will back up.  Digital earthquakes happen - they are not “unexpected.”  Design for them.

Follow by Email