March 9, 2007

Are We Slowly Losing Control of the Internet?

I have long been intrigued by the question of how do we turn the internet into a lifeline grade infrastructure.  (See, for example my presentation From Barnstorming to Boeing - Transforming the Internet Into a Lifeline Utility (powerpoint) [with speakers notes - Adobe Acrobat format].)

My hope that this will occur soon or even within decades is diminishing.

Most of us observe, almost daily, how even well established infrastructures tend to crumble when stressed, even slightly.  For example, even something as small and foreseeable as a typo in someone's name or SSN number during a medical visit can generate months of grief when dealing with insurance companies.

I was at the O'Reilly Etel conference last week.  The content was impressive and the people there were frequently the primary actors in the creation and deployment of VOIP.  However, not once during the three days did I hear a serious discussion by a speaker or in the hallways about how this evolving system would be managed, monitored, diagnosed, or repaired.

My mailbox is being filled with IETF announcements for the upcoming meeting in Prague.  I see internet draft after internet draft making proposals that are going to cause implementation errors, security holes, and ultimately service outages.

Take for example the prime candidate protocol for VOIP - SIP.

I've spoken to many people who have implemented SIP components.  There is a common theme - that SIP is far too complex.  Even the basic encoding method is a mess - apparently the SIP working group could not agree among alternatives, so like most committees, they comprised by allowing all alternatives.  The result is that the SIP implementer has to write code to handle many different representations of exactly the same information.  That means that there will probably be code paths that are insufficiently, or never, tested.  It also means that SIP systems will probably be susceptible to failure or misbehavior when introduced, perhaps years after initial instillation, to new SIP devices based on different SIP engines.

And to top that off, many of the new proposals for SIP use completely different encoding methods (the darling of the moment is XML) from the textual ASCII/UTF8 form used in the core parts of SIP.  Implementers are going to go gray from the stress of trying to make this mish-mosh work.  And people who have to maintain and troubleshoot VOIP will go bleary eyed and take hours longer to resolve outages than they would had there been a consistent and uniform design.

There is a lot of talk about the benefits of network effects, but few people talk about how those same network effects lock-in the work of the past and make it difficult, perhaps impossible, to evolve to new and improved mechanisms.

History often survives and reaches out through very long periods of time.  It has been said that the size of modern day airplanes are derived from the width of the Roman horse: The width of the horse dictated the spacing of wheels on Roman carts.  Those carts created standardized ruts that coerced other carts to conform through the ages.  Early railroads, adopting carts, spaced the rails one-rut-pair width apart.  That width dictated cargo load size.  The need to carry those cargos has affected airplane design.

Consider how long it has taken to deploy IPv6 - a technology that celebrated its 10th anniversary a few years ago.  And IPv6 has the luxury of being an alternative to IPv4 rather than a transparently compatible upgrade.  Consider how much longer it will take to deploy VOIP protocol redesigns when the old protocol is embedded in telephones around the world?

We have to admire old Ma Bell for building a reliable and maintainable system.  Yes, it took a 100 years of work - and modern telco phones, particularly on the local loop, use a lot of technology created in the late 1800's.

You would have thought that in this internet age that we might have learned that clarity of internet protocol design is a great virtue and that management, diagnostics, and security are not afterthoughts but primary design goals.

There is a lot of noise out there about internet stability.  And a lot of people and businesses are risking their actual and economic well being on the net, and the applications layered on it, really being stable and reliable.

But I have great concern that our approach to the internet resembles a high pillar of round stones piled on top of other round stones - we should not be surprised when it begins to wobble and then falls to the ground.

I am beginning to foresee a future internet in which people involved in management, troubleshooting, and repair are engaged in a Sisyphean effort to provide service in the face of increasingly non-unified design of internet protocols.  And in that future, users will have to learn to expect outages and become accustomed to dealing with service provider customer service "associates" whose main job is to buy time to keep customers from rioting while the technical repair team tries to figure out what happened, where it happened, and what to do about it.

Posted by karl at March 9, 2007 1:33 AM