Central Point of Failure

by Dan Murray

Published January 31, 2001



Microsoft Corp.’s Web sites were offline for several days last week due to mysterious problems with their Internet routers. The company claimed that unscrupulous individuals maliciously attacked their system. Independent security analysts verified that the principal source of their problems were internal.

Kerry Newman, London Financial Times reporter, wrote, “Microsoft, like many governments, goes into stealth mode when the chips (or servers) are down. Calls and emails to their press relations people are typically greeted with silence or tepid assurances that the company is ‘working on it.’”

Last month, Microsoft’s Websites were visited 54 million times, AOL 61 million and Yahoo 55 million times, according to Jupiter Media Metrix research.

People who book plane or hotel reservations with Expedia couldn’t access the travel information they depended upon. Tech people working with Windows were without answers to questions that plague their daily lives. MSNBC’s news site was blank.

Eighty-four million Hotmail users couldn’t get their email. When asked, Microsoft’s Adam Sohn said that it was difficult to say whether Hotmail’s problems were directly related to the server outage, or to “the normal difficulties that sometimes occur with Hotmail.” Fred Futher of AOL said, “It’s a free service. No offense intended to Microsoft, but you get what you pay for.”

Frustrated Hotmail users pleaded with news organizations, demanding to know when their services would resume. It seems the American public could be indifferent about a declaration of war; but take away their email, and they are up in arms.

“I’ve noticed an alarming amount of Microsoft public departments were unavailable for comment to the press,” said John Markham, a retired professor of journalism at Massey College.

Maynard Grepler, independent security analyst said, “Microsoft is infamous in this industry for taking two or three weeks to respond fully to bug attacks. They lag on response time often enough that it’s troubling. They just stonewall you, almost as a reflex.

“This is a typical old-economy response to queries,” Markham commented. “They stonewall you, wait for the story to appear, decide how much damage has been done and respond to that, rather than giving you something that you might not have found out on your own. It’s a distrustful practice.”

The company announced later that a sole technician had erred while changing a network router’s configuration to the computers that guide Web surfers to their many named sites.

One anonymous Microsoft technician told reporters, “Complete information about our network is from easily accessible records. Anyone could have figured out how we were set up. Everything was so messed up that we weren’t sure if it was an attack. We were like the proverbial deer caught in the headlights.”

Greg Keefe, the owner and operator of domain name service provider HammerNode.com, said, “Microsoft should have backup equipment ready to go at a moment’s notice for this type of emergency. Their inattention to technical details is inexcusable.”

In the wake of the Microsoft Websites worldwide blackout, Keefe (and others) observed Microsoft “frantically off-load the management of their routing to another company.” That company, Akamai, uses the rock-solid open-source Linux operating system. Matt Power of the BindView Corporation observed that the Microsoft domain names affected included: microsoft.com, msnbc.com, hotmail.com, carpoint.com, slate.com, zone.com, windowsmedia.com, homeadvisor.com and passport.com.

“Microsoft did not follow industry best practices,” said Michael Warfield, senior researcher at Internet Security Systems’ X-Force which monitors and studies network security breaches.

“Failure to be prepared, coming from a Fortune 100 company that develops and sells DNS software as part of their core business, contributes to distrust,” added Keefe. As such, he doubts that Microsoft was attacked by outside crackers. Daniel Todd, chief technologist at Keynote Systems Inc. echoed that statement: “These do not appear to be denial-of-service attacks against their Websites.”

Regardless of human error or mischief being the cause, security experts concur that Microsoft’s network configuration problems are far more serious than they recognize or admit. Techies digging for information discovered major flaws in their basic network plumbing.

Problems like these can beset any Web business, but most are operational again rather quickly. Standard practices includes at least one separate alternative path to other networks for such a contingency. But in this situation, Microsoft’s safety-net was their primary network. The vast resources and technical talent employed at Microsoft was insufficient to expeditiously identify and remedy the blockage.

The incident is an acute embarrassment for this company that is offering the promise of unprecedented reliability in marketing its Internet products. Analysts like Todd are concerned about the company’s proclamations.

For instance, Microsoft’s newest pet project, called “.Net,” resdesigns Windows software so that applications can be delivered over networks on demand. A $200M advertising campaign proclaims the reliability and performance of its “enterprise” products.

Columnist Peter Wayner says, “We’re moving back to a vision of one central point of failure. There are advantages, and dangers. They want to put everything in one central spot. If that breaks, we’re hosed.”

Everyday Web surfers, looking to Microsoft as their startup Web page, thought the “Internet was down.” Of course, that’s an exaggeration; but to them their entire Internet experience had come to a halt.

The modern adage is: “No news is bad news.” Keeping a secret is considerably more difficult in a wired world.