Dodge; Duck; Dip; Dive; DreamHost

August 31, 2007 on 7:11 am | In Funnyish, Insider View by Josh Jones | 32 Comments

Prepare to meet your Dodge-maker!

I’m sure some of you sometimes wonder what exactly is going ON over here when the network goes out at DreamHost.

Stunned into Dodge-paralysis!

Well, we get right down to it. No matter how late, nor how dark, nor how dangerous, everybody heads over to the local park and plays some Dodgeball while we wait for it to hopefully resolve itself!

First you Dodge-run to the balls!

It’s an industry standard practice, let me assure you.

Then, you Dodge-heave the balls!

In fact, lots of times when we get there, we find several other web hosts are already playing! That’s when we take them out with our massive balls.

Dodge-Ball Square Pants!

I usually lead the assault… since nobody can see me in my camouflage shorts!

Watch the hovering red Dodge-orb!

Finally, when one lucky shmoe is declared Ultimate Dodger (small people have an advantage), they get to go back and fix everything!

The Dodge-Champion gets all the balls!

The rest of us are forced to go get drunk and eat onion rings.

It All Falls Down

August 21, 2007 on 3:18 pm | In Insider View, Updates by Josh Jones | 50 Comments

My apologies.

On the off-chance (and judging by that graph of our Level 1 queue, it seems like a pretty good off-chance) that a few of you may have noticed a little problem we had last Thursday afternoon, all the way through Friday morning, I thought I might offer something in the ways of an explanation to go along with that apology.

You customers really notice no DNS!

It’s funny how problems cascade.

It all started Wednesday around noon, when we had a sudden and mysterious network problem related to our core 2 router.

There seemed to be some sort of corruption with the ARP tables.. we eventually figured it out, and fixed it thanks to a gazillion sendARPs. Cisco support wasn’t helpful because we weren’t running the latest version of their IOS router operating system. Unfortunately, upgrading is scary stuff since it requires a short network outage, assuming everything works smoothly. We decided we’d do the upgrade Friday night.

Come Thursday at 2pm, exactly 24 hours after our previous outage was fixed, our network started to get wonky again. It seemed like it was most likely due to all the sendARPs from the previous day expiring at the same time. We were pretty much on top of this as soon as it happened though, and re-sent the sendARPs (staggered this time)!

In fact, it wasn’t actually due to an aging issue at all, but it was just an IOS bug on the core router. No big deal, we pretty much had things under control should the same problem pop up again on Friday at 2pm before the planned upgrade Friday night.

One pizza after another, all laid neatly on end.

A Chain of Events

Of course, little did we know, a chain of events had already been set in motion that would ruin everybody’s Friday.. FOREVER.

You see, every hour we have a little script that runs that purges old dead entries from our active nameserver database. Really, it isn’t the end of the world for us to keep that old stale stuff around, but in the name of being good dns citizens, I guess it’d been decided a while ago to remove them quickly.

Which is fine, I guess. However, the method in which we decided what entries should be removed was a bit suspect.

We first create a hash of ALL good domains “%domids” from our hosting database. Then, we go through all domains (as “$domid”) in our nameserver database and do:

unless ($domids{$domid}) {
print “- removing stray records under non-existant domain $domid\n”;
$pdb->do(”DELETE FROM records WHERE domain_id=” . $pdb->quote($domid));
}

Which works pretty well, assuming everything is working pretty well.

Well, everything was not working pretty well on Thursday. Because of the network weirdness, the connection to the hosting database apparently didn’t work, leaving %domdids blank.

And, due to the excellent error handling and sanity checking of that script, it did not die at that point, or even so much as raise an eyebrow as it happily decided to delete every single domain in our dns database.

I think I can see my site in there..

Now, for bad or good, it didn’t just hose the whole table at once. Instead, it just deleted one database after another, in order.. which turned out to be a rather slow process on a busy dns database. In fact, 22 hours later when we finally found it STILL RUNNING (normally it finishes in under a minute since there’s nothing to delete) it had only deleted a third of the domains in the table.. about 300,000. Hooray!

It actually would have been a lot better if it’d just hosed everything at once. It would have been much easier to detect, and rectify, immediately.

Instead, things worsened gradually. It took over two hours before we even started getting reports from customers that their sites were down. At that point, it seemed like the problem was just some sort of residual effect of the network problem, and re-generating DNS for each person who wrote in fixed it right away, and for good.

As time went on, and the problems kept coming in, we realized there was a pretty major data loss in the nameserver database, and started running some scripts to regenerate it all. Those would take a couple of hours, but when they were done everything would be better, we assumed!

It wasn’t until those regeneration scripts finished and we discovered there were still lots of missing domains that it finally dawned on us .. dns records were continuously being deleted!

And THAT is when we finally found the culprit, fixed the mess, and started trying to make sure this would never happen again!

When it rains, we’re poor.

And where was DreamHost Status for all this?

DreamHost Status was down. (See, if you just read DreamHost Status you would have known that!)

Like they’ve said befores, when it rains it pours.

We thought DreamHost Status was down because of the huge crush of people trying to access it due to the network problems. So, when we could finally get into it, we switched it to a static html page to try and lighten the load.

Lighten the load it did not!

Right about then we got a message from our remote data center in San Francisco (both ns2.dreamhost.com and dreamhoststatus.com are kept completely off our main network and in a different city exactly so they wouldn’t be affected by outages like this!)

Your server’s switchport has been de-rated to 10 Mb/s because your server began generating an out-bound storm of packets. This type of event usually indicates a compromise in security.

We have taken this action to mitigate the amount of bandwidth transfer charges incurred by your account related to this activity

Man, what timing! We did not need a DDoS attack right now.

But wait a second. Somehow that just seemed a little bit TOO Murphy-esque. And, indeed, when we probed them further, they told us:

According to my monitor, it appears you’re being DDoS attacked on your DNS service (UDP 53) specifically to IP 208.96.10.221. At 5a,
your traffic peaked our threshold for dangerous amounts of packets going through your switch port which was when your server was de-rated.

That “Distributed Denial of Service” attack was actually just honest DNS requests!

Which was super-high because ns1.dreamhost.com was returning “I don’t have any records for that domain” for a ton of domains, due to the deletion of the DNS database entries, due to the haywire script, due to the network blip, due to the IOS bug, due to us not upgrading as quickly as possible because of the network downtime involved!

After Math is Art!

The Aftermath

Well, we did the IOS upgrade and it looks like it fixed the networking problems.

We also made our crazy script do some sanity checking. But more importantly (and in just two lines of code!), we’ve now set all our internal scripts to just DIE MISERABLY if they ever get any kind of un-good data from an sql query. Clearly, ’tis better to not do something you were supposed to then to do something you were not supposed to!

We’re also going to separate good old DreamHost Status from absolutely everything else DreamHost related.. even if that means moving it to blogger or something!

We must break the cycle!

The Internet is not for People

August 10, 2007 on 2:55 pm | In Insider View, Musings by Josh Jones | 43 Comments

Johnny Five is a jive!

Not good, decent, people like you and me, anyway.

Maybe it used to be. Maybe, back in the old days of akebono.stanford.edu and hit counters and free porn you could find an actual, true, honest-to-goodness, person on the Internet. But not anymore. Nope, not anymore.

You never count your money when you're sitting at the table.

What am I talking about?

Robots man, robots.

Just like in the future, we are living in a world of robots. Or, as they prefer to be called, “bots”. Or, as they prefer us to get used to calling them, “overlords.”

I blame the Japanese.

“Why do I bring this up”, you ask? Why now, when robots have been building our cars, walking on mars, and marrying our daughters for decades?

I’ll give you a clue. It has something to do with that new DreamHost PS service I mentioned a scant one post ago.

Give up?

Good, I win! Now let me explain.

One of the big reasons The only reason you’d want your own Private Server is to be isolated from other sites on your shared server. And the reason you’d want to be isolated from them is so nobody but YOU can crash your server. And the REASON sites crash their server is because they’re getting more visits than they can handle.

It seems to me though, more visits than they can handle is a hugely varying value. For some sites, just one visit is too many. For others, say, a nice static html page, there is pretty much no limit.

Nevertheless, most sites on one of our shared server, even the really poorly coded ones, really have no problem handling a few thousand visitors a day. There only start to be problems when a completely dynamic site gets tens of thousands of daily visitors.

In fact, one of the sites we used to test out DreamHost PS fell in exactly this category. It was a frequently updated, decently popular blog (and for SOME reason, blogs just can’t be static html, can they? oh nooooo….), and on an average day, it got over 10,000 unique page visits (that’s not counting images, css files, etc..).

The blog was constantly causing problems on their shared server. We had them turn on caching, but it would still spike frequently and suck bazookas of memory. I guess it was just TOO POPULAR! Imagine, tens of thousands of good old human beings reading that blog, every single day! It only made sense that a site of this magnitude would need its own private server.

In fact, judging by the amount of load we see on servers, we must host a lot of sites in the five figures of daily visitors. But something about that just didn’t sit right with me. Just from running a few of my own stupid sites, I know how hard it is just to get in the ones figure of daily visitors.

So, I decided to look at the log file from yesterday, August 9th, 2007… and low and behold.. The Internet is not for People:

Pages Percent Type
11406 100% Total Page Views
8033 70.4% Spiders.. Yahoo, Google, MSN, Ask.. including 20% mystery spiders (I assume up to no good!)
1943 17% Comment Spammers
798 7% RSS Readers and Aggregators
632 5.6% Actual Humans ©

We’re a minority out there, you and I! The Internet circa 2007 is made of robots, by robots, for robots! By rampant extrapolation, almost 95% of the page views to the entire Internet are made by machines!

Our future?

However, in the end, all these machines are doing is trying to organize things a little better for us humans. It would be no fun at all to visit every website in the world each and every single time you wanted to find a picture of a monkey eating ice cream! Better to let our future omnipotent masters do the dirty work for now.

(Also! When I examined the “actual humans” visits more closely, 40% of those hits were the result of an image search.. and 35% were multiple pages by the same human. Meaning overall, only 149 different people actually visited that blog yesterday to actually view it in its intended entirety … barely 1% of the total page visits!)

All these robots cause problems though. It’s been well known since 1994 that 99.99% of the sites on the Internet get absolutely NO traffic. It’s how web hosts make money.

But now, that’s all changed! The only thing safe to say is 99.99% of the sites on the Internet get absolutely no HUMAN traffic! Every site now gets search engine spiders, feed aggregators, and spammers.. a veritable ARMY of undead automotrons! And those undead robot hits hit your site just as hard as living human hits.

A ton of times in the past, a site was crashing a shared server, and it turned out all we had to do was block Googlebot from visiting it and everything was fine. We figured that was better than just disabling the entire site, and yet sometimes we caught some crazy flak for it! Which saddened us so greatly we even made a wiki page to try and explain the situation.

It probably would have been better to just disable these sites.. it’s not like any humans were actually visiting them anyway.

After blocking GoogleBot, people then had two options:

1. Keep blocking.
2. Fix their site’s inefficiencies and un-block.

At least now our Happy Customers have a simple third option:

3. Pay us more money.

Just in time too, I’ve noticed my robot attack insurance premiums have been increasing recently… how strange.

What a CON!

August 2, 2007 on 11:56 am | In Business, Insider View, New Features, Promotions, Tech News by Josh Jones | 66 Comments

Go green? There was much internal debate at the highest of levels!

Well, it wasn’t a TOTAL con.

At least Dallas and I didn’t pay anything to go. He was on a panel about green hosting, and I got free admission by signing myself up as “press”. I guess in a way I’m paying now via this feeling of obligation to blog-post about it though.

Anyways, now I finally understand why we say we don’t go to hosting conferences.

They’re not for us.

Overall, we just got a really “businessy” feel from the whole thing. I mean… we can’t be the only host who’s just doing this until our band makes it big, right? And man, nobody told me to wear a logo collared short-sleeved shirt; the official uniform of hosting cons.

My two Dallii.. yes, a Korean Dallas!

What Happened

Basically, we checked out the display booths (it looks like the new trend is to give away Wiis, iPhones, and Mini Coopers.. sadly, Dallas already has those, and I don’t want a Mini Cooper because I hate the environment), had three meetings, went to three talks, the best of which by far was Dallases panel. And that was just because I interrupted a lot.

We also went to the keynote, which was by some myspace founder guy, and probably the second most-famous person at the con showed up, Carson Daly.

(The most famous? Hmmm… well, I don’t see you reading the Last Call Blog, do I?)

The booths really didn’t do anything for me.. it was almost entirely places offering pre-packaged software (we use only open source or develop our own) and out-sourcing / reselling opportunities (again, we try to be as “vertically integrated” as possible, and don’t outsource anything besides our data centers and network connectivity.. plus, any add-on service we do add we develop (and fully control) ourselves).

We were a little shocked to find out that some fairly sizable hosts just use The Planet for their entire infrastructure… they don’t own any of their servers!

The talks didn’t really do anything for me.. I already knew all the gibberish Dallas was going to say.. so predictable, man!

The next talk, from a Tier1 Research guy, allowed me to self-affirm the seemingly irrational disdain I’ve always held for market research companies. His talk was entitled “Marketing Web Hosting Services in a Rapidly Transforming Market” and basically his message was “I think everybody should partner with Microsoft and other value-added resellers to make more money by offering more junk to your customers.”

Exactly what we don’t want to do.

Oh, and he also threw in for good measure “Just offering lots of disk and bandwidth isn’t going to get you any more customers.” Ah, now that actually sounds like a pretty reasonable assumption, Philbert… if only it weren’t 100% exactly WRONG! “Research” is always easier when you just declare your hypothesis correct rather than bothering to actually test it…

(Ouch, my punches are un-pulled!)

A beautifully aesthetic curve at the top.

(Oh yeah, and despite what I said before, bad stuff DID happen while we were away. A $64,000 rack of NetApp storage got dropped on the loading dock by the delivery guys! The gentle curving of the rack you see above is not to reduce wind resistance.)

The last talk we went to, before we decided we had to stop for fear of death (and not by boredom actually, but by freezing in the lecture halls!) was by founder of Open Hosting, entitled “Virtual Private Server Hosting with Utility Pricing.”

I had some high hopes for this talk; at least the guy giving it actually runs a web host! Unfortunately, it turned out to pretty much be a bust. I guess there’s just not a lot of insight to be gleaned from a host with 2,000 times fewer customers than you!

Also, it turned out what this guy called “utility pricing” wasn’t anything of the sort. It wasn’t something cool like Amazon … instead, he had regular old (and not very generous) monthly plans with hefty overage fees for excess CPU and memory.

The whole point of “utility pricing” is if you don’t actually USE something, you don’t have to PAY for it! Not to still pay $19.95/month minimum no matter what! This guy has taken the worst from both worlds and combined them.. no “overselling” and yet still a high minimum monthly fee! Where’s the VALUE?

The Open Hosting guy also claimed that they were the only Linux-Vserver-based host in the U.S. Say whuuuuut?

Who Happened

On the bright side, every person we met was very nice… plus I got to taunt lunarpages, as well as eat lunch with the just-a-little-bit-less-cool-than-us Media Temple entourage. I also got to meet all my secret admirers, and let me tell you, THERE WERE A FEW.

Honestly, I guess if there’s any reason for us to ever go back to a hosting convention, apart from avoiding our smelly employees, it’d probably be the chance to try and recruit some decent “human capital”. That’s what it’s known as in the “biz”, which is what the biz is known as in the “biz.”

P.S… I Love You.

Oh, before I forget, there was maybe one more tiny thing that that came out of our three days in sunny Chicago. We got an idea for a brand new feature.. and it’s already ready to go!

Perhaps it was the Tier 1 guy yammering on about upselling, or maybe it was the Open Hosting guy’s illuminating discussion of Linux-Vserver, but we’re not here to play the blame game.

Nonetheless, for some reason, we’re now proud to announce our first entirely new product in a lonnnng time: the massively simple, tremendously useful, surprisingly cheap, and enticingly prestigious, currently invite-only DreamHost PS!

(Yep, DreamHost just became one more American host offering Linux-VServer. And Open Hosting just became one American host offering Linux-VServer less special.)


Powered by WordPress. Pool theme by Borja Fernandez, modified by DreamHost.
Entries and comments feeds. ^Top^