Anatomy of a(n ongoing) Disaster..
August 1, 2006 on 12:29 pm | In Foobars, Insider View, Updates by Josh Jones |
What a three weeks…
As I’m sure most of you already know, we’ve had nothing but troubles, large troubles, for pretty much the last three weeks. A lot of these troubles were our fault, a couple of them were at least ostensibly beyond our control, and they all compounded each other.
Here I’ll try and go into as much detail as possible about what happened, why, and the steps we’re taking to stop this sort of thing from ever happening again. I can’t excuse what happened, just apologize and hopefully elucidate.
Ironically, all the recent disasters stem somewhat from us attempting to take some proactive steps to head off any sort of future power outages like the kind we experienced last year.

The Back Story
As some of you may know, we are co-located with Switch and Data in The Garland Building in downtown L.A. To say we’re co-located is a bit misleading though, since we’re now basically 95% of their data center.
Why don’t we have our own data center?
Because, believe it or not, we’re still not big enough for it to make sense. Even now, we only use about 1000 sq ft of data center space.. for it to really start to make sense to get our own space, we’d have to be using around 2500 sq ft. Mainly because when you buy a data center, you want to get one big enough to handle a lot of growth.. and although it’s cheaper per square foot than co-locating, you have to pay for all the space you’re not using yet.
And really, The Garland Building is supposed to be an excellent place for data centers. There are more than a dozen in the building. Companies like iPowerWeb, Media Temple, BroadSpire, and even MySpace (now the most popular website in the whole US!) are in there. It’s got FIVE huge generators, UPS for the whole building, on two separate power grids, and a dedicated engineering staff to make it all work flawlessly. Or so we were all assured.
Around last June though, the building informed all its data center tenants that they had essentially run out of power! Not power altogether, but the “good” power that data centers need.. i.e. ups and generator-backed power. Because Wells Fargo, who holds the master lease on the building, wasn’t sure if they were going to renew the lease when it is up in three years, they didn’t want to invest the millions of dollars to add more generators and ups to increase capacity. This is in fact the primary reason we’re still not selling any more dedicated servers .. they use too much power per dollar!
Of course, none of that was supposed to have any affect on their ability to keep the current power going in the case of an outage. September 12th, 2005 we discovered they actually couldn’t… when two of the five generators failed!
However, since then, the building has repaired and replaced the faulty generators, and given all their tenants numerous assurances that what happened before would never ever happen again.

Why didn’t we move data centers right then?
That would have been a fairly massive undertaking, resulted in even more down time, been very expensive, and actually we did look around and there weren’t any really good options for moving… data center space is becoming pretty tight (in the LA area at least) and the Garland Building is still one of the best options, believe it or not. Also, this was the first time something like this had ever happened, and it seemed pretty reasonable that it wouldn’t happen again. We even asked around and none of those other tenants mentioned above were moving, so I guess it seemed like people were generally pretty confident it was a one-time freak occurrence.
Nevertheless, we started making contingency plans, searching around for another data center that had some power and would make sense for us. Eventually, we found Alchemy, just down the hall from S+D actually, and began making arrangements for getting some space from them. They had a little bit of power available because they were moving some of their clients out to El Segundo, and because they had gotten permission from the building to install their own generator. With that generator and some UPSes they were able to convert a “dirty” power feed into “clean” (i.e. good for data center use) power.

How the troubles began.
All this took a very, very, very long time. After months of searching and negotiating with Alchemy, we still had to get Switch and Data to allow us to put a cross-connect in from their data center over to their competitors down the hall. After even more months and teeth-pulling, we finally got that up and running. In fact, we finally got the first live server up in Alchemy a little less than a month ago.
All this in an attempt to head off future power problems.
Unfortunately, shortly after setting up the new footprint, we noticed something wasn’t right. Getting to Alchemy from Switch and Data we would lose huge buckets of packets. Just as we were trying to figure out the problem, we started to have problems with one of our file servers.
This resulted in a lot of problems across the board. The web servers that mounted that filer all had problems. The mail servers that mounted that filer all had problems. In fact, one of the mail servers was mis-configured and was logging thousands of errors a second to a remote logging machine… so many in fact that it was saturating its switch and clogging up a whole chunk of our network. Which in turn caused other machines to get slow and crashy because they couldn’t get to their filers, and so on and so on.
It turned out the filer problem seemed to stem from the fact that we had one shelf of 300GB disks and one shelf of 150GB disks on it. Apparently they’re not supposed to be able to support this, or at least it’s a bad idea. So, this was entirely our fault. However, we did have a number of other filers we did this on, and we’d never had problems before. Nonetheless, we will never mix disk shelf types on a file server again.
We eventually cleared all this up.
However, the Alchemy connection problems were still ongoing.
After trying all sorts of things, we eventually decided to replace one of our distribution switches that was acting strangely with a new one. This didn’t really seem to fix the problem either. This was on Friday, July 21st.

On Saturday, July 22nd, the building lost power.
This time, the generators actually worked, but the UPS failed! Honestly, it was much better than last year’s.. but unfortunately, even a brief power outage wreaks havoc on a data center. And this one wasn’t so brief.. here’s the building’s explanation:
At around 5:21pm, on Saturday July 22nd, a brown out occurred due to record high temperatures in downtown Los Angeles. Voltage dropped due to the high demand of electrical current along with equipment failure operated by the Department of Water and Power, City of Los Angeles. This condition caused failure of “ATS-B” switch and to UPS Module #3. Engineering crews were dispatched and began repair of this damaged equipment. A power interruption was required to replace contacts in “ATS-B”.
Repair of “ATS-B” failed contacts was completed on 7-24-06. Power was restored between 4:00am and 4:30am by the Engineering department.
Thank you,
Office of the Building
So, after all the emergency filer stuff going on the previous weekend, just about the entire admin team was back last weekend, working on getting everything back up when power came back on. Even when we had power, it was in a degraded state and so the cooling wasn’t working. As temperatures rose, file servers automatically shut themselves down rather than risk being damaged by the hostile environment. Apparently, MySpace made the decision to just keep all their servers off until cooling was restored.

More network troubles..
After the power outage, we decided to just yank everything back out of Alchemy (they lost power too!) until we could figure out what was going on with the network to there. Unfortunately, this didn’t seem to fix things, and our internal (”red”) network was still really fubar. When our red network isn’t working, the panel isn’t working, webmail isn’t working, and our server configuration system starts having problems (basically, anything that connects to our internal databases).
It took us just about all of Monday to figure out (and then fix) that a lot of the file servers had bad routes after being powercycled.. and so were sending ALL their traffic through the red network, saturating it. These things are generally pretty stable and a lot hadn’t been rebooted since September 12th, 2005.. and some had apparently had their networking set up by hand instead of correctly configured via our database. We’re making sure that doesn’t happen anymore either.
More network troubles..
Once that was fixed, things generally got better. Except there was STILL strange stuff going on (causing slowness and high loads around the system, but not an actual system-wide outage), even without NFS traffic going through red, and even without anything at Alchemy. It started to look like there was a problem with one of our core routers. We called our Cisco consultant and opened a trouble ticket with Cisco themselves..

More power problems..
On Friday, July 28th, we lost power again. The building wrote:
The Garland Building experienced a dead short which resulted in a brief power outage today, July 28, 2006. The air conditioning, elevators, and the electrical utility have all been restored.
While on generator power, a dead short occurred from one of our internal telecom users. We are investigating where the dead short occurred. A follow-up memo will be sent by the end of the business day reconfirming our transfer at 11:30pm tonight. We are currently on DWP power until further notice.
And then:
The Garland Building UPS System is back on-line supplied by DWP. Diesel generators have returned to an on-call status.
The 11:30pm transfer has been cancelled due to the dead short prematurely returning us to utility power. At 4:30pm the engineers engaged the UPS System to protect all tenants at the Garland Building.
Thank you,
Office of the Building
This time, we were able to get our entire system back up much quicker and with close to no problems. Of course, it had been less than a week since our last power outage.
Alchemy was the only data center in the building who did not lose power this time.

More network troubles..
Over the weekend (this last weekend), we kept having the same ongoing weird network problems I mentioned above, and Cisco hasn’t made much progress. Yesterday, we realized the new distribution switch (an extreme) was causing spanning tree problems with the older Ciscos. Jeremy got it all figured out, but in the process it erroneously blocked our “green” (public!) network for a few brief periods, taking down everything again.
Unfortunately, that fix STILL doesn’t seem to have fixed the ongoing core network problems. We were finally able to get our tickets escalated with Cisco yesterday. It is starting to look like something may have been damaged during the first power failure, although we’re not sure. The replacement/repair cost might be around $80,000 it looks like.

And that’s where things stand today.
Our number one priority right now is getting this nagging network problem understood and fixed. Once that’s the case, we should be able to put things back in Alchemy, who didn’t lose power on Friday at least. Once things are going good there, we’ll be able to add new servers and transition old ones slowly with little to no downtime.
We’re also going to be buying our own UPSes, since we have learned we can’t trust our data center OR our building to do it. We’ll start by putting the core routers on them, then our internal databases and servers, then our file servers, and finally the hundreds of customer mail, web, and database servers.

Finally…
We’re very sorry for what happened. We definitely don’t want it to happen again, and we’re trying to take all the practical steps we can to prevent it. We never want to have another July 2006 again.
Ironically, some of the network problems seem to have stemmed from us trying to better protect ourselves from power failures. I also want to say for the record that none of these problems in my opinion stemmed from “overselling”. Rather, I’d say it’s the result of bad luck. And incompetence on our (and the building’s) part.
I don’t know if we’ll be able to change our luck, but hopefully we’ve at least learned something and will be able to become a tiny bit less incompetent in the future.
I hope you’ll all stay with us to find out.
August 1st, 2006 at 12:41 pm
DreamHost Power Outage…
Downtown Los Angeles was hit by another power outage last Friday. This affected DreamHost’s services and other tenants at the Garland Building (here among MySpace), because the UPS system failed.
DreamHost was affected by a similar incident last …
August 1st, 2006 at 1:00 pm
thank you for explaining what happened over the past few days. these things happen, and you seem to be dealing with it.
some peole might say these things shouldn’t happen, and in a perfect world, they wouldn’t. but this world is far from perfect, and as long as you guys (and gals) are trying, that’s fine by me.
thanks again.
August 1st, 2006 at 1:05 pm
It’s refreshing to see honesty and a company hold their hand up and say they made a mistake. Thank you.
Lets hope your building has the same honesty in admitting their faults.
August 1st, 2006 at 1:09 pm
ditto. thanks for the good post!
August 1st, 2006 at 1:13 pm
Good to see the open and clear explanations. Thanks.
Glad it’s you running all this. ;-)
August 1st, 2006 at 1:14 pm
Josh, will the space with Alchemy allow you to start offering dedicated server again?
August 1st, 2006 at 1:15 pm
Dedicated servers at dreamhost, ya!!!!
From,
Adam
August 1st, 2006 at 1:19 pm
I’m not going anywhere, and this blog post is a big reason why. Thanks for explaining what’s going on and being up front about everything. May your cluetrain never crash.
August 1st, 2006 at 1:33 pm
I’m still here.. :)
August 1st, 2006 at 1:34 pm
I am a sysadmin/networkadmin/etc.. these things happen. Usually they happen all at once making it seem like we are entirely incompetent, but in reality, it’s chaos taking over - strange attractors and all. Point being, for a year I experienced relatively little downtime, and all at once a bunch. In no way does that cancel out the year. I am staying here for now. - Yossie
P.s. And THANKS for the explanation - knowledge is power, and I feel a bit more powerful today :)
August 1st, 2006 at 1:40 pm
I was about to make the switch to DreamHost last September when you had a major power outage. I monitored your status page and blog and decided that I would give you a try anyway. I’m glad I did and so are my clients. Your honesty about what is going on is definitely appreciated by me.
August 1st, 2006 at 1:41 pm
thanks for keeping your customers in the loop. even with all the problems that you all have had, you’re still the best webhost that i’ve worked with.
thanks!!
August 1st, 2006 at 1:41 pm
Use one vendor for all your switches. Any other way lies pain.
August 1st, 2006 at 1:56 pm
Posts like this are the reason that I love dreamhost and continue to pimp you guys out to all my friends.
August 1st, 2006 at 2:02 pm
Great post :D… thanks for the explanation!
We only joined under a week ago, but since then have been astounded by the level of honesty and customer care you guys provide. They ought to learn a few things from you over here in the UK!!
Keep trying to fix the problems, and try not to beat yourselves up too much over this period of bad luck and trial and error. We’re not moving anywhere for the foreseeable future.
Tom :)
August 1st, 2006 at 2:18 pm
Thank you so much for putting all this out in the open! I feel hugely better now.
August 1st, 2006 at 2:28 pm
The right way to do business…
I have been a Dreamhost customer since 1999, and I only recall one significant outage over that entire period. Until the last couple of weeks. Things have definately gotten pretty ugly. DreamHost Blog サ Anatomy of a(n ongoing) Disaster.. One……
August 1st, 2006 at 2:34 pm
We use DH for non-critical stuff, like our internal company wiki and email routing. It sounds like a number of problems were brought about by the building’s inability to provide good power.
My company has looked at many colo facilities, and I was very impressed by Equinix. When I took a tour of one of their facilties, I could see many big names (Google, Amazon) located there, and it seems like they know how to run a facility. They have 3 locations in the LA area — I’d recommend looking at moving there.
The fact that MySpace is in your building doesn’t mean much. They probably never expected to grow as much as they did, and are too busy dealing with that to focus on their datacenter situation. They started small and haven’t had time to evaluate if their current facility is appropriate for the level of uptime they require.
August 1st, 2006 at 2:55 pm
Hey, if the worst thing that can happen is Dreamhost owning up to their mistakes and promising to do better, I see no reason at all to even consider switching. The federal government should be so honest!
August 1st, 2006 at 3:03 pm
I’m not a DreamHost customer, but I just wanted to come here and leave a comment to say I admire the way you’re owning up to the problems that have been occurring recently. Laying everything out in the open for everyone to see is the way business should be done!
August 1st, 2006 at 3:47 pm
I just wanted to say how refreshing it is to get such honest information these days.
Thanks and all the best for the future.
August 1st, 2006 at 3:50 pm
I still love you guys.
August 1st, 2006 at 3:51 pm
Ah, good… it sounds like you are almost ready to have a local navvy put a shovel through the fibre outside the building…
August 1st, 2006 at 3:56 pm
I have to say that without this post, I’d be outta here. Thankfully, you guys sent out the newsletter with a pointer to this. Well done!
I subscribe to the emergency status feed and I was beginning to wonder if you had just started reporting things that had not been reported before (but happening), or if you had a new person that got chatty about emergency stuff.
Glad to hear this was an anomaly. I’ll stick with you.
Again, very well done on being so upfront about the issues; but do please try your best to prevent it (I’m sure you will).
August 1st, 2006 at 4:12 pm
Thanks for the updates and thet honest work. It’s appreciated and understood. Thanks again.
August 1st, 2006 at 4:21 pm
Thanks for being so up front with all of us. It’s rare to find a company who is willing to own up to their mistakes and be honest with customers. I know I appreciate it, and I’m sure most of your other customers do, too.
I’ve been recommending Dreamhost to all of my friends for years, and I’m going to continue to do so.
Keep up the great work.
August 1st, 2006 at 4:26 pm
I too appreciate the update.
I’ve been with DH for ~6 years now and have no intentions of leaving, but I would expect in this context to see something of a bone thrown to your customers. We recently experienced some downtime on Puzzle Pirates and ended up crediting subscribers with four days of free play. The downtime was pretty minimal, measured in minutes rather than hours, but intermittant and annoying, so we ‘paid up’ to our players. I would respectfully suggest doing something similar.
August 1st, 2006 at 4:27 pm
Well, despite it all, I still love you guys. And I try to direct as many people as I can to you, too. That way, maybe you can get that fancy data center (or not).
Keep up the good work….
Nancy
August 1st, 2006 at 4:35 pm
Thanks for being up front throughout this ordeal. I’m hoping that August goes better for you guys. Thanks!
August 1st, 2006 at 4:36 pm
Garland Building en Los Angeles
http://flickr.com/photos/alvy/42973478/
August 1st, 2006 at 4:45 pm
Thanks for the explanation. After four years I’m not planning on going anywhere, but you could include some more information on the customer service issues that have come to light in this. All the problems felt were not technical, and the inaccurate and untimely information being put out was at least as frustrating as the outages themselves to me.
August 1st, 2006 at 4:48 pm
You guys rock! I love the catostrophe pictures strewn about. Really conveys the mood :)
I used to work for PowWeb (before it was sold), who was also hosted at the Garland Building, so I hear where you guys are coming from.
This past three weeks has been rough on me with the outages, just like Im sure it has been for many customers, but knowing that you guys definitely dont WANT this to happen, and are constantly trying to make it better goes a long way.
Anyways, brighter days are ahead, and I know I am rooting for you.
August 1st, 2006 at 4:51 pm
DreamHost…
I admire honesty where I see it. Too much of the world is built on lies or at the very least deception. Recently, DreamHost (the company that hosts DigiFiend and a couple of other sites I manage) has been having……
August 1st, 2006 at 5:03 pm
go to hell, dreamhost. ive been with you 7 years and the service has been downhill since i first got on. the cutesy “family and friends” tone isn’t cutting it anymore (unless you’re completely retarded), we’re all pretty aware of the dough being raked in. i can’t believe i’m getting lame copy/paste apology responses instead of credits to my account in some form or another. i pay out my nose for you (in comparison to an equally apt host like 1and1), have referred at least 10 people to you through the years, and keep operating under this illusion that i’m getting the “best” of customer service and features. prove me right, and cut out the cuteness. focus on service and re-imbursing your customers for gigantic boners like this. i don’t care what happened to your facilities, guess what, i have problems in my life, too. do your job as a supplier and amend the situation, and credit us ALL SOMETHING for the bullshit we had to go through for this pathetic display.
August 1st, 2006 at 5:04 pm
Jeesh. Comp time for everyone!
August 1st, 2006 at 5:10 pm
It’s great to see such a detailed account of what happened. That’s why I’m still a customer.
Kind of off-topic, but: Why do internet companies constantly choose to locate in LA, which has chronic power problems in the summer? Why not Dallas, Atlanta, or Richmond? There are tons of other cities with great infrastructure, cheap land, adequate power/no brownouts, and a skilled labor force. But for some reason, LA is chosen despite its lack of adequate power during the summer. I don’t really understand that.
August 1st, 2006 at 5:15 pm
Thank you for the status report, I’m once again pleased I switched to DH earlier this year, excellent customer service and an entertaining newsletter! What more could we want?
August 1st, 2006 at 5:25 pm
Thanks so much for taking the time to give us a status report. I won’t be moving hosting companies anytime soon. You guys really rock in terms of customer support.
August 1st, 2006 at 5:25 pm
I appreciate that a company owns up to its mistakes. Also I have to note that customer support was top-notch during these incidents. While I was a bit pissed off during the occurences, this explanation brings back the warm and fuzzies.
August 1st, 2006 at 5:26 pm
I have been with Dreamhost for about 4 years now and have recomended Dreamhost to around 20 others and I am happy that I have. Has all of this been hard on us? Yes. But LIFE DOES HAPPEN! Over all I have been VERY happy with Dreamhost and have watched them gorw (and learn) and feel like this will only help them come out a little stronger! Thanks for telling us all of this so that we really can understand what happened! That is all that I ask for!
August 1st, 2006 at 5:30 pm
Thank you guys. Your customer service is amazing.
4 years and counting.
August 1st, 2006 at 5:33 pm
I’d say it’s time to hire a dedicated network engineer. You mention “consultant” in the article.
August 1st, 2006 at 5:35 pm
The hosting solution i used before you guys recently had a bunch of problems, some of which stemmed from the terrible flooding in Louisiana. So I could forgive them for the circumstances, but i could not forgive them for the terrible level of communication during and after the initial problems.
Your forthrightness and honesty are absolutely golden, and you can count on me sticking with you guys through the trying times.
keep up the good work!
August 1st, 2006 at 5:37 pm
Thanks for this write-up; we will definitely be staying with DreamHost. This is the kind of status report that keeps customers happy — well done!
August 1st, 2006 at 5:38 pm
[...] How do you treat your customers when you let them down? DreamHost did about as good a job as I can imagine after they were down during a power outage that they couldn’t control. Read the comments where customers are coming back — corporate blogging done right. Thanks to Dylan Bennett for emailing me this. [...]
August 1st, 2006 at 5:39 pm
Without this post I would seriously consider moving. In fact I even hosted one site at Godaddy.com in anticipation of moving, however, I will stay for now. Personally, I think you should at least offer your customers some sort of compensation due to the lack of planning. speaking of which:
I live in Florida, I hosted with DH in CALIFORNIA. Why did I do this? HURRICANES. I know that every year hurricanes are going to take out a large part of Floridas infrastructure. I always laugh my ass off at the idiots driving hummers and mercedes to water and ice lines in the heat while I drive my beat up Saturn around and go home to hot water and a cold refrigerator sitting on my Air-conditioned ass Watching DVD’s on the projector. I bought a generator rather than a hummer. It just made mroe sense. Same thing with a gun. my mom always bugs me about owning a worthless gun I never shoot. But if someone ever attacks my family at home, I can protect my loved ones. And yes I keep it LOADED and COCKED and yes the kids know what it is and where it is and how to handle it. That way when they see one they will know how to handle it rather than blowing thier damn friends head off.
If you prepare for the worst you will be prepared for the best. Perhaps DH should consider hiring some BOY SCOUTS to help prevent problems rather than putting the flames out when it catches fire.
All in all, If i had been at DH when all the catastrophy happened I would be absolutely beside myself. Im sure you gues have been sitting around after a hard day at work and the lights go out all at once and someone said “OH YOU HAVE GOT TO BE FUCKING KIDDING ME!!” I worked for a company once that fired me after I told them that the worst can and would happen. They fired me right after I set up the generator plans and the day was set for it to be installed. The funny thing was that the generator was partially installed and all the parts were not available the day they lost power for a week.
They called me and being the nice guy I suggested they send the guy who replaced me to the Home Depot to buy every generator they had. Idiot forgot the extension cords. Anyways, good luck, and godspeed and please heed some advice. IF IT IS GOING TO HAPPEN TO ANYONE IT WILL HAPPEN TO ME. Say that everyday when you wake up and you will never get surprised.
August 1st, 2006 at 5:42 pm
Great job being so honest guys. I’m really happy that I switched to your service after a horrible experience with C I Host… who still owes me $40.
August 1st, 2006 at 5:43 pm
You guys are honest, transparent, and rock our collective hosting world. I have been in your shoes (in NY, not LA) so I understand exactly what you mean — I found myself nodding through most of your post and I even had a flashback or two at various points. Like most of the customers here, I will use this as only further reason to sing your praises loud and long and to recruit more customers for you.
I’m glad you’re putting in your own UPSes. I second the earlier recommendation for a single vendor for your network fabric (I know, the last thing you want is a backseat sysadmin, but think of this as a friendly comment, not a whine), and on top of that, a consistent OS release on those network devices. (Yes, that’s the scar of an IOS version burn showing through.)
I’ll admit if I was spending $1000/mo on hosting with you I’d be expecting some kind of rebate… but then, it’d be in the SLA I’d signed with you. So if you felt like giving us all a free month I wouldn’t argue, but it sounds like you may have six figures of unexpected hardware expenses about to hit, so I understand if that precludes giving each small customer some kind of token of apology. Hopefully others will understand, too.
Thank you, Josh. Tell the rest of the team thanks, too. Best of luck in the coming months.
August 1st, 2006 at 5:45 pm
I really appreciate you guys! The downtime has not been much of an impact, but what HAS been an impact has been being able to go to “dreamhoststatus.com” and find out IMMEDIATELY what the problem was. My last hosting company would basically lie thru their teeth, and I appreciate that you don’t. The pictures made me die laughing, by the way… :)
August 1st, 2006 at 5:47 pm
I switched my company’s hosting over to DreamHost in March after our previous host had continued downtimes with no explanation or support. While I’ve experienced problems on DreamHost, they’ve always been taking care of quickly, friendly, and with an explanation.
Power outages happen. You don’t own the building, so you can’t ensure the generators and UPSes function correctly. Things happen. I understand.
Thanks for the explanation. Even without it, DH support and the features I am given keeps me here. I’m proud to work with DreamHost, and wholeheartedly recommend them to anyone who needs a host.
/dan mattia
August 1st, 2006 at 5:51 pm
Thank you for your honesty.
This has hurt our website - as has many others no doubt - just as we were finally starting to see some traction.
Your ‘tell it like it is’ approach is why I can tell my visitors we’re sticking with Dreamhost despite 3 weeks of torment.
Please get this fixed quickly. I want to be able to get enough traffic to be able to buy a dedicated server from you guys! :-)
August 1st, 2006 at 6:00 pm
You’d have to go down way more often to get me to go somewhere else. Don’t get me wrong, I have experienced plenty of outages with you, but you always get it fixed asap. Brown outs? As long as you’re looking into the backup equipment failures I’m happy.
August 1st, 2006 at 6:03 pm
Yes, I do appreciate the honesty, but really, being honest can’t cover everything and make all well. I don’t pay my bill to dreamhost every month for honesty, I pay for service. There is a point where you have to look at the history and decide if there are other companies with less down time history (whether or not its under their control).
Aaron
August 1st, 2006 at 6:08 pm
Rad post, Josh. And don’t worry about the power stuff. Your honesty and humor make up for it.
Oh, and regarding that .1% transfer joke in the newsletter… well, I feel like a stud because I used about 26% of my transfer. Some day, I hope to reach 50%. Ha.
August 1st, 2006 at 6:22 pm
Do your contract with your datacenter and your datacenter’s contract with the building specify compensation for power outages? That seems to me to be a good way to make sure the building people have the right incentives to keep your power going rather than making excuses.
August 1st, 2006 at 6:22 pm
While I am sure most of us appreciate the honesty and being kept in the loop, the fact remains that many of the sites being hosted here are suffering from the problems.
I switched several of the sites I maintain over to dreamhost in March and April. The problems have been sporadic. Some of my sites have been affected more than others, it just depends on use.
After this month (yesterday in fact) I have made arrangements to move at least two of my sites to another hosting service. The rest will likely follow, but aren’t as critical for up-time at the moment.
I really liked the setup at dreamhost. I really liked the demeanor of the staff, the newsletters, the status blog, the wiki, etc… but in the end I need hosting to actually be available and not have these kinds of problems.
I get phone calls when sites are down or the database doesn’t connect, or sites are slow. The best I can tell them is I will see what is wrong, but most likely we will have to wait for it to get sorted out. That makes me look bad, my clients look bad, and you look bad.
I can better support clients somewhere that has better up-time with less features so I can spend my time developing and not answering questions about why things don’t work.
August 1st, 2006 at 6:22 pm
Not a big deal. You might want to consider looking at datacenters a bit further away from California’s power grid, though, or this will probably happen again next year at about the same time :)
(The domain transfer I approved today had nothing to do with the downtime; I agreed to transfer it away several months ago)
August 1st, 2006 at 6:27 pm
you folks at DreamHost got on the Cluetrain in such a cool way. Thanks for the honest/open communication with us (and everyone else). It’s this type of thing that will keep me at DreamHost for a long, long time.
August 1st, 2006 at 6:28 pm
To the poster who wants to switch to 1and1… be my guest! More space for me. I started with 1and1 and it was *slow*. Their service was friendly and professional, but never helpful. IMAP servers were like swimming in molasses. DreamHost is for techies.
August 1st, 2006 at 6:37 pm
Thank you for the informative, honest post Josh! It is truly refreshing to have a hosting company that actually cares about it’s customers, and keeps them in the loop.
Posts like this are what set you all apart from other hosts. Many other hosts would have just pretended like the problem didn’t happen, and give vague boiler-plate responses. You take the time and energy to keep customers in the loop, which says a lot about the quality of support that Dreamhost provides.
I am constantly amazed at the knowledge and dedication of the support staff, and every one else who makes Dreamhost tick behind the scenes. Regardless of your shortcomings, which will inevitably be ironed out in due time, there are two things which will keep you guys as the leader of the pack (at least in my book): honesty, and excellent customer support. Just keep these two things paramount and Dreamhost will be the only webhost for me. :)
Thanks guys, and good luck!
Zachery
August 1st, 2006 at 6:40 pm
With the exception of hp, it seems the DH’s customers are real computer users that understand that it isn’t if, but rather when problems will occur. The difference between a good company and a great company is how they handle things when the sh@! hits the fan. DH has proven once again that they are an excellent company through their use of great communication tools to keep customers informed of the situation. The fact that they acknowledge some fault in the latest issue is almost unheard-of in the industry. DH is now the gold standard in my book.
Now if they could just get a clue and get the hell out of Hell (L.A.)! This summer is just the start of things to come with power issues. I hope DH will differentiate themselves in the marketplace and move to middle America where space and power are in abundance. I have been a customer for ~6 years and the only reason I would ever consider leaving is if DH fails to diversify or move operations out of California.
Keep up the good communication and take a trip East!
August 1st, 2006 at 6:41 pm
Thanks for the explanation.
August 1st, 2006 at 6:41 pm
thank you for being honest. This is what we need from a company not a whole it will be back on monday and come next monday it still being down.
August 1st, 2006 at 6:54 pm
Like jz I manage a lot of customers via DH, and it has not been easy getting calls from them these past few weeks. More than the hosting, it’s the email problems (this last time was a Friday and a Monday - ouch!) that have been doing most of the damage.
It doesn’t change the past, but it makes a big difference to get the full story. Thanks for posting.
August 1st, 2006 at 6:55 pm
Thanks for the info. I appreciate knowing the details! I’ve been very happy with Dreamhost.
August 1st, 2006 at 6:57 pm
I am a systems administrator by trade and I understand how things like this can happen and compound on each other. Throughout it all I have never doubted that Dreamhost would get the issues resolved and that you would tell us the truth about what was going on.
You guys rock.
August 1st, 2006 at 6:59 pm
[...] Read about it here. [...]
August 1st, 2006 at 7:00 pm
From very (very) far away I’m still here :)
Good luck (from Brazil),
Cassiano
August 1st, 2006 at 7:08 pm
Problems happen everywhere, with everyone. Unlike many/most others, DH is honest and upfront, and you don’t have to wait weeks for an explanation. Maybe I’m more forgiving than most because I just have a couple vanity websites with no sort of commerce coming for any of them, or because I’m a techie in the daylight hours also, but the good karma earned from the way you guys handle these episodes far outweighs the negative karma caused by the episodes themselves. Plus, when I think things are going crappy at work, I can remind myself of your last 3 weeks and realize they could be a hell of a lot worse!
August 1st, 2006 at 7:13 pm
Thanks for letting us know. I’m still a happy DreamHost customer. I really like a lot of things about DreamHost customer, and I’m sticking around. I enjoy your newsletters, too. :)
August 1st, 2006 at 7:16 pm
Sorry to be contrarian, but while this openness is a good thing, it’s not enough.
When someone tells me their techs are working day and night, it tells me that they didn’t have enough redundancy, pre-configured replacements, and a good emergency plan in place. Their techs are working day and night to compensate for a lack of preparation.
I just host personal sites on Dreamhost. “Mission critical” sites get a dedicated solution in a badass datacenter. But until Dreamhost’s problems, I didn’t realize how much I was relying on my personal e-mail addresses for business-related correspondence.
So, save for some simple filehosting that doesn’t matter much, I’ll be migrating all but a few of my sites off of Dreamhost this month and when my subscription comes up for renewal in 6 months, they’ll all have been migrated.
As in anything, you get what you pay for. If you want reliability, redundancy, and an SLA… it will cost. But when it affects your business, it’s worth it. IMO, all the “we’re really trying” and honesty is nice, but it’s not enough if any of your income depends on any of the services you’re hosting through DH.
August 1st, 2006 at 7:16 pm
Hi Dreamhost, thanks for the transparency. I agree on your business approach with regards to overselling, and the point about not being perfect. I still think that there is a market out there which prefers “Premier” shared hosting, something less than a Virtual Private Server, but yet better than hosting on a server shared by thousands without any redundance. Everyone is following the trend, but there are some businesses which would rather prefer a 99.5% or higher with redundancy. At the very least, the sites would not “disappear” but would just be slow when the traffic is high - a signal to upgrade to VPS or dedicated server. Completely down and fluctuations over 3 weeks or more isn’t acceptable in this business. I would think that it is ok to host low traffic critical sites, but not for business critical sites. Ask anyone running a business and that lost would mean lost of cost as well. I will still (probably) continue to host the much less critical sites here, but more impt ones would have to move out. Slow is ok, but totally down and not accessible isn’t. Apart from the events mentioned in the Status page, there are also other times when we cannot access.
August 1st, 2006 at 7:18 pm
I appreciate your hard work, honesty, and sense of humor. Definitely will stick with you!
thanks
August 1st, 2006 at 7:20 pm
I agree with most people here that your honesty is much appreciated and we’re glad you’re doing the best you can regarding the situation.
However, we have noticed a lot of problems occuring since that September Blackout. We’ve been a dreamhoster for almost five years now and all our clients are hosted here, and sad to say all our major complaints regarding your service/uptime have been for the past year or so.
I think most of us would consider one month refund for compensation for this downtime more than acceptable, and this is something we can pass on to our clients who trust us to recommend the best possible hosting for their website.
August 1st, 2006 at 7:24 pm
I agree 100% with Jonathan Feldman. I currently host a site for Monsanto with you guys and when the outage happened you better believe I got some angry calls and emails. It was so nice to be able to point my customer to your site dreamhoststatus.com that listed the issue and what was being done to solve it. Made my job a lot less easier. I too had considered leaving do to this issue and a previous hardware problem a few months ago, but this post has kept me as a customer (just don’t let it happen again *wink*).
August 1st, 2006 at 7:27 pm
IIRC, UPS manufacturers such as APC recommends not putting a UPS in series with another UPS. Also, my question is why are you in a data center like this? If you are truly commited to providing top quality service, why not colo in a meet-me facility and enter into a peering relationship with some major carriers? It just seems like you have a real business with a bullshit co-lo facility. Have you concidered moving to http://www.laiix.net/?
August 1st, 2006 at 7:34 pm
The honesty is apprecaited, it really is.
August 1st, 2006 at 7:38 pm
As I’m thankful for this note to the DH community, I must add I’m conflicted about the whole serious of events. Having been a DH user since Feb. 1998, I haven’t really observed something this wide-spread, long and exhaustive. Granted, I’m not running sort of business operations on my domains, but, I still like them up, and not slow or completely offline. The last few weeks have been really taxing… having webmail issues, ftp issues, anon ftp access by customers, etc… none of which I really ever had to worry about with DH. And, it sucks. We are paying for a service, and it wasn’t provided, in full, for several weeks. And, then, it’s hard to negate the service they did provide for so long… but, again, I was paying for it. If I didn’t pay for it, I wouldn’t have received the service… why should we pay for a service we didn’t fully receive? Hell, even Comcast and Time Warner Cable will credit you for downtimes you’ve had to deal with… should this be any different?
Yea, this is a little rant-ish, but, it’s more food for thought, poising questions, thinking outloud… I probably will still rave about DH, but, will mention this recent event.
August 1st, 2006 at 7:39 pm
start a campfire and everybody start making out. are you people serious? i’m not sure what sort of work you do, but not being able to send e-mails for about 4 hours caused me all sorts of hell. let’s break this down. 1. i pay for a service. 2. i don’t get the service. 3. i’m told, in so many words, to fuck off and understand that this is a super complicated technical thing that my layman self wouldn’t understand, so to just be content with watching dreamhost shuffle their feet in the dirt and do the “Gosh darn it! i’m so stupid!” routine. 4. watch them get the Presidential Medal of Valor for being so fucking charming. 5. risk this happening again when big bad mother earth causes a server fry, then muddle through dreamhoststatus.com and cute e-mails apologizing to me with NO REIMBURSEMENT FOR BEING A LOYAL CUSTOMER. count me out, shills. one more fuck up like this and i’m out.
August 1st, 2006 at 7:40 pm
Sorry, but we agree with ‘hp’.
While it’s great to see “Honesty”, that doesn’t make up for the thousands of dollars my small business has lost due to DH’s incompetence. This is not an isolated incident. It’s become a pattern. 4 major, multi-day outages in 2 years, with constant small outages (email, web, DB) at any given time during any given month. Completely ridiculous.
Where is the offer of compensation for services not rendered? I’ve yet to see that in the “happy-happy-blog”.
If you’re a teenager with a personal website … by all means, DH is a great place, with reasonable rates. The fact that their service is completely unreliable is probably not an issue.
If you’re a professional, or a business, and expect professional hosting. This isn’t it.
We’ll be moving our business to a professional hosting company ASAP. It is not something we look forward to.
Three years ago, DH was great. It’s not anymore. It’s a sub-par hosting company with apparently incompetant network and systems administrators, with no disaster plans or redundancy. You can get shared hosting from any number of companies at similar rates with redundant data centers, nevermind reliable dedicated hosts which DH no longer can even provide (Not that it would help when their whole infastructure is a disaster).
Our cable ISP (Comcast) we use at our business is more reliable, and that’s saying something.
BR
August 1st, 2006 at 7:56 pm
Thanks for the explanation. I probably sent you guys nasty e-mails due to the downtime… I apologize for that. Kudos to you guys for handling these crises so well!
August 1st, 2006 at 8:01 pm
I launched a new website for a new client on the 2nd July. Can you imagine the crap I have gone through for past month trying to explain why they can’t send email or why their website is down consistantly after launch? They dont care who the host is or what temperature it is in california, their email and website is down and its MY fault, TOTALLY UNACCEPTABLE.
August 1st, 2006 at 8:06 pm
I’ve been a customer for about 10 months. I’ve been involved in hosting my web sites for a VERY long time. Outages and problems happen. I’m staying. The Customer Service and the hosting costs are great.
No company is perfect and there is not a hosting company out there that can guarantee 100% uptime all the time.
Thanks Dreamhost for doing your best and thanks for letting us know what the issues are and that you are working to resolve them.
August 1st, 2006 at 8:11 pm
It sounds there are those that are fine with the apology and openness that you have in your explanation, and those that are not. Honesty is great, and it seems to weigh heavy in certain situations, but there is a service being paid for here that was not delivered.
I am not ready to jump ship, yet.
Reasons, explanations or excuses, I’ll be waiting to see what happens in the next month or two.
August 1st, 2006 at 8:12 pm
Thank you for the explanation, and for all your hard work to fix it. It’s very appreciated :) I have been here over a year, and just renewed. I am glad to see I didn’t make a bad decision.
August 1st, 2006 at 8:14 pm
While I appreciate the explanation, I dont think it clears you guys from responsibility for this major outage. With so many issues compounding one another, I would hope that your staff would be able to handle at least some of the issues quickly. The fact that you had to bring in a cisco consultant says something. And routes being configured by hand, and presumably not being saved, thus having problems after a power outage? That is the most pathetic thing I have ever heard of. I hope that this serves as an indicator for you and leads to some big changes. I am a relatively new customer and am definitely considering other options now. Some big changes, explained in detail and some sort of credit might change my mind, but this is incredibly lame.
August 1st, 2006 at 8:24 pm
Wow… After reading everything that happened, I’m quite impressed. I didn’t even notice there was something wrong during July. I signed up for dreamhost in the middle of the month and started building a webpage + forum. There were down times but those lasted less than 1 minute usually and at the most 10 minutes (that I could tell, but I wasn’t online 24/7)
Maybe it was because the place I was with before sucked a LOT. Because even with your power outages, you had better uptime. And since I don’t run a business and no one in my forum will die if they have to wait a couple minutes to access a page about their favorite celebrity, it’s all gravy =D
Thanks for the heads up and explanation =D Hope all goes well from now!
-Fuji
August 1st, 2006 at 8:24 pm
The honesty is appreciated, and hopefully similar problems can be avoided in the future. Overall I have been pretty satisfied with your response time and communication, and hopefully things will get better.
August 1st, 2006 at 8:25 pm
[...] For the last few weeks, you have no doubt noticed the site problems - mostly really slow loading or, for a while there, no loading at all. I believe the issues are all fixed now, but at any rate if you really want to know what happened (warning: lots of technical speak) with my webhost (and their heroic efforts to get the problems all fixed) read this. [...]
August 1st, 2006 at 8:25 pm
Obviously I picked the best time to join DreamHost, becoming a new member on 7/11/06. I guess I could live with the severe outages and problems I experienced the last three weeks. The little blog this here is okay and helps. But what really steams me about DreamHost are two things:
1) Can’t call a phone number. I’m sorry but this is a *HUGE* issue with me. When a system-wide problem like this occurs, it’s a no-brainer to put a broadcast message up on the menu system. Can’t use the panel when it’s down, can I? I had thought to myself — “oh, there won’t be a need to call in, I only had to call my previous hosting provider a couple times in 2 years” — but not so with DreamHost. Not only that but my responses to support requests placed through tickets is s-l-o-w!!!! Oh yeah, and the first time I requested an urgent callback some yoyo told me I wasn’t entitled to any, despite my plan clearly allowing for three per month! (And if you ask me, ANY new customer should be able to talk with someone on the phone their first month of service! Setup can be a bear!) Overall, I am sorry to say I have to give customer service a failing grade.
2) Your policy of no SpamAssassin when having a catch-all email address. Sorry, this is unacceptable. Your web pages say “catch alls ‘generate’ too much spam” but this is not true… they just all the spam to be delivered. Unless you have SpamAssassin running. My prior hosting provider has no issue with this… why does DreamHost? Someone promised to have someone set up my own copy of SpamAssassin, but they never came through. Following the instructions on my own only resulted in me hosing my own email for hours until I figured out how to fix it. Sorry, but I am not a penguin-guru.
[Why is this so important? Anytime I register with a site, I provide an address such as "dreamhost@mydomain.com" or "microsoft@mydomain.com" -- This is a very efficient way to track where mail is coming from, to "kill" an address that is being used for spam, and to sort my incoming emails into folders. I have over 200 existing mail rules and that is not even covering the ones I haven't bothered to sort.]
There is a third factor, actually, and that is I really miss cPanel. There are a couple things that the proprietary DH panel offers that are better, but these could be integrated with cPanel. Sorry but the DH panel doesn’t cut it for me. It’s slow (changes sometimes take hours to propogate) and visually/user-interface-wise it’s a total mess.
For this reason I am trying to switch hosting providers. But guess what? My domain is locked as “TRANSFERPROHIBITED” and it’s been murder trying to unlock it to complete the transfer so I can close my DH account. Three emails to support so far. And waiting, waiting, waiting for responses. (I did get one that said it was unlocked, but surprise, surprise It’s still not unlocked).
Get your act together, DH!
August 1st, 2006 at 8:25 pm
Hey, keep it up DH, the sun will come out tomorrow. As for the guy that mentioned “quit it with the friends and family tone” a big stfu. Way too many companies take themselves so seriously that I’m starting to think that insertion of a barded dildo into the rectom has become a hiring requirement. Everyone complaining, take a real deep breath. It’s August, all your customers are at the beach, so quit getting red faced and have a Choco-Taco. P.S. DH continually rocks my socks.
August 1st, 2006 at 8:31 pm
There’s lots of good, clean hydro power down here on the South Island of New Zealand… cool too… not much bandwidth though.
Thanks for the explanation and update.
I suggest you put a graph showing the system throughput over time on your status page. Our apps still seem to be grinding to a halt now and then, and it would be good to be able to check whether this was at your end.
August 1st, 2006 at 8:46 pm
How about throwing us a bone and offering a limited ‘777′ deal if we renew for a year?
I of course am appreciative that you took the time to write an explanation of what happened, but I need more than that to keep my service here. If this is even going to be a once a year occurrance, that’s too often. I would expect nothing less than for you to provide some sort of SLA to those who require it. Sure, the mom and pops are going to love your response, but those who are trying to impress clients and build a business are really left holding the bag on this one.
I can’t imagine being in your shoes from a marketing perspective. This fubar is off the charts. I look at the status page and just shake my head and wonder what the hell am I doing hosting my sites with DH. Never thought I’d say that, but this was just too much.
August 1st, 2006 at 8:47 pm
It’s difficult to read the lack of forgiveness in some of these posts. Mistakes happen. Everyone makes them. How can people live with that kind of lack of tolerance? What kind of life is that?
DH gives top notch service at amazingly discounted rates. I can’t find other hosts that would allow me the freedom to manage my server resources like I get here. That kind of flexibility is just unheard of. Any other company would have BSed the causes and wrapped the explanations in red tape. I see that kind of stuff every day and its maddening.
DH, as long as you stay this transparent, I’m not going anywhere. I really appreciate being able to tell my clients exactly what is up. I can’t get this anywhere else…
We’re all in this together and I feel like I get something special from DH. If a few power outages come with that, then so be it.
10000 thanks for all you do everyday!
August 1st, 2006 at 8:55 pm
Your post was not very comforting. I think your company has alot of things assbackwards:
1. anytime it takes that long to explain something: the person doing the explaining is the one at fault;
2. your support people kept telling me that things were fixed when in fact they were not;
3. I got tired of being asked to be more specific, when your own explanations were not any clearer;
4. even when I was told to directly e-mail someone at support, the e-mail bounced back or went directly to general mail: I can get that service from a bank;
5. if you really feel guilty about how you have messed up, then I suggest you start giving us credit on our accounts. Otherwise, your “confession” is worthless to me. I’ve lost so much business over this, that I am leaving!!!
August 1st, 2006 at 8:57 pm
You still tie a hair lip on GoDaddy.com hosting!
August 1st, 2006 at 8:58 pm
Most of the sites I host here are personal, and as big and heavily-trafficked as some may be, they’re still hobby sites, so when they go down, I’m unhappy, but I’ll live.
Clients are another kettle of fish. I’ve had clients hosted here since 2000, and I can firmly say that I’ve had no problems with DH until about a year ago. That being said, I’ve had many problems since then. Usually friendly customer service has given way to curt replies, replies two weeks after the fact, or copy/paste brushoffs. While I’d love to actually talk at length to a tech about the specific requirements of one of my sites, I never get that far and other hosts are starting to look mighty friendly, given all the downtime.
I used to work for an ISP, I know crap happens, it happened all the time to us, nobody has 100% uptime, etc, but that doesn’t change the fact that the service at DH, so excellent in the past, has gone steadily downwards in recent months. The plans are nice, but the repeated hiccups in service aren’t doing us any favors.
I’ve loved my time at Dreamhost, up until the past 12 months or so. I’ve recommended you highly to everyone, and now I’m starting to regret having hosted so many clients here.
Since we lost three weeks of stable up-time, ANY company or utility wouldn’t bill for that time. A free month won’t fix some of the underlying issues I’m experiencing, but it might stop me from marching over to another host tonight and moving some of my more key domains elsewhere.
Your post was honest and I appreciate the explanation, but I’d also appreciate you owning up further by compensating your customers for lost uptime. Repairs or no repairs, please reinforce your wonderful attitude toward your customers and give where it’s due?
August 1st, 2006 at 8:59 pm
“It’s difficult to read the lack of forgiveness in some of these posts. Mistakes happen. Everyone makes them. How can people live with that kind of lack of tolerance? What kind of life is that?”
No offense, but you obviously don’t expect much from a company that you are paying to provide a service.
If this was a one-time event over the history of DH, your point would be valid. It’s not. It’s a recurring theme.
DH has constant multi-hour outages of basic services like email (I think we have something like 20 tickets from the last 6 months about email alone - either the IMAP servers were hosed, or the SMTP servers were pushed bad configurations, nevermind all the times they get black-listed), to major outages of web servers that last all day.
Not *once* have they offered to compensate us for their lack of service. Over 3 years. It wears thin, and in our case, we gave them lots of rope to hang themselves with.
Again, like ‘hp’, I really have no idea what planet you folks are from. When your sites go down on a regular basis, and you can’t send email for hours at a time … you think this is “normal”? And that “Mistakes happen”? It’s incompetence. I don’t know if they lost all the folks that actually knew what they were doing or what, but they have a serious problem.
A high school kid with an SDSL line and a linux box could provide more reliable service.
- BR
August 1st, 2006 at 9:04 pm
[...] Things have improved over the last week, but I was left wondering if my experience was typical of how things are normally with Dreamhost. However yesterday this post showed up in my feeds, and in an email newsletter from Dreamhost. [...]
August 1st, 2006 at 9:04 pm
If the cable is out for 4 days, I get a 3 day credit on my bill.
If the phone is out for 4 days, I get a 3 day credit on my bill.
Hmmm.
August 1st, 2006 at 9:08 pm
This mess has really made me question Dreamhost, and I’ve been here for a very long time. I’d always interpreted the regular mail screw-ups, the bizarre little problems, as the result of rapid growth and maybe not making the conversion from ‘happy fun hobby business’ to ‘real business’ as smoothly as possible. But I assumed that the big picture stuff, like power, and routing tables, and all that was being taken care of, and my little problems with weird quirks of the email control panel were just growing pains, or the founders/bosses playing games with things they didn’t understand anymore (which one time an overly honest, and probably now unemployed, tech told me was the actual reason for an email outage).
Now I’m convinced that 80% of the staff spends all day just jacking around with minor crap like the control panel, and 10% of the their time praying nothing goes wrong in the big bad data center they don’t really understand, and 10% of the time looking for pretty pictures of disasters on the web instead of using those extra minutes to look up redundancy in the dictionary.
August 1st, 2006 at 9:16 pm
you guys better get your shit together!
everyone i know who uses dreamhost is very close to dumping your “service” and moving on
August 1st, 2006 at 9:18 pm
For those of you losing thousands of dollars over 24 hours, may I remind you that we pay something like 8 dollar per month. I’ve already shared with my clients where they’re hosted and what other hosting options are. None of us are really sweating this one. Sure it’s a bummer to be down but if I was gonna lose thousands per day I’d switch to redundant co-lo.
August 1st, 2006 at 9:18 pm
Outages can be a real disaster if you’re in the middle of a book launch, and there was one a couple of months ago which really hit us in the gut. However, I agree that transparency is the best policy, and DreamHost gets points for that. I’m really glad you linked all the issues together, because I think DreamHost would have looked like chaos if you kept pushing “isolated events”.
Plus the pictures made me laugh! Thanks for those. And double thanks for not taking out the gross food pictures again. ;-)
August 1st, 2006 at 9:20 pm
what happened to the DREAM in dreamhost?
its been a nightmare hosting with you guys!
i’m not going to renew my plan
August 1st, 2006 at 9:23 pm
I can’t say that I’m happy with the downtime, but I do appreciate the transparency. I use quite a few hosting companies and I can honestly say that none of them would have provided this sort of detailed account of what happened over the past month.
One other point I wanted to add - while I generally don’t believe in astrology, I’ve learned to lay low when Mercury is retrograde - which it was from 7/4 - 7/28. July 2006 at Dreamhost was classic Mercury retrograde.
August 1st, 2006 at 9:26 pm
[...] DreamHost, the hosting provider for JaredWSmith.com and about 95% of the other Web things I operate, has had a rough July thanks to the brownouts and such in Los Angeles. Their extremely detailed blog entry shows that they are sincerely working hard to try to mitigate any future disasters, and gives some insight into just how tough it is to run a massive hosting operation such as Dreamhost. It’s a great read, highly recommended. Technorati Tags: dreamhost, power outage [...]
August 1st, 2006 at 9:36 pm
[...] DreamHost Blog » Anatomy of a(n ongoing) Disaster. dreamhost. del.icio.us this! [...]
August 1st, 2006 at 9:39 pm
I am hosting with dreamhost despite living in Kenya, East Africa. DH experiences sound just like what happens to us in East Africa all the time, so I sympathise.
Good luck, keep at it, and thanks for the support.
August 1st, 2006 at 9:48 pm
# Daniel James Says:
I’ve been with DH for ~6 years now and have no intentions of leaving, but I would expect in this context to see something of a bone thrown to your customers. We recently experienced some downtime on Puzzle Pirates and ended up crediting subscribers with four days of free play.
Oddly, as I was reading this, I was thinking exactly the same thing - down to ‘With the recent issues on Puzzle Pirates, they…’
In the past, dreamhost has been excellent about renumeration for issues. I don’t feel ‘entitled’ to anything in particular, but I’ve spent a lot of time recently having to post to my blog (hosted elsewhere), ‘ok, guys, I can’t send mail again, so Rachel, if you’re reading this….’ and I admit starting to wonder if it was time to leave DH and head for some young, hungry company… like the one I signed up with several years ago.
-Stephanie.
August 1st, 2006 at 9:51 pm
While I don’t like downtime I prefer an honest “I screwed up!” anyday. Also, it is little memos like this that give DreamHost character–that is why I stick around.
August 1st, 2006 at 9:55 pm
Dreamhost - Thank you so much for the detailed explaination. I have over 60 of my clients hosted with you guys so I have been feeling the heat (no pun intended) from them as well.
Overall, Dreamhost has been an incredible business partner for me and I continue to partner exclusively with them.
August 1st, 2006 at 9:56 pm
It is very interesting to see all of these comments. The range is quite astounding. I am a recent new customer, so I haven’t had experience with previous outages, and therefore I can’t comment on the reliability.
My opinion is that the nice friendly explanation is nice, refreshing even. The lack of phone support does bite, but then that is why your clients hire you because you are supposed to have a clue, you are support. The prices for dreamhost are good, great even, what is the uptime average over the past 5 years?
A rough spot? possibly. A tech who is learning? possibly. Terminate the wrong guy? possibly. Downtime is certainly part of the datacenter’s contract, and I am sure that with patience DH will make the issues right, and we will all get properly compensated for the event.
The hardware replacement really bites, I wonder if the new switch wasn’t listening to STP priorities, or if they just weren’t set, or if it just started having it’s own party. That type of network problem is pernicious, difficult to track, and generally a pain in the you know what (especially in multi vendor environments, there are some things that syslog and snmp just can’t tell you)
All in all I am pretty happy.
And on a side note: my cable modem went down more often this past month, and the cable company knows they’re the only thing going in the neighborhood… so no credits here. Regardless of what it would do for customer loyalty and that general “hey, you got any vaseline?” felling I get when I think about my utility providers.
If your clients are making such big money off their site, they should put in their own pipe, run their own servers, and see how much fun it can be. Can’t get email? well, if you ran another MX at your location (and you probably should…), you wouldn’t have to worry about that. Drop a lower priority on it and you only get hit when it goes down.
You criticise the disaster plan? where is yours? what are you gonna do when the unlikely happens to you? oh, wait. it just did. Your plan was to sit back and whine that your clients were chewing you out. Great plan. I hope you pay your clients for that.
My mail servers (and my client’s) fall back to me, it is slower, but it’s still up. business gets done.
DH, I will stay. It’s not five nines, but I am betting that it is still in the mid 90’s.
Good luck. Hang tough. The admin is never appreciated.
August 1st, 2006 at 9:57 pm
As a systems admin, I understand the problems that the not quite so happy Dreamhost crew has been going through.
I also remember the OMG FUN times I had when I hosted all my sites from a box at my house. Script kiddies from Brazil, getting blacklisted because some spammer fell in love with my domain names, etc.
So, do the you will be happy again Dreamhost crew, thank you for being honest and letting us know what’s up. I truly appreciate your honesty.
And to those who posted negatively as “Anonymous”… move on if you must, but know you’ll most likely not be missed.
August 1st, 2006 at 10:01 pm
Thank you thank you, thank you!
I have been searching for another hosting company for the last couple of weeks. I love dreamhost. You guys have been great for a long time and here recently it’s been crap. I get hard times but I just wanted to know what was going on. Now that I know then I have hope things will get better. I’ll stick it out a few more months and hopefully it will get better.
Thanks again!
August 1st, 2006 at 10:08 pm
[...] As noted over in Half Baked, Dreamhost is starting to talk about the horrible things that have happened to it in the past month. That makes the issues a little less unsettling, just because someone, at least, is admitting to some sort of incompetence mixed with bad luck. Yay them. [...]
August 1st, 2006 at 10:13 pm
First of all, thanks, DH, for the openness and honesty. I’ve been bullshitted constantly by numerous tech companies (most notably, my local ISP, Alltel, who once tried to convince me that the entire Internet slows to 250kbps every evening at 10pm); it’s refreshing to hear someone say “This is what happened. It was [at least partially] our fault.”
I understand completely why businesses who rely on their web presence and e-mail would be angry about last month’s events. I also understand the demands for reimbursement; if you lose money due to someone else’s mistake, that’s a reasonable request. It’s also easy to understand why some DH customers would feel justified in taking their business elsewhere.
With that said, however, some of you guys need to cut the attitude. Shit happens. The world’s an imperfect place. Everyone on this board has screwed up at one point or another. The major difference is that *our* screwups aren’t visible by 300,000+ customers and even more web users. So cut them some slack; when you’re perfect, you can start giving them crap. Until then, get over yourselves.
Also, the fact that DH called a Cisco consultant *does* say something: it says that they’re smart enough to know when they need help. Tell me, folks… Who would you rather have working on the Cisco router hooked up to the server your site is located on: A DH engineer, or an engineer who works for and was trained by the company that ACTUALLY MADE THE ROUTER? I don’t expect DH to know everything about Cisco routers. There are Cisco engineers who are there for that.
Just my 2 cents. OK, maybe more like $2. Either way, I’m sticking with you guys. After all, things can only go up from here…
August 1st, 2006 at 10:33 pm
While I have to agree with the vast majority that Dreamhost’s honesty and integrity are not in question, the performance of the network is definitely something to be concerned about.
I wouldn’t consider moving at this point, but until we get several consecutive months of solid (and I don’t mean perfect) performance out of the network, I can’t in all good concience recommend the service to anyone else.
August 1st, 2006 at 10:44 pm
[...] Dreamhost Blog: Anatomy of a(n ongoing) disaster [...]
August 1st, 2006 at 10:47 pm
It’s a semi-decent explanation of things but I’m afraid it just doesn’t cut it with me. You’ve explained a few of the problems and it’s interesting to see your note end on a kinda ‘yeah well we’ve fixed a lot of problems but there’s still a few gremlins in the system that we’re trying to fix. we promise it won’t happen again… bla bla… please stay’… yawn.
You’re trying to put a lot of the blame on the infrastructure and the building. Let’s see. Media Temple are hosted there. I know a lot of their customers and they have had a near-faultless system for years, especially over the last few months. MySpace was only offline for an afternoon and service has resumed with no snags. Why on earth is it hitting Dreamhost so hard? Yes you’ve explained it yourselves. It’s part building part your own incompetence but a line really has to be drawn.
I’ve been with you guys for over 7 years and the last 3-4 months have been a JOKE! E-mail up and down like a yoyo, just like my sites. Plus (and here’s the biggie). I’ve spent the last 2 years telling everyone how great you are and have recommended clients and friends to move. Lately I’ve looked like a fool with phonecall after phonecall asking what’s going on with my website… what’s going on with my mail!
Enough!
I’m off elsewhere. To Texas actually, where the power grid seems to be a lot more stable. I’m not going to mention what host I’m off to as I don’t want this rant to turn into an advert. It’s been a great six and a half years but the last few months have been rediculous and people who buy into this ‘ohh we’re sorry’ explanation need a slap, especially those of us who have recommended you in the past and are looking foolish at this time. NOT a nice position to be in I’ll tell you!
August 1st, 2006 at 10:50 pm
Thanks for the communication. The reason we left our last service provider is because they had a rash of bad problems but their communication skills were ABYSMAL. We had no idea what was happening and any attempts to get a status were met with increasing hostility as the days wore on and the stress levels rose. So thanks for this and please keep us posted. That’s what separates you guys from everybody else.
August 1st, 2006 at 11:00 pm
I am in agreement with most of what has been posted here… the blog and friendly customer service is outstanding… so are your services and the level of control you offer your customers. That said, the level of uptime is not acceptable.
I have no intentions of leaving DreamHost, but I will definitely be evaluating the hosting over the next six months… non-mission critical sites will stay regardless but I may have to move e-commerce sites elsewhere if I have to go through another episode of this magnitude. I would like to recommend that you dedicate some of your efforts towards the infrastructure side of your operations (UPS, Generators…etc.) and perhaps focus less on Goodies and one-click installs and a little more on server backups and redundant architecture.
I used to host mostly with Interland (now Web.com). They are extremely overpriced, customers have hardly any control over their sites and their tech support is in India… however, I don’t recall ever having an crash and burn episode like this one. Still, I am very happy with DreamHost and hope you can get your act together in a reasonable time.
August 1st, 2006 at 11:22 pm
Where did the wild dogs come in?
August 1st, 2006 at 11:31 pm
Dreamhost, thank you for being honest, which is rather refreshing. It does remind me of a quote from Yes Prime Minister, talking about being an executive in the City of London:
“The basic rule of the City was that if you are incompetent you have to be honest, and if you are crooked you have to be clever. The reasoning is that, if you are honest, the chaps will rally round and help you if you make a pig’s breakfast out of your business dealings. Conversely, if you are crooked, no one will ask questions so long as you are making substantial profits. The ideal City firm was both honest and clever, although these were in short supply.” (Yes Prime Minister II, p. 109)
You seem to have made your mind up about the data centre - I would have thought instead of buying your own UPS it would be cheaper to move elsewhere (I’m sure we could live with the downtime!) but there’s probably something you know that we don’t :)
August 1st, 2006 at 11:31 pm
I joined Dreamhost at the start of July.
Hmmm - I’ve had a really bad string of luck in my life recently - maybe it’s *me*?!?!? ;)
I will be staying because no host is perfect but you guys are at least human and are trying your best.
And you’re funny. :)
August 1st, 2006 at 11:44 pm
[...] 如果一家公司给它的客户惹了不少麻烦,事后还能获得“谀”词如潮,想知道它是怎么做的吗?这家公司就是我这个网站的服务商Dreamhost。7月份是Dreamhost的灾难月,数据中心断电、UPS不工作、文件服务器不正常、路由器损坏……一连串的事件接连发生,让Dreamhost应付不暇。但是发生在这一切的同时,Dreamhost通过网站向用户通报事情进展,而在风平浪静之后又发了一篇详细的日志,坦诚的交代事情的前后经过。 [...]
August 1st, 2006 at 11:47 pm
Thanks for the awesome post!
August 1st, 2006 at 11:48 pm
The problem with moving elsewhere is that…
…unless I go for a dedicated server or a VPS, there’s nowhere that offers the same features. Even on a shared hosting plan twice the price.
August 1st, 2006 at 11:50 pm
It is looking better today. Hope you can fix all problems and I would suggest you invest a good amount of time to find a much more reliable data center. As a customer I would be happy to spend a bit more for better stability.
The company I am working for is switching colocotation for the 3rd time in 4 years and I am very familiar with many of your problems.
August 1st, 2006 at 11:53 pm
Thanks for the update Dreamhost, your honesty is appreciated.
August 2nd, 2006 at 12:01 am
Guys, keep up the good work. Honesty is the best policy. And I still wuv u!!! Better luck in the future!
August 2nd, 2006 at 12:06 am
You know, you guys are the unluckiest guys in the world of hosting… I hear of a few people slagging you off and saying that I’m leaving - those people really get on my nerves… you are the only “human” hosting company I know of, having such an informative company letting us know about every single in and out is such a nice feeling.
I put my trust in you with all of my sites and I will continue to do so, if something goes wrong I know you will be doing your best to fix it straight away, my complete confidence is in you guys.
Thank you for the great service and many laughs you have given to me…
August 2nd, 2006 at 12:11 am
[...] In recent weeks Dreamhost has been having some problems. Actually that’s an understatement. They’ve been having big problems. Not as bad as the problems my last hosting company had which drove me to leave (at least Dreamhost always have backups to restore), but enough to make me question hosting commercial sites with them. [...]
August 2nd, 2006 at 12:18 am
(Okay this turned out to rather long… but what the heck are you guys doing reading all the way down here at comment one hundred and something anyway?)
This is my second year with DreamHost - I just renewed. Right after I joined last time you guys had the LA power problem and I lost a couple of domains for a day or so. Inspite of that I still went ahead and started migrating all my multitude of domains and sites I hosted on my home server - I was able to do that because you guys added unlimited domains just after I joined up long before anyone else did, lets just say I was very happy. I was also able to stop worrying about a Slashdot effect taking out my home DSL line, or even running up an excess bandwidth bill because you guys effectively made that unlimited in my first year. You also added DNS access so I could stop trying to maintain separate DNS services elsewhere and save even more money.
This time around I have all my domains with you, twelve in one account alone (and another three in a separate account I manage for a non-profit). I’ve had a few comments from users of those domains, but they all have enough experience of computer systems to know that every IT department with a flawless uptime record is just one unexpected disaster or screwup from a reputation like s**t. In fact the longer an organization goes without some major problem to deal with the more complacent it gets. Everyone leaving DreamHost now will certainly rant and rave about how good their new place is right up until the time it fails them, and it will.
Right from the start I recognized that DreamHost services are homegrown, kind of folksy, and just plain different. But everyone, you know why you are here in the first place - its a deal, its a steal is sale of the ****king century - so what did you expect?
At the end of my second year I wont have paid more than $20 for my two years of hosting of all my domains and I’ll still have money left over in my rewards. And I’ll bet many of those few (I count less than 10% - hardly “everyone”) unhappy folks rubbed their hands together with glee over how much money they were going to save, or make reselling DreamHost services.
Well guys, you get what you pay for (remember what your el-cheapo plans are called - CRAZY DOMAIN INSANE - get it?) and anything more is gravy and to me DreamHost has delivered tons of gravy for almost nothing in the last year and they have removed from me the burden of self hosting and fixing all my OWN screwups at the most inconvenient times. What really did it was not even been able to go on vacation for worrying a disk would crash, network equipment fail or just something… Those things still go on but I’m more than happy to pay next to nothing every year for that to be SOMEONE ELSES problem no matter how long it takes.
Finally, the apology and complete honesty that screw-ups were made goes a long way with me. I just can’t stand a company or person that wont own up to their own mistakes and for every company that never issues such a thing but experiences even 0.001% downtime there’s a bunch of screw-ups and failures they never confess to, and because they never do you’ll never know if you’re just a gnats whisker away from the scale of problems DreamHost just had. Sure they may be of different nature, but they could have as big a consequence. And because DreamHost is very honest about its screwups all the information is right out there for buyers to make their decision to become customers - and yet they still do.
I don’t expect DreamHost confessionals to make my hosting problems go away - I never did, and never wil - but I admire them for having the balls to put it out there and realize they will be judged by how much they measure up to it in the future. A person or company that never does this is and just denies everything or pays off customers “out of court” so to speak to keep its reputation “clean” is just getting a free ride on our inherent greed and need to get “a good deal”.
My advice to DreamHost is - if you don’t want to keep being judged by the crappiness of your buildings power system, then do something about it. Its good enough for me (clearly better than what I had before) - but obviously not others. I’m glad to hear you are working on that and I’d love to hear more honest reports about progress in that direction and maybe even a timeline. I don’t care too much myself but I think it would make a lot of others happier and show good will on your part to follow through after today’s “confessional”.
August 2nd, 2006 at 12:31 am
[...] I’ve had more bad luck than you can shake a stick at with web hosts, lately. Read about what’s been happening to DreamHost, which hosts this self-same blog. [...]
August 2nd, 2006