Ask DreamHost Customers
August 25, 2006 on 10:44 am | In Business, Hardware, Insider View by Josh Jones | 62 Comments
I’ve got a question.
And I thought, who better to ask, than everybody?
Here goes…
We’ve got pretty serious storage needs. Like, in the next year, we’re estimating needing about 250TB (big T, big B) of additional centralized, networked, storage.
Besides needing a lot, we also need very high performance, redundancy, and thrift.

We want it ALL!
Our absolute requirements for our system are as follows:
* RELIABILITY .. we can never ever lose any data, ever.
* PERFORMANCE .. we need something that can serve approximately 3000 NFS ops per second per TB. (See spec.org). (It needs to do NFS.)
* PRICE .. that’s the whole reason we’re looking.
We’ve got a good system of RELIABILITY and PERFORMANCE already.. but the cost per usable GB is $10. The main problem is the 300GB Fiber Channel drives we use, which are $800 each. Is there anything out there that can do the same but with SATA drives that cost more like $100? Even if we needed twice or four times as many drives for the same performance and reliability, it seems possible!
There are also some REALLY WANT TO HAVE features, though possibly could be passed up if the top three are satisfied.
* SNAPSHOTS .. automatic versioning backups of all files by the OS. We’ve got this now, in a hidden “.snapshot” directory in every folder.. check it out!
* USER QUOTAS .. really, with the amount of space we’re giving out these days, quotas are almost a moot point. They’d be nice to have though.
* HIGH DENSITY/LOW POWER .. it’s always a plus to fit more storage in the same amount of space with the same amount of power, but it’s not really that big a deal.
* RAID 6 SUPPORT .. it’s cool.
Here are some vendors/solutions we’re considering..
* NETAPP .. what we use now.
* BLUEARC
* ONSTOR
* PANASAS
* CORAID (Open Source, ATA over Ethernet.. intereesssssting!)
* LUSTRE (Open Source, Clustering storage)
Soo.. basically, if people could post their suggestions, experience, other solutions, etc.. in the comments, it would be much appreciated. Not to mention you will be doing your patriotic duty to improve your hosting forever!

And remember, we’re talking serious NFS ops.. and we’d be willing to buy 256TB at once if our tests showed this system can do what we want and THE PRICE IS RIGHT!
62 Responses to “Ask DreamHost Customers”
Powered by WordPress. Pool theme by Borja Fernandez, modified by DreamHost.
Like WordPress? Consider attending WordCamp LA.
Entries and comments feeds.
^Top^
August 25th, 2006 at 10:57 am
That’s all well and good, but what if DreamHost does an update and it somehow wipes out complete domains/subdomains/folders and leaves its customer in limbo for six (going on 7 hours), without even a response like “we’re recovering your data?”
Check it out.
August 25th, 2006 at 11:04 am
Amazon S3. I sync my 12 GB of photos with it.
# $0.15 per GB-Month of storage used.
# $0.20 per GB of data transferred.
It’s not hosting so all the features you asked for would have to be run off another server and interact with the S3 storage.
August 25th, 2006 at 11:08 am
The coraid ata over ethernet seems like the best bang for the buck solution out right now. The ability to add shelves and still use commodity hardware as a “NAS controller” makes it a very nice solution. I currently manage several LARGE netapp appliances and they are nice but the cost per gig is off the scale and the assosciated support contracts are nearly as aexpesive as the devices themselves…
August 25th, 2006 at 11:54 am
SATA over Ethernet looks pretty sweet. I really want some SATAoE on my LVM at home, maybe some slim drive enclosures with Ethernet on the back. I have no idea how scalable it is, or how cheap as your storage needs increase. Good luck though! =D
August 25th, 2006 at 12:04 pm
One thing you should do is have telephone tech support.
Or at least allow paying customers the opportunity to call you, even if they only have a set number of calls allowed per month.
8 hours is too long to wait for a response to a critical problem- that DreamHost caused- know what I mean?
August 25th, 2006 at 12:10 pm
Loper, we all realize you’re having problems, but have some common courtesy and post your issues where they are appropriate.
In case you are unsure, that place is not on this post.
August 25th, 2006 at 12:16 pm
Have you thought about a Thumper (Sun x4500)?
That’s what they’re for. 24Tb in 4U.
They’re about be included in the 60 day ‘try and buy’ free trials too, so you can kick the tyres :
http://www.your-sun.com/x64/
(If you’re in a big hurry, have a word with your reseller – ours are offering them already).
August 25th, 2006 at 12:32 pm
What’s the performance of ATA over Ethernet? Does it provide enough throughput to meet DH’s requirements?
August 25th, 2006 at 12:50 pm
mmm… I don’t know about storage, but don’t forget about elictricity.
August 25th, 2006 at 1:02 pm
“Loper, we all realize you’re having problems, but have some common courtesy and post your issues where they are appropriate.” – James Asher
Thanks, but while DreamHost did an update, it completely maimed my site. And Josh is here doing advertising.
DreamHost forces its customers into this conundrum when it fails to meet a reasonable response time; this is a result of putting the customers in a “don’t call us, we’ll call you” situation.
Granted, much of the time, the response is fine, but when it fails, DreamHost is failing its customers, some of whom, like me, pay a lot of money for the service.
The poster (Josh) asked for suggestions on improvement, and I’m making one. Where else am I expected to do so- when the email tech support is totally non-responsive?
That’s the problem. There is only one avenue to contact tech support, and it is showing (and has shown) a tendency to get bogged down, completely outside the control of the people affected by the problems.
August 25th, 2006 at 1:16 pm
MAID Technology in 28TB lumps at $3.75 per GB up to 480TB using up to 896 drives (250 or 500 GB SATA) Raid 5 each cabinet.
Power consumption 2,402- 6,368 watts.
200 – 240 VAC, 30 AMP, 3-phase.
Throughput 5.2 TB/Hour.
I’ll have a dozen :)
http://www.copansys.com/index.shtml
August 25th, 2006 at 1:18 pm
Is there a reason you aren’t considering building a GoogleFS or MogileFS setup? If you’re saying your current cost, fully implemented, is ~$10/GB, then the upper bounds for your 256TB setup should be in the ballpark of $2.5M. I’m guessing it has to be possible to build a 256TB total capacity MogileFS setup for less than $2.5M.
Of course, your “it must speak NFS” requirement throws a twist in here. How hard could it be to write a userspace NFS server that could sit in front of MogileFS storage? I might even prefer a userfs driver that “mounts” the MogileFS storage … then if you need to export that over NFS, fine.
Good luck, Josh & the DreamHost team.
August 25th, 2006 at 2:50 pm
Don’t forget power requirements on you list. I’d second Sun’s Thumper for that reason. Look into ZFS while you’re at it for some slick snapshot capapbilities.
August 25th, 2006 at 3:54 pm
Go Panasas!
August 25th, 2006 at 4:34 pm
I don’t see the point in using SCSI for file storage, if you’re just gonna have an array of discs then SATA is fine. SATA drives are also larger than SCSI drives, use less power, run cooler and are quieter (though sound levels probably aren’t a real big issue in web hosting).
With even just an 8 drive RAID-6 set up with ‘cuda 750GB SATA-2 drives, that’s 4,500 GB right there. And the minimum throughput will be at least double of what a gigabit network connection can provide.
However, it seems that with 300 – 400 GB drives you get more GB / $. So it’d have to be a trade-off between price and data density.
August 25th, 2006 at 6:21 pm
There are two other enterprise storage vendors that you should be having a serious look at: Pillar () and TerraScale () — but that said, my work place has similar online storage needs (0.5 petabyte and growing), and we loves us some NetApp….
August 25th, 2006 at 7:49 pm
Have you guys looked at the “low-cost fiber channel” solutions offered by EMC? It sits in between FC and SATA in terms of performance and price, and you can get 500GB drives (if I’m not mistaken).
August 25th, 2006 at 9:05 pm
You might want to look at http://www.emc.com/products/systems/symmetrix/DMX_series/pdf/C1304_Symmetrix_DMX3_SS_ldv.pdf
(Apparently the raw 1PB was about $4M, at the start of the year, most likely come down in price, only thing is, space, and cost of repairs if something goes belly up.
August 26th, 2006 at 3:33 am
I don’t know what you’re talking about with your “automatic versioning backups of all files by the OS…in a hidden .snapshot directory in every folder” — I don’t have a single .snapshot dir in my entire filesystem on Dreamhost.
August 26th, 2006 at 3:51 am
“The poster (Josh) asked for suggestions on improvement, and I’m making one”
Josh asked for networked storage, you whined about support response time :-/
“I don’t have a single .snapshot dir”
have you tried?
[shish] on [dorito] Sat Aug 26 04:50:34 ~/shishnet.org
>ls -a
. .. .htaccess catjump.avi favicon.ico index.php index2.html robots.txt
[shish] on [dorito] Sat Aug 26 04:50:38 ~/shishnet.org
>cd .snapshot
[shish] on [dorito] Sat Aug 26 04:50:47 ~/shishnet.org/.snapshot
>ls -a
. .. hourly.0 hourly.1 nightly.0 nightly.1 weekly.0 weekly.1
August 26th, 2006 at 6:05 am
The University I do IT work at was recently presented with the same challenges. We opted to go with a Compellent system. http://www.compellent.com/ because it gives us all the features being looked for. It replaces our old netapps and gives us a lot more flexibility with provisioning and future growth.
August 26th, 2006 at 6:43 am
Hm, no. I assumed that “hidden” meant “starting with a dot, and thus hidden from the ls output unless you use the -a switch.” (I.e. the standard Unix definition of the word “hidden”.) I did an “ls -a” and .snapshot did not show up, nor did it show up in a “find . -iname ‘*snapshot*’”. So it’s actually not “hidden”, it’s invisible. I never would have guessed to cd into a directory that wasn’t actually there in the directory listing.
August 26th, 2006 at 6:49 am
Maybe not the right place to post it, but there’s plenty of knowledgable people here. Is anyone else still having problems logging into the Control Panel with Firefox? I haven’t been able to do it for weeks, and Internet Explorer makes me feel ill.
August 26th, 2006 at 7:27 am
“Maybe not the right place to post it,”
You’re right–it’s not. Don’t be like Loper.
http://discussion.dreamhost.com
People that can’t follow the simple directions on what to post about here are probably causing their own problems in the first place. Lack of reading comprehension & common sense isn’t exactly known for making things run smoothly.
There is a place for discussing things that don’t belong here: http://discussion.dreamhost.com
There, the forum URL is posted twice. Now let’s see how many imbeciles still post irrelevant trash here.
August 26th, 2006 at 8:31 am
to jon:
try clearing out your cache and delete your cookies and then try again. i had the same problem after a firefox upgrade.
August 26th, 2006 at 9:51 pm
Josh,
As you already know, BlueArc is #1 on SPECsfs for NFS Ops (per your URL link). Titan 2000 also solves the scalability problem, delivering file system capacities to 256 terabytes, or 512 terabytes on a single namespace.
BlueArc offers both high-speed Fibre Channel and high-density SATA drives in a single system, and a lot of our customers are in the Internet services space, storing customer e-mail and hosting Web sites. We would be happy to provide references and have them tell you how Titan has accelerated their business.
You can learn more here: http://www.bluearc.com/html/products/titan.shtml
Give us a call. 1 866 864-1040.
August 27th, 2006 at 1:06 am
As long as your your servers don’t die and lose data whats the difference to us?
August 27th, 2006 at 1:34 am
“As long as your your servers don’t die and lose data whats the difference to us?”
More reliable network storage is less likely to die or lose data; faster storage will make sites more responsive; cheaper storage allows for more space per customer and / or more money to be spent elsewhere (eg support, where even with the help of nagios, it still takes them an hour to notice a dead server; and several hours to notice a server with a load level over 200…)
Do the snapshots take up any space, or are they just links to existing data? If they’re taking space, is that accounted for in “$10 per usable GB”?
August 27th, 2006 at 2:31 am
Common Sense is a prick
August 27th, 2006 at 11:00 am
$10 per usable gigabyte means “free space that you our beloved customers can place data.” It does eat physical space on the disks, a 100meg deleted file is stored once in snapshots.
When Josh says RAID6, he means “two parity disks per raid group”, meaning we can lose two disks in a single raid group and not lose any data. I don’t know if thats what NetApp’s do, as they call it RAID4 (single parity) and RAID_DP (dual partiy).
You crazy people and your “data.” Why, back in my day, the web ran at 300 baud. Both ways. Uphill. In the snow. Barefoot.
August 27th, 2006 at 1:04 pm
google needs a lot of storage and seems to be handling it with cheap disks and gfs :-)
August 27th, 2006 at 5:05 pm
Why is people keep on mentioning GoogleFS ? Last time I checked, it’s proprietary and not for sale. Or is it different now ??
August 27th, 2006 at 6:13 pm
Man, i don’t know if it will fit your needs, but a bunch of these just look wonderful –> http://www.apple.com/xserve/raid/
August 28th, 2006 at 12:10 am
It’s already been mentioned once, but I figure it’s worth reiterating. When I next build a scalable storage solution, it’s almost certainly going to be with Sun’s Thumper, Solaris and, the most important ingredient, ZFS. That gives you all of your requirements, I think, plus ZFS just totally rocks.
August 28th, 2006 at 9:33 am
We use iSCSI boxes from Rackable Systems. These boxes are cheap arrays of SATA discs and are fast as they allow you to use SCSI over TCP. This allows you to have close to fiber channel speeds but also have fully redundant arrays because the boxes are comodo and do not cost a lot of money. There is iSCSI software from Wasabi Systems and a few open source iSCSI packages for linux that have all sorts of features.
August 28th, 2006 at 11:20 am
Someone mentioned it already but take a look at Pillar Data. I can’t speak for its specifications for your environment but their customer service is _GREAT_.
We had some technical complications with an Oracle 10g RAC and they had three people on site for 4 days to assist us with the problem. Not to mention 3 very long days.
J
August 28th, 2006 at 1:23 pm
With your requirements, you need to take a meeting to learn about BlueArc. I know we would be the best solution, and at 250TB our price/TB will be very aggressive. Call me at 408-576-6689 and I will helpy you set up a 45 minute tech overview.
August 28th, 2006 at 1:47 pm
We’ve been using Coraids for a while… Great stuff… GREAT STUFF. :)
August 28th, 2006 at 3:43 pm
Have you checked out PolyServe or Isilon?
August 29th, 2006 at 12:09 am
You might want to take a read at this post which turned up in my RSS feeds today:
http://cuddletech.com/blog/pivot/entry.php?id=741
August 29th, 2006 at 3:30 am
Hey Guys, this just showed up via Digg:
“Capricorn releases a 120TB storage system”
http://www.theinquirer.net/default.aspx?article=33968
3TB/1U Node, includes an x86 controller – consumes about 80W (inc drives?)
Up to 4x 750GB drives per node.
19U Rack holds 120TB, consuming 3200W
Approx $200K for a rack.
August 29th, 2006 at 7:30 am
For the record:
http://www.capricorn-tech.com/
Just in case you had any difficulty finding them.
August 29th, 2006 at 8:14 pm
Have you thought about the IBRIX Fusion line of products? They’re claiming that you can scale up to 16 PB (petabytes) of data in a single namespace giving you plenty of future room for expansion and they do NFS and CIFS file shares.
Oh, plus they scale linearly in performance the more nodes you add.
Link
August 30th, 2006 at 11:40 am
Isilon is pretty hot these days. They use all SATA drives and allow copious throughput and scalability. Check ‘em out!
August 30th, 2006 at 1:28 pm
I know the sun x4500 server was mentioned earlier, but here is a link:
http://www.sun.com/servers/x64/x4500/
They claim it starts at $2/GB.
August 30th, 2006 at 9:26 pm
You want Isilon. They make a GREAT cluster based file server which scales past petabytes and the performance is great too. Cost is hard to beat!
Not affiliated with them, at least not yet~
August 30th, 2006 at 11:00 pm
Josh,
Thanks for asking. Since I get two votes out of three choices, can I double-check “good” (or Reliable)?
The DreamHost nightmares of July and continuing into August have practically killed my admittedly “niche,” but formerly active, site. People couldn’t log on repeatedly, and many will never come back.
I vote for more robust router and generator systems. Current backup systems seem okay. At least, to DreamHost’s credit, it’s never lost my database. I hope the potential database loss isn’t as inevitable an earthquake in L.A.
You guys scare me, not with your honesty, but with what I’ve seen happen to my site.
Josh, I love your sense of humor, the photos and your communicative blog. I know you guys work hard, and do your best to communicate during hard times.
I just really want to see my site not go down repeatedly, whatever that takes, even if it costs a few more dollars.
August 31st, 2006 at 5:35 pm
I would also strongly consider getting a Sun x4500 (thumper)
demo box from your friendly Sun sales rep and trying it
out with Solaris 10.
For the uninitiated, this box packs (48) SATA drive bays in
a 4U enclosure, with dual opterons, boatloads of RAM, and
quad gigabit nics. Very compelling platform from a real
enterprise vendor.
Solaris is the _only_ NFS server I would consider using
in production and ZFS will meet all your requirements.
If you jerk the wheel off the hardware support road, you
could buy much cheaper and larger drives on the open
market and drop them in yourselves.
# Sun X4500 SATA: $32,995.00 for 12TB: $2.69/gig
# Sun X4500 SATA: $69,995.00 for 24TB: $2.85/gig
Google is handy for these calculations:
http://google.com/search?q=%2433000+%2F+12+terabytes+in+dollars+per+gigabyte
September 1st, 2006 at 12:37 pm
Just throw a couple Xserve cabinets in there.
mmmmmm
September 1st, 2006 at 1:36 pm
Hey if the Amazon S3 is good enough for Smugmug’s 300TB of data then it should be good enough for your measly 250TB of data. Just saying.
September 1st, 2006 at 2:28 pm
Well, I can say that you shouldn’t pick Lustre if those are requirements. It’s a very high performance filesystem (one of the highest, in fact), but it’s not high availabilty. So, you have to rely on RAID, but a storage node could still easily fail. The metadata server failover is also pretty vanilla. It’s mainly designed around large compute clusters, not web hosting, where downtime *will* kill you, not just make you bleed.
September 1st, 2006 at 6:59 pm
We had a requirement for something similar. One proposal we considered was
building nodes out of 1U or 2U boxes holding 4 to 8 drives each. Each node
would run Solaris x86 and would create a raided, exportable iSCSI chunk from
the onboard drives.
At the top level, there would be a Solaris server with tape drives, etc for backup
and it would mount the nodes as iSCSI targets and put them into zfs pools.
Adding additional disk capacity is as simple as building another node and adding
it to the top level zfs storage pool.
September 1st, 2006 at 7:56 pm
Well, I can say that you shouldn’t pick Lustre if those are requirements. It’s a very high performance filesystem (one of the highest, in fact), but it’s not high availabilty. So, you have to rely on RAID, but a storage node could still easily fail. The metadata server failover is also pretty vanilla. It’s mainly designed around large compute clusters, not web hosting, where downtime *will* kill you, not just make you bleed.
You might try looking at IBRIX. It’s an all-software solution that’s $12k/yr and $22k/3yr, which makes it pretty cheap.
September 2nd, 2006 at 1:32 pm
http://www.equallogic.com
http://www.pillardatasystems.com
September 3rd, 2006 at 9:44 am
Is a solution from IBM or EMC out of the question? Price prohibicive?
September 7th, 2006 at 1:38 am
MogileFS is free, designed for this stuff, and unlike Bluearc, actually assumes hardware will fail and deals with it.
September 7th, 2006 at 8:57 am
Just an idea like that… Ever considered offering VPS hosting as well?
September 7th, 2006 at 2:05 pm
Could take a look at this, I’m not sure it it’s like what you need, but…
http://www.nexsan.com/products/products/satabeast/advanced.html
September 8th, 2006 at 6:23 pm
“We’ve got a good system of RELIABILITY and PERFORMANCE already.”
If that’s so why has my site been down all evening, and sounds like it will be until MONDAY…
September 27th, 2006 at 6:36 pm
[...] This is fairly cheap considering Dreamhost pins their cost per usable GB at around $10. Amazon Elastic Compute Cloud (ec2) [...]
October 13th, 2006 at 3:44 pm
[...] You may remember when I asked for recommendations on storage. [...]
September 18th, 2008 at 8:58 am
[...] was when I made this post asking our customers for some suggestions on storage. I made the mistake in that post of mentioning [...]