Ask DreamHost Customers
August 25, 2006 on 10:44 am | In Business, Hardware, Insider View by Josh Jones |
I’ve got a question.
And I thought, who better to ask, than everybody?
Here goes…
We’ve got pretty serious storage needs. Like, in the next year, we’re estimating needing about 250TB (big T, big B) of additional centralized, networked, storage.
Besides needing a lot, we also need very high performance, redundancy, and thrift.

We want it ALL!
Our absolute requirements for our system are as follows:
* RELIABILITY .. we can never ever lose any data, ever.
* PERFORMANCE .. we need something that can serve approximately 3000 NFS ops per second per TB. (See spec.org). (It needs to do NFS.)
* PRICE .. that’s the whole reason we’re looking.
We’ve got a good system of RELIABILITY and PERFORMANCE already.. but the cost per usable GB is $10. The main problem is the 300GB Fiber Channel drives we use, which are $800 each. Is there anything out there that can do the same but with SATA drives that cost more like $100? Even if we needed twice or four times as many drives for the same performance and reliability, it seems possible!
There are also some REALLY WANT TO HAVE features, though possibly could be passed up if the top three are satisfied.
* SNAPSHOTS .. automatic versioning backups of all files by the OS. We’ve got this now, in a hidden “.snapshot” directory in every folder.. check it out!
* USER QUOTAS .. really, with the amount of space we’re giving out these days, quotas are almost a moot point. They’d be nice to have though.
* HIGH DENSITY/LOW POWER .. it’s always a plus to fit more storage in the same amount of space with the same amount of power, but it’s not really that big a deal.
* RAID 6 SUPPORT .. it’s cool.
Here are some vendors/solutions we’re considering..
* NETAPP .. what we use now.
* BLUEARC
* ONSTOR
* PANASAS
* CORAID (Open Source, ATA over Ethernet.. intereesssssting!)
* LUSTRE (Open Source, Clustering storage)
Soo.. basically, if people could post their suggestions, experience, other solutions, etc.. in the comments, it would be much appreciated. Not to mention you will be doing your patriotic duty to improve your hosting forever!

And remember, we’re talking serious NFS ops.. and we’d be willing to buy 256TB at once if our tests showed this system can do what we want and THE PRICE IS RIGHT!
61 Comments
Sorry, the comment form is closed at this time.
Powered by WordPress. Pool theme by Borja Fernandez, modified by DreamHost.
Entries and comments feeds.
^Top^

That’s all well and good, but what if DreamHost does an update and it somehow wipes out complete domains/subdomains/folders and leaves its customer in limbo for six (going on 7 hours), without even a response like “we’re recovering your data?”
Check it out.
Comment by loper — August 25, 2006 #
Amazon S3. I sync my 12 GB of photos with it.
# $0.15 per GB-Month of storage used.
# $0.20 per GB of data transferred.
It’s not hosting so all the features you asked for would have to be run off another server and interact with the S3 storage.
Comment by Mike — August 25, 2006 #
The coraid ata over ethernet seems like the best bang for the buck solution out right now. The ability to add shelves and still use commodity hardware as a “NAS controller” makes it a very nice solution. I currently manage several LARGE netapp appliances and they are nice but the cost per gig is off the scale and the assosciated support contracts are nearly as aexpesive as the devices themselves…
Comment by Dave — August 25, 2006 #
SATA over Ethernet looks pretty sweet. I really want some SATAoE on my LVM at home, maybe some slim drive enclosures with Ethernet on the back. I have no idea how scalable it is, or how cheap as your storage needs increase. Good luck though! =D
Comment by Adam Backstrom — August 25, 2006 #
One thing you should do is have telephone tech support.
Or at least allow paying customers the opportunity to call you, even if they only have a set number of calls allowed per month.
8 hours is too long to wait for a response to a critical problem- that DreamHost caused- know what I mean?
Comment by loper — August 25, 2006 #
Loper, we all realize you’re having problems, but have some common courtesy and post your issues where they are appropriate.
In case you are unsure, that place is not on this post.
Comment by James Asher — August 25, 2006 #
Have you thought about a Thumper (Sun x4500)?
That’s what they’re for. 24Tb in 4U.
They’re about be included in the 60 day ‘try and buy’ free trials too, so you can kick the tyres :
http://www.your-sun.com/x64/
(If you’re in a big hurry, have a word with your reseller - ours are offering them already).
Comment by Dick Davies — August 25, 2006 #
What’s the performance of ATA over Ethernet? Does it provide enough throughput to meet DH’s requirements?
Comment by Josh — August 25, 2006 #
mmm… I don’t know about storage, but don’t forget about elictricity.
Comment by ST — August 25, 2006 #
“Loper, we all realize you’re having problems, but have some common courtesy and post your issues where they are appropriate.” - James Asher
Thanks, but while DreamHost did an update, it completely maimed my site. And Josh is here doing advertising.
DreamHost forces its customers into this conundrum when it fails to meet a reasonable response time; this is a result of putting the customers in a “don’t call us, we’ll call you” situation.
Granted, much of the time, the response is fine, but when it fails, DreamHost is failing its customers, some of whom, like me, pay a lot of money for the service.
The poster (Josh) asked for suggestions on improvement, and I’m making one. Where else am I expected to do so- when the email tech support is totally non-responsive?
That’s the problem. There is only one avenue to contact tech support, and it is showing (and has shown) a tendency to get bogged down, completely outside the control of the people affected by the problems.
Comment by Anonymous — August 25, 2006 #
MAID Technology in 28TB lumps at $3.75 per GB up to 480TB using up to 896 drives (250 or 500 GB SATA) Raid 5 each cabinet.
Power consumption 2,402- 6,368 watts.
200 - 240 VAC, 30 AMP, 3-phase.
Throughput 5.2 TB/Hour.
I’ll have a dozen :)
http://www.copansys.com/index.shtml
Comment by Norm — August 25, 2006 #
Is there a reason you aren’t considering building a GoogleFS or MogileFS setup? If you’re saying your current cost, fully implemented, is ~$10/GB, then the upper bounds for your 256TB setup should be in the ballpark of $2.5M. I’m guessing it has to be possible to build a 256TB total capacity MogileFS setup for less than $2.5M.
Of course, your “it must speak NFS” requirement throws a twist in here. How hard could it be to write a userspace NFS server that could sit in front of MogileFS storage? I might even prefer a userfs driver that “mounts” the MogileFS storage … then if you need to export that over NFS, fine.
Good luck, Josh & the DreamHost team.
Comment by Dossy Shiobara — August 25, 2006 #
Don’t forget power requirements on you list. I’d second Sun’s Thumper for that reason. Look into ZFS while you’re at it for some slick snapshot capapbilities.
Comment by Mike — August 25, 2006 #
Go Panasas!
Comment by Perry Huang — August 25, 2006 #
I don’t see the point in using SCSI for file storage, if you’re just gonna have an array of discs then SATA is fine. SATA drives are also larger than SCSI drives, use less power, run cooler and are quieter (though sound levels probably aren’t a real big issue in web hosting).
With even just an 8 drive RAID-6 set up with ‘cuda 750GB SATA-2 drives, that’s 4,500 GB right there. And the minimum throughput will be at least double of what a gigabit network connection can provide.
However, it seems that with 300 - 400 GB drives you get more GB / $. So it’d have to be a trade-off between price and data density.
Comment by David Harrison — August 25, 2006 #
There are two other enterprise storage vendors that you should be having a serious look at: Pillar () and TerraScale () — but that said, my work place has similar online storage needs (0.5 petabyte and growing), and we loves us some NetApp….
Comment by genehack — August 25, 2006 #
Have you guys looked at the “low-cost fiber channel” solutions offered by EMC? It sits in between FC and SATA in terms of performance and price, and you can get 500GB drives (if I’m not mistaken).
Comment by Wilson — August 25, 2006 #
You might want to look at http://www.emc.com/products/systems/symmetrix/DMX_series/pdf/C1304_Symmetrix_DMX3_SS_ldv.pdf
(Apparently the raw 1PB was about $4M, at the start of the year, most likely come down in price, only thing is, space, and cost of repairs if something goes belly up.
Comment by Nigel — August 25, 2006 #
I don’t know what you’re talking about with your “automatic versioning backups of all files by the OS…in a hidden .snapshot directory in every folder” — I don’t have a single .snapshot dir in my entire filesystem on Dreamhost.
Comment by Anthony DiSante — August 26, 2006 #
“The poster (Josh) asked for suggestions on improvement, and I’m making one”
Josh asked for networked storage, you whined about support response time :-/
“I don’t have a single .snapshot dir”
have you tried?
[shish] on [dorito] Sat Aug 26 04:50:34 ~/shishnet.org
>ls -a
. .. .htaccess catjump.avi favicon.ico index.php index2.html robots.txt
[shish] on [dorito] Sat Aug 26 04:50:38 ~/shishnet.org
>cd .snapshot
[shish] on [dorito] Sat Aug 26 04:50:47 ~/shishnet.org/.snapshot
>ls -a
. .. hourly.0 hourly.1 nightly.0 nightly.1 weekly.0 weekly.1
Comment by Shish — August 26, 2006 #
The University I do IT work at was recently presented with the same challenges. We opted to go with a Compellent system. http://www.compellent.com/ because it gives us all the features being looked for. It replaces our old netapps and gives us a lot more flexibility with provisioning and future growth.
Comment by Laurion — August 26, 2006 #
Hm, no. I assumed that “hidden” meant “starting with a dot, and thus hidden from the ls output unless you use the -a switch.” (I.e. the standard Unix definition of the word “hidden”.) I did an “ls -a” and .snapshot did not show up, nor did it show up in a “find . -iname ‘*snapshot*’”. So it’s actually not “hidden”, it’s invisible. I never would have guessed to cd into a directory that wasn’t actually there in the directory listing.
Comment by Anthony DiSante — August 26, 2006 #
Maybe not the right place to post it, but there’s plenty of knowledgable people here. Is anyone else still having problems logging into the Control Panel with Firefox? I haven’t been able to do it for weeks, and Internet Explorer makes me feel ill.
Comment by Jon — August 26, 2006 #
“Maybe not the right place to post it,”
You’re right–it’s not. Don’t be like Loper.
http://discussion.dreamhost.com
People that can’t follow the simple directions on what to post about here are probably causing their own problems in the first place. Lack of reading comprehension & common sense isn’t exactly known for making things run smoothly.
There is a place for discussing things that don’t belong here: http://discussion.dreamhost.com
There, the forum URL is posted twice. Now let’s see how many imbeciles still post irrelevant trash here.
Comment by Common Sense — August 26, 2006 #
to jon:
try clearing out your cache and delete your cookies and then try again. i had the same problem after a firefox upgrade.
Comment by Aaron — August 26, 2006 #
Josh,
As you already know, BlueArc is #1 on SPECsfs for NFS Ops (per your URL link). Titan 2000 also solves the scalability problem, delivering file system capacities to 256 terabytes, or 512 terabytes on a single namespace.
BlueArc offers both high-speed Fibre Channel and high-density SATA drives in a single system, and a lot of our customers are in the Internet services space, storing customer e-mail and hosting Web sites. We would be happy to provide references and have them tell you how Titan has accelerated their business.
You can learn more here: http://www.bluearc.com/html/products/titan.shtml
Give us a call. 1 866 864-1040.
Comment by Louis Gray — August 26, 2006 #
As long as your your servers don’t die and lose data whats the difference to us?
Comment by Adam — August 27, 2006 #
“As long as your your servers don’t die and lose data whats the difference to us?”
More reliable network storage is less likely to die or lose data; faster storage will make sites more responsive; cheaper storage allows for more space per customer and / or more money to be spent elsewhere (eg support, where even with the help of nagios, it still takes them an hour to notice a dead server; and several hours to notice a server with a load level over 200…)
Do the snapshots take up any space, or are they just links to existing data? If they’re taking space, is that accounted for in “$10 per usable GB”?
Comment by Shish — August 27, 2006 #
Common Sense is a prick
Comment by Jon — August 27, 2006 #
$10 per usable gigabyte means “free space that you our beloved customers can place data.” It does eat physical space on the disks, a 100meg deleted file is stored once in snapshots.
When Josh says RAID6, he means “two parity disks per raid group”, meaning we can lose two disks in a single raid group and not lose any data. I don’t know if thats what NetApp’s do, as they call it RAID4 (single parity) and RAID_DP (dual partiy).
You crazy people and your “data.” Why, back in my day, the web ran at 300 baud. Both ways. Uphill. In the snow. Barefoot.
Comment by Kelly — August 27, 2006 #
google needs a lot of storage and seems to be handling it with cheap disks and gfs :-)
Comment by doc — August 27, 2006 #
Why is people keep on mentioning GoogleFS ? Last time I checked, it’s proprietary and not for sale. Or is it different now ??
Comment by harry — August 27, 2006 #
Man, i don’t know if it will fit your needs, but a bunch of these just look wonderful –> http://www.apple.com/xserve/raid/
Comment by Douglas — August 27, 2006 #
It’s already been mentioned once, but I figure it’s worth reiterating. When I next build a scalable storage solution, it’s almost certainly going to be with Sun’s Thumper, Solaris and, the most important ingredient, ZFS. That gives you all of your requirements, I think, plus ZFS just totally rocks.
Comment by mathie — August 28, 2006 #
We use iSCSI boxes from Rackable Systems. These boxes are cheap arrays of SATA discs and are fast as they allow you to use SCSI over TCP. This allows you to have close to fiber channel speeds but also have fully redundant arrays because the boxes are comodo and do not cost a lot of money. There is iSCSI software from Wasabi Systems and a few open source iSCSI packages for linux that have all sorts of features.
Comment by Peter Adams — August 28, 2006 #
Someone mentioned it already but take a look at Pillar Data. I can’t speak for its specifications for your environment but their customer service is _GREAT_.
We had some technical complications with an Oracle 10g RAC and they had three people on site for 4 days to assist us with the problem. Not to mention 3 very long days.
J
Comment by jeffx — August 28, 2006 #
With your requirements, you need to take a meeting to learn about BlueArc. I know we would be the best solution, and at 250TB our price/TB will be very aggressive. Call me at 408-576-6689 and I will helpy you set up a 45 minute tech overview.
Comment by Jeff Narduzzi — August 28, 2006 #
We’ve been using Coraids for a while… Great stuff… GREAT STUFF. :)
Comment by M@ — August 28, 2006 #
Have you checked out PolyServe or Isilon?
Comment by Raj Bala — August 28, 2006 #
You might want to take a read at this post which turned up in my RSS feeds today:
http://cuddletech.com/blog/pivot/entry.php?id=741
Comment by mathie — August 29, 2006 #
Hey Guys, this just showed up via Digg:
“Capricorn releases a 120TB storage system”
http://www.theinquirer.net/default.aspx?article=33968
3TB/1U Node, includes an x86 controller - consumes about 80W (inc drives?)
Up to 4x 750GB drives per node.
19U Rack holds 120TB, consuming 3200W
Approx $200K for a rack.
Comment by Will Hughes — August 29, 2006 #
For the record:
http://www.capricorn-tech.com/
Just in case you had any difficulty finding them.
Comment by Lorenzo — August 29, 2006 #
Have you thought about the IBRIX Fusion line of products? They’re claiming that you can scale up to 16 PB (petabytes) of data in a single namespace giving you plenty of future room for expansion and they do NFS and CIFS file shares.
Oh, plus they scale linearly in performance the more nodes you add.
Link
Comment by Nathan — August 29, 2006 #
Isilon is pretty hot these days. They use all SATA drives and allow copious throughput and scalability. Check ‘em out!
Comment by Jeff Hughes — August 30, 2006 #
I know the sun x4500 server was mentioned earlier, but here is a link:
http://www.sun.com/servers/x64/x4500/
They claim it starts at $2/GB.
Comment by Mark — August 30, 2006 #
You want Isilon. They make a GREAT cluster based file server which scales past petabytes and the performance is great too. Cost is hard to beat!
Not affiliated with them, at least not yet~
Comment by Jim — August 30, 2006 #
Josh,
Thanks for asking. Since I get two votes out of three choices, can I double-check “good” (or Reliable)?
The DreamHost nightmares of July and continuing into August have practically killed my admittedly “niche,” but formerly active, site. People couldn’t log on repeatedly, and many will never come back.
I vote for more robust router and generator systems. Current backup systems seem okay. At least, to DreamHost’s credit, it’s never lost my database. I hope the potential database loss isn’t as inevitable an earthquake in L.A.
You guys scare me, not with your honesty, but with what I’ve seen happen to my site.
Josh, I love your sense of humor, the photos and your communicative blog. I know you guys work hard, and do your best to communicate during hard times.
I just really want to see my site not go down repeatedly, whatever that takes, even if it costs a few more dollars.
Comment by pamd — August 30, 2006 #
I would also strongly consider getting a Sun x4500 (thumper)
demo box from your friendly Sun sales rep and trying it
out with Solaris 10.
For the uninitiated, this box packs (48) SATA drive bays in
a 4U enclosure, with dual opterons, boatloads of RAM, and
quad gigabit nics. Very compelling platform from a real
enterprise vendor.
Solaris is the _only_ NFS server I would consider using
in production and ZFS will meet all your requirements.
If you jerk the wheel off the hardware support road, you
could buy much cheaper and larger drives on the open
market and drop them in yourselves.
# Sun X4500 SATA: $32,995.00 for 12TB: $2.69/gig
# Sun X4500 SATA: $69,995.00 for 24TB: $2.85/gig
Google is handy for these calculations:
http://google.com/search?q=%2433000+%2F+12+terabytes+in+dollars+per+gigabyte
Comment by stu — August 31, 2006 #
Just throw a couple Xserve cabinets in there.
mmmmmm
Comment by Chris — September 1, 2006 #
Hey if the Amazon S3 is good enough for Smugmug’s 300TB of data then it should be good enough for your measly 250TB of data. Just saying.
Comment by Mike Lane — September 1, 2006 #