Skip to main content
Internet Archive's 25th Anniversary Logo

View Post [edit]

Poster: ihtoit Date: Oct 9, 2012 7:24pm
Forum: petabox Subject: Design update?

I have idea to more than double data density, while potentially *reducing* power requirements per rack. This involves 10 4U servers per rack, each with 45 storage drives and 1 system drive, two power supplies per case (500+1000W per - you'll see why) and the top 4U for local switchgear etc. This gives a potential of 1350TB (using 3TB drives), or exactly 1PB per rack using RAID6.

Power supplies: In each case, a 500W ATX PSU would be powering the motherboard and system drive, and all the fans. You'd probably need half a dozen 120mm fans per case, no more. The 1kW PSU would only run at anywhere approaching full load on startup, and that would be spinning up 45 storage drives prior to booting the server. After that, the load would drop to something more sensible. Also bear in mind that most of the power drain on a hard drive is through the +5V rail.

The drives themselves would be connected via port replicators to PCIe SATA cards (45/5 gives you the requirement, which is 9 cables, 9 ports, which can be done with 2x4-port and 1x2-port cards), there configured via JBOD or RAIDx. I reckon each server could be built for less than US$7,000. EACH PB rack could be done for change out of US$100k (not including switchgear). I've had quotes myself for such a project using "Enterprise grade" hardware and seen seven to eight digits. Amazon want nearly three million bucks per rack!

Reply [edit]

Poster: Coderjo Date: Oct 10, 2012 11:10pm
Forum: petabox Subject: Re: Design update?

It sounds like you are suggesting the Backblaze storage pod. I'd be surprised if someone in charge of hardware at IA hasn't already heard of it. (I am not part of that group)

Reply [edit]

Poster: Coderjo Date: Oct 10, 2012 11:18pm
Forum: petabox Subject: Re: Design update?

BTW, I've heard the current design makes use of SuperMicro CSE-847E26-R1400LPB cases.

Reply [edit]

Poster: GridEngine Date: Dec 24, 2012 10:00am
Forum: petabox Subject: Re: Design update?

Keep in mind that while Amazon S3 is more expensive than any of the DIY storage projects, the S3 pricing already includes the cost of maintenance. Also, S3 keeps 3 copies of your data in different datacenters.

If pricing is a major concern, then Amazon RRS is cheaper, but only keeps copies of your data in 2 different datacenters. There is also Amazon Glacier that offers a much cheaper pricing but it is a bit more difficult to get your data out from Glacier.

BTW, I don't think Internet Archive should move to AWS.

Note: I don't work for Amazon, but we use AWS for our scalability testing:


Open Grid Scheduler - The Official Open Source Grid Engine