The cost of disks

When I’m providing quotes for servers and systems, I’m often asked why the disks are so expensive. This is frequently accompanied by a statement that they can go down to the local department store and by a multi-terabyte disk for around one hundred dollars.

The question comes up so often, that it’s worth explaining to non technical people the reasons why the price is much different.

Firstly, those USB hard disks you buy in the shop are much like the ones in your home computer. They’re domestic grade disks, design for a single person sitting in front of a computer for a couple of hours a day, saving a document every 10 minutes or so, and turning off the computer when they’re not using it.

Server grade hard disks are very different beasts. They run 24 hours a day, seven days a week. They may have dozens, hundreds or thousands of users reading and writing to them at any given moment. So, these disks are built to stricter quality standards, typically spin faster, and therefore cost more.

Secondly, as I blogged previously, hard disks will fail. It’s inescapable. So, to ensure your server keeps running, we never put just a single disk in a server. We put two or three or more in, and build in levels of redundancy. For really critical systems, we also add “hot spare” disks; one or more drives which normally sit empty, but will be automatically swing into action when one of the main disks failed. Statistically, if you have a group of disks and one fails, it’s more likely a second one will fail too, so it’s important to have a replacement on hand as soon as possible. So, where you buy one multi-terabyte disk at your local shop, we’re putting in three or more.

Thirdly, bigger disks fail more often: hard disks come in a standard (physical) size, and they operate by storing data onto a metallic surface using magnetic pulses. When data is stored more densely (with more disk platters, or more tracks per inch on each disk platter), there is a higher risk that a magnetic pulse will affect the tracks around it inadvertently. So, we will usually put in more disks of a smaller capacity rather than fewer large disks, to reduce this risk.

Fourthly, it’s not just the disk: in larger environments, particularly ones with virtual servers, we don’t put the disks into the server. We use a system called Storage Area Networking (SAN) where the disks are placed in a central system, and provided out to the other servers. From a holistic view, this eliminates some wastage as disk space can be re-allocated or extra space added quickly, but it requires additional cabling, switches and management. Those costs are built in to the “cost per gigabyte” you may be quoted for your system.

Fifthly, backups: Every gigabyte of extra disk you add needs to be backed up, typically several times. That means more removable drives, or backup tapes. It may mean higher capacity backup tape drives too. The price of that needs to be factored in to the cost of buying new disk.

Hopefully, this post helps explain some of the reasons. Typically, all these costs get wrapped up in a simple figure of “cost per gigabyte”, so it’s easier to work out how much storage will be as a system grows. But a gigabyte is not just a gigabyte, in this case – and it’s far, far more than just a simple disk you buy down the street and plug in!