Hard drive performance: the Red Hill Guide

The importance of speed-rating drives

For all the hyperbole devoted to Athlons and Pentiums, the hard drive is still the single most important component in a computer system. A faster hard drive makes more difference to the usability of a system than any other component. (Except RAM, of course, but RAM is very boring — if you can remember the three words "more is better" then you know almost all you ever need to know.) The extraordinary thing is that there is no recognised standard single measure of hard drive performance.

Measuring and comparing performance is always problematic, but it's particularly so with hard drives. There are innumerable hard drive benchmark testing programs, but despite some valiant efforts, none of them are particularly well-respected. Because all a computer's components interact with the hard drive, and because different users and programs use the drive in different ways, and most of all because this crazy industry keeps moving the technological goalposts, a software test that's almost fair on 1997 hardware can be all but useless with current kit — and probably will not run at all on 1991 equipment! To make matters more difficult still, how can you be certain that a Windows-based test is an appropriate predictor for Macs or Unix systems, or vice-versa?

But all this is to ignore the real and fundamental underlying difficulty, one that very few researchers seem to appreciate, let alone attempt to deal with in a comprehensive way. It is this: the real effect of computer performance in general and of hard drive performance in particular is the effect on the human being.

Outside of certain specialised industrial and scientific applications, the only purpose in making hard drives faster is to please the human being sitting at the keyboard. ure, faster drives make happier users, but how much faster? Faster in what way? And what type of user? Hard drive performance measurement, in short, is not purely technical. Nor, of course, is it purely psychological: it is a little of both.

The problem of scale

The first difficulty we have to deal with, if we are to make drive performance measurement more relevant to the human being, is the way that most benchmark numbers scale. They can be quite useful for comparing two or three quite similar drives, within, say, 10 or 20% of one another, but give counter-intuitive and almost meaningless results for drives of different generations or different market segments.

Like most physical measurements of matters which are, in the final analysis, perceptual, benchmark scores also tend to inflate the higher scores. As an example of this, consider using horsepower as an indication of the speed of your car. Sure, a 200 horsepower motor will make it go faster than a 100 horsepower motor, but not twice as fast.

Similarly, a computer with a 2000 MHz CPU is nothing like twice as fast as the same machine with a 1000 MHz chip. Partly this is to do with the fact that (in this example) we have only changed one part and the rest of the system — RAM, hard drive, video card and so on — is no different.

(Of course, this is exactly what you have to do when you are benchmarking. Though very dated now, some of our old CPU and motherboard performance tests explored this in more detail and Ace's Hardware developed quite a name for looking intelligently at the interrelationships between components.)

However, even if we double all the components: plug in a twice-as-fast hard drive, twice as much RAM, and so on, we still don't get a machine which is twice as fast. This is because our perception of computer speed, like our perception of most things, is not arithmetic, it is logarithmic.

An example: a noise which sounds just barely louder than another noise is actually about twice as loud (i.e. it has twice as much energy in the air vibrations). Our ears can't tell the difference between a sound and a second sound which is 20 percent louder. (If this seems absurd, take a look at any introductory sound engineering or psychology textbook.) Sensibly, audio engineers don't usually measure sound pressure levels directly, they measure them on a curving, logarithmic scale which "seems straight" to the human ear. This is has the very useful result that you can measure any two sounds using the audio decibel (dBA or just dB for short), no matter how loud or how soft, and know how far apart they are: a 3dB difference is only just noticeable if you concentrate hard, a 6dB difference (twice as much actual power) is noticeable under normal conditions, and so on. It doesn't matter if we are measuring the 20dB whisper of the breeze on a summer day or 98dB of a rock band. The audio dB, in other words, like all good measurements scales properly.

By the way, it's common to use a log scale to measure all sorts of things, not just sound volume. Examples are as varied as the dBV (for measuring voltage), the Beaufort Scale (for measuring wind force), and the Richter Scale (for measuring earthquakes). Indeed, even the musical scale works rather like this — it is measuring frequency not volume, but each octave is twice as large as the one before.

In summary, we need a measurement that is:

Consistent across a wide range of drives: past and present, fast, slow and middling
Scales appropriately, so that a drive that "feels" 20% faster to the average user gets a 20% bigger number.
Anomaly free

Simple inputs, complex results

With all this in mind, we can return to the problem of measuring hard drive performance. It really ought to be very simple! Ignoring two or three very minor factors, there are only three things that determine hard drive speed, and all three are very easy to measure — so much so that in practice we can usually accept the manufacturer's published claims for them. (But with care — some manufacturers cheat!) The drive has to:

Move to the right part of the disc (seek time).
Wait while the disc spins until the first byte of the desired data passes under the read head (latency).
Suck the rest of data off the disc as fast as possible (Data Transfer Rate or DTR).

That's all there is to it. (We are ignoring relatively trivial factors like caching, external data rate and head switching; these are discussed elsewhere.) Seek, latency and DTR are public figures and easily verifiable. And yet there is no commonly accepted single number to describe hard drive performance. In contrast, despite the masses of talk surrounding the issue, no-one seriously argued with the major public benchmarks for CPU performance until a certain CPU manufacturer bought shares in the benchmark publishers and put its thumb in the scales. (No names here, let's just say its initials were "Intel".) Up till then though, things like Business Winstone 98 were really fairly decent guides. (Note that we are talking about real work here, not games.)

Hard drive designers no doubt spend a lot of time and money investigating the theoretical relationship between the three main determinants of hard drive performance, but the intricacies of this are not very relevant to us. We just want a single-figure real-world guide. In any case, real-world drives have strongly cross-correlated key performance factors. In other words, drives with good DTR tend to have low seek times and latency, and so on. This is for both technical and commercial reasons: remember that like everything else in this industry, hard drive performance is as much a product of social and economic factors as it is of technical ones. In other words, understanding the theoretical relationship between DTR, seek time and latency in the development lab is not particularly useful, as it doesn't tell us much about the small subset of all technically possible products that actually gets released onto the market.

But giving a real-world single-figure indication of drive performance should not be difficult! Any experienced computer techie can estimate a drive's performance fairly accurately just by sitting at the keyboard for a minute or two. Reasonably keen but non-technical computer users soon detect a difference if you swap in a significantly faster or slower drive. In conversation with other computer people, it's commonplace to agree on the merits or shortcomings of a particular model — much more so than with, say CPUs or video cards, about which even the experts disagree.

Partial measures: Seek time

In the old days, the mid-eighties, let's say, it used to be common to just quote the seek time as a single measure. The habit came about because (back then) nearly all drives ran at 3600 RPM and thus had identical latency, and nearly all drives had exactly the same DTR: 5 Mbit/sec was an interface limitation of the old MFM controller. Even the handful of faster transfer drives only had 7.5 Mbit/sec DTRs, so seek time really was a pretty good descriptor. Not any more!

But the habit dies hard, of course. It's still quite common for non-technical people to ask about seek time thinking it equals performance. It stopped being very meaningful around about 1990. Seek time still has some validity as a single measure but only because it is strongly cross-correlated with latency and DTR in commercially successful drives, and because typical DTR has become so high that is a lesser factor than it used to be. Usually, if the drive maker has spent all that money on giving a drive a fast seek time, they will have spent money on getting decent latency and a good DTR too.

Partial measures: RPM (latency)

Although drive latency is almost never quoted on its own, RPM figures are, and they have the exact same meaning. This is because latency is determined only by RPM: in other words the two figures are simply different ways of expressing the same thing. It is quite common to use RPM as a rough guide to performance. As a rough and ready measure, it's not bad at all. But it can't make fine distinctions, and like all single measures it can be quite misleading.

Latency for common spindle speeds (ms)
RPM	3600	3811	4000	4400	4500	5200	5400	7200	10,000	15,000
Latency	8.33	7.87	7.50	6.82	6.67	5.77	5.56	4.17	3.00	2.00

Partial measures: Internal data rate

In lieu of anything better, some people use the internal data transfer rate (DTR) as the closest thing to a single figure description — it does correlate pretty well with actual performance (about 0.9). And, of course, it cross-correlates reasonably well with seek time and latency too — there is no technical reason why it has to, but few competent drive manufacturers spends millions developing a hard drive with top class data transfer and very poor seek time.

But just taking the DTR is not always accurate: some drives with low DTRs go rather well, and some with high DTRs are not as good as you'd expect. There is an interaction.

Three equal speed drives with quite different performance characteristics
Model	Data rate	Seek	Latency	Performance
Micropolis 2217	47 Mbit/sec	10ms	5.56ms	0.85
Quantum Bigfoot CY	93 Mbit/sec	13ms	8.33ms	0.85
Seagate Medalist2132	68 Mbit/sec	12.5ms	6.66ms	0.84

The three drives above are from different eras and market segments but have roughly equal performance (something you can verify quite easily by trying them out in practice and comparing with a few much faster and much slower units). As you can see, they use three very different ways to kill the same cat. The 2217 has little more than half the data transfer rate of the Bigfoot, but holds its own because of its faster seek time and much better latency; the Quantum's excellent DTR makes up for its slow spin (i.e. high latency) and slowish seek time. The Seagate is average in all respects, and all three are about the same speed. (The alert reader will notice that we have slipped in a fast one here: we are using our own speed measurement in a kind of self-justification. If it bothers you enough, go get hold of some drives and run some other speed test on them — Winbench or whatever you like. You'll come up with broadly similar results.)

Combining partial measures

Although we have dismissed each of the three main single measures in turn, we have seen nothing to suggest that we need to introduce a fourth variable. Seek time, latency and DTR are clearly the key factors to consider. Is it possible to find some way of combining them to produce an accurate composite measure?

The first step is obvious: add the seek time and the latency together. If you think about how a drive works, you can see that it doesn't really matter if it has 15ms seek time and 5ms latency, or 5ms seek and 15ms latency: either way, the net average delay before the drive starts reading data is 20ms. (There is a complicating factor here to do with the non-random distribution of data, but we won't get into this just yet.) This yields access time. (Technically, "access time" is seek plus latency plus the various electronic delays involved, but these are so small that we can ignore them for now.)

That leaves us with just two variables. It should then be a simple mathematical process to discover the correct way to combine access time and DTR to produce a performance rating. The normal method is to take some sample measurements, find a formula for access time and DTR which produces the same answers as the actual measurements, and then predict some other measurements with it. If the predictions are close to the actual measured results, then the formula is correct and we can use it.

Unfortunately, we can't do this, because there is no standard, uncontroversial way to measure drive performance. We are right back with the problem we originally started with! We can make all the mathematical predictions we like, but there is no performance equivalent of the 12 inch ruler or the chemist's scales, and there is, therefore, no way to check the results. In the end, what we are measuring is as much psychological as it is technical: how much faster will the drive seem to you or me?

Creating a single measure

Now at last we are on firmer ground. There is no doubt, for example, that the Western Digital Caviar 140 was faster than the WD93044, or that there was very little difference between the Seagate Medalist 1720 and the IBM Deskstar 2. Similarly, just by sitting at the keyboard, we can easily tell that our old Seagate Cheetah 1 is still faster than any of the IDE drives made until quite late in the '90s.

There are many possible ways to combine DTR and delay to produce a single figure, of course. The easiest way to test them is to use a spreadsheet or statistical program to produce performance tables for some well-known drives. The majority of these possible combinations can be eliminated at a glance — the end figures obviously bear little relationship to reality. For the small number of transformations that make sense on first sight, closer inspection is required.

We need to pay particular attention to unusual cases: very fast or slow drives, and ones with an unusual mix of performance characteristics — these are the ones most likely to show up weaknesses in the formulae. It's particularly useful to do blind testing. (Or at least as close a thing to blind testing as possible, given the measurement difficulties we've outlined above!) By this we mean selecting a drive, estimating its performance rating from experience with it, then using the formula to calculate the actual performance rating. If the calculated result is close to the estimated result, then its evidence in favour of the formula under test. If it is surprising, then either your estimate was out, or the formula could use revision.

Of the many dozens of formulae we tried, one clearly gives the best fit: 2 log (DTR) / √ (access). So far, it's the only one we've found which seems to work with consistent accuracy, though there may well be others which are equally as good or better. We make no claim for a theoretical basis behind it, and in fact suspect that it will need modification to cope properly with the ever-faster drives that will be released in years to come.

We'd also expect that a truly universal formula of this nature would be generalizable to other, non-drive storage devices: this formula clearly is not — if you are interested, try plugging in the data transfer rate, seek and latency of a floppy drive: the results seem meaningless. It does, however, work very well in the range for which we are interested: hard drives from ST-412 to Cheetah X15-36LP. If and when we find a better way to express hard drive performance, we'll switch to it. It's not very meaningful to compare, say, a Cheetah 1 and an ST-225 directly, of course, but comparing either of them with drives of not too dissimilar a vintage works well and produces few surprises.

Of course, no single figure can hope to describe a drive's performance in the same way as the DTR, seek, and RPM (i.e. latency) figures we try to provide for all the drives we've listed; it can give you a rough idea of the general performance of the drive, but provides little of the flavour that the detailed figures add. For example, returning to the three more or less equal speed drives in the table above, you can see that the Bigfoot is easily the quickest if you mostly play big A/V files, that the Micropolis would be much better for for database work, and you'd prefer to have the Seagate for more general purpose tasks.

Speed matters

But how do you measure it?