HARD DISKS

When the power to a PC is switched off, the contents of memory are lost. It is the PC's hard disk that serves as a non-volatile, bulk storage medium and as the repository for a user's documents, files and applications. It's astonishing to recall that back in 1954, when IBM first invented the hard disk, capacity was a mere 5MB stored across fifty 24in platters. 25 years later Seagate Technology introduced the first hard disk drive for personal computers, boasting a capacity of up to 40MB and data transfer rate of 625 KBps using the MFM encoding method. A later version of the company's ST506 interface increased both capacity and speed and switched to the RLL encoding method. It's equally hard to believe that as recently as the late 1980s 100MB of hard disk space was considered generous. Today, this would be totally inadequate, hardly enough to install the operating system alone, let alone a huge application such as Microsoft Office.

The PC's upgradeability has led software companies to believe that it doesn't matter how large their applications are. As a result, the average size of the hard disk rose from 100MB to 1.2GB in just a few years and by the start of the new millennium a typical desktop hard drive stored 18GB across three 3.5in platters. Thankfully, as capacity has gone up prices have come down, improved areal density levels being the dominant reason for the reduction in price per megabyte.

It's not just the size of hard disks that has increased. The performance of fixed disk media has also evolved considerably. When the Intel Triton chipset arrived, EIDE PIO mode 4 was born and hard disk performance soared to new heights, allowing users to experience high-performance and high-capacity data storage without having to pay a premium for a SCSI-based system.

Construction
Hard disks are rigid platters, composed of a substrate and a magnetic medium. The substrate - the platter's base material - must be non-magnetic and capable of being machined to a smooth finish. It is made either of aluminium alloy or a mixture of glass and ceramic. To allow data storage, both sides of each platter are coated with a magnetic medium - formerly magnetic oxide, but now, almost exclusively, a layer of metal called a thin-film medium. This stores data in magnetic patterns, with each platter capable of storing a billion or so bits per square inch (bpsi) of platter surface.

Platters vary in size and hard disk drives come in two form factors, 5.25in or 3.5in. The trend is towards glass technology since this has the better heat resistance properties and allows platters to be made thinner than aluminium ones. The inside of a hard disk drive must be kept as dust-free as the factory where it was built. To eliminate internal contamination, air pressure is equalised via special filters and the platters are hermetically sealed in a case with the interior kept in a partial vacuum. This sealed chamber is often referred to as the head disk assembly (HDA).

Typically two or three or more platters are stacked on top of each other with a common spindle that turns the whole assembly at several thousand revolutions per minute. There's a gap between the platters, making room for magnetic read/write head, mounted on the end of an actuator arm. This is so close to the platters that it's only the rush of air pulled round by the rotation of the platters that keeps the head away from the surface of the disk - it flies a fraction of a millimetre above the disk. On early hard disk drives this distance was around 0.2mm. In modern-day drives this has been reduced to 0.07mm or less. A small particle of dirt could cause a head to "crash", touching the disk and scraping off the magnetic coating. On IDE and SCSI drives the disk controller is part of the drive itself.

There's a read/write head for each side of each platter, mounted on arms which can move them towards the central spindle or towards the edge. The arms are moved by the head actuator, which contains a voice-coil - an electromagnetic coil that can move a magnet very rapidly. Loudspeaker cones are vibrated using a similar mechanism.

The heads are designed to touch the platters when the disk stops spinning - that is, when the drive is powered off. During the spin-down period, the airflow diminishes until it stops completely, when the head lands gently on the platter surface - to a dedicated spot called the landing zone (LZ). The LZ is dedicated to providing a parking spot for the read/write heads, and never contains data.

When a disk undergoes a low-level format, it is divided it into tracks and sectors. The tracks are concentric circles around the central spindle on either side of each platter. Tracks physically above each other on the platters are grouped together into cylinders which are then further subdivided into sectors of 512 bytes apiece. The concept of cylinders is important, since cross-platter information in the same cylinder can be accessed without having to move the heads. The sector is a disk's smallest accessible unit. Drives use a technique called zoned-bit recording in which tracks on the outside of the disk contain more 04sector.gif (3388 bytes)

sectors than those on the inside.

Operation
Data is recorded onto the magnetic surface of the disk in exactly the same way as it is on floppies or digital tapes. Essentially, the surface is treated as an array of dot positions, with each "domain' of magnetic polarisation being set to a binary "1" or "0". The position of each array element is not identifiable in an "absolute" sense, and so a scheme of guidance marks helps the read/write head find positions on the disk. The need for these guidance markings explains why disks must be formatted before they can be used.

When it comes to accessing data already stored, the disk spins round very fast so that any part of its circumference can be quickly identified. The drive translates a read request from the computer into reality. There was a time when the cylinder/head/sector location that the computer worked out really was the data's location, but today's drives are more complicated than the BIOS can handle, and they translate BIOS requests by using their own mapping.

In the past it was also the case that a disk's controller did not have sufficient processing capacity to be able to read physically adjacent sectors quickly enough, thus requiring that the platter complete another full revolution before the next logical sector could be read. To combat this problem, older drives would stagger the way in which sectors were physically arranged, so as to reduce this waiting time. With an interleave factor of 3, for instance, two sectors would be skipped after each sector read. An interleave factor was expressed as a ratio, "N:1", where "N" represented the distance between one logical sector and the next. The speed of a modern hard disk drive with an integrated controller and its own data buffer renders the technique obsolete.

The rate at which hard disk capacities have increased over the years has given rise to a situation in which allocating and tracking individual data sectors on even a typical drive would require a huge amount of overhead, causing file handling efficiency to plummet. Therefore, to improve performance, data sectors have for some time been allocated in groups called clusters. The number of sectors in a cluster depends on the cluster size, which in turn depends on the partition size.

When the computer wants to read data, the operating system works out where the data is on the disk. To do this it first reads the FAT (File Allocation Table) at the beginning of the partition. This tells the operating system in which sector on which track to find the data. With this information, the head can then read the requested data. The disk controller controls the drive's servo-motors and translates the fluctuating voltages from the head into digital data for the CPU.

More often than not, the next set of data to be read is sequentially located on the disk. For this reason, hard drives contain between 256KB and 8MB of cache buffer in which to store all the information in a sector or cylinder in case it's needed. This is very effective in speeding up both throughput and access times. A hard drive also requires servo information, which provides a continuous update on the location of the heads. This can be stored on a separate platter, or it can be intermingled with the actual data on all the platters. A separate servo platter is more expensive, but it speeds up access times, since the data heads won't need to waste any time sending servo information.

However, the servo and data platters can get out of alignment due to changes in temperature. To prevent this, the drive constantly rechecks itself in a process called thermal recalibration. During multimedia playback this can cause sudden pauses in data transfer, resulting in stuttered audio and dropped video frames. Where the servo information is stored on the data platters, thermal recalibration isn't required. For this reason the majority of drives embed the servo information with the data.

File systems
The precise manner in which data is organized on a hard disk drive is determined by the file system used. File systems are generally operating system dependent. However, since it is the most widely used PC operating system, most other operating systems' file systems are at least read-compatible with Microsoft Windows.

The FAT file system was first introduced in the days of MS-DOS way back in 1981. The purpose of the File Allocation Table is to provide the mapping between clusters - the basic unit of logical storage on a disk at the operating system level - and the physical location of data in terms of cylinders, tracks and sectors - the form of addressing used by the drive's hardware controller.

The FAT contains an entry for every file stored on the volume that contains the address of the file's starting cluster. Each cluster contains a pointer to the next cluster in the file, or an end-of-file indicator at (0xFFFF), which indicates that this cluster is the end of the file. The diagram shows three files: File1.txt uses three clusters, File2.txt is a fragmented file that requires three clusters and File3.txt fits in one cluster. In each case, the file allocation table entry points to the first cluster of the file.

The first incarnation of FAT was known as FAT12, which supported a maximum partition size of 8MB. This was superseded in 1984 by FAT16, which increased the maximum partition size to 2GB. FAT16 has undergone a number of minor modifications over the years, for example, enabling it to handle file names longer than the original limitation of 8.3 characters. FAT16's principal limitation is that it imposes a fixed maximum number of clusters per partition, meaning that the bigger the hard disk, the bigger the cluster size and the more unusable space on the drive. The biggest advantage of FAT16 is that it is compatible across a wide variety of operating systems, including Windows 95/98/Me, OS/2, Linux and some versions of UNIX.

Dating from the Windows 95 OEM Service Release 2 (OSR2), Windows has supported both FAT16 and FAT32. The latter is little more than an extension of the original FAT16 file system that provides for a much larger number of clusters per partition. As such, it offers greatly improved disk utilization over FAT16. However, FAT32 shares all of the other limitations of FAT16 plus the additional one that many non-Windows operating systems that are FAT16-compatible will not work with FAT32. This makes FAT32 inappropriate for dual-boot environments, although while other operating systems such as Windows NT can't directly read a FAT32 partition, they can read it across the network. It's no problem, therefore, to share information stored on a FAT32 partition with other computers on a network that are running older versions of Windows.

With the advent of Windows XP in October 2001, support was extended to include the NTFS. NTFS is a completely different file system from FAT that was introduced with first version of Windows NT in 1993. Designed to address many of FAT's deficiencies, it provides for greatly increased privacy and security. The Home edition of Windows XP allows users to keep their information private to themselves, while the Professional version supports access control and encryption of individual files and folders. The file system is inherently more resilient than FAT, being less likely to suffer damage in the event of a system crash and it being more likely that any damage is recoverable via the chkdsk.exe utility. NTFS also journalizes all file changes, so as to allow the system to be rolled back to an earlier, working state in the event of some catastrophic problem rendering the system inoperable.

FAT16, FAT32 and NTFS each use different cluster sizes depending on the size of the volume, and each file system has a maximum number of clusters it can support. The smaller the cluster size, the more efficiently a disk stores information because unused space within a cluster cannot be used by other files; the more clusters supported, the larger the volumes or partitions that can be created.

The table below provides a comparison of volume and default cluster sizes for the different Windows file systems still commonly in use:

Volume Size	FAT16 Cluster Size	FAT32 Cluster Size	NTFS Cluster Size
7MB – 16MB	2KB	Not supported	512 bytes
17MB – 32MB	512 bytes	Not supported	512 bytes
33MB – 64MB	1KB	512 bytes	512 bytes
65MB – 128MB	2KB	1KB	512 bytes
129MB – 256MB	4KB	2KB	512 bytes
257MB – 512MB	8KB	4KB	512 bytes
513MB – 1GB	16KB	4KB	1KB
1GB – 2GB	32KB	4KB	2KB
2GB – 4GB	64KB	4KB	4KB
4GB – 8GB	Not supported	4KB	4KB
8GB – 16GB	Not supported	8KB	4KB
16GB – 32GB	Not supported	16KB	4KB
32GB – 2TB	Not supported	Not supported	4KB

Performance
The performance of a hard disk is very important to the overall speed of the system - a slow hard disk having the potential to hinder a fast processor like no other system component - and the effective speed of a hard disk is determined by a number of factors.

Chief among them is the rotational speed of the platters. Disk RPM is a critical component of hard drive performance because it directly impacts the latency and the disk transfer rate. The faster the disk spins, the more data passes under the magnetic heads that read the data; the slower the RPM, the higher the mechanical latencies. Hard drives only spin at one constant speed, and for some time most fast EIDE hard disks span at 5,400rpm, while a fast SCSI drive was capable of 7,200rpm. In 1997 Seagate pushed spin speed to a staggering 10,033rpm with the launch of its UltraSCSI Cheetah drive and, in mid 1998, was also the first manufacturer to release an EIDE hard disk with a spin rate of 7,200rpm.

In 1999 Hitachi broke the 10,000rpm barrier with the introduction of its Pegasus II SCSI drive. This spins at an amazing 12,000rpm - which translates into an average latency of 2.49ms. Hitachi has used an ingenious design to reduce the excessive heat produced by such a high spin rate. In a standard 3.5in hard disk, the physical disk platters have a 3in diameter. However, in the Pegasus II, the platter size has been reduced to 2.5in. The smaller platters cause less air friction and therefore reduce the amount of heat generated by the drive. In addition, the actual drive chassis is one big heat fin, which also helps dissipate the heat. The downside is that since the platters are smaller and have less data capacity, there are more of them and consequently the height of the drive is increased.

Mechanical latencies, measured in milliseconds, include both seek time and rotational latency. "Seek Time" is measured defines the amount of time it takes a hard drive's read/write head to find the physical location of a piece of data on the disk. "Latency" is the average time for the sector being accessed to rotate into position under a head, after a completed seek. It is easily calculated from the spindle speed, being the time for half a rotation. A drive's "average access time" is the interval between the time a request for data is made by the system and the time the data is available from the drive. Access time includes the actual seek time, rotational latency, and command processing overhead time.

The "disk transfer rate" (sometimes called media rate) is the speed at which data is transferred to and from the disk media (actual disk platter) and is a function of the recording frequency. It is generally described in megabytes per second (MBps). Modern hard disks have an increasing range of disk transfer rates from the inner diameter to the outer diameter of the disk. This is called a "zoned" recording technique. The key media recording parameters relating to density per platter are Tracks Per Inch (TPI) and Bits Per Inch (BPI). A track is a circular ring around the disk. TPI is the number of these tracks that can fit in a given area (inch). BPI defines how many bits can be written onto one inch of a track on a disk surface.

The "host transfer rate" is the speed at which the host computer can transfer data across the IDE/EIDE or SCSI interface to the CPU. It is more generally referred to as the data transfer rate, or DTR, and can be the source of some confusion. Some vendors list the internal transfer rate, the rate at which the disk moves data from the head to its internal buffers. Others cite the burst data transfer rate, the maximum transfer rate the disk can attain under ideal circumstances and for a short duration. More important for the real world is the external data transfer rate, or how fast the hard disk actually transfers data to a PC's main memory.

By late 2001 the fastest high-performance drives were capable of an average latency of less than 3ms, an average seek time of between 4 and 7ms and maximum data transfer rates in the region of 50 and 60MBps for EIDE and SCSI-based drives respectively. Note the degree to which these maximum DTRs are below the bandwidths of the current versions of the drive's interfaces - Ultra ATA/100 and UltraSCSI 160 - which are rated at 100MBps and 160MBps respectively.

AV capability
Audio-visual applications require different performance characteristics than are required of a hard disk drive used for regular, everyday computer use. Typical computer usage involves many requests for relatively small amounts of data. By contrast, AV applications - digital audio recording, video editing and streaming, CD writing, etc. - involve large block transfers of sequentially stored data. Their prime requirement is for a steady, uninterrupted stream of data, so that any "dropout" in the analogue output is avoided.

In the past this meant the need for specially designed, or at the very least suitably optimized, hard disk drives. However, with the progressive increase in the bandwidth of both the EIDE and SCSI interfaces over the years, the need for special AV rated drives has become less and less. Indeed, Micropolis - a company that specialize in AV drives - went out of business as long ago as 1997.

The principal characteristic of an " AV drive" centered on the way that it handled thermal recalibration. As a hard drive operates, the temperature inside the drive rises causing the disk platters to expand (as most materials do when they heat up). In order to compensate for this phenomenon, hard drives would periodically recalibrate themselves to ensure the read and write heads remain perfectly aligned over the data tracks. Thermal recalibration (also known as "T-cal") is a method of re-aligning the read/write heads, and whilst it is happening, no data can be read from or written to the drive.

In the past, non-AV drives entered a calibration cycle on a regular schedule regardless of what the computer and the drive happened to be doing. Drives rated as "AV" have employed a number of different techniques to address the problem. Many handled T-cal by rescheduling or postponing it until such time that the drive is not actively capturing data. Some additionally used particularly large cache buffers or caching schemes that were optimised specifically and exclusively for AV applications, incurring a significant performance loss in non-AV applications.

By the start of the new millennium the universal adoption of embedded servo technology by hard disk manufacturers meant that thermal recalibration was no longer an issue. This effectively weaves head-positioning information amongst the data on discs, enabling drive heads to continuously monitor and adjust their position relative to the embedded reference points. The disruptive need for a drive to briefly pause data transfer to correctly position its heads during thermal recalibration routines is thereby completely eliminated.

Capacity
Since its advent in 1955, the magnetic recording industry has constantly and dramatically increased the performance and capacity of hard disk drives to meet the computer industry's insatiable demand for more and better storage. The a real density storage capacity of hard drives has increased at a historic rate of roughly 27% per year - peaking in the 1990s to as much as 60% per year - with the result that by the end of the millennium disk drives were capable of storing information in the 600-700 Mbits/in² range.

The read-write head technology that has sustained the hard disk drive industry through much of this period is based on the inductive voltage produced when a permanent magnet (the disk) moves past a wire-wrapped magnetic core (the head). Early recording heads were fabricated by wrapping wire around a laminated iron core analogous to the horseshoe-shaped electromagnets found in elementary school physics classes. Market acceptance of hard drives, coupled with increasing areal density requirements, fuelled a steady progression of inductive recording head advances. This progression culminated in advanced thin-film inductive (TFI) read-write heads capable of being fabricated in the necessary high volumes using semiconductor-style processors.

Although it was conceived in the 1960s, it was not until the late 1970s that TFI technology was actually deployed in commercially available product. The TFI read/write head - which essentially consists of wired, wrrapped magnetic cores which produce a voltage when moved past a magnetic hard disk platter - went on to become the industry standard until the mid-1990s. By this time it became impractical to increase a real density in the conventional way - by increasing the sensitivity of the head to magnetic flux changes by adding turns to the TFI head's coil - because this increased the head's inductance to levels that limited its ability to write data.

The solution lay in the phenomenon discovered by Lord Kelvin in 1857 - that the resistance of ferromagnetic alloy changes as a function of an applied magnetic field - known as the anisotropic magnetoresistance (AMR) effect.

Capacity barriers
Whilst Bill Gates' assertion that "640KB ought to be enough for anyone" is the most famous example of lack of foresight when it comes to predicting capacity requirements, it is merely symptomatic of a trait that has afflicted the PC industry since its beginnings in the early 1980s. In the field of hard disk technology at least 10 different capacity barriers occurred in the space of 15 years. Several have been the result of BIOS or operating system issues, a consequence of either short-sighted design, restrictions imposed by file systems of the day or simply as a result of bugs in hardware or software implementations. Others have been caused by limitations in the associated hard disk drive standards themselves.

IDE hard drives identify themselves to the system BIOS by the number of cylinders, heads and sectors per track. This information is then stored in the CMOS. Sectors are always 512 bytes in size. Therefore, the capacity of a drive can be determined by multiplying the number of cylinders by the number of sectors by 512. The BIOS interface allows for a maximum of 1024 cylinders, 255 heads and 63 sectors. This calculates out at 504MiB. The IEC's binary megabyte notation was intended to address the confusion caused by the fact that this capacity is referred to as 528MB by drive manufacturers, who consider a megabyte to be 1,000,000 bytes instead of the binary programming standard of 1,048,576 bytes.

The 528MB barrier was the most infamous of all the hard disk capacity restrictions and primarily affected PCs with BIOSes created before mid-1994. It arose because of the restriction of the number of addressable cylinders to 1,024. It's removal - which led to the "E" (for Enhanced) being added to the IDE specification - by abandoning the cylinders, heads and sectors (CHS) addressing technique in favour of logical block addressing, or LBA. This is also referred to as the BIOS Int13h extensions. With this system the BIOS translates the cylinder, head and sector (CHS) information into a 28-bit logical block address, allowing operating systems and applications to access much larger drives.

Unfortunately, the designers of the system BIOS and the ATA interface did not set up the total bytes used for addressing in the same manner, nor did they define the same number of bytes for the cylinder, head, and sector addressing. The differences in the CHS configurations required that there be a translation of the address when data was sent from the system (using the system BIOS) and the ATA interface. The result was that the introduction of LBA did not immediately solve the problem of the 528MB barrier and also gave rise to a further restriction at 8.4GB.

The 8.4GB barrier involved the total addressing space that was defined for the system BIOS. Prior to 1997 most PC systems were limited to accessing drives with a capacity of 8.4GB or less. The reason for this was that although the ATA interface used 28-bit addressing which supported drive capacities up to 2**28 x 512 bytes or 137GB, the BIOS Int13h standard imposed a restriction of 24-bit addressing, thereby limiting access to a maximum of only 2**24 x 512 bytes or 8.4GB.

The solution to the 8.4GB barrier was an enhancement of the Int13h standard by what is referred to as Int13h extensions. This allows for a quad-word or 64 bits of addressing, which is equal to 2**64 x 512 bytes or 9.4 x 10**21 bytes. That is 9.4 Tera Gigabytes or over a trillion times as large as an 8.4GB drive. It was not until after mid-1998 that systems were being built that properly supported the BIOS Int13h extensions.

By the beginning of the new millennium, and much to the embarrassment of the drive and BIOS manufacturers, the 137GB limit imposed by the ATA interface's 28-bit addressing was itself beginning to look rather restrictive. However - better late than never - it appears as though the standards bodies may have finally learnt from their previous mistakes. The next version of the EIDE protocol (ATA-6) - being reviewed by the ANSI committee in the autumn of 2001 - allows for 48 bits of address space, giving a maximum addressable limit of 144PB (Petabytes). That's 100,000 times higher than the current barrier and, on previous form, sufficient for the next 20 years at least!.