To most computer users, the hard drive is a mysterious and little known device. Everyone knows a vast array of information resides on one’s computer, but few know how this process works. Along with their definitions, the relationship of hard drive terms like sectors, clusters, file slack, and unallocated space is examined below.
Computer Hard Drive Operations
There are three basic layers involved in the process of writing and erasing information from a computer hard drive: (1) the hardware layer, (2) the Operating System layer, and (3) the application layer.
1. The Hardware Layer
The main storage medium utilized by computers is a device called a hard drive, also known as a hard disk, disk drive, hard disk drive and abbreviated as HD or HDD. Computer hard drives come in various physical sizes (1.8″, 2.5″, 3.5″, and the older 5.25″), capacities (how much information it can store), and interfaces (how it communicates with the computer). You may have heard of terms like platter, track, cylinder, and sector used in reference to computer hard drives. These are all part of the hardware layer.
Most hard drives are comprised of one or more platters specially coated to allow information to be stored magnetically. This does not take into account the newer Solid State disk drives since they are rarely used for now. The platters are stacked on a spindle and rotated at high speed from 4,200 to 15,000 revolutions per minute. Each platter has two read/write heads dedicated to it, one on top and one on the bottom. Both surfaces of a platter can hold tens of billions of individual bits of information. A bit is a simple yes or no, true or false, 1 or 0, magnetically charged or not.
The platter itself is broken into tracks (tightly-packed concentric rings). As shown in Figure 1, these tracks are concentric, unlike the continuous spiral of a phonograph record. Each platter contains thousands of these tracks, with the quantity getting larger as technology improves.
Figure 1: Platter with Four Tracks
Most hard drives have multiple platters stacked on top of each other, and a cylinder consists of identically positioned tracks from each platter (see Figure 2). For example, each “Track 0”, from both sides of all platters, collectively comprises one cylinder. This term is a leftover relic from the days before LBA (Logical Block Addressing).
Figure 2: A Cylinder
Back to tracks; a track holds entirely too much information to be suitable as the smallest unit of storage on a hard drive, so each track is further broken down into sectors (see Figure 3). A sector can hold 512 bytes of information. Today’s hard drives can have thousands of these sectors in a single track. Just like the track density on a platter, the sector density increases as technology improves.
Figure 3: A Sector
2. The Operating System Layer
The Operating System (OS), such as the various Microsoft Windows products, Linux, Unix, and MacOS, hides the application layer (Section 3, below) from the complexities of the hardware layer (Section 1, above). The operating system utilizes “partitions” to create “clusters”, “volumes”, and “file systems” on a hard drive. Only then can it organize, store, retrieve, and delete information from the hard drive.
The terms partitions and volumes are often used interchangeably. In the strictest sense, this is not accurate. While they may be contained in the same basic space of the hard drive, a volume is actually created within a partition.
A partition is an area of a hard drive reserved for use through an entry in the partition table of that hard drive. Barring a few specialized partitions, these partitions are accessible for use by virtually all operating systems. In contrast, a volume is a logical storage unit contained within a partition and formatted with a given file system.
For example, let’s consider a computer with the Windows Operating System and a single 250GB (gigabyte) hard drive arranged as in the table below. The Primary Partition contains a single volume assigned to drive letter “C”. The Extended Partition contains 2 volumes of 100GB each and assigned to drive letters “D” and “E” respectively.
Table 1: Partitions versus Volumes
Most computers shipped today will have a single hard drive with two partitions, one of which is hidden or at least write protected. The user observable partition/volume is labeled “C” which holds the operating system used to run the computer and stores the computer owner’s programs and files. The hidden partition is a recovery partition that holds the manufacturer supplied operating system install files, device drivers and default applications.
A computer’s file system (also written as filesystem) is a system or method of storing and retrieving data on a computer system that allows for a hierarchy of directories, subdirectories, and files. A file system creates and maintains structures allowing files to be created, moved, copied, deleted, located, truncated, and appended. A computer file system is comparable to a library Dewey Decimal System (DDS) in that it is portable across all computers with support for that file system. There are many file systems available, such as, various versions of FAT (File Allocation Table), NTFS (New Technology File System), UFS (Unix File System), various versions of EXT (Extended File System), HPFS (High Performance File System), and more. All Operating Systems can support one or more file systems.
A cluster is a set of consecutive sectors. Depending on the operating system and the file system being utilized, the number of sectors in each cluster will vary. For example, most Microsoft Windows 2003, 2008, XP, Vista, and the new Windows 7 systems utilize the New Technology File System (NTFS) with eight sectors per cluster by default. Because the size of each sector is 512 bytes, each cluster is 4,096 bytes or 4KB in size (8 sectors x 512 bytes each). A single cluster can typically store about 680 words, such as the first few paragraphs of a legal pleading or forensic report. One cluster is the smallest amount of hard drive space that can be allocated to any given file, though most files require the use of more than one cluster.
Any cluster currently assigned to a file is considered allocated. Figure 4 below represents an excerpt of an allocated cluster. The first column indicates the byte position within the cluster, the center section contains the data in hexadecimal format, and the last column shows the textual representation of the hexadecimal code. While Figure 4 is a common way for computer forensic examiners to view clusters of a hard drive, all information is actually stored in binary format (1’s and 0’s).
Figure 4: An Allocated 4096 Byte Cluster with 45 Bytes of Data
Any cluster that is not currently assigned to a file is referred as unallocated. Figure 4 could represent an empty cluster if the first three lines were all zeros. Because a cluster can, and often does, contain data even when not currently assigned to a file, Figure 4 could also represent an unallocated cluster that still contains data from a file to which the cluster was previously allocated. In many cases, complete files can be recovered from unallocated space.
Let’s examine the difference between logical file size and physical file size. Most operating systems, including Windows, keep track of the exact size of a file in bytes. This is the logical size of the file and is the number that you see in the directory listing. In Figure 4, the logical file size is 45 bytes. The physical size of a file is based on the number of clusters allocated to the file. Since one cluster is the smallest unit of space that can be allocated to any given file, the physical size of the file represented by Figure 4 is 4096 bytes (some special NTFS conditions are ignored for simplicity). If a file contained 4097 bytes of data, its logical size would be 4097 bytes, but its physical size would be 8192 bytes.
The area from the end of the logical file to the end of the cluster is called slack space. Slack space can be further dissected into two additional pieces. When data is written to a hard drive, it is written in blocks of 512 bytes, or one sector. If only one byte needs to be written, the operating system must add another 511 bytes in order to write 512 bytes. This additional 511 bytes would simply be read from memory. Due to security concerns, this area is now written with zeros as the filler. Because of this history, the portion of the slack space from the end of the logical file to the end of the sector (not the cluster) was called RAM slack. More recently, the term sector slack has been used; both refer to the same portion of the slack space. The remainder of the slack space, from the end of sector slack to the end of the cluster, is called file slack.
Figure 5: A Single Cluster File
Once a file is deleted, its clusters are available for reallocation to another file. In Figure 5, we see the new data in blue doesn’t reach to the end of the cluster. So, the first 281 bytes contain the new file, the next 231 bytes would be sector slack filled with zeros, and in red from byte 512 to 4095 is the file slack. An everyday example using a VCR can also illustrate slack space. You put a new 2 hour tape into your VCR, and tape your favorite one hour legal drama. The remaining one hour of blank tape is analogous to slack space. After you’ve watched your one hour show, you rewind the tape and then tape a thirty minute sitcom. After watching the sitcom, you notice the last half of the first show is still there. In this case, the last 30 minutes of the legal drama and one hour of blank tape equate to slack space.
As you continue to reuse tapes, most of them will eventually contain parts of previous recordings. The same is true of hard drives. Tens of thousands of files on most hard drives don’t finish writing a cluster, and thus can contain parts of previous files. Because of file slack and previously allocated clusters, the potential exists for a very large amount of valuable information to be found.
3. The Application Layer
Due to the sheer number of available software applications, it is not possible to cover them all. Generically speaking, one of the main purposes of an application is to add structure to the information stored on a hard drive. Applications enhance the capabilities of the file system by allowing for formatting, organization, insertions, and in-line modifications to the information stored within the clusters.
With few exceptions, applications will, at some point, ask the Operating System for hard drive space to use for the storage of information. Because applications vary widely in how they utilize the space allocated to it, each application brings with it the potential for a different method of recovering its information, Outlook email databases are one example of this. Additionally, many applications store additional, often times hidden, data which can add valuable evidence to any case, such as “track changes” in Microsoft Word.
While information can be reviewed from the thousands of files contained within the computer hard drive, file slack and unallocated space are two of the most important sources of information in most computer forensic investigations. Consider these numbers obtained from an average laptop computer’s 80GB hard drive:
21,195,751 Total Clusters
13,762,379 Unallocated Clusters (All with the potential for deleted data.)
109,382 Files Saved on Disk (All with the potential for file slack.)
As computers continue to become more prominent in our everyday lives and as computer hard drives continue to grow larger, there is an ever growing potential for finding valuable evidence to support your case. Remember, what’s hidden in depths of a computer will show the true nature of its owner.
Proper analysis of this overabundance of information requires more than just a basic understanding of computers and the simple ability to recover data. Make sure your Computer Forensic Expert is up to the challenge of recovering, correlating, and analyzing this plethora of information.