Peter Gutmann
Department of Computer Science
University of Auckland
pgut001@cs.auckland.ac.nz
This paper was first published in the Sixth USENIX Security Symposium
Proceedings, San Jose, California, July 22-25, 1996
Abstract
With the use of increasingly sophisticated encryption systems, an attacker
wishing to gain access to sensitive data is forced to look elsewhere for
information. One avenue of attack is the recovery of supposedly erased data
from magnetic media or random-access memory. This paper covers some of the
methods available to recover erased data and presents schemes to make this
recovery significantly more difficult.
1. Introduction
Much research has gone into the design of highly secure encryption systems
intended to protect sensitive information. However work on methods of securing
(or at least safely deleting) the original plaintext form of the encrypted data
against sophisticated new analysis techniques seems difficult to find. In the
1980's some work was done on the recovery of erased data from magnetic media
[1] [2] [3], but to date the
main source of information is government standards covering the destruction of
data. There are two main problems with these official guidelines for
sanitizing media. The first is that they are often somewhat old and may
predate newer techniques for both recording data on the media and for
recovering the recorded data. For example most of the current guidelines on
sanitizing magnetic media predate the early-90's jump in recording densities,
the adoption of sophisticated channel coding techniques such as PRML, the use
of magnetic force microscopy for the analysis of magnetic media, and recent
studies of certain properties of magnetic media recording such as the behaviour
of erase bands. The second problem with official data destruction standards is
that the information in them may be partially inaccurate in an attempt to fool
opposing intelligence agencies (which is probably why a great many guidelines
on sanitizing media are classified). By deliberately under-stating the
requirements for media sanitization in publicly-available guides, intelligence
agencies can preserve their information-gathering capabilities while at the
same time protecting their own data using classified techniques.
This paper represents an attempt to analyse the problems inherent in trying to
erase data from magnetic disk media and random-access memory without access to
specialised equipment, and suggests methods for ensuring that the recovery of
data from these media can be made as difficult as possible for an attacker.
2. Methods of Recovery for Data stored on Magnetic Media
Magnetic force microscopy (MFM) is a recent technique for imaging magnetization
patterns with high resolution and minimal sample preparation. The technique is
derived from scanning probe microscopy (SPM) and uses a sharp magnetic tip
attached to a flexible cantilever placed close to the surface to be analysed,
where it interacts with the stray field emanating from the sample. An image of
the field at the surface is formed by moving the tip across the surface and
measuring the force (or force gradient) as a function of position. The
strength of the interaction is measured by monitoring the position of the
cantilever using an optical interferometer or tunnelling sensor.
Magnetic force scanning tunneling microscopy (STM) is a more recent variant of
this technique which uses a probe tip typically made by plating pure nickel
onto a prepatterned surface, peeling the resulting thin film from the substrate
it was plated onto and plating it with a thin layer of gold to minimise
corrosion, and mounting it in a probe where it is placed at some small bias
potential (typically a few tenths of a nanoamp at a few volts DC) so that
electrons from the surface under test can tunnel across the gap to the probe
tip (or vice versa). The probe is scanned across the surface to be analysed as
a feedback system continuously adjusts the vertical position to maintain a
constant current. The image is then generated in the same way as for MFM [4] [5]. Other techniques which have been used in
the past to analyse magnetic media are the use of ferrofluid in combination
with optical microscopes (which, with gigabit/square inch recording density is
no longer feasible as the magnetic features are smaller than the wavelength of
visible light) and a number of exotic techniques which require significant
sample preparation and expensive equipment. In comparison, MFM can be
performed through the protective overcoat applied to magnetic media, requires
little or no sample preparation, and can produce results in a very short
time.
Even for a relatively inexperienced user the time to start getting images of
the data on a drive platter is about 5 minutes. To start getting useful images
of a particular track requires more than a passing knowledge of disk formats,
but these are well-documented, and once the correct location on the platter is
found a single image would take approximately 2-10 minutes depending on the
skill of the operator and the resolution required. With one of the more
expensive MFM's it is possible to automate a collection sequence and
theoretically possible to collect an image of the entire disk by changing the
MFM controller software.
There are, from manufacturers sales figures, several thousand SPM's in use in
the field today, some of which have special features for analysing disk drive
platters, such as the vacuum chucks for standard disk drive platters along with
specialised modes of operation for magnetic media analysis. These SPM's can be
used with sophisticated programmable controllers and analysis software to allow
automation of the data recovery process. If commercially-available SPM's are
considered too expensive, it is possible to build a reasonably capable SPM for
about US$1400, using a PC as a controller [6].
Faced with techniques such as MFM, truly deleting data from magnetic media is
very difficult. The problem lies in the fact that when data is written to the
medium, the write head sets the polarity of most, but not all, of the magnetic
domains. This is partially due to the inability of the writing device to write
in exactly the same location each time, and partially due to the variations in
media sensitivity and field strength over time and among devices.
In conventional terms, when a one is written to disk the media records a one,
and when a zero is written the media records a zero. However the actual effect
is closer to obtaining a 0.95 when a zero is overwritten with a one, and a 1.05
when a one is overwritten with a one. Normal disk circuitry is set up so that
both these values are read as ones, but using specialised circuitry it is
possible to work out what previous "layers" contained. The recovery of at least
one or two layers of overwritten data isn't too hard to perform by reading the
signal from the analog head electronics with a high-quality digital sampling
oscilloscope, downloading the sampled waveform to a PC, and analysing it in
software to recover the previously recorded signal. What the software does is
generate an "ideal" read signal and subtract it from what was actually read,
leaving as the difference the remnant of the previous signal. Since the analog
circuitry in a commercial hard drive is nowhere near the quality of the
circuitry in the oscilloscope used to sample the signal, the ability exists to
recover a lot of extra information which isn't exploited by the hard drive
electronics (although with newer channel coding techniques such as PRML
(explained further on) which require extensive amounts of signal processing,
the use of simple tools such as an oscilloscope to directly recover the data is
no longer possible).
Using MFM, we can go even further than this. During normal readback, a
conventional head averages the signal over the track, and any remnant
magnetization at the track edges simply contributes a small percentage of noise
to the total signal. The sampling region is too broad to distinctly detect the
remnant magnetization at the track edges, so that the overwritten data which is
still present beside the new data cannot be recovered without the use of
specialised techniques such as MFM or STM (in fact one of the "official" uses
of MFM or STM is to evaluate the effectiveness of disk drive servo-positioning
mechanisms) [7]. Most drives are capable of microstepping the
heads for internal diagnostic and error recovery purposes (typical error
recovery strategies consist of rereading tracks with slightly changed data
threshold and window offsets and varying the head positioning by a few percent
to either side of the track), but writing to the media while the head is
off-track in order to erase the remnant signal carries too much risk of making
neighbouring tracks unreadable to be useful (for this reason the microstepping
capability is made very difficult to access by external means).
These specialised techniques also allow data to be recovered from magnetic
media long after the read/write head of the drive is incapable of reading
anything useful. For example one experiment in AC erasure involved driving the
write head with a 40 MHz square wave with an initial current of 12 mA which was
dropped in 2 mA steps to a final level of 2 mA in successive passes, an order
of magnitude more than the usual write current which ranges from high microamps
to low milliamps. Any remnant bit patterns left by this erasing process were
far too faint to be detected by the read head, but could still be observed
using MFM [8].
Even with a DC erasure process, traces of the previously recorded signal may
persist until the applied DC field is several times the media coercivity [9].
Deviations in the position of the drive head from the original track may leave
significant portions of the previous data along the track edge relatively
untouched. Newly written data, present as wide alternating light and dark bands
in MFM and STM images, are often superimposed over previously recorded data
which persists at the track edges. Regions where the old and new data coincide
create continuous magnetization between the two. However, if the new
transition is out of phase with the previous one, a few microns of erase band
with no definite magnetization are created at the juncture of the old and new
tracks. The write field in the erase band is above the coercivity of the media
and would change the magnetization in these areas, but its magnitude is not
high enough to create new well- defined transitions. One experiment involved
writing a fixed pattern of all 1's with a bit interval of 2.5 µm, moving the
write head off-track by approximately half a track width, and then writing the
pattern again with a frequency slightly higher than that of the previously
recorded track for a bit interval of 2.45 µm to create all possible phase
differences between the transitions in the old and new tracks. Using a 4.2 µm
wide head produced an erase band of approximately 1 µm in width when the old
and new tracks were 180° out of phase, dropping to almost nothing when the two
tracks were in-phase. Writing data at a higher frequency with the original
tracks bit interval at 0.5 µm and the new tracks bit interval at 0.49 µm allows
a single MFM image to contain all possible phase differences, showing a
dramatic increase in the width of the erase band as the two tracks move from
in-phase to 180° out of phase [10].
In addition, the new track width can exhibit modulation which depends on the
phase relationship between the old and new patterns, allowing the previous data
to be recovered even if the old data patterns themselves are no longer
distinct. The overwrite performance also depends on the position of the write
head relative to the originally written track. If the head is directly aligned
with the track, overwrite performance is relatively good; as the head moves
offtrack, the performance drops markedly as the remnant components of the
original data are read back along with the newly-written signal. This effect is
less noticeable as the write frequency increases due to the greater attenuation
of the field with distance [11].
When all the above factors are combined it turns out that each track contains
an image of everything ever written to it, but that the contribution from each
"layer" gets progressively smaller the further back it was made. Intelligence
organisations have a lot of expertise in recovering these palimpsestuous
images.
3. Erasure of Data stored on Magnetic Media
The general concept behind an overwriting scheme is to flip each magnetic
domain on the disk back and forth as much as possible (this is the basic idea
behind degaussing) without writing the same pattern twice in a row. If the
data was encoded directly, we could simply choose the desired overwrite pattern
of ones and zeroes and write it repeatedly. However, disks generally use some
form of run-length limited (RLL) encoding, so that the adjacent ones won't be
written. This encoding is used to ensure that transitions aren't placed too
closely together, or too far apart, which would mean the drive would lose track
of where it was in the data.
To erase magnetic media, we need to overwrite it many times with alternating
patterns in order to expose it to a magnetic field oscillating fast enough that
it does the desired flipping of the magnetic domains in a reasonable amount of
time. Unfortunately, there is a complication in that we need to saturate the
disk surface to the greatest depth possible, and very high frequency signals
only "scratch the surface" of the magnetic medium. Disk drive manufacturers,
in trying to achieve ever-higher densities, use the highest possible
frequencies, whereas we really require the lowest frequency a disk drive can
produce. Even this is still rather high. The best we can do is to use the
lowest frequency possible for overwrites, to penetrate as deeply as possible
into the recording medium.
The write frequency also determines how effectively previous data can be
overwritten due to the dependence of the field needed to cause magnetic
switching on the length of time the field is applied. Tests on a number of
typical disk drive heads have shown a difference of up to 20 dB in overwrite
performance when data recorded at 40 kFCI (flux changes per inch), typical of
recent disk drives, is overwritten with a signal varying from 0 to 100 kFCI.
The best average performance for the various heads appears to be with an
overwrite signal of around 10 kFCI, with the worst performance being at 100
kFCI [12]. The track write width is also affected by the
write frequency - as the frequency increases, the write width decreases for
both MR and TFI heads. In [13] there was a decrease in write
width of around 20% as the write frequency was increased from 1 to 40 kFCI,
with the decrease being most marked at the high end of the frequency range.
However, the decrease in write width is balanced by a corresponding increase in
the two side- erase bands so that the sum of the two remains nearly constant
with frequency and equal to the DC erase width for the head. The media
coercivity also affects the width of the write and erase bands, with their
width dropping as the coercivity increases (this is one of the explanations for
the ever-increasing coercivity of newer, higher-density drives).
To try to write the lowest possible frequency we must determine what decoded
data to write to produce a low-frequency encoded signal.
In order to understand the theory behind the choice of data patterns to write,
it is necessary to take a brief look at the recording methods used in disk
drives. The main limit on recording density is that as the bit density is
increased, the peaks in the analog signal recorded on the media are read at a
rate which may cause them to appear to overlap, creating intersymbol
interference which leads to data errors. Traditional peak detector read
channels try to reduce the possibility of intersymbol interference by coding
data in such a way that the analog signal peaks are separated as far as
possible. The read circuitry can then accurately detect the peaks (actually
the head itself only detects transitions in magnetisation, so the simplest
recording code uses a transition to encode a 1 and the absence of a transition
to encode a 0. The transition causes a positive/negative peak in the head
output voltage (thus the name "peak detector read channel"). To recover the
data, we differentiate the output and look for the zero crossings). Since a
long string of 0's will make clocking difficult, we need to set a limit on the
maximum consecutive number of 0's. The separation of peaks is implemented as
some form of run-length-limited, or RLL, coding.
The RLL encoding used in most current drives is described by pairs of
run-length limits (d, k), where d is the minimum number of 0
symbols which must occur between each 1 symbol in the encoded data, and
k is the maximum. The parameters (d, k) are chosen to place
adjacent 1's far enough apart to avoid problems with intersymbol interference,
but not so far apart that we lose synchronisation.
The grandfather of all RLL codes was FM, which wrote one user data bit followed
by one clock bit, so that a 1 bit was encoded as two transitions (1 wavelength)
while a 0 bit was encoded as one transition (« wavelength). A different
approach was taken in modified FM (MFM), which suppresses the clock bit except
between adjacent 0's (the ambiguity in the use of the term MFM is unfortunate.
From here on it will be used to refer to modified FM rather than magnetic force
microscopy). Taking three example sequences 0000, 1111, and 1010, these will be
encoded as 0(1)0(1)0(1)0, 1(0)1(0)1(0)1, and 1(0)0(0)1(0)0 (where the ()s are
the clock bits inserted by the encoding process). The maximum time between 1
bits is now three 0 bits (so that the peaks are no more than four encoded time
periods apart), and there is always at least one 0 bit (so that the peaks in
the analog signal are at least two encoded time periods apart), resulting in a
(1,3) RLL code. (1,3) RLL/MFM is the oldest code still in general use today,
but is only really used in floppy drives which need to remain
backwards-compatible.
These constraints help avoid intersymbol interference, but the need to separate
the peaks reduces the recording density and therefore the amount of data which
can be stored on a disk. To increase the recording density, MFM was gradually
replaced by (2,7) RLL (the original "RLL" format), and that in turn by (1,7)
RLL, each of which placed less constraints on the recorded signal.
Using our knowledge of how the data is encoded, we can now choose which decoded
data patterns to write in order to obtain the desired encoded signal. The
three encoding methods described above cover the vast majority of magnetic disk
drives. However, each of these has several possible variants. With MFM, only
one is used with any frequency, but the newest (1,7) RLL code has at least half
a dozen variants in use. For MFM with at most four bit times between
transitions, the lowest write frequency possible is attained by writing the
repeating decoded data patterns 1010 and 0101. These have a 1 bit every other
"data" bit, and the intervening "clock" bits are all 0. We would also like
patterns with every other clock bit set to 1 and all others set to 0, but these
are not possible in the MFM encoding (such "violations" are used to generate
special marks on the disk to identify sector boundaries). The best we can do
here is three bit times between transitions, which is generated by repeating
the decoded patterns 100100, 010010 and 001001. We should use several passes
with these patterns, as MFM drives are the oldest, lowest-density drives around
(this is especially true for the very-low-density floppy drives). As such,
they are the easiest to recover data from with modern equipment and we need to
take the most care with them.
From MFM we jump to the next simplest case, which is (1,7) RLL. Although there
can be as many as 8 bit times between transitions, the lowest sustained
frequency we can have in practice is 6 bit times between transitions. This is a
desirable property from the point of view of the clock-recovery circuitry, and
all (1,7) RLL codes seem to have this property. We now need to find a way to
write the desired pattern without knowing the particular (1,7) RLL code used.
We can do this by looking at the way the drives error-correction system works.
The error- correction is applied to the decoded data, even though errors
generally occur in the encoded data. In order to make this work well, the data
encoding should have limited error amplification, so that an erroneous encoded
bit should affect only a small, finite number of decoded bits.
Decoded bits therefore depend only on nearby encoded bits, so that a repeating
pattern of encoded bits will correspond to a repeating pattern of decoded bits.
The repeating pattern of encoded bits is 6 bits long. Since the rate of the
code is 2/3, this corresponds to a repeating pattern of 4 decoded bits. There
are only 16 possibilities for this pattern, making it feasible to write all of
them during the erase process. So to achieve good overwriting of (1,7) RLL
disks, we write the patterns 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111,
1000, 1001, 1010, 1011, 1100, 1101, 1110, and 1111. These patterns also
conveniently cover two of the ones needed for MFM overwrites, although we
should add a few more iterations of the MFM-specific patterns for the reasons
given above.
Finally, we have (2,7) RLL drives. These are similar to MFM in that an
eight-bit-time signal can be written in some phases, but not all. A
six-bit-time signal will fill in the remaining cracks. Using a « encoding
rate, an eight-bit-time signal corresponds to a repeating pattern of 4 data
bits. The most common (2,7) RLL code is shown below: