Detecting and preventing file tampering
With each new
legal case,
office personnel must sort through a wide array of electronic evidence, including discovery from
external sources including opposing counsel. These documents are
shared with the courts, experts and several other parties. How can you ensure that everyone is working with the exact same set of facts? How can you determine if any of these files were altered prior to arriving in your
care?
Electronic files are typically shared by disc or download link using the honor system. Once these files have been dispersed, anyone can modify the contents to fit their narrative, and then distribute that version as an official exhibit.
Such changes can be difficult or impossible to trace to their source due
to the number of people with access to the files. To prevent this, you can identify each file by its unique Hash
value and then use that identifier to ensure file integrity once file
sharing has begun.
A Hash value is an electronic fingerprint constructed solely from the file’s contents and structure. The most common Hash is the 5th generation of the Message Digest
algorithm, commonly known as MD5. There are dozens of free programs to calculate MD5 values and,
regardless of the program used, the resulting MD5 value will always
match for exact copies of the same file.
Two easy-to-use free MD5 programs are Digestit
and Checksum.
They support drag-n-drop ease of use and require les time than reading
this paragraph. Other offerings, like Microsoft's™ File
Checksum Integrity Verifier are also free, but can be cumbersome to use.
As soon as you receive or generate an original electronic item, use your
preferred hash value program to calculate its MD5 value. You can then
include a list of all the relevant MD5 values anytime you share those files.
This is often done as a read-only text file that you control. At anytime,
a file recipient can generate their own list of MD5 values and, if they match, be confident that their file versions are identical and indistinguishable from the
file versions under your control.
Any changes, even the simple act of opening and resaving a file without any content changes,
can alter
the calculated MD5 value.
MD5 is so secure from tampering that you have significantly better odds of
winning the Powerball lottery,
four times in a row. Best of all, MD5 values work on media files, presentations, documents, DVDs,
ZIP files and anything else that can be shared electronically. Assigning MD5 hash
values is the easiest method to provide version control of all your electronic files.
NOTE: Because the MD5 algorithm is open-source, it has been reverse
engineered and compromised under controlled conditions. However, every attempt to undetectably alter
real-world
documents or media files has resulted in file corruption and/or different MD5
values. Although
there are
newer algorithms (e.g. MD6), MD5 remains secure and its support is
built
into common applications.
Once version control is in place, the next step is determining if your files were altered prior to receipt.
Although a full authentication analysis requires years of experience and sophisticated software,
some basic tests can be quickly performed using free software and minimal
guidance.
FILE AUTHENTICATION
It is well known that a computer’s operating system can display a
file's modified, accessed and created dates, collectively known as
the MAC dates. In Windows, this information is accessed by right
clicking on a file and then left clicking on the Properties option. However,
MAC information is saved by the computer's operating system and can
become inaccurate from several causes including user error, virus, file
transfer or free programs like FileDateCH. A more reliable date source is
reading the metadata hidden within the target file.
Some of this metadata can be read with the tools you already have. For
example, with Windows you right click on the file, left click on
Properties and then left click on the Details tab. You can often access
metadata by simply choosing Properties under the File menu of whatever
software you are using to open the electronic file. In the case of media
file, MediaInfo
lets you review every metadata field of most audio and video files. If
these details deviate from the known case facts, they become a strong indication of file tampering.
It is increasingly common for file metadata to include the GPS coordinates of the recording location. You can
enter these values directly into a Google™ search box to translate them into a street address. If the metadata
includes a field labeled as “@xyz” the third GPS coordinate is the above sea level altitude denoted in meters. In most cases, these GPS coordinates are
extremely accurate and can pinpoint a specific room of a high-rise
building. This accuracy results from advances in GPS technology and the
inclusion of signals form cell
phone towers (aka Enhanced GPS). GPS data can make or break a case. For
example, I recently had a case where a file’s GPS coordinates matched the address of someone with video editing skills, instead of the location depicted in the video.
The file's metadata often includes details about the software or equipment used to capture the recording, including the
user settings in effect when the file was saved. If these details do not
match the how the file is claimed to be created, then you may have strong evidence of after-the-fact file tampering. For example, if
a video’s metadata lists the file as being in a Windows format, but the event was captured on an iPhone, then you can be confident that you are not looking at the original
recording as saved by that iPhone. Metadata inconsistencies should
always be examined. Copying, downloading and sharing a file will not alter
a files metadata or MD5 value.
AUDIO AUTHENTICATION
Audacity and Audition are extremely common audio editing programs. Each
program allows the user to isolate a given frequency for deeper
analysis. Editing may be detected by disruptions in the pulse of a given
frequency, shifts in the bit rate, or shifts in the DC portion of the
signal. All of these tests require specialized training to interpret the
significance of such anomalies.
Hex editors allow the user to examine raw file data. The information at
the beginning and end of a file can provide deep insight into the
programs that affected the underlying file data. Be.HexEditor
allows the user to view and modify any bit or byte of data.
VIDEO AUTHENTICATION
When working with video, VLC
is free and indispensable program. Once you open a video with VLC, each press
of the letter “e” on your keyboard advances the video to the next
image and displays the text "Next frame". If the “e” button
makes the words appear, but the image does not change, then your video includes duplicate identical images.
This is a strong indicator that your video is a later generation
duplicate and not an exact copy of the original recording.
Modern video compression methods slice videos into a mosaic of small
squares, typically eight pixels (screen dots) wide and eight pixels
tall. If some of the duplicate images show the people and objects in the exact same position, but a few
of these squares are slightly brighter (or darker) than seen in the
otherwise duplicate frame, then the file likely originated from remote viewing software instead of being an original recording. Remote
software captures are almost always an incomplete and lower quality representation of the original
video because they sacrifice quality rather than alter the playback
speed when dealing with limited internet speeds.
On-screen information is the last thing added to a video
before it is saved onto a recording device. If any video frame shows
two different video frames or time stamps blended together, then your video
is likely the creation of a screen
capture. This overlap results from the lag time of the viewing
computer's video not matching the lag time of the screen capture method being
used.
IMAGE AUTHENTICATION
Most image viewing programs include a “Properties” or
"Info" menu option to display a picture’s metadata. If your image is in JPG format,
JpegSnoop will display
a long list of metadata fields and even indicate if there are signs of content tampering.
Document and presentation programs typically include their own “Properties” or "Info" option to display metadata fields for their file types.
This is especially important with JPG images since their structure makes
image tampering nearly impossible to detect. It should also be noted
that software (like analogexis)
can alter every metadata field of JPG and TIFF images. Similar programs
exist for other formats, including some raw and proprietary file types.
Often times you can compare a file's metadata and/or header values to
published tables (e.g. a Google search) to determine if that information
matches the expected information. By comparing the file's header and/or metadata to known facts, you can quickly separate fact from fiction.
Steganography is the ability to hide data within a host file, typically
as a text inside an image file. The hidden data cannot be detected using
metadata or hex editor tools. OpenPuff
makes it easy to extract hidden messages, passwords or even illegal
information.
These are just a few simple tests to determine if further questions should be asked or a forensic expert consulted. Deeper authentication tests (Spectrographic analysis, phase shifts detection, changes in data depth, Principal Component Analysis and Wavelet analysis, etc…)
and visual analysis (reflections, clarity, shadows, object analysis,
etc…) do require a far greater understanding of the science.
By incorporating MD5 values and basic authentication tests into your workflow, the
factual integrity of the evidence will be preserved.
By Douglas Carner
Suggested
next article or this one
|