home home Staff Contact
Forensic Protection - Services Forensic Protection - Rates Forensic Protection - FP_System Forensic Protection - Feedback
AVFA career

AUDIO VIDEO FORENSIC ANALYST - Career Preparation

As an audio video forensic analyst, the majority of your work will be to improve the perceived audio or visual clarity of a digital recording. However, your work will also include measurements, cross referencing data, and handling everything in the most ethical manner in preparation of your expert testimony in the service of justice. You will be expected to understand industry best practices, stay atop innovative peer-reviewed technologies and methods, adapt existing knowledge to unexpected circumstances, and the rules of evidence that will apply to the case before you.

Let’s first discuss audio, where human hearing exceeds the frequency (measured in Hertz) of speech by a factor of six. To reduce file size, digital recorders discard much of this unused headroom before saving the recording. This reduction is a balancing act because pitch distortion will occur once this ratio becomes negative, or if too few bits represent each data point. The recording may be so loud (measured on the logarithmic decibel scale) that data has been clipped or the recording may be saturated with reverberation. Additional losses in quality may occur due to the compression codec being applied, regardless of the file’s container extension. Individually, such losses may not seem substantial, but collectively they can inhibit audio enhancement.

Such limitations may become obvious during critical listening and is the reason you should maintain a high quality lossless format throughout your entire enhancement process to avoid introducing additional data defects. As you listen, you will gather clues regarding which processes and enhancement filters should be applied. For example, while you can attenuate a notched parasite sound, adaptive time-frequency filters would be the better choice to suppress unwanted dynamic sounds.

A similar logic applies to video recordings, but with a much more complex data set. For example, most surveillance video is composed of unique visual frames composed of two interlaced sequential moments in time. These moments (fields) must be separated, thus doubling the frame count and aspect ratio, and to achieve the greatest clarity you will want the proprietary recording to preserve field integrity and minimize compression losses. There are numerous peer-reviewed third party programs that may be able to extract the video stream, which is typically compliant with the industry standard of using tokens to define motion and pixel block. Extraction creates an open format while retaining your path to the original metadata while ensuring the absence of post-production content tampering or Steganography.

If extraction is not possible, and you must use some method to recapture the visual contents of a video, your frame count will almost certainly deviate from the one field/frame per moment methodology found in a native recording. This is a perfect example of the Locard principal where each action that affects something will leave some indication behind.

Computers allow analog files to become digital, and then digital files to be transformed in the frequency domain. For example, you could use a FFT domain filter to detect tampering or remove transitory specks that pollute a video. Alone, destructive processes can isolate unique data or artifacts, but collectively a forensic analysis can tell a more powerful story. For example, the ever changing silent electrical network frequency (ENF) generated by our nation’s power grid has been documented for decades, and thus isolating that ENF from a recording can be used to determine the approximate when and where of a recording’s origination, and then those details can be compared to known case facts and the file’s metadata to determine evidentiary authenticity. Even the interfering noise embedded within a file can be used to identify the specific equipment or handling that produced the recording.

Unlike analyzing ENF noise or measuring size with Photogrammetry, the processes of enhancement and identification are considering an art form due to absence of a measurable scientific error rate. This is not surprising since there is no magic bullet method applicable for tasks outside of data analysis, and the industry must provide its own oversight by objectively evaluating the expert’s qualifications and their objective methods.

As an audio video forensic analyst, you are tasked with understanding the limitations of technology, and the opinions that can be formed. No enhancement can achieve the fantasy expectations depicted on television, which is why you want to apply a soft and realistic hand when attempting to enhance a recording. For example, while attempting to improve the clarity of subtle motion, avoid excessive sharpening of high energy details, or applying to much brightness, as you may actually destroy the details you were trying to improve. Even the simple task of opening a recording can alter its metadata, which is why you need to work with an exact copy.

Most audio or video content authentication tests are not conclusive in themselves, but instead provide clues that the analyst collectively uses when forming their final opinion. For example, Video Error Level Analysis (VELA) can potentially draw attention to a cropped video or a removed object, but it can also produce a false positive. An audio impulse below present below the frequency range of the recording microphone could have originated from someone pausing the recording, or it could have been caused by an electrical issue. Such certainty limitations are why it is nearly impossible for opinions to be defined in absolutes.

As a forensic analyst, you are expected to understand the procedures of the rules of evidence applicable to the jurisdiction of your case. You will be expected to maintain data integrity through the use of hash values and/or chain-of-custody control, and to keep detailed notes of your activity on each case. Although you will communicate and work at the direction of whoever hires you, you work solely for the evidence and in accordance with the highest ethics. If you calculate the hash value for each file, then everyone can use this value to validate evidentiary integrity regardless of how those files are shared from that point forward.

It is important for your report and testimony detail your tests and results, including those that may be in conflict with each other. You are given significant discretion as to which steps are performed and exhibits produced, just so long as you fully disclose and support your choices. If you want your expert work and opinions to survive a Frye challenge, they must be your own and, if you want your CV to survive a Daubert challenge, then it must support your qualifications to form those opinions. It is your job to draw opinions within your area of expertise, and it is the court who will determine if those opinions will be entered into the record.

Let’s say that you are hired to determine someone’s height and you only have a single camera view to work with. Within that video, you must find a reference object of a definable size (e.g. a doorway of known height that the subject walks through) and a video still depicting when the subject walks through that doorway. Multiply the height of that doorway by the pixel height of the person in your still, and then divide that result by the pixel height of the doorway to determine the actual height of the person in question.

The court will expect your work to be scientific, and for that you must include a margin-of-error. For the above example, this would mean dividing some constant by the calculated pixel height of the person. Thus, if you calculated the person as 5’10” with a ¼” margin-of-error, then you can be 68% (sigma 1) confident in a height range, or 99.7% (sigma 3, which is 3 standard deviations) that the person's height is between 5’9¼” and 5’10¾”.

This height example is a simplified scenario. You would need to apply corrective geometry if the reference object was not directly in-line with the person (e.g. nearby sign or wall). As for the constant, I use 24 to get an answer in inches. That number is based on over a hundred such cases, typical human heights, and a very conservative potential improvement in resolution from enhancement. As you advance in your career, your experience will lead to the development of new methods.

Never forget that your opinions may deeply affect someone's life and your impartiality is critical. For this reason, you need to avoid forming a bias. If you are enhancing an audio file, do not read the transcript or get the expected wording until after your work is complete. If asked to clarify a face, use some other known as your work reference. For example, when I was asked to enhance George Zimmerman's head, I instead enhanced the badge of the officer standing next to Zimmerman so as not to enhance to a preconception. Most importantly, if asked to support an indefensible position, consider walking away because your integrity is your most valuable asset and once it is gone, so is your credibility.

Link for the Audio Video Forensic Analyst:
Industry standard tamper - authentication testing
MAT  form
Additional reading (link)
Testing Certification   (coming soon)
Training (link)

Copyright © Forensic Protection
QuickLinks | Main page | Case study | Media | FAQs | Contact us