Introduction to APTO Processing

But First, a Word on Television Loudness

The introduction of compliance regulations and overall increased awareness of television loudness control has generally been successful in taming the wild loudness shifts that startled and annoyed viewers in the earlier days of digital television.

Real-time processing is ubiquitous and effective, content creators now have a better understanding of how viewers watch (and listen to) their programs, and broadcasters have the necessary tools at their disposal to analyze and, if necessary, correct pre-recorded programs upon ingest in the file domain.

That being said, there are still some misconceptions about loudness control and regulations being repeated as fact, and when applied, are nearly always to the detriment of the audio.

Speaking in the most general terms, most loudness regulations specify an average audio level for each program segment. Some are based on the loudness of the entire mix, while others are based on only an anchor element such as dialogue.

In either case, the regulations do not dictate that the audio be devoid of dynamic range nor do they mandate that levels constantly stay at the target value so that your LKFS/LUFS meter looks more like an FM modulation monitor at a major market radio station. Soft passages are allowed to stay soft and loud passages are permitted to pass through generally unaltered, so long as the average over the duration of the program segment is maintained and, where applicable, True Peak values are not exceeded.

Just like audio can be over-processed, it can also be under-processed. Trying to watch a sporting event in a noisy bar or restaurant (or even the average living room) would be frustrating if presented with the amount of dynamic range appropriate for a cinematic presentation in a high-end, multi-channel home theater.

The proper balance is found in what is commonly called the “comfort zone,” that is, the range in which the viewer never strains to hear lower-level programming yet is never annoyed by excessively loud levels. This is sometimes described as the situation in which the viewer never feels the need to reach for the volume control on their television, mobile phone, or tablet.

What Is APTO?

The audio processing within ARC is performed by Linear Acoustic® APTO™, our latest and most advanced adaptive loudness control algorithm to date.

APTO ensures that ARC can deliver audio that is compliant with various television broadcast regulatory loudness standards including EBU R128, ATSC A/85, FreeTV OP59, ARIB TR-B32, and AGCOM 219/09/CSP as well as the AES-TD1006 streaming standard.

It also includes profiles for other non-broadcast delivery platforms and listening environments such as gaming, movies, earphones, and in-flight entertainment.

APTO enhances the listening experience by providing consistent audio levels within a user-defined “comfort zone”, thus eliminating listener annoyance due to sudden loudness shifts. It also improves dialogue intelligibility, addressing one of the most common complaints with television audio. Best of all, it does so without audibly affecting the sound quality and artistic intent of the original content.

What Makes APTO Different?

Traditional real-time television processing normally employs a series of wideband and/or multiband compressors or AGCs that react to changing input levels by either increasing or decreasing gain in an effort to provide a more consistent output level. The various threshold, ratio, and attack and release rates can be adjusted to help determine the amount of dynamic range present at the output, or, put another way, how close to the desired output level the audio remains at any given time. A final look-ahead limiter is typically employed for peak control.

One potential downside to this type of processing is that unless the audio falls below a specified gate threshold – the point at which low-level audio isn’t increased so as not to bring up background noise – the gain is always changing, whether it needs to or not. Furthermore, multiband processing by design re-balances the spectral mix of the program. This can be advantageous for helping to fix less-than-ideal mixes or if a spectrally consistent output is a priority, but it can also affect the artistic intent of well-mixed content.

In contrast, APTO focuses on achieving the following goals:

Ensuring that foreground sounds – particularly dialog – remain intelligible at all times
Allowing the user to define a comfort zone within which no additional audio processing is applied, providing a more natural sense of dynamics
Maintaining the spectral balance of the original program material to preserve the artistic intent
Achieving and maintaining an output target level that is in compliance with global loudness regulations and is optimized for distribution platforms (such as streaming and on-demand services) and for specific devices and listening environments (including mobile phones and tablets)

How Does APTO Work?

In very basic terms, APTO first measures and analyzes the loudness of the incoming audio. In its first processing stage – the “Dynamic Range” stage - it applies realtime loudness control to reduce the overall dynamic range and get the levels within the user-defined comfort zone. This processed audio is then scaled in the second processing stage – the “Compliance” stage - to achieve an average output level that matches the desired target loudness value.

As mentioned previously, ARC includes a host of factory profiles for various loudness standards and deliverable platforms, but individual controls to fine-tune both processing stages are also brought out to the user interface and are described in greater detail later in the sections on Basic and Advanced APTO processing.