Managing audio loudness across multiple platforms

With the CALM Act now in effect, stations need strategies for managing audio loudness — preferably for a wide variety of broadcast platforms. This article details the loop spanning from production to multi-platform delivery, paving the way for high-quality audio across genres, platforms and borders.

As the number of listeners per stream goes down, and in combination with a more dynamic and less predictable consumer environment, it is important for broadcasters to consider five factors before committing to any change of station operating procedure: 1) Are we addressing listener concerns? 2) How well does a technique cater to the station’s majority of programs? 3) Does it reduce content creation time? 4) How does any procedure facilitate cross-platform distribution? and 5) Will any decision limit potential options, or will we retain the freedom to maneuver in the future?

This radar shows loudness history in broadcast production, compliant with ITU-R BS.1771 (upper screen shots), or “annoyance” in movie trailers as defined by TASA (lower screen shots). Bar-graph meters show true peak.

Because the same measurement may be applied in production, ingest, transmission and logging, a transparent loop can be established from production to multi-platform. The loop may even be closed, with feedback from logging used to improve step by step.

To help this transparent loop, Loudness Range (LRA) is a statistical tool for making objective mixing and processing decisions. It quantifies the level variation with a time-varying loudness measurement. LRA is supplementary to the main audio measure, Program Loudness, of ITU BS.1770-3.LRA measures the variation of loudness on a macroscopic time-scale, in LU (loudness units).

Normally, broadcast should not be mixed like a cinema movie, nor like a pumped-up commercial, and LRA provides a simple value at which to aim. Figure 1 shows loudness changes in a clip from the movie “Pulp Fiction”: Relatively loud music plays until halfway through the clip, when the scene changes into dialogue. Both scenes sound even in loudness, but the first scene is noticeably louder than the second. The 3s time scale seems ideal for measuring the magnitude of that macro dynamic change; the 1s time scale shows the same tendency but more noisily, and the 10s time scale blurs the change unnecessarily. LRA catches this difference because it is tuned to time scales relevant to film, broadcast and music.

Figure 1. Shown here on different time scales is the absolute loudness of the movie “Pulp Fiction,” from 00:06:50 to 00:07:20.

Used during ingest or on a broadcast server, LRA is an objective measure used for deciding when programs for delivery to certain platforms require range restriction. HD platforms may be set to tolerate any LRA value, though a limit such as 12-, 15- or 20LU, depending on genre, may be recommended in delivery guidelines. Downstream of production, LRA doesn’t change as long as gain offsets only are applied (normalization), but the number reveals if any significant range processing has taken place between two points in the broadcast chain. LRA may also serve as a logging tool, verifying that no range processing has happened during distribution, or in a codec.

For programs shorter than 30s, LRA is not suitable. Short-term or Momentary loudness are the metrics to use for preventing such programs from becoming too loud.

iPod and mobile TV

Mobile and computer devices have a different gain structure and make use of different codecs than domestic AV devices such as television. Tests have been performed to determine the standard operating level on Apple devices. Based on 1250 music tracks and 210 broadcast programs, the Apple normalization number comes out as -16.2LKFS (Loudness, K-weighted, relative to Full Scale) on a BS.1770-3 scale. It is, therefore, suggested that when distributing podcast or Mobile TV, to use a target level no lower than -16LKFS. The easiest and best-sounding way to accomplish this is to: 1) Normalize to target level (-24LKFS); 2) Limit peaks to -9dBTP (Units for measurement of true peak audio level, relative to full scale); and 3) Apply a gain change of +8dB. Following this procedure, the distinction between foreground and background isn’t blurred, even on low-headroom platforms.

Headroom in broadcast

Figure 2. The headroom requirement is genre-dependent, ranging from 20db in cinema movies, to 6dB or even lower in commercials and pop/rock music.

The ratio between max peak level and average operating level is called headroom. Using BS.1770, headroom can be regarded as the ratio between true peak level and program loudness. The amount of headroom is genre-dependent, as is shown in Figure 2.

In commercials and pop/rock music, the headroom requirement can be 6dB or even lower, while a cinema movie may need more than 20dB. Furthermore, movies and classical music only need this headroom for a short period of time, while “beat music,” in general, requires the same amount of headroom from start to end.

When a signal path offers less headroom than required for conveying a program, limiting or clipping will result.

Unfortunately, every part of the signal path may constitute a headroom bottleneck, so broadcast doesn’t sound better than its weakest link. In analog broadcast, headroom is frequency-dependent, with less at high frequency because of transmission emphasis. Analog TV has a headroom of 10dB to 12dB, while FM radio is often operated with 8dB or less.

In digital broadcast, noise is generally lower, and emphasis is no longer part of the equation. Consequently, a lower average level in combination with a higher peak level is now a possibility. With target and peak level specified by ATSC A/85 (-24LKFS/-2dBTP), a generous 22dB of headroom is available. This is more headroom than broadcast has ever had.

Used for stereo only, the AC-3 codec isn’t more sensitive than other codecs at a similar bitrate with regard to the max true peak level it handles without clipping. If a typical pop/rock track is encoded without attenuation, AC-3 clips frequently, but if the same track is attenuated so peaks don’t exceed -1dBTP, the problem is solved.

The real challenge with AC-3 and headroom is the way 5.1 is handled. The majority of consumers are listening in stereo, so if AC-3 is transmitted without an independent stereo stream, the decoder has to downmix every 5.1 program that comes along. This is where problems start. The decoder doesn’t include a transparent downmix limiter, so one option is to use conservative mix coefficients to avoid stereo overloads: L, R: -6dB; center: -9dB; SL, SR: -12dB. Now there will be no mix overloads, but systematic level jumps will occur when switching from native 5.1 to native stereo.

The real peak level problem in AC-3 doesn’t come from the data reduction itself, but from the downmix section in the decoder. If broadcasters could keep peak level low, decoder mix coefficients wouldn’t have to be conservative. On the other hand, it would be a shame if a general restriction of headroom in broadcast was inflicted because of first-generation codecs with technical design issues.

Recent experiments have pointed to a solution more tolerable than using a general limit threshold at -6dBTP. In 5.1 action movies, the center channel generally uses more of its headroom than the other channels. The AC-3 downmix solution is, therefore, simple: Use -6dBTP limiting for all the lateral channels, but -3dBTP for center.

The future of loudness

Even today, stations need to serve different platforms, and in the future, that number may well increase. Also, each platform may have a unique target level to aim at; this scenario calls for a flexible technical setup that’s able to handle this task.

In a perfect world, the average audio loudness would be even across a full day of broadcast and across all genres and channels, meaning that browsing dozens of TV channels wouldn’t provoke a constant need to adjust the volume. But, even better, uniform loudness would also be the case — regardless of whether you turn on the TV, put on a video podcast or stream a YouTube video.

We may not be there yet, but technological breakthroughs, definitions of broadcast standards and legislation on loudness indicate that this is the direction in which we are headed. Station engineers should keep these practices in mind, along with the ever-changing need to support a variety of broadcast platforms — each with its own audio performance requirements.

Following the guidelines given here makes the distribution of quality audio to multiple platforms easy and codec-agnostic.

Thomas Lund is HD development manager at TC Electronic.