Improving loudness control

2012 has been a CALM year. However, chances are that 2013 will be even CALMer, as the much-debated law on loudness in broadcast becomes effective in just a few months. But is it, in fact, already due for a revision?

No doubt, the CALM Act is most welcome and represents a huge step in the right direction with regard to eliminating annoying jumps in loudness in television. But, the broadcast standard that the CALM Act relies on may possibly be improved, as technology has evolved since the CALM Act was signed by the President. Therefore, the focal point of this article will be ATSC’s A/85 standard as it applies to broadcasters in the U.S., and not least how and in which areas it may be able to be updated for an even better performance with regard to measuring and controlling loudness in television.

However, two additional broadcast standards will also be mentioned: The International Telecommunication Union’s (ITU) BS.1770 will be discussed, as it is in fact the “mother standard” upon which all other broadcast standards are built, and the European Broadcast Union’s (EBU) R 128 will be discussed to examine whether applying specific R 128 tools to A/85 would be able to enhance ATSC’s A/85.

Article continues below

Loudness monitors such as the TM9 by TC Electronic owe their existence to the Communications Research Centre (CRC) in Canada, which designed a simple, yet effective, model for measuring “loudness.”

The starting point

In 2006, ITU introduced BS.1770, which was the first international loudness standard. It defined the core principle upon which virtually all other broadcast standards now rely. Until then, the audio industry at large had been struggling with various peak-level-based meters that made commercials and new pop/rock music appear systematically loud. Thanks are owed to Communications Research Centre (CRC) in Canada for designing a simple, yet effective, model for measuring “loudness.” After verification in numerous independent studies, the CRC model made it into the ITU BS.1770 standard, transparent and free for the world to use.

Without getting into all of the technical details, the measurement was extended from its original mono to also work with stereo and 5.1 programs. A so-called K-weighted filter curve (defined by the above-mentioned research results) is applied to each audio channel, which, in fact, builds a bridge between subjective impression and objective measurement.

Both A/85 and R 128 build on BS.1770, but in 2011, ITU updated its standard to version 2, also known as BS.1770-2, which integrates further improvements to loudness-measuring tools. These improvements had already been proven efficient by Japanese, Brazilian and European broadcasters. However, ATSC decided that A/85 should continue to rely on the previous version — let’s refer to it as BS.1770-1 — which is a bit unusual, as it is common scientific practice to refer to the latest revision of any standard. Normally, it would not be necessary to distinguish between 1770-1 and 1770-2, but simply refer to the standard as BS.1770 (in its latest version, whatever number that might be).

Different approaches

One specific part of the BS.1770-2 revision is essential — the gating scheme when measuring program loudness. This method prevents, e.g., long periods of silence or atmospheres in a movie, to affect the overall measurement undesirably. In short, this means that it is now possible to align different types of (television) programs in terms of loudness. News, sports, commercials, concerts, talk shows and movies can actually co-exist without viewers having to adjust the volume over and over again, which is exactly what caused the complaints that led to the CALM Act in the first place.

The gate, however, is not the only difference between A/85 and R 128. While R 128 measures the audio signal in full with the employed gating scheme, A/85 aims at detecting the speech part of the signal and using that as an anchor point for the measurements. Although this method can work for determining gain offsets between dialogue-based programs of a certain genre, it has proven to be ineffective when it comes to aligning many different types of programs.

For instance, one obvious, potential problem is that not all programs contain speech, or use it like it is used in movies. Furthermore, who determines what is “speech” and what is not? The proprietary and patent-protected dialogue-detection algorithm A/85 relies on can sometimes interpret a violin as being a human voice, and conversely, not recognize a Swedish dialect. Under the “speech ruling,” it would also be easy for commercials to become even louder by just keeping dialogue softer than other elements of a mix.

Faced with these challenges, ATSC recognized that the dialogue-based approach was not suitable for aligning interstitials on TV, and consequently new annexes (J and K) added in July 2011 stated that commercials were no longer to be measured using the speech-anchoring approach, but that all sources had to be taken into account. However, regular programs are still recommended to be measured using speech anchoring, and consequently, broadcasters will have to switch back and forth between different ways of measuring whenever a commercial appears.

To avoid this, it might be worth considering switching to one transparent measurement method, based on open standards, that works across all types of program material.

The gating scheme

Obviously, the one measurement suggested above is available already, namely in BS.1770-2. Its key feature is the gating scheme, so let’s have a closer look at how that actually works.

The gate is activated when programs with a wide loudness range are being measured. In such situations, the measurement hones in on foreground elements and disregards the rest. In practice, the gate takes long passages of silence or background audio into account by pausing the measurement of parts dropping below -10 loudness units (LU) relative to a measurement of the same program material without the gate.

Note how the BS.1770-2 gate is not set at an absolute loudness level (such as -34LKFS), which would be impractical and necessitate a new measurement in case a level offset was performed. Measurement gating is also a good help on the application side: With the previous, ungated technique (BS.1770-1), random parts of silence before and after a program could influence the result, making it virtually impossible to obtain the same number twice.

As a net result of BS.1770-2, a station is now able to gain-offset (normalize) programs based on their foreground loudness, which is what the audience prefers to hear. Furthermore, with optimum gain-offset taking place at the station, little or no dynamics processing is needed, and it becomes easy to cater to various broadcast platforms at a high audio quality. Finally, BS.1770-2-based normalization grants programs more transmission headroom than if normalization was based on BS.1770-1, or if it was based on speech.

Conclusion

Compared to previous peak-based or speech-based attempts of controlling level in broadcast, the BS.1770-2 standard is a remarkable improvement. As shown in Figure 1, even if only digital television (like ATSC A/85) is considered, and mobile TV, podcast and analog distribution are disregarded, the new standard is a giant step forward for audio quality. A fully transparent loop, based entirely on open technology, may now be created between production, ingest, transmission and the home listener, as shown in Figure 2.

Figure 1. BS.1770-2, with its gating scheme, represents a giant step forward in audio quality — both in terrestrial broadcast and in mobile TV, podcast and analog distribution.

Consequently, “sausage processing” at the point of transmission should be considered a thing of the past. Instead, BS.1770-2-compliant metering in production, and at subsequent stages, allows transparent handling and normalization of audio in the chain.

For broadcast platforms based on AC3, the BS.1770-2 measurement also enables a more precise and cheaper setting of dialnorm metadata than ever before. Remember, AC3 decoders are not dialogue-specific, neither with regard to normalization (dialnorm), nor with regard to processing (DRC).

For the plethora of other platforms — mobile TV, IPTV, podcast and counting — BS.1770-2 provides easy and audio-conscious answers also, especially in combination with complementary measurements such as momentary loudness, short-term loudness and loudness range.

Figure 2. The new BS.1770-2 standard creates a fully transparent loop between production, ingest, transmission and the home listener.

Under the new order, programs remain untouched, as long as their loudness range isn’t excessive for the audience of a given platform. This is also an improvement over today, because AC3 processing (DRC) is typically routinely enabled during DTV transmission. Where this kind of domestic processing was supposed to be non-destructive, DRC has ironically become anything but: The home listener cannot turn off a primitive processor that is not even BS.1770-compliant.

In an ideal world, no matter what the program type, the perceived loudness level would stay about the same throughout a full day of broadcast, across channels, across platforms. We’re close to that goal, and it all starts with the BS.1770-2 standard. Further complementary tools should not be dismissed, but the good news is that progress is being made constantly as users everywhere gain experience with the audio revolution that is taking place before our ears. The mere fact that broadcasters, as well as legislative assemblies, across the globe are now focusing on loudness solutions is a positive and welcome development that is sure to make the viewing — and, not least, listening — experience more enjoyable for everyone.

—Thomas Lund is the HD development manager at TC Electronic.