Considering Video for Audio Engineers

Anyone who has been an audio engineer for a substantial period of time does so because they love what they do, not because they can’t move on to something else. Being an audio engineer in the world of broadcast means that we also have to deal with all that pesky video equipment and the signals that accompany it.

A depth of video knowledge may actually be the broadcast audio engineer’s least-known skill since, just to get our jobs done, we often end up learning more about video than anyone realizes. Still it can be tempting to just stick to the basics of what we need to know and end up puzzled by some of the stuff we run across.

This month we’re going to take a look at some video terminology that we see, but may not be familiar with. Let’s start out with a general overview of garden variety video connections: plain old analog video, as composite and component, standard-definition video (SD), and high-definition video (HD).

Video starts out with individual red (R), green (G) and blue (B) channels, which, depending on the application, either remain individual full bandwidth signals or get matrixed into a full-bandwidth luminance channel (Y); and two lower bandwidth color difference channels (B-Y, R-Y), which minimizes loads on video processors.

An array of video connectors with composite as CVBSCOMPOSITE AND COMPONENT
Composite video combines the three channels onto one cable while component leaves them separate. Composite connections are usually just called video, though I recently ran across a device that labeled it CVBS (color, video, blanking and sync). Component connections are most often designated Y/G, Pb/B, Pr/R and may or may not be labeled component. The simplest way to think of SD and HD video is that they are 8- or 10-bit digital versions of composite video that each move down one Serial Digital Interface (SDI) cable typically terminated in BNCs, so even the cabling is similar to composite analog.

The superpower of SDI video is that it can carry 16 channels of embedded audio along with it. Native SD video is always in the 4:3 aspect ratio with an interlaced frame size of 720x480 at 270 Mbps. Native HD is in 16:9 widescreen video format and is most often seen in progressive frame sizes 1280x720 (720p) or 1920x1080 (1080p) at 1.485 Gbps.

Interlaced video places odd lines on the display first, followed immediately by the even lines, while progressive scan video displays all lines in order. As with audio, connecting digital and analog video signals together requires conversion.

Another video format that needs addressing here is the high-definition multimedia interface (HDMI) as its ubiquity has allowed it to creep out of the consumer space into professional facilities. HDMI passes high-definition video and audio between devices on one cable, but quality can be hampered by poor quality cables and transmission hindered by high-bandwidth digital content protection (HDCP).

Content-encrypted with HDCP will not display on devices without an HDCP license, or those it considers unlicensed. Unfortunately, nonsensical failures are regular occurrences due to this embedded copy protection so HDMI should be used sparingly and always be tested.

Video has, of course, moved past HD to 4K and higher resolutions, with high dynamic range the current flavor of the day. 4K, which is being marketed as “ultra-high definition,” offers a minimum of four times the pixels of HD. Actually, UHD displays reproduce 3840x2160 lines to maintain a 16:9 aspect ratio, while video capture and image creation will likely be done in the actual 4K resolution of 4092x2160 lines. Higher resolutions and bit depths require infrastructure upgrades to beyond 3 Gbps and, depending on the codec, potentially more data storage space.

Uncompressed 1080p HD at 59.95 fps consumes disk space at approximately 1.53 TB per hour, whereas uncompressed 2160p 4K video would take up approximately 5 TB per hour. These storage requirements almost certainly mean that most 4K content will be shot and stored in some compressed format.

THE PROBLEM IS...
A bigger problem is that video carrying this much data cannot be moved around within most infrastructure as it currently exists. There was talk of a new multi-cable standard for 4K early on, but, as with most other technologies in the plant, IP distribution seems to be the way forward. HDR technology, which increases contrast and expands colors available in images, is being touted as a way to improve picture quality without increasing resolution. I’ve seen HDR in cinema demonstrations and it is very impressive, with incendiary whites and seemingly bottomless blacks and tons of detail where it was previously missing.

Every once in a while I run across a device with an asynchronous serial interface (ASI) connection on the back, also called a digital video broadcasting-asynchronous serial interface (DVB-ASI). This is a one-way data link for streaming compressed video and audio between digital devices. An ASI output on one device connects to the ASI input on another and, since it is a compressed data stream, it is not compatible with SDI connections.

Finally, there are the video sampling formats 4:4:4, 4:2:2, 4:1:1 and their seemingly ceaseless variations. This is actually related to our earlier coverage of luminance and RGB. The first number in each of these trios refers to luminance, while the other two are chrominance.

For instance, a 4:4:4 signal has the luminance (Y), red minus luminance (R–Y), and blue minus luminance (B–Y) channels each sampled four times, while 4:2:2 samples luminance four times, and drops sampling of each chrominance channel to two times. Following this pattern, it is easy to figure out the other sampling formats, though if you encounter one with a fourth digit (i.e. 4:4:4:4) then a key channel has been added.

Luminance sampling must remain high because we are far more sensitive to light than we are to color, so color sampling can be reduced with little concern. You may have noticed that all matrixed RGB lacks a green channel, which is not necessary since green can be derived from information in the other luminance channels.

Jay Yeary is a broadcast engineer and consultant who specializes in audio and now wonders if video did indeed kill the radio star. He is an AES Fellow and a member of SBE, SMPTE and TAB. He can be contacted through TV Technology or attransientaudiolabs.com.

Get the TV Tech Newsletter