IMAX: Measuring the Impact of HDR
IMAX’s Abdul Rehman explains how the company is working to help streamers reduce churn and improve profits with better HDR video streams
Faced with intense pressure to reduce losses, streaming companies have pivoted in the last two years to an obsessive focus on cost cutting, loss reduction and profitability.
At the moment, only Netflix is profitable. But these financial imperatives have helped several major streaming offerings—notably Warner Bros. Discovery’s Max, NBCU’s Peacock and Disney’s direct-to-consumer offerings—reduce losses and boost their stock prices.
Unfortunately, the war on red ink has also come with some costs, most notably price hikes, increased churn, and in some cases poorer quality video. Netflix, Amazon and Max, for example, have all made it significantly more expensive for consumers to access better quality 4K high dynamic range (HDR) video, a move that has pushed millions of subscribers into tiers with second-rate video quality.
In this interview with TV Tech, Dr. Abdul Rehman, the chief product officer at an IMAX, explains that streamers don’t have to make “lose/lose” choices between cutting costs and maintaining high video quality because 4K HDR can be delivered at scale in a cost-effective manner. This focus on better quality video, can in turn, also drive subscriber growth and help stem the hefty marketing and other costs created by millions of subscribers churning in and out of their subscriptions each year.
All this is part of a wider effort by IMAX to bring the high-quality images it has long been associated with in the theatrical film industry into the home. In 2023, as part of that strategy, IMAX acquired streaming technology provider, SIMMWAVE, where Rehman was the founder and CEO/CTO. Since then, IMAX has been investing heavily in technologies to improve streaming video and at IBC 2023 launched its StreamSmart solution.
The following interview has been edited for clarity and length.
TVT: In a general way, what are some of the key things that you’ve been focusing on at IMAX to improve video quality and how does HDR fit into that?
Get the TV Tech Newsletter
The professional video industry's #1 source for news, trends and product and tech information. Sign up below.
REHMAN In terms of consumer technology, an important goal is to help storytellers tell stories in the best possible way. In other words, help preserve creative intent of the content delivered as audio and video and other aspects of video streams in the home.
For that, we have developed technology that can really measure human experience very accurately at scale in the way humans perceive content. We can assess that in terms of how the content is delivered and served on a specific device. We can give it a number for how close it is to the creative intent from zero to 100. The higher the number, higher the closer the content is the creative intent.
We use that measurement technology to optimize video experiences and to make sure the customer is happy with the stream they're delivering.
Then we help them by providing products and solutions to preserve quality and to shave off everything that's not needed.
Because it's driven by deep science and an understanding of how the human visual system operates, we are able to help them reduce costs, in the tens of millions of dollars generally speaking, while not impacting quality.
The other benefit related to delivering premium experience in the home. You can think of it as providing an IMAX quality experience using a similar number of bits to the typical bitrate budgets.
TVT: In terms of that visual experience, how do you see the adoption of HDR?
REHMAN: In terms of the libraries that are out there, especially in cloud-based content, the percentage of content that's available in high dynamic range [HDR] is fairly limited.
Generally speaking, it is less than 10% and can be significantly less than that, in terms of the percentage of content available in high dynamic range or HDR.
But the consumption I believe is much higher. This limited set of HDR assets are consumed much more broadly than the rest of the library.
Because you can deliver more details, highlight details and shadow details, and there's the wider color spectrum that is organic and represents skin tones correctly, HDR gives creatives the ability to tell the story that they want to tell in a much more powerful way. But the way it is implemented from content creation all the way to playback is very important.
When we look at human perception, dynamic range is an important element of that. We believe if you cannot measure something, you cannot optimize that experience. If you can’t measure it, you're just shooting in the dark. The technology that you use may work 80% of the time or 60% of the time or 90% of the time, based on your experience or subjective evaluation. But until you can measure it at your scale, you cannot really make decisions that will allow that content to be delivered and experienced as expected at scale.
We started working on measuring how humans experience content about 16 years or so ago. We studied the impact of processing compression. We studied the impact of devices and how they do tone mapping as a common operation to fit the signal within the device capabilities. Not all the content is created for 1000 nits, not all devices can do 1000 nits of brightness. The device will [adapt] using standards like Dolby Vision, HDR 10 Plus, HDR 10 and others.
If you do this operation with some solutions, you may reduce or remove some details from the content. One method may remove details from highlights and then the brighter parts don't look so good because they are flashed out. Or it may remove details from the darker areas.
The question is, how does that impact the storytelling and human perception? That's the goal of our vision science and what we measure. We are able to do that very accurately. We introduced that capability in our tools in 2017.
Since then, it has been used by our customers not only to measure effectively what they’re delivering in terms of HDR. It also to measure performance of different encoders and various decisions that they have to make. It also allows them to dynamically optimize the bitrate requirement for HDR content on a segment-by-segment or scene by scene basis.
Obvously, you cannot do that unless you can really measure how humans experience the content. Our metric and vision science can replicate the human objective evaluation very accurately.
If you are not able to measure extremely accurately, then you may think that you're keeping the quality at the same level when you aren’t. If you have an automated approach that can’t really perceive the issues with the content that seems fine at 14 Mbps, then you may tell the automated approach to keep going down and you may end up at eight or six Mbps, where you end up a lot of issues like color banding and so on so forth.
We have a product, StreamSmart, that can accurately measure content in an automated way. It provides users with a way to make decisions on the basis of human perception and the principles we’ve been talking about that can improve quality and save money.
TVT: What about HDR in live streams and live production? How do you see the adoption of that in sports and other live streams?
REHMAN: The trend is obviously towards more and more adoption. But the growth of HDR content on the live side is significantly slower than the file-based side. That’s due to the challenges that come from live production and putting a linear channel together.
For example, if I specifically talk about a live production, rather than a whole linear channel, doing a specific game is easier relatively speaking, than doing multiple events at the same time.
There are a number of challenges for doing a specific game. If you have a specific camera setup, you need all those cameras to do HDR. HDR obviously requires higher bandwidth. So you need to have links from that camera to the truck that are very high bandwidth. Then you need to deliver that HDR feed every step of the way and spend more bits.
Bandwidth is certainly a challenge when it comes to content production because the physical wires that you need to put in place from cameras all the way to the truck or IP infrastructure that needs to support the 10s of gigabits per second per feed that are coming in from camera so you can mix them together and produce the live sports.
But that’s not that big of a challenge.
Another challenge occurs when you are working with a much broader color gamut and much wider dynamic range from scene to scene. Then you have to make sure that they match when you are mixing them together. If the cameras are very different, if they're capturing a very different range of brightness, then consumers are not going to like that.
So you need uniformity. Consistency is critical. You don't want to go from camera to camera, where the color changing from one type of color gamut to another type of color range. That is challenging, but that is still easy to solve.
Adding ads into the mix creates further challenges. Now these ads may or may not be available in HDR. So you need to create some uniformity between the ads and the live stream.
That's when you use automated approaches to match the look and feel. But because many of these approaches are limited in terms of human perception perspective, you end up creating a different look and feel as you go from the live stream to the ads and so on so forth. So that's a challenge.
Then, if you have a linear channel and you go from watching football and to hockey, you have issues with how the colors would match or not match and how the camera setup was put in place and the rest of it. So those are the challenges when it comes to live and linear content.
TVT: How does your StreamSmart solution address those challenges?
REHMAN: The goal is to help not only assess the look and feel of the content—whether it matches or not and whether we are covering the right brightness range and stuff like that. The goal is also to compress that much better.
There's a debate out there about whether it's better to do UHD SDR standard dynamic range or HD, high definition HDR.
With HD HDR, you have less detail in terms of resolution more dynamic range in terms of brightness and color and stuff. With UHD SDR you have more resolution but less dynamic range.
Based on our technology you can do both. You can benefit and deliver the highest quality when you have all the pixels UHD 4k and high dynamic range within the same bandwidth. You can offer the full experience. Instead of choosing from a lose-lose situation, where you are either losing dynamic range or losing resolution, you can have both with our solution.
TVT: There are still a variety of HDR formats. Where do you see the different ones being adopted and their development?
REHMAN: I think we will see these evolve over time as technology grows every step of the way.
At this point, if you look at the file-based side, the format that's most commonly used is Dolby Vision. That's the preferred approach for a number of reasons.
On the live side. It's HLG [Hybrid Log-Gamma] or HDR 10. HLG has a certain simplicity in the way it handles meta data and stuff like that.
That goes back to my file and live based comparison and the complexities that are involved in that. Dolby Vision and HDR 10+ allow you to change the look and feel of the content from a color and brightness perspective from scene to scene. That additional complexity is easier to handle on the file based side than on the live side. ,
If there is a solution on the live side that will deliver the benefits of scene-by-scene adaptation and do it in a simple way from an implementation perspective, you might see a different approach in terms of the live side of things.
But I think on the live side, things are still in the air. How do you take the next leap in terms of adjusting color and dynamic range on a scene-by-scene basis? On the file-based side, it's much easier to do.
So there are a number of challenges that exist when it comes to using HDR formats. And as these formats evolve and solve little problems, you will see that the composition, what formats are out there and which are preferred is going to change over time. Especially in the live side of things.
TVT: In terms of new high-profile content, big dramas, etc. is HDR now widely adopted? Or is there still some hesitancy to shoot with HDR?
REHMAN: It's a question more than anything else I think of cost. So if that premium quality is part of the storytelling, and creatives have the budget and time to spend on that, then I do see that people are creating content in HDR and that's only going to increase.
I think HDR is going to become much more prevalent every step of the way. New devices and experiences are coming out to like the Apple Vision Pro.. So you will see that it will become more and more commonplace for people to consume high dynamic range content, which will feed this loop of creating more content. It will just take some time because a lot of content is not even available in UHD.
…[That's] on the streaming side. If we are just purely talking a master for premium content, that is going to be 4K HDR. That is table stakes right now. You have to have HDR today. If you’re not doing HDR, then I don't know what you are doing.
On the theatrical side, certain films are taking it even further. Oppenheimer was done using IMAX film cameras. That is pushing it farther in terms of resolution significantly and aspect ratio. There's a lot of work in the look and feel that IMAX actually been doing that is pushing it even further. But 4K and HDR are table stakes at this point.
TVT: How would you describe the difference between how IMAX’s StreamSmart is helping encoding and workflow, and what's the difference between that and say, per title encoding or content aware encoding using some of the other methods that are out there such as VMAF? How is the IMAX vision science differentiated from those?
REHMAN: So I'll cover the encoding in the first part and then talk about other quality metrics separately in the second part.
For the content we're encoding, the principle is that if a specific encoder finds a specific piece of content easier to encode, and less trying, then the encoder will spend less bits on that.
That's generally the principle. In other words, you are saying let’s measure the complexity of content and if the content is less complex to encode, then we'll spend fewer bits. And if it's more complex, then lets spend more bits. You can obviously apply that to selecting resolution as well. At a specific bitrate what is the right resolution to pick?
In terms of IMAX, what we are talking about here is a quality-driven metric allocation. It is additive to Content Aware encoding. Yes, if the content is less complex, then, let's spend fewer bits. But how much do we go down from more complex content? Is it eight Mbps? Is it six or four or three?
That's where we come in. I'll give you an example. The content encoding approach available in FFmpeg based encoders is called CRF, Constant Rate Factor. You can set the CRF value at 25, at 28, at 30, at 16. Where you set it at the end of the day, that's based on people's experience and stuff like that. Now, let's say based on your experience and whatever measurement you have done, you want to set it for high quality, with the lower the number meaning the higher the quality.
So let's say you set it to a CRF 18. But how do you know that is right for that specific scene? How do you know that 19 or 20, or 25 will not deliver the same quality levels?
That is where we come in. We can say for this scene, you don't need 15. We can say that for this scene, CRF 22 will give you the same quality at three megabits per second as CRF 15 or 18 would.
So it's a quality driven approach sitting on top of content awareness encoding. That is what we are using.
Now when it comes to VMAF [Video Multi-method Assessment Fusion], they [Netflix] in a recent blog talked about the use of VMAF for optimizing HDR content. And they talked about extending the ability of VMAF [from SDR] to measure HDR content quality and enabling VMAF to measure HDR content quality.
At this point, it's not publicly available. The only organization that knows what [VMAF] can or cannot do it is Netflix. At this point they are the only ones who know this because it's internal and that's what they use for themselves. So, I don't know if it's good or it's not good enough. They obviously think that it's not good enough because it's been tweaked for internal use for HDR. But they have not made it publicly available.
I was surprised to see that some of the graphs they put in their blog, don't even have numbers on them. They don't even feel comfortable enough to say that the score is 85 or score is 90. So that's sort of the readiness level [for VMAF] at this point.
[In contrast], we started working on this in 2008, 2009 in our research lab and our HDR work is very very highly cited.
First of all, VMAF doesn't support HDR 10. So it's not an applicable comparison. The second thing is that VMAF is built as a combination of metrics. So they collected a bunch of metrics which is fine. If somebody wants to use it, you collect four or five, six different metrics from different papers and then fuse them together.
That's what the F word stands for in VMAF. It's for fusion. You train some simple machine learning model and hope that one of the metrics is going to do the right job at the end of the day. But what if none of them do their job?
Keep in mind that these metrics are fundamentally SDR metrics, not HDR. When VMAF comes out with HDR capabilities, I would need to see if they have modified some of those metrics to make them HDR capable or if it is a totally brand-new approach. That remains to be seen.
My personal opinion is that when I go to see Netflix content, I see banding. I see a number of other issues.
So the question is: where's that coming from? Is it from the use of this metric to reduce the bitrate or not? Can the estimate [based on VMAF] really be trusted to make sure and absolutely guarantee that the quality is not being compromised? If they think that VMAF can do that, then 100% They should use it. If they think it can't, then we're here to help them
TVT: What has IMAX’s acquisition of SSIMWAVE meant for the development of the solutions you are talking about?
REHMAN: It’s meant a few things. One is resources. Obviously, as a young company starting out we had a specific level of resources available. With IMAX we have more resources available. IMAX is strategically keen on investing and bringing IMAX offering outside of theaters as well. They want to invest and deliver on expanding on the business opportunities outside of that theatrical environment.
Another is validation and their belief in the technology and the people. Having the investment and IMAX brand, means something to consumers. Those things have clearly helped us with additional resources and reach.
TVT: The streaming industry is at a pivotal point where they are under enormous pressure to reduce their loses, which means that in addition to controlling costs, they have to attract subscribers and reduce the churn—the number of subscribers who are dropping their subscriptions. How do you see yourself fitting into that pressure to reduce costs and churn?
REHMAN: At the end of the day, streaming companies and media companies are generally telling investors that profitability is the key objective. We can help them reach that goal faster by helping them keep their consumers with better quality content and reduce the cost of streaming content so they can hit profitability faster.
TVT: I know you can’t be too specific about your roadmap going forward, but what are the major things you are focusing on to further improve your offering?
REHMAN When it comes to StreamSmart and similar technologies, we continue to invest in multiple facets to make sure the creative intent is preserved.
On the measurement side, right now one of the challenges is, for example, that a lot of content that has either digital or film look that is not preserved. It's kind of butchered by the time it gets to the home.
Our goal is to help preserve and measure how well that creative intent is preserved. So, we will be launching in the near future, a metric to see how well it is preserved. That is something a number of people in the industry are looking at. They can use a metric like that because the look, a specific kind of film look, is an important part of storytelling for them.
That's just one example on the perceptual measurement side of how we are working to make sure we can measure the preservation of creative intent, better and better every step of the way.
On the StreamSmart side, we will continue pushing the envelope in what users can be saving in deployments at scale.
On the content side with the same spirit of preserving creative intent, we are bringing in contextual elements [to focus] on what scenes are more important to storytelling. How can we incorporate AI technologies into some of that?
Some of that we are working on is still in its early days but we’re really excited at what we’ve been able to achieve and bring to StreamSmart.
We are also working hard on the live linear side as well. In the industry, the file-based side is ahead compared to the live, linear side. So, on the linear side, we’re working to deploy it much more broadly and to create a sort of a cookie-cutter solution that can be deployed across different types of approaches to live channels.
George Winslow is the senior content producer for TV Tech. He has written about the television, media and technology industries for nearly 30 years for such publications as Broadcasting & Cable, Multichannel News and TV Tech. Over the years, he has edited a number of magazines, including Multichannel News International and World Screen, and moderated panels at such major industry events as NAB and MIP TV. He has published two books and dozens of encyclopedia articles on such subjects as the media, New York City history and economics.