Using AI to Learn What Viewers Want to See

As streaming platforms have turned to advanced computer science to help their customers more easily find movies and TV shows they’ll like, the 2001 sleeper hit Donnie Darko has provided an interesting example of a title that’s hard to put into a search bucket using traditional metadata tools.

Humans employed to classify Donnie Darko with metadata have traditionally just put the film in the horror-drama-fantasy bucket, explained Bel Lepe, chief technology officer for Ooyala, a video technology company specializing in advanced user content search and recommendation. But the movie about a high school student driven to violent crime by a man in a white rabbit suit is also rich in dark comedic overtones.

"Surfacing" a film like Donnie Darko on an advanced user interface requires the power of artificial intelligence and machine learning, or applying AI to give systems the ability to learn and improve from their experiences without specifically programming them.

“AI provides a different view of content and a greater level of detail,” Lepe said. “Through sentiment analysis, AI can algorithmically go through the film second by second, frame by frame, and capture its true prevailing quality.”

Indeed, as user interfaces have advanced to help TV viewers more easily find content amid an ever expanding field of choices, the data science used to achieve true personalization in many cases has moved beyond the scope of human capabilities. The blending of voice recognition into the user search and recommendation process only serves to further the need for AI.

“The industry is moving to deep learning completely for metadata extraction, language translations, user intent mining and personalized discovery,” said Lijin Chungapalli, senior software engineer of metadata engineering group for TiVo.

“Manual operations are too expensive and do not scale,” he added. “Machine learning will soon achieve human level accuracy. Using machine learning, the industry will move towards scene-level metadata for cast, mood, tones and product placement by processing video samples. We will have deeper integration with merchandising with effective entity recognition of products in a scene. Content augmentation will be completely dominated by machine generated data and we will see the decline of human curation in the next five years.”

According to Lepe, the video industry is experiencing a democratization of AI tools, which once existed as proprietary technology for large companies such as Netflix and Comcast.

Through cloud resources including Microsoft’s Azure, metadata tools exist now that let video companies wishing to deploy advanced personalization features do their own training.

“You no longer need a team of 40 data scientists,” Lepe said.

While personalization features have become table stakes for video companies wishing to supply their subscribers with modern video interfaces, there’s a lot more work that needs to be done to make these systems truly intuitive and personalized, Chungapalli added.

“Even though much progress has been made in this space, we are still at very early stages of deeper semantic understanding of content,” he said. “There is a paradigm shift and the focus has switched to machine learning in this area. With huge datasets available and cheaper hardware, machine learning has been gaining ground over conventional search and discovery applications. The future lies in machine learning and it is maturing at a faster pace and finding diverse applications.”

[Want more information like this? Subscribe to our newsletter and get it delivered right to your inbox.]

CATEGORIES