Unified Monitoring: How BBC Keeps Its Networks Running at Peak Efficiency

LONDON—Is your broadcasting organization capable of handling more than a million data points every five minutes across your IT infrastructure?

That’s how much network traffic is happening in BBC Online—and BBC Online is just one service our Future Media Group (FM) has to monitor to keep our IT infrastructure up and running. We’re using unified network monitoring to keep tabs on a vast hybrid IT network comprising a network operations center and 60 product groups. That’s 3,000 on-premise devices, and 2,000 in the cloud.

Every second of every day we’re dealing with 3,000 metrics that must be analyzed to make sure our infrastructure is working to deliver online digital services with the flawless performance audiences expect.

ONE MILLION DATA POINTS EVERY FIVE MINUTES

As most of you are probably aware, the BBC has a very wide range of broadcast services. We offer our broadcast and online audiences with nine national television channels, plus regional programming, 10 national radio stations, 40 local radio stations and an extensive web site. Globally, BBC World Service broadcasts on television, radio and online, providing news in 27 languages. BBC Worldwide and other commercial ventures complement our vast media portfolio.

Across that spectrum of services, it’s the BBC Future Media Group’s job to ensure that all digital media services perform without a stumble. (Those digital services include BBC Online, BBC Red Button and BBC iPlayer, as well as digital coverage of important ad hoc and live events, such as the Olympics and elections.)

It’s a tall order. BBC Online generates approximately one million data points every five minutes. As I said at the outset, we monitor 5,000 devices in a hybrid IT infrastructure, half on-premises with the remainder hosted in Amazon Web Services.

Keeping track of all our network activity is an essential part of a broadcast technologist’s job these days. People are consuming more and more of their news and entertainment online, and consumers have extremely high expectations for digital experience delivery. They have very little patience for issues with page loads, streaming media errors, or latency in general.

The BBC is committed to delivering an exceptional customer experience. To that end, the entire organization has been increasingly adopting continuous delivery and DevOps across its groups, as well as moving much of its infrastructure and services to public cloud. 

Our success in that effort, however, meant addressing some old ways of handling network monitoring that weren’t quite meeting modern infrastructure requirements.

THE OLD WAY: FIXED MONITORING BY LOCATION

Like many large media organizations, BBC’s infrastructure is dispersed, and our monitoring systems were set up for specific locations in the infrastructure. That’s far from ideal. Some of these monitoring tools were sitting on individual PCs, which would frequently crash.

Our NOC team was forced to constantly monitor multiple dashboards from different consoles. In this kind of dispersed environment, it can be very difficult for NOC team members to identify action items and act on them quickly.

With more than 60 product teams, the NOC needed a solution that could monitor output across the distinct services, and also easily integrate into DevOps workflows, to ensure maximal monitoring coverage within a single pane of glass.

That was my vision: To have a single dashboard so our NOC team can pick up incidents quickly. Without it, the BBC was susceptible to delivering a degraded customer experience, which would in turn risk damaging our brand’s reputation for reliable service.

The solution was unified network monitoring. Several companies provide flavors of unified network monitoring, and we settled on a system provided by Zenoss. We now have improved and streamlined our monitoring through a single dashboard. As a result, we have contributed direct value to the organization, in terms of both customer satisfaction and operational efficiency. In addition, we have moved approximately 50 percent of their operations to the cloud, and network monitoring has definitely helped make the transition smoother and more orderly.

THE BENEFITS OF OPEN SOURCE UNIFIED MONITORING

There are three main reasons to look to a dedicated, open source unified network monitoring solution: Scalability, flexibility, and value.

Scalability:The rapid changes in broadcast technology means we are now monitoring considerably more data than ever before, with a 24/7 feed to our NOC. That requires software-defined IT operations that can scale as data points continue to grow. Unified network monitoring allows us to monitor 3,000 metrics and 20 separate events every second.

Flexibility:At the BBC, we have an agile, dynamic business model. We work quickly, and we innovate frequently. Our centralized NOC team needed a solution that would intrinsically fit within the fast-paced DevOps workflows of each product group. We were able to create custom tools within the network monitoring environment to automate application monitoring and to add dashboarding functionality for specific internal stakeholders. Because of this flexibility the system now enjoys wide adoption and use among our monitoring team, product groups, operations teams and senior management.

Value:The BBC is a public service broadcaster, supported by taxes paid by U.K. households. Financial accountability and consistent quality of service are absolutely essential to our way of doing business.

Unified network monitoring directly benefits the BBC because it makes it easy to capture and analyze network performance data. We can detect incidents in both infrastructure and applications, so we can move quickly to find a root cause and take action to bring the service back up. But we don’t have to pay for analyzing volumes of unnecessary data, because unified network monitoring allows us to “triage” problem situations, focusing on the things we need to fix. That’s an extremely cost-effective way of maintaining our network.

As new technologies enter the broadcast environment, it puts an added burden on your network infrastructure—a burden that cannot affect the reliability of the network or the speed with which you conduct transactions with your audience. Unified network monitoring relieves you of that burden, creating the kind of flawless customer experience that’s essential to any broadcaster’s reputation.

Shani Mashhood is head of monitoring for the BBC. She can be reached atshani.mashhood@bbc.co.uk