Beneath the buzzwords: AI, video analytics, edge computing and more

Guest authors

Albert UnterbergerThis post is written by Albert Unterberger, owner at Unterberger Consult with extensive experience in the security industry. Read more about Albert at the bottom of this page.

As the dictionary definition puts it, a buzzword is: “A word or expression from a particular subject area that has become fashionable by being used a lot.”

While buzzwords can get a bad press for being overused (and often misused), they usually become part of our vocabulary because they relate to new, exciting innovations with huge potential.

The buzzwords I’ll delve into in this post – artificial intelligence (AI), machine learning, deep learning, video analytics, the internet of things and edge computing – all have enormous potential in safety and security (and more). Indeed, their application will be essential if we are to take full advantage of the continued innovations in video hardware and software.

The need for video analytics

A high-level definition of video analytics is fairly straightforward: software which automatically analyzes live and recorded video footage, providing alerts, automated responses and actionable insights.

While video analytics is not a new thing – we’ve been talking about it since the early 2000s and even before that – like many buzzwords its value is only being truly realized well after its creation as a result of other technologies catching up with the original vision.

The increasing need for video analytics is obvious. More cameras, of greater quality, placed in in an increasing variety of environments for an equally expanding number of reasons. At a fundamental level, this points to one thing: more information to review. A lot more information.

Humans alone simply cannot review the volume of video footage created by the world’s surveillance cameras. Their own cognitive skills – which are excellent but not scalable to hundreds or thousands of cameras – have to be supported and augmented by technology. And that’s a good thing. Computers have the power to process huge amounts of data in real-time, 24 hours a day, without fatigue.

But while processing data is one thing, being able to analyze effectively and deliver useful insights requires a level of cognition. Which brings us on to one of the biggest current buzzwords, artificial intelligence (AI).

Can computers replicate human intelligence?

Artificial intelligence illustrationThe short answer is, “no”. The slightly longer answer is, “possibly, but not for a long while yet”.

In reality, “artificial intelligence” is something of a misnomer, or at least commonly misunderstood, as it suggests artificially creating human levels of intelligence in a computer. But to draw on a dictionary definition once again, intelligence is simply, “the ability to acquire and apply knowledge and skills”.

When put in those terms, it’s much easier to see how computers can become artificially intelligent. It’s also easier to start to appreciate what “knowledge and skills” will be required for AI to be applied effectively in video analytics (for instance, at a very basic level, the ability to know when an object is a human vs. an animal vs. and inanimate object).

But while AI is often the headline subject, the definition points to a more specific requirement for AI to become a reality: the ability of computers to learn.

Teaching the machines to learn, and learn deeply

Training computers to learn is what’s known (logically!) as machine learning.

We’re all exposed to machine learning on a regular basis. For those of us who shop or watch movies and TV online, the recommendations made – and which hopefully become more accurate and useful over time – are based on machine learning. Learning about our preferences based on what we’ve bought or watched is one thing. Machine learning becomes really powerful, however, when these are analyzed alongside those of millions of other people making similar choices, and adding further known demographic information about us (from social media sites, for example).

In the world of video surveillance, machine learning has enormous potential, exemplified in a tweet from leading AI expert Andrew Ng “Pretty much everything a normal human can do in <1 second, we can now automate with AI.” This includes, of course, recognizing and classifying humans (male, female, race, age) animals, vehicles and objects.

In training computers to learn the attributes of every object, video analytics becomes ever more powerful. But understanding what something is is one thing; analyzing what it is doing and why is another challenge entirely.

Video analytics has always been dependent by the imagination, experience and skills of the human software developer. And it simply isn’t possible for a human to write code for video analytics software that addresses every possible object in every possible situation.

Which brings us to the most advanced level of AI today, deep learning.

Neural networks and object properties

Central to deep learning is the use of artificial neural networks: structures for computer processing that more closely replicate how the human mind works. It is essential to understand that it is no longer about algorithms, but it is much about the training data describing the situations the computer needs to recognise. So, theoretically there is no longer a limiting human factor.

In a very simple scenario, while machine learning will tell you that there are cars and people; deep learning will tell you the color and the brand of cars but also recognize traffic signs, cyclists, and even people carrying handbags.

However, we can only learn if we have the material to learn from. Computers are no different. And while neural networks allow computers to become trained, if we truly want them to “learn the world” the amount of data needed to do that is simply enormous.

Why “Big Data” was an understatement…

Cloud and Edge computing illustrationFor deep learning in video analytics, the ability to source that data is a significant challenge. Footage from video surveillance cameras is often highly sensitive and, quite rightly, covered under international data protection regulations such as GDPR. In addition, even when huge data sets can be found, the computing power required to process, analyze and learn is extremely significant in scale.

But if history tells us anything, these challenges will be overcome, and there are a couple of other buzzwords which point towards the potential contributors for AI: the internet of things and edge computing.

The internet of things (IoT) is central to the proliferation of data. Describing the totality of the world’s connected devices – from mobile phones to fridges, from industrial sensors to lampposts, from voice assistants to network video cameras – the IoT creates huge amounts of data every second of every day. Much of this data has potential value, but needs to be transferred, processed, handled, stored and analyzed.

The most common current model sees all data transferred from the connected device to a data center for storage and analysis. Of course, not all the data from the device will be useful or valuable, so its transfer and storage creates significant wasted resources in bandwidth and memory, with their associated impacts in energy consumption and cost.

Enter edge computing. As the name suggests, edge computing puts greater processing power at the “edge” of the network or, in more tangible terms, within the connected device itself. This allows for a level of data analytics by the device itself – in our terms a network video camera – and therefore the transfer of only meaningful, useful data, or that data which require further analysis (for example, alerting officials of exceptions at border control requiring passport verification). The benefits in bandwidth and storage requirements are obvious, let alone increased efficiency in operations.

The vision for video analytics

We’ve covered a lot of ground, and it’s worth pulling the strands together. For the future of video analytics, the aim is to combine the power of computers and their increasing ability to learn and understand with the unique decision-making abilities of human beings. In many ways, “augmented intelligence” is a better term than “artificial intelligence”. Humans supported by computing power will be the winning combination. We will see smarter cameras, more able to effectively analyze situations through AI, delivering the most relevant information to operators upon which to make accurate, rapid and effective decisions regarding the appropriate response.

You can read more about what Axis believes the near future holds in this post on 2019’s key industry trends, and more about the vision for video analytics below.

Vision for video analytics

 

Albert Unterberger is the owner at Unterberger Consult. With extensive experience in the security industry working as a specialist for video surveillance he is now driving strategies for the development of high-tech video products and solutions as well as positioning of products and solutions in vertical market segments. Visit http://www.unterberger-consult.com/aboutme/ for more information.