In addition to video streams, Axis devices can also deliver metadata streams. While the purpose of video streams is self-explanatory, the purpose of metadata might not be.

Metadata is the foundation for gathering intelligence from video. It assigns digital meaning to each video frame by describing the key details in the scene. Using metadata, you can quickly find, evaluate, and act on what is important in large amounts of video. This is why metadata has increasingly become an essential part of efficient security, safety, and business operations.


Visualization of analytics metadata generated by Axis cameras available on AXIS OS 11.11 or later.

Analytics metadata stream

The analytics metadata stream, describes the events, content, and characteristics of a scene including Event data and Scene metadata.

Car being detected within defined area for 1 minute

Event data

Logical rules based on what has occurred at the scene for example, when an object crossing a line is counted, or an object is dwelling in an area.

A yellow bus, grey car and white track being detected on a street

Scene metadata

Describes which objects are in the scene (such as humans and vehicles) and their attributes.


AXIS Scene Metadata enhances scene understanding, providing critical details such as object classes (human or vehicle), clothing and vehicle colors, license plates, and speed data. This in turn enables rapid decision-making, automated actions, and simplified search. Seamlessly integrated with third-party solutions through standardized methods, and delivered directly from Axis cameras, AXIS Scene Metadata helps reduce system and operational costs while ensuring efficiency and precision.

AXIS Scene Metadata includes information about the object types as well as other specific attributes such as clothing and vehicle color, license plate information, speed, location, and timestamp. AI-based analytics, featuring object detection and classification, lets you filter the attained metadata, trigger events, and examine specific elements of the scene.

A complete list of supported object classes and attributes is available below, but please note that the detection and classification capabilities are camera dependent. For detailed information about which object classes and attributes a specific camera model supports, please refer to the camera's data sheet.

Object Classes

  • Human
  • Vehicle
    • Car
    • Bus
    • Truck
    • Bike (Motorcycle/Bicycle)
    • Other
  • Human Face
  • License Plate

Object Attributes

  • Clothing Color (Upper/Lower)
  • Vehicle Color

In addition to the listed object attributes, the analytics metadata stream may also provide information about other properties of the detected objects such as likelihood, duration, shape, images, speed*, geocoordinates* and position. 

*Requires radar or radar-video fusion camera integration.

It is also worth noting that AXIS Scene Metadata generates information such as shape and position about non-classified moving objects in the video scene as well. 

The value of analytics metadata

Analytics metadata not only provides details about objects in a scene. It also provides context to events and allows large amounts of footage to be quickly sorted and searched. This enables functions that can be broadly categorized into the areas of post-event forensic search, real-time use, and identification of trends, patterns, and insights.

illustration metadata consumers

The main consumers of analytics metadata can be summarized as follows:

1) Edge applications

2) Real-time alarm notification and post event forensic searching within a Video Management System

3) Statistical analysis and reporting leveraging IoT and business intelligence platforms, such as data visualization dashboards

4) Advanced analytics requiring additional processing power 

Axis devices generate analytics metadata that is conformant with ONVIF Profile M streaming over RTSP to support use-cases related to post-event forensic search. However, this metadata is also accessible through alternative communication protocols and file formats, enabling a straight forward integration with a wide variety of systems covering a large array of use-cases. This guide provides important information such as suggested architecture, development examples, and recommended considerations when designing a solution that consumes this analytics metadata.

This guide also assumes that you are familiar with the concept of video metadata and more specifically with the capabilities that Axis cameras posses for detection of classified objects in a scene. In order to better understand the added value of video metadata, as well as an introduction to the capabilities of Axis cameras in this context, consider visiting the AXIS Scene Metadata product page and reading our white paper if you haven’t done so already.

Plan your integration

Axis devices with AXIS OS 11.11 or later support two different methods of generating analytics metadata, each being designed and optimized to support specific use-cases, such as real-time alarm notifications or post-event forensic search and identification of trends, patterns, and insights. To retrieve information regarding the metadata analytics producers on the Axis camera and the kind of metadata they can produce, please refer to the Analytics Metadata Producer Configuration API available in the VAPIX Library.

Consider whether your application requires the device to deliver the information in real-time when selecting which method is most suitable to fulfill your specific use-case. In the following section, we will take a deeper dive into each of the available methods.

Frame-by-frame metadata

The Analytics Scene Description producer generates metadata on a frame-by-frame basis at a frequency of 10 times a second.

Each metadata frame contains information about the moving objects in the scene at that specific point in time, as exemplified below:

Consolidated metadata

The Analytics Object Description producer generates metadata based on the track of a detected object, meaning that it can deliver metadata frames either when an object has entered and left the video scene. The information gathered during the lifetime of the object is combined and included in each metadata frame that is delivered by the camera.

Frame-by-frame metadata example

Frame 1 contains Object A and Object B as detected in the scene. Object A is classified as a Human wearing red clothing, while Object B is classified as a Human wearing blue clothing.

Consolidated metadata example

Frame 1 contains only information about Object A, when the object was first and last detected and a summary of the trajectory of the object and attributes that the camera was able to detect during the course of the track – so Object A had a 33% likelihood of wearing red clothing and 67% likelihood of wearing blue clothing, as the camera detected both colors during

Frame-by-frame metadata example

In Frame 2, the camera now determines that Object A is actually wearing blue clothing and that Object B is now wearing yellow clothing. They are the same objects as in frame 1, but with different color attributes which is also reflected in the metadata output.

Consolidated metadata example

Similarly, Frame 2 contains all known information about Object B.

Even if the data is being generated based on detected objects instead of on a frame-by-frame basis, it can contain the same information with regards to object trajectory and detected object classes and attributes.

Frame-by-frame metadata example

In Frame 3, Object B is no longer present in the scene and the camera can only track Object A in the scene, which is still a Human wearing blue clothing.

When to use frame-by-frame metadata

This method could, for example, be suitable for an edge application running on the Axis camera to trigger real-time events based on the content of the metadata, e.g. when a yellow vehicle is detected and an access gate needs to be opened. 

When to use consolidated metadata

This method of generating metadata is best suited for non-real-time applications, such as performing post-incident forensic search or statistical analysis as it eliminates the need to process and store irrelevant information and greatly reduces the amount of logic necessary to develop powerful applications based on object classification metadata.

Best snapshot

The analytics metadata streams can also be configured to include cropped images of detected classified objects using the Best Snapshot feature.

The image is represented as a base-64 encoded string within the metadata output. For examples, please refer to sample data frames which include images in the Ways of accessing the metadata section. The Best Snapshot feature must be enabled manually by issuing the following request to the camera:

Enable the Best Snapshot feature






JSON Input parameters
    "data": {
        "enabled": true

Parse the JSON Response

    "status": "success"

Ways of accessing the metadata

In this section, we will provide examples of how different consumer types are capable of accessing and consuming AXIS Scene Metadata.

1) Edge applications

AXIS Camera Application Platform (ACAP) is an open application platform from Axis. It provides a development platform for software-based solutions and systems built around Axis devices. ACAP is available for various types of Axis products such as cameras, speakers and intercoms.

The ACAP Native SDK is targeted towards users that want to develop plug-in style, event generating applications that fit well into a Video Management System centric environment. From AXIS OS 11.9, ACAP applications can consume AXIS Scene Metadata leveraging the Message Broker to further apply logical filters and rules to the information about the object in the scene in order to, for example, trigger actions based on defined thresholds or specific behaviors.


This example showcases how an ACAP application can consume frame-by-frame or consolidated analytics metadata using the Message Broker API.

The available topics to subscribe to are:

AvailableOutputTopicSample Data
AXIS OS 11.9 or laterFrame by framecom.axis.analytics_scene_description.v0.betaexamples_json_frame_based.json
AXIS OS 11.11 or laterConsolidatedcom.axis.consolidated_track.v1.betaexamples_json_consolidated.json

Please refer to the following resource for more detailed information on available topics and terminology.

2) Video management systems

Axis devices send both the analytics metadata and video streams to the Video management system (vms) to enable forensic search integrations. Two examples of these integrations are AXIS Optimizer forensic search for Milestone plugin and AXIS Forensic Search for Genetec

The Analytics Scene Description metadata stream can be retrieved from an Axis device by opening an RTSP stream that uses the TCP transport protocol according to the following example:

RTSP requestDescriptionSample frame
rtsp://ip-address/axis-media/media.amp?analytics=polygonVideo analytics metadata excluding video streamdetections.xm
metadata sample with explanations

1) Information about the type of metadata stream generated by the device.

2) Frame timestamp crucial for syncing metadata with video (or audio) when you play or query and source field.

3) Information for bounding boxes and polygons, represented in ONVIF coordinate system which is -1 to 1 in the X and Y axes.

4) Bounding boxes and polygons are currently the same if you use Analytics Scene Description as a source.

5) Represents the color (of the car) and the probability value of the object classification. The color is presented before object class due to the ONVIF Profile M format. Object class can be found in section 6.

6) Object class and probability value of the object classification, such as vehicle in this example.

7) In addition to the main category, it presents a sub-category, such as a car in this example.

Additional parameters such as "camera=2" can be added to the above request to e.g. receive metadata events from a different video channel. This is useful when the Axis device supports more than one video source.

Please visit the AXIS OS knowledge base for additional information.

3) Second layer of analytics

Some applications require a combination of edge-based and server-based processing to perform more advanced analyses. Pre-processing can be performed on the camera and further processing on a server. Such a hybrid system can facilitate cost-efficient scaling of analytics by streaming only relevant video or images along with metadata to the server.

MQTT is a standard messaging protocol that facilitates efficient and reliable exchange of data between IoT devices and cloud applications. It allows devices (through their MQTT clients) to publish messages to a common MQTT broker (server) that mediates communication with other devices. The broker keeps track of who is publishing what and who wants to see the data, forwarding messages to only the clients that subscribe to the right topic. In a typical VMS ecosystem, Axis event notifications from devices are traditionally streamed to a single destination via VAPIX/ONVIF API interface using the RTSP streaming protocol. But the same notifications can be distributed using the MQTT protocol via the device’s built-in MQTT client (applicable for devices running AXIS OS 9.80 or a later version). This is possible both within VMS ecosystems and outside of them, and is particularly useful over the internet. This guide does not provide an overview of the MQTT protocol or specific configuration of the MQTT client on the camera. To find out more about these topics, please visit Device integration with MQTT and VAPIX MQTT Client API.

The capability of obtaining analytics metadata using the MQTT client available on Axis cameras is available by enabling a feature flag through the Feature Flag Service API. Feature flags can be used to, among other things, toggle experimental features on/off and to enable prototyping a solution early on in the development process to get feedback from developers.

Developers interested in exploring this capability can obtain detailed instructions on how to enable this feature by creating a support helpdesk ticket specifying AXIS Scene Metadata in the product category field. Please include information about your project and use-case in the description field.

Configure the metadata stream

From AXIS OS 10.11 and onwards, it's possible to choose dedicated analytics producers within the Axis camera or to have multiple analytics metadata producers enabled simultaneously. There are currently three methods available to configure the metadata producers within the camera. Additional information about each of these methods can be found below. 

Once the metadata stream has been configured according to your needs, you can access the output using one of the available methods as described in the previous section.

Start implementing

You now posses the necessary information to develop solutions based on the analytics metadata content generated by Axis cameras. The components making up a solution can vary drastically depending on the specific needs of your project.

Safety and security

Analytics metadata can be used in real time to help operators respond quickly to situational changes. It can also provide valuable input to support decision making or enable automated action.

Real time edge analytics that work with high-quality metadata can help you secure people, sites, and buildings and protect them from intentional or accidental harm. You can rapidly detect, verify, and evaluate threats so they can be efficiently handled.

Operational efficiency

The analytics metadata is typical stored in a database type component which can be queried regularly to extract the data that is of interest and visually present it in dashboard.

In the case of leveraging cloud services to perform advanced analysis on a detected object, however, the database component might not be required if there is a need to present the data in as close to real-time as possible. 

An example implementation is represented below.

Analytics schematic

1) An Axis camera with MLPU or DLPU, generating analytics metadata

2) The analytics metadata is transmitted to consumers through the available communication protocols

3) The analytics metadata is further processed and stored, then consumed by different applications

4a) Collected data is visualized in a graphical dashboard to analyze trends and gain insights

4b) Access to a sensitive area is restricted based on license plate information included in the analytics metadata

Release updates

You can view upcoming changes in the AXIS OS Portal.

AXIS OS 12.0Changes in the analytics metadata stream. Read more
AXIS OS 11.9Introduced the Message Broker API (in beta). This API lets you build applications that can easily access analytics metadata of detected objects in the scene.
AXIS OS 11.5Upper and lower clothing color has been added as an attribute to the human object class within the analytics scene description metadata stream. Read more
AXIS OS 11.1Vehicle color is included in the Axis analytics metadata stream. Read more
AXIS OS 11.0The source parameter in the Axis analytics metadata stream changed to "AnalyticsSceneDescription". Read more
AXIS OS 10.11

For new products and after defaulting the devices, AXIS Object Analytics will become the default metadata producer. Old configured systems will work as it is once you upgrade to AXIS OS 10.11.

  • AXIS Object Analytics will also deliver objects that cannot be classified. See sample package.
  • Added support for HumanFace, LicensePlate, Bike, Bus, Car and Truck as new object detection classes to machine learning cameras. See affected products in the AXIS OS portal.
  • Added information about supported features for metadata producers in web GUI(Axis Device Assistant). See how it looks.
  • Corrected an issue where changing rotation required restart of the device for the metadata in Axis Object Analytics. See affected products in the AXIS OS portal.
AXIS OS 10.10

Restructure of AXIS Object Analytics and Motion Object Tracking Engine (MOTE) metadata producers. AXIS Object Analytics ACAP is no longer the producer for object classification metadata.

The metadata from the Axis Object Analytics provider will add two new classifications and associated bounding boxes for: 

  • HumanFace
  • LicensePlate (location of the license plate on the image)

Note: The additional classifications will only be available on DL cameras.

For releases after AXIS OS 10.10, it is good to be aware that:

  • There might be new metadata producer added
  • The factory default metadata producer might change
  • The contents of the metadata from the producers that already exist may evolve with new classifications and other additional data

For more information, see changes in Metadata Analytics stream AXIS OS 10.10.

AXIS OS 10.9Motion Object Tracking Engine (MOTE) data got a source name. The main difference is Source=VideoMotionTracker, the rest is the same. See example package.
AXIS OS 10.6

With the release of Axis OS 10.6, you are able to retrieve object classification data, e.g. humans and vehicles.

To be able to retrieve object classification data, you need a device with MLPU or DLPU. Please refer to the Product selector...


ONVIF Profile M