Identification and recognition

Required resolution

The traditional way of defining the resolution requirements for an analog CCTV system was to specify the height of the observed object as a percentage of the vertical image. Different surveillance objectives required different percentages.

For example, detecting the presence of a person in a scene might require that the height of the person occupies 20% of the view. Recognizing a person, however, might require that the person occupies 40%, and identification person might require 140% or more (i.e. the person is taller than the image).

Figure 1: The height of the same person occupying 20%, 40% and 140% of the image.

Network video cameras today, however, offer a much wider range of resolutions, and using the percentage requirements from the analog world is no longer practical. Instead, we now use pixels when specifying resolution requirements. The table below shows how Axis defines these requirements.

Operational requirement Horizontal pixels/face Px/cm Px/inch
Identification (Challenging conditions) 80 px/face 5 px/cm 12,5 px/in
Identification (Good conditions) 40 px/face 2,5 px/cm 6,3 px/in
Recognition 20 px/face 1,25 px/cm 3,2 px/in
Detection 4 px/face 0,25 px/cm 0,6 px/in


Table 1: The Axis definition of the requirements for detection, recognition and identification.

For a detailed discussion on the resolution requirements for identification, recognition and detection, see the Perfect pixel count tutorial.

Other criteria will be valid for other objects, such as license plates, where typical recommendations are that the height of the letters and digits should be 15 pixels (corresponding to approx. 200 pixels/m) to ensure legibility.

When determining the resolution needed in order to use camera footage as evidence in court, it is also important to take into account legal and regulatory requirements.

The resolution of a captured scene is determined by the camera resolution and the size of the scene. Select a camera and lens that will allow the field of view to match the scene width at the desired distance from the camera.

Camera’s horizontal resolution Focal length Maximum
scene width
2592 pixels 2.8 – 8 mm 9 m 5.2 m
1280 pixels 3.3 – 12 mm 6 m 2.6 m
1920 pixels 5.1 – 51 mm  41 m 3.8 m
736 pixels 3.3 - 119 mm 50 m 1.5 m
1280 pixels 4.4 – 132 mm 67 m 2.6 m


Table 2: Examples of maximum distances for identification (500 px/m or 80 pixels/face).

Axis Lens Calculators and the Axis Product Selector are useful tools that help you find a suitable camera and focal length. For advanced users, a pixel and distance calculator spreadsheet is also available.

As the maximum size of a scene at a given resolution only depends on the resolution, cameras with higher resolutions can cover larger areas. For example, if a scene 7 m wide requires five cameras at 4CIF resolution, these could be replaced by two cameras at 1080p HDTV resolution (1920 x 1080 pixels). Also, a camera with higher resolution can be used to give a better overview, by covering a larger scene while maintaining the required horizontal resolution.

Various Resolutions

Figure 2: A comparison of various resolutions.
1: 4 CIF (704x576)
2: SVGA (800x600)
3: HDTV 720p (1280x720)
4: HDTV 1080p (1920x1080)
5: 3 MP (2048x1536)
6: 5 MP (2592x1944)
7: 4K (3840x2160) 

The greater the depth of field, the larger the area in which persons or objects are in focus. The chances of identification increase with a larger depth of field, which is determined by the iris opening, the focal length and the distance to the camera.

The depth of field increases as the iris opening gets smaller, which means that good lighting can help increase the depth of field. The P-Iris feature in some Axis cameras will adjust the iris to optimize the depth of field for various lighting conditions.

You can learn more about the P-Iris in this white paper:

P-Iris. New iris control improves image quality in megapixel and HDTV network cameras.

Using a shorter focal length will also increase the depth of field. Cameras with higher resolutions can capture scenes using shorter focal lengths, whilst still maintaining resolution requirements.

Most lenses exhibit some degree of distortion, often in the form of barrel distortion. This is due to the lens magnification being smaller at the edges of the field of view than at the center of the image. The effect is that objects near the edge appear closer to the center as compared to an undistorted image. Objects of the same size will cover fewer pixels when close to the edge, compared to what they would cover if they were closer to the center. This means that objects close to the edge of the field of view need to be closer to the camera in order to fulfill the requirements for minimum resolution.

The effect of barrel distortion is often much more pronounced at short focal lengths, making wide angle lenses less suitable for identification purposes.