Audio API

Description

Axis products with audio capabilities usually support two-way audio, that is the Axis product can both transmit and receive audio. Using a built-in or external microphone, the Axis product can capture audio and transmit the audio to the network. Using built-in or external speakers, the Axis product can play audio received from the network. Most products support full duplex and half duplex transmission modes but can also be configured to use simplex modes where the product can only receive or only transmit audio. As audio surveillance is restricted in many countries, audio streaming can always be disabled.

Note
For information about audio clips, see Media clip API.

Audio from the Axis product can be streamed together with H.264/MJPEG video over RTP/RTSP (including RTP over RTSP over HTTP), together with MJPEG video over HTTP, or on its own. When streaming over RTP/RTSP, audio and video are synchronized. Supported audio compression standards are product-dependent but usually includes G.711, G.726, AAC and Opus. For sample rates and bit rates, see Audio compression formats. AAC should be streamed over RTP/RTSP as streaming over HTTP is not part of the standard. AAC and Opus support stereo audio in addition to mono.

Audio settings are defined by parameters in the Audio and AudioSource groups. Many audio settings are global but the Audio.A# groups can be used to set up different audio configurations to be used in different streams. When requesting an HTTP or RTSP stream, argument audio determines if audio is streamed or not. If audio is omitted, the parameter settings determine if audio is included in the stream.

To enable audio, parameters Audio.A#.Enabled and AudioSource.A#.AudioSupport must both be set to yes.

Audio modes

Axis network video products can support some or all of the following audio modes:

Full duplex

Simultaneous two-way audio. Multiple clients can receive audio, but only one client at a time can transmit audio.

Half duplex

Two-way audio, but only in one direction at a time.

Simplex – Speaker only

One-way audio where audio is transmitted from the client to the Axis product.

Simplex – Microphone only

One-way audio where audio is transmitted from the Axis product to the client. Multiple clients can receive audio at the same time.

Audio compression formats

Axis network video products can support all or some of the following audio compression formats

CompressionMIME typeBit rate (kbit/s)Sample rate (kHz)
G.711 µ-lawaudio/basic648
Axis µ-law 128audio/axis-mulaw-128(1)12816
G.726audio/G726-32328
audio/G726-24248
AACaudio/mpeg4-generic8, 12, 16, 24, 328
12, 16, 24, 32, 48, 6416
16, 24, 32, 48, 64, 128 32
32, 48, 64, 12844.1
32, 48, 64, 12848
Opusaudio/opus8, 12, 16, 24, 328
12, 16, 24, 32, 48, 6416
32, 48, 64, 12848
LPCMaudio/L24384, 768, 1058.4, 115216, 32, 44.1, 48
  1. Variant of G.711 µ-law with doubled sample rate and bit rate. Can be used for client-to-server communication.

Prerequisites

Identification

The Audio API is supported if:

Property

Properties.API.HTTP.Version=3

Property

Properties.Audio.Audio=yes

Firmware

5.00 and later

The Properties.Audio parameters lists supported audio capabilities.

Properties.Audio
ParameterValid valuesDescription
Audioyes
no

yes = Audio is supported.

no = Audio is not supported.

FormatA string

Comma-separated list of supported audio encoding formats.

g711 = G.711 µ-law is supported.

g726 = G.726 is supported.

aac = AAC is supported.

opus = Opus is supported.

lpcm = Lpcm is supported.

DuplexModeA string

Comma-separated list of supported duplex modes.

full = Full duplex mode is supported.

half = Half duplex mode is supported.

post = Simplex post mode is supported. The Axis product can transmit simplex audio.

get = Simplex get mode is supported. The Axis product can receive simplex audio.

InputTypeA string

Comma-separated list of supported input types.

mic = Microphone input is supported.

line = Line input is supported.

Decoder.FormatA string

Comma-separated list of supported audio decoding formats.

g711 = G.711 µ-law is supported.

axis-mulaw-128 = Axis µ-law 128 is supported.

g726 = G.726 is supported.

opus = Opus is supported.

Source.A#.Inputyes
no

yes = AudioSource.A# has audio input.

no = AudioSource.A# does not have audio input.

Source.A#.Outputyes
no

yes = AudioSource.A# has audio output.

no = AudioSource.A# does not have audio output.

Common examples

Enable audio in the Axis product. AudioSource.A0.AudioSupport=yes enables audio from audio source 0. Audio.A0.Enabled=yes enables audio configuration 0.

//myserver/axis-cgi/param.cgi?action=update&AudioSource.A0.AudioSupport=yes
//myserver/axis-cgi/param.cgi?action=update&Audio.A0.Enabled=yes

Request an RTSP stream with video and audio.

rtsp://myserver/axis-media/media.amp?videocodec=h264&audio=1

Request an audio stream over HTTP.

//myserver/axis-cgi/audio/receive.cgi

Limit the maximum number of clients that can receive audio at the same time.

//myserver/axis-cgi/param.cgi?action=update&Audio.MaxListeners=5

Configure the audio source parameters.

//myserver/axis-cgi/param.cgi?action=update
&AudioSource.A0.Name=Dynamic%20Microphone
&AudioSource.A0.AudioEncoding=g726
&AudioSource.A0.InputType=mic
&AudioSource.A0.MicrophonePower=no

Parameters

Audio parameters

The Audio group contains audio parameters used for all audio configurations.

Audio
Parameter Default valueValid valuesAccess controlDescription
DuplexModeProduct-dependentfull(1)
half(1)
get
post(1)
admin: read, write
operator: read, write
viewer: read

The audio mode.

full = Full duplex. Simultaneous two-way audio.

half = Half duplex. Two-way audio, but only in one direction at a time.

get= Simplex. Retrieve audio from the Axis product.

post = Simplex. Send audio to the Axis product.

MaxListeners10(1) or 20(1) 0 … 20(1)admin: read, write
operator: read, write
viewer: read
Maximum number of simultaneous audio clients (does not affect multicast delivery).
ReceiverBuffer (1)1200 … 9999admin: read, write
operator: read, write
The receiving audio buffer size in milliseconds.
ReceiverTimeout10000 … 9999admin: read, write
operator: read, write
The receiving audio timeout in milliseconds. When the Axis video product is receiving audio data from a client, the session is terminated if no data is received in this time span.
NbrOfConfigsProduct-dependentAn unsigned integeradmin: read
operator: read
viewer: read
The number of audio configurations, that is of Audio.A# subgroups.
DSCP00 … 63admin: read, write
operator: read
viewer: read
The Differentiated Services Codepoint for audio Quality of Service (QoS).
  1. Product/release-dependent. Check the product’s release notes.

Audio configuration parameters

The Audio.A# groups contain settings for different audio configurations. The audio configurations can be used when requesting audio streams.

Note
The # in Audio.A# is replaced by a group number starting from zero, e.g. Audio.A0.
Audio.A#
ParameterDefault valueValid valuesAccess controlDescription
Enablednoyes
no
admin: read, write
operator: read, write
viewer: read
Enable/disable the audio for the specific audio configuration.
HTTPMessageTypesinglepartsinglepart
multipart
admin: read, write
operator: read, write
viewer: read
How audio should be streamed. Some proxies require multipart streaming.
NameA stringadmin: read, write
operator: read, write
Name of the configuration.
Source0An integer(1)admin : read, write
operator: read, write
The audio source a specific audio configuration is connected to.
NbrOfChannels(1)11, 2admin : read, write
operator: read, write

Number of channels in the audio configuration.

1 = mono audio.

2 = stereo audio.

AlarmLevel(2)500 ... 100admin: read
operator: read

Obsolete. Replaced by AudioSource.A#.AlarmLevel.

Alarm level in percent of the maximum amplitude of the audio samples. The alarm level is used in event setup. Events can be configured to trigger when the sound level rises above or falls below the alarm level.

AlarmResolution(2)500 ... 100admin: read
operator: read
The length of the audio sample used for the audio alarm calculation. The parameter is expressed as percent of a block of 1024 samples, e.g. 50% corresponds to 512 samples. The actual sample time is the number of samples divided by the sample rate, e.g. 512 samples at 8 kHz correspond to 64 ms.
An audio alarm is generated when the mean level for a sample exceeds the AlarmLevel. A shorter AlarmResolution makes the alarm calculation more sensitive.
AlarmLowLimit(2)500 ... 10000admin: read
operator: read
The lowest configurable alarm limit (AlarmLevel=0%) in basis points (1/10000) of the maximum amplitude value.
AlarmHighLimit(2)65000 ... 10000admin: read
operator: read
The highest configurable alarm limit (AlarmLevel=100%) in basis points (1/10000) of the maximum amplitude value.
  1. Product/release-dependent.
  2. Obsolete

Audio source parameters

The AudioSource group contains settings for the product’s audio sources.

AudioSource
ParameterDefault valueValid valuesAccess controlDescription
NbrOfSources1(1)An unsigned integeradmin: read
operator: read
viewer: read
The number of audio sources.
AudioSupportyesyes
no
admin: read, write
operator: read
viewer: read
Whether the audio sources should be enabled or not.
  1. Product/release-dependent. Check the product’s release notes.

The AudioSource.A# groups contain settings for the different audio sources. The # is to be replaced by an integer starting from zero, for example AudioSource.A0

AudioSource.A#
ParameterDefault valueValid valuesAccess controlDescription
NameAudioA stringadmin: read, write
operator: read, write
Name of the audio source.
AudioEncodingaac(1)g711(2)
g726(2)
aac(2)
opus(2)
lcpm(2)
admin: read, write
operator: read, write
viewer: read
The audio codec.
InputTypeHardware-dependentinternal(1)
mic
line(1)
digital(1)
admin: read, write
operator: read, write
The source from where the audio is captured.
MicrophonePoweryes(1)yes
no
admin: read, write
operator: read, write
Enable/disable power on the audio input connector.
InputGain0mute
Product-dependent numbers (decimals allowed)
admin: read, write
operator: read, write

Applied gain (in dB) to sound sent from the Axis product.

InputPreGain(1)high(1)low
high
admin: read, write
operator: read, write

Pre-amplifier gain.

OutputGain(1)0mute
Product-dependent numbers (decimals allowed)
admin: read, write
operator: read, write

Applied gain (in dB) to sound sent to the Axis product.

SampleRateHardware-dependent8000(1)
16000(1)32000(1)44100(1)48000(1)
admin: read, write
operator: read, write
Clock rate (in Hz) for the audio sampling.
BitRateEncoder-dependentg711: 64000
g726: 24000, 32000
aac (8 kHz): 8000, 12000, 16000, 24000, 32000
aac (16 kHz): 12000, 16000, 24000, 32000, 48000, 64000
aac (32 kHz): 16000, 24000, 32000, 48000, 64000, 128 000
opus (8 kHz): 8000, 12000, 16000, 24000, 32000
opus (16 kHz): 12000, 16000, 24000, 32000, 48000, 64000
opus (48 kHz): 32000, 48000, 64000, 128 000
admin: read, write
operator: read, write
The output bit rate (in bits per second).
AudioSupportyesyes
no
admin: read, write
operator: read
viewer: read
Enable/disable audio from this audio source. If the audio source is turned off with this parameter, no audio will be transmitted even if Audio.A#.Enabled=yes.
InputPort(1)1An integeradmin: read, write
operator: read, write
Set which audio input port to use if the device got more than one.
MicrophoneBalancedno(1)yes
no
admin: read, write
operator: read, write
Enable/disable balanced audio source.
MicrophonePowerTypeelectret2_5v(1)electret
electret3_0v
electret2_5v
electret2_0v
p12
p48
r12
admin: read, write
operator: read, write
The power types to use for the microphone. To set a value it is assumed that MicrphonePower is set to yes.
SpeakerAmp(1)noyes
no
admin: read, write
operator: read, write
Enable/disable speaker amplifier.
AlarmLevel1000 .. 100admin: read, write
operator: read, write
viewer: read

Alarm level for the tns1:AudioSource/tnsaxis:TriggerLevel event. Replaces Audio.A#.AlarmLevel.

The alarm level is the audio input level expressed in percent. 0% corresponds to the minimum audio level which is -90 dBFS for 16-bit audio. 100% corresponds to the maximum audio level which is 0 dBFS.

LevelIndicator(1)noyes
no
admin: read, write
operator: read, write
Enable/disable audio level indication.
PTZAlarmControl(1)yesyes
no
admin: read, write
operator: read, write
viewer: read
Enable/disable audio level alarm during PTZ movement. Camera movement could create noises that trigger alarms. If set to yes no alarms will trigger during PTZ movement.
NbrOfChannelsProduct-dependentAn unsigned integeradmin: read
operator: read
viewer: read

Number of supported audio channels within the AudioSource.A#.

1 = Mono audio.

2 = Stereo audio.

Each channel has its own input and output gain settings. See AudioSource.A#.Channel.C# below.

  1. Product/release-dependent. Check the product’s release notes.
  2. Product-dependent. Check the corresponding Properties parameter.
AudioSource.A#.Channel.C
ParameterDefault valueValid valuesAccess controlDescription
InputGainInheritedmute
inherit
Product-dependent numbers (decimals allowed)
admin: read, write
operator: read, write

Applied gain (in dB) to sound sent from the Axis product.

mute = Audio is muted.

inherit = Value is inherited from parameter AudioSource.A#.InputGain

OutputGainInheritedmute
inherit
Product-dependent numbers (decimals allowed)
admin: read, write
operator: read, write

Applied gain (in dB) to sound sent to the Axis product.

mute = Audio is muted.

inherit = Value is inherited from parameter AudioSource.A#.OutputGain

HTTP API

The default audio source is AudioSource.A0.

Audio data request

Request and configure an audio stream.

Access control

viewer

Method

GET

Syntax:
//<servername>/axis-cgi/audio/receive.cgi?[&<argument>=<value>]

With the following argument and values:

ArgumentValid valuesDescription
audio=<int>0,1Enable (1) or disable (0) audio.
camera=<int>1...(1)Select the audio configuration in Audio.A#.
Note: The argument has a different value than the corresponding parameter. E.g. if the argument camera=1 then the parameter group is Audio.A0.
httptype=<string>singlepart
multipart
Choose streaming method. Some proxies require multipart streaming. Default: As defined by the parameter Audio.A#.HTTPMessageType
audiochannel=<int>1...(1)Select the audio source in AudioSource.A#.
Note: The argument has a different value than the corresponding parameter. E.g. if the argument audiochannel=2 then the parameter group is AudioSource.A1.
audionbrofchannels=<int>1...The number of audio channels.
  1. The number of audio configurations/audio sources may differ between different cameras and video servers. See the product's specification.

Request an audio stream:

//myserver/axis-cgi/audio/receive.cgi

Singlepart audio data response

Request a singlepart audio stream using HTTP:

//<servername>/axis-cgi/audio/receive.cgi?httptype=singlepart

Successful request

If the request was successful, the server returns a continuous flow of audio packets. The content type is only set at the beginning of the connection. When the connection is up and running audio packets will come one after another without any extra information between the packets.

Return
Successful response to a HTTP request. Here, singlepart audio data with G.711 μ-law compression is returned.

HTTP Code

200 OK

Content-Type

<Audio MIME>

Syntax:
<audio data>
<audio data>
<audio data>
...

Failed request

If the specified parameter value is invalid, the server returns 400 Bad Request.

Return

HTTP Code

400 Bad Request

Syntax:
<body>

Multipart audio data response

Request a multipart audio stream using HTTP:

//<servername>/axis-cgi/audio/receive.cgi?httptype=multipart

Successful request

If the request was successful, the server returns a continuous flow of audio packets. The content type is “multipart/x-mixed-replace” and each audio packet ends with a boundary string. The message body contains a block of binary data. The content length provides the size of each block of coded audio which varies for different codecs: G.711 has 512 bytes block size, G.726 32 kbit/s has 256 bytes and G.726 24 kbits/s has 192 bytes. AAC is not supported.

Return
Successful response to a HTTP request. Here, multipart audio data with G.726 32 kbit/s compression is returned.

HTTP Code

200 OK

Content-Type

multipart/x-mixed-replace; boundary=<boundary>

Syntax:
--myboundary \r\n
Content-Type: audio/G726-32\r\n
Content-Length: 256\r\n
 
<Audio data>\r\n
--myboundary\r\n
Content-Type: audio/G726-32\r\n
Content-Length: 256\r\n
 
<Audio data>\r\n
--myboundary\r\n
Content-Type: audio/G726-32\r\n
Content-Length: 256\r\n
 
<Audio data>\r\n
--myboundary\r\n
Content-Type: audio/G726-32\r\n
Content-Length: 256\r\n
 
<Audio data>\r\n
--myboundary\r\n

Failed request

If the specified parameter value is invalid, the server returns 400 Bad Request.

Return

HTTP Code

400 Bad Request

Syntax:
<body>

Transmit audio data

Transmit a singlepart audio data stream:

Check what audio formats your Axis product can transmit. For a complete list of audio formats supported by VAPIX® see Audio compression formats.

//<servername>/axis-cgi/param.cgi?action=list&group=Properties.Audio.Decoder
Access control

viewer

Method

POST

Syntax:
//<servername>/axis-cgi/audio/transmit.cgi
Content-Type

<Audio MIME>

Content-Length

<Ignored if emitted or zero, or shall be set to transfer length of message body.>

Syntax:
<Audio data>

There are no arguments and values to transmit.cgi.

When an audio stream is transmitted, the server receives a continuous flow of audio packets. The content type is only set at the beginning of the connection together with the content length that can have any value. When the connection is up and running the audio packets will come right after another without any extra information between the packets. The message body contains a block of binary data.

The content length must be set to a valid size and will generate a server response for every successful playback. If the playback fails, the connection will be closed without any response.

Transmit singlepart audio using G.711 µ-law (authorization omitted):

POST /axis-cgi/audio/transmit.cgi HTTP/1.0\r\n
Content-Type: audio/basic\r\n
\r\n
<Audio data>
<Audio data>
<Audio data>
...

Error responses

This section describes the error responses that can occur when using the API.

Error codeContent-TypeError messageDescription
400text/plainBad requestThe request had a bad syntax, or could not be implemented.
405text/plainMethod not allowedThe GET/POST is not allowed in the current mode.
415text/plainUnsupported media typeThe request is not in an acceptable format and can’t be processed.
503text/plainService unavailableThe maximum number of clients are already connected.

Audio in the RTSP API

Media streams transmitted over RTSP include audio if the request contains audio=1 or if parameters Audio.A#.Enabled and AudioSource.A#.AudioSupport are enabled. If audio=0, the stream does not include audio even if the parameters are enabled.

When AudioSource.A#.AudioSupport is enabled, the camera and audiochannel arguments from audio/receive.cgi can be used when requesting RTSP streams. The camera argument specifies both the video source and the audio configuration.

Audio detection event

The Audio Detection event is true when the sound level rises above the audio alarm level defined by parameter AudioSource.A#.AlarmLevel.

Topic

Name

tns1:AudioSource/tnsaxis:TriggerLevel

Type

Stateful

Nice name

Audio detection

Source instance

Nice name

Channel

Type

integer

Name

channel

ValueNice name
1
2
...
n = number of audio channels

Data instance

Nice name

Above alarm level

Type

boolean

Name

triggered

isPropertyState

true