The audio framework is a whole audio architecture aiming to provide different layers of audio interfaces for applications. The interfaces involve Audio HAL and Audio Framework.
Audio HAL: defines unified audio hardware interfaces. It interacts with audio driver to do audio streaming or audio settings.
Audio Framework: provides interfaces for audio streaming, and other settings.
The audio framework provides passthrough architecture which supports audio recording, and audio playback. This architecture has no audio mixing, only one audio playback is allowed at the same moment.
The audio interfaces and the entire implementation of audio passthrough architecture is shown below.
This section describes the data format that audio framework and audio HAL supports.
The common part of audio framework and audio HAL is described here. And the different parts will be described in their own sections.
Both audio framework and audio HAL support interleaved streaming data.
The interleaved data is illustrated in the following figures.
This section describes the format that Audio Framework supports. Before playback, or capture, make sure your sound format is supported.
Audio Framework has the following types of bit depth:
RTAUDIO_FORMAT_INVALID - invalid bit depth of audio stream
RTAUDIO_FORMAT_PCM_8_BIT - audio stream has 8-bit depth
RTAUDIO_FORMAT_PCM_16_BIT - audio stream has 16-bit depth
RTAUDIO_FORMAT_PCM_32_BIT - audio stream has 32-bit depth
RTAUDIO_FORMAT_PCM_FLOAT - audio stream has 32-bit float format
RTAUDIO_FORMAT_PCM_24_BIT - audio stream has 24-bit depth
RTAUDIO_FORMAT_PCM_8_24_BIT - audio stream has 24-bit + 8-bit depth
The following table describes the supported formats for playback and recording. Y means the format is supported; N means the format is not supported.
Bit depth
Playback
Capture
RTAUDIO_FORMAT_PCM_8_BIT
Y
Y
RTAUDIO_FORMAT_PCM_16_BIT
Y
Y
RTAUDIO_FORMAT_PCM_32_BIT
Y
Y
RTAUDIO_FORMAT_PCM_FLOAT
N
N
RTAUDIO_FORMAT_PCM_24_BIT
Y
Y
RTAUDIO_FORMAT_PCM_8_24_BIT
Y
Y
The sample rate is another important format of audio streaming. For playback and recording, audio framework supports the following sample rates.
Y means the sample rate is supported; N means the sample rate is not supported.
Sample rate
Playback
Capture
8000
Y
Y
11025
Y
Y
16000
Y
Y
22050
Y
Y
32000
Y
Y
44100
Y
Y
48000
Y
Y
88200
Y
Y
96000
Y
Y
192000
Y
Y
To do audio streaming, the channel count parameter setting is necessary, too. For playback and recording, audio framework supports the following channel counts.
Y means the channel count is supported; N means the channel count is not supported.
AUDIO_HW_FORMAT_INVALID - invalid bit depth of audio stream
AUDIO_HW_FORMAT_PCM_8_BIT - audio stream has 8-bit depth
AUDIO_HW_FORMAT_PCM_16_BIT - audio stream has 16-bit depth
AUDIO_HW_FORMAT_PCM_32_BIT - audio stream has 32-bit depth
AUDIO_HW_FORMAT_PCM_FLOAT - audio stream has 32-bit float format
AUDIO_HW_FORMAT_PCM_24_BIT - audio stream has 24-bit depth
AUDIO_HW_FORMAT_PCM_8_24_BIT - audio stream has 24-bit + 8-bit depth
If using the Audio HAL interface, check the bit depth HAL supported for Playback and Capture.
Y means the format is supported; N means the format is not supported.
Bit depth
Playback
Capture
AUDIO_HW_FORMAT_PCM_8_BIT
Y
Y
AUDIO_HW_FORMAT_PCM_16_BIT
Y
Y
AUDIO_HW_FORMAT_PCM_32_BIT
Y
Y
AUDIO_HW_FORMAT_PCM_FLOAT
N
N
AUDIO_HW_FORMAT_PCM_24_BIT
N
N
AUDIO_HW_FORMAT_PCM_8_24_BIT
Y
Y
The sample rate is another important format of HAL audio streaming. For playback and recording, audio HAL supports the following sample rates.
Y means the sample rate is supported; N means the sample rate is not supported.
Sample rate
Playback
Capture
8000
Y
Y
11025
Y
Y
16000
Y
Y
22050
Y
Y
32000
Y
Y
44100
Y
Y
48000
Y
Y
88200
Y
Y
96000
Y
Y
192000
Y
Y
To do audio streaming, the channel count parameter setting is necessary, too. For playback and recording, audio HAL supports the following channel counts.
Y means the channel count is supported; N means the channel count is not supported.
Audio hardware configurations lie in {SDK}/component/soc/amebadplus/usrcfg/include/ameba_audio_hw_usrcfg.h.
Different boards have different configurations.
For example, some boards need to use an amplifier, while others do not. Different boards may use different pins to enable the amplifier; the start-up time is different for different amplifiers.
In addition, the pins used by each board’s DMICs may be different, and the stable time of DMICs may be different.
All the information needs to be configured in the configuration file.
The ameba_audio_hw_usrcfg.h file has the description for each configuration, please set them according to the description.
One is that the system clock is an integer multiple of 98.304M or 45.1584M, we add the system clock in SocClk_Info array, so you can modify the index of SocClk_Info array in bootloader_km4.c. When you need high-quality audio applications, you can use this method.
The other is that the system clock is not an integer multiple of 98.304M or 45.1584M, in this case, we automatically get the 98.304M or 45.1584M.
Audio HAL provides AudioHwStreamOut/AudioHwStreamIn/AudioHwControl interfaces to interact with audio hardware. The interfaces lie in {SDK}/component/audio/interfaces/hardware/audio.
The interfaces have specific descriptions in them, read them before use.
AudioHwStreamOut: receives PCM data from the upper layer, writes data via audio driver to send PCM data to hardware, and provides information about audio output hardware driver.
AudioHwStreamIn: receives PCM data via audio driver and sends to the upper layer.
AudioHwControl: receives control calling from the upper layer, and sets control information to the driver.
The AudioHwStreamOut/AudioHwStreamIn is managed by AudioHwCard interface. It is responsible for creating/destroying AudioHwStreamOut/AudioHwStreamIn instance.
AudioHwCard is a physical or virtual hardware to process audio stream. It contains a set of ports and devices as shown in following figure.
Port – the stream output/input of the audio card is called “port”.
Device – The device output/input of audio card is called device.
Choose a specific card to play (currently audio manager only support primary audio card):
structAudioHwCardDescriptor*audio_card_desc;for(int32_tindex=0;index<cards_size;index++){structAudioHwCardDescriptor*desc=&card_descs[index];for(uint32_tport=0;(desc!=NULL&&port<desc->port_num);port++){printf("check for audio port \n");if(desc->ports[port].role==AUDIO_HW_PORT_ROLE_OUT&&(audio_card=audio_manager->OpenCard(audio_manager,desc))){audio_port=desc->ports[port];audio_card_desc=desc;break;}}}
Create AudioHwConfig according to the sample rate, channel, format, and AudioHwPathDescriptor, then use CreateStreamOut() to create an AudioHwStreamOut based on the specific audio card:
Construct AudioHwConfig according to the sample rate, channel, format, and AudioHwPathDescriptor, then use CreateStreamIn() to create an AudioHwStreamIn based on the specific audio card:
Here is an example showing how to use audio HAL interfaces to control audio codec:
AudioHwCotrol is always thread-safe, and the calling is convenient. To use AudioHwCotrol, the first parameter of the function call should always be GetAudioHwControl().
Take the PLL clock setting for example:
Audio Streaming Interfaces include RTAudioTrack and RTAudioRecord interfaces, which lie in {SDK}/component/audio/interfaces/audio.
The interfaces have specific descriptions, read them before using.
RTAudioTrack: initializes the format of playback data streaming in the framework, receives PCM data from the application, and writes data to Audio HAL.
RTAudioRecord: initializes the format of record data streaming in the framework, receives PCM data from Audio HAL, and sends data to applications.
RTAudioTrack includes support for playing variety of common audio raw format types so that audio can be easily integrated into applications.
Audio Framework has the following audio playback category types.
Applications can use the types to initialize RTAudioTrack. Framework gets the category type and does the volume mixing according to the types.
RTAUDIO_CATEGORY_MEDIA - if the application wants to play music, then its type is RTAUDIO_CATEGORY_MEDIA, it can use this type to init RTAudioTrack. Then audio framework will know its type, and mix it with media’s volume.
RTAUDIO_CATEGORY_COMMUNICATION - if the application wants to start a phone call, it can output the phone call’s sound, the sound’s type should be RTAUDIO_CATEGORY_COMMUNICATION.
RTAUDIO_CATEGORY_SPEECH - if the application wants to do voice recognition, and output the speech sound.
RTAUDIO_CATEGORY_BEEP - if the sound is key tone, or other beep sound, then its type is RTAUDIO_CATEGORY_BEEP.
The test demo of RTAudioTrack lies in {SDK}/component/example/audio/audio_track.
Here is an example showing how to play audio raw data:
Before using RTAudioTrack, RTAudioService should be initialized:
Apps can use the Audio Configs API to provide detailed audio information about a specific audio playback source, including stream type (type of playback source), format, number of channels, sample rate, and RTAudioTrack ringbuffer size. The syntax is as follows:
define the stream type of the playback data source.
sample_rate:
playback source raw data’s rate.
channel_count:
playback source raw data’s channel number.
format:
playback source raw data’s bit depth.
buffer_bytes:
ringbuffer size for RTAudioTrack to avoid xrun.
Note
The buffer_bytes in RTAudioTrackConfig is very important. The buffer size should always be more than the minimum buffer size Audio framework calculated.
Otherwise overrun will occur.
Use the interface to get minimum RTAudioTrack buffer bytes, and use it as a reference to define RTAudioTrack buffer size, for example, you can use minimum buffer size*4 as buffer size. The bigger size you use, the smoother playing you will get, yet it may cause more latency. It’s your choice to define the size.
Apps can use the Audio Configs API to provide detailed audio information about a specific audio record source, including record device source, format, number of channels, and sample rate. The syntax is as follows:
With RTAudioRecordConfig object created, you can initialize RTAudioRecord, in this step, Audio HAL’s AudioHwCard will be opened, according to the audio input device source:
RTAudioRecord_Init(audio_record,&record_config);
When all the preparations are completed, start audio_record:
RTAudioRecord_Start(audio_record);
Read audio microphone data. The read size can be defined by users. Users need to make sure size/frame_size is integer.
Audio Control Interfaces include RTAudioControl interfaces to interact with audio control HAL.
RTAudioControl provides interfaces to set and get hardware volume, set output device, and so on.
The interfaces lie in {SDK}/component/audio/interfaces/audio/audio_control.h.
The interfaces have specific descriptions, read them before using.