IC:

Introduction

The Global Direct Memory Access (GDMA) controller, also referred to as DMAC, is primarily designed for transferring data between memory and peripherals via the AXI/OCP bus without CPU intervention, thereby offloading computational overhead from the CPU.

The GDMA controller comprises 8 channels:

  • Channel 0 and Channel 1 have a 128-byte FIFO buffer.

  • Other channels have a 32-byte FIFO buffer.

Featuring a dual AXI/OCP master bus architecture, the GDMA incorporates a slave interface for configuration and programming. It supports hardware-based priority arbitration and programmable priority between DMA requests.

DMAC Performance

The data-transmission efficiency of GDMA is affected by clock synchronization, channel FIFO depth, transfer types, handshake efficiency, GDMA interface setting of slave and other factors. The following data is based on the results of the experiment with the transmission type of single block and the transmission channel is zero.

The data transmission efficiency of the GDMA is influenced by clock synchronization, channel FIFO depth, transfer type, handshake efficiency, GDMA slave interface configurations, and other factors. The experimental data below is derived from tests conducted under single-block transfer mode with channel 0.

Slave

Clock (Hz)

Writing 64 bytes

Reading 64 bytes

SRAM

250M

(64*8)/(280ns)= 1828.57Mbps

(64*8)/(240ns)= 2133.33Mbps

PSRAM

250M

(64*8)/(350ns)= 1462.86Mbps

(64*8)/(360ns)= 1422.22Mbps

Audio

40M

(64*8)/(1050ns)= 487.62Mbps

(64*8)/(470ns)=1089.36Mbps

SPI

100M

(64*8)/(710ns)= 721.13Mbps

(64*8)/(670ns)= 764.18Mbps

Note

The time of GDMA turn-around is not included.

DMAC Configuration

The diagram of DMA block size is figured below:

../../_images/dma_block_size_diagram.svg

Data Size

The figure above illustrates the setting of GDMA transmission data size. The block_ts indicates the amount of data that will be transferred in a single data block. It needs to be set to the total number of data/SRC_TR_WIDTH, and the maximum value is 0xFFFF.

Transfer Direction and Flow Controller

There are currently four transmission directions and two flow controller settings, with a total of eight available configurations.

  • When the peripheral acts as a flow controller, the DMA transfers data according to the single/burst requests issued by the peripheral.

  • When the DMAC acts as a flow controller, all requests from the peripheral will be processed according to the configured requests.

TT_FC[2:0] field of CTLx register (x is channel)

Direction

Flow controller

000

Memory to Memory

DMAC

001

Memory to Peripheral

DMAC

010

Peripheral to Memory

DMAC

011

Peripheral to Peripheral

DMAC

100

Peripheral to Memory

Peripheral

101

Peripheral to Peripheral

Source Peripheral

110

Memory to Peripheral

Peripheral

111

Peripheral to Peripheral

Destination Peripheral

Note

The block_ts parameter can only be set when the DMAC is used as a flow controller.

Transfer msize

The length of each transaction can be configured.

  • msize > 1: burst transaction

  • msize = 1: single transaction

SRC_MSIZE[2:0]/DEST_MSIZE[2:0] field of CTLx register

Transfer msize

000

1

001

4

010

8

011

16

100 and above

Not supported

Transfer Width

The GDMA supports the following transmission width.

SRC_TR_WIDTH[2:0]/DST_TR_WIDTH[2:0] field of CTLx register

Transfer width (byte)

000

1

001

2

010

4

011 and above

Not supported

Note

  • When reading and writing peripherals, the SRC_TR_WIDTH/DST_TR_WIDTH is completely determined by the width of peripherals.

  • When reading and writing memory:

    • If cache is disabled, the address does not need to be aligned to any value. It only needs to be SRC_TR_WIDTH divisible by the total amount of data so that the block_ts is an integer.

    • If cache is enabled, buffer boundary addresses and cache line alignment are necessary.

  • If memory is destination (P2M, M2M), DST_TR_WIDTH parameter will be ignored, and writing are always based on the bus width (typically 32 bits, 4 bytes).

Transfer Types

Single Block

Single block DMA transfer – Consists of a single block.

Multi-block

Multi-block DMA transfer – DMA transfer may consist of multiple RTK_DMAC blocks. Multi-block transfer types include:

  • Auto-reloading mode

  • Linked list mode

Auto-reloading Mode

In auto-reloading mode, the source and destination can independently select which method to use.

Auto-reloading transfer types

Setting

Introduction

Src auto reload

PGDMA_InitTypeDef->GDMA_ReloadSrc = 1

PGDMA_InitTypeDef->GDMA_ReloadDst = 0

For multi-block transfers, the SAR register can be auto-reloaded from the initial value at the end of each block,

and DST address is contiguous, as shown in Multi-block DMA transfer with source address auto-reloaded and contiguous destination address.

Dst auto reload

PGDMA_InitTypeDef->GDMA_ReloadSrc = 0

PGDMA_InitTypeDef->GDMA_ReloadDst = 1

For multi-block transfers, the DAR register can be auto-reloaded from its initial value at

the end of each block, and the SRC address is contiguous.

Src & Dst auto reload

PGDMA_InitTypeDef->GDMA_ReloadSrc = 1

PGDMA_InitTypeDef->GDMA_ReloadDst = 1

For multi-block transfers, the SAR and DAR register can be auto-reloaded from its initial value at the end of each

block, as shown in Multi-block DMA transfer with source and destination address auto-reloaded.

../../_images/mbd_source_auto_dest_cont.png

Multi-block DMA transfer with source address auto-reloaded and contiguous destination address

../../_images/mbd_source_dest_auto.png

Multi-block DMA transfer with source and destination address auto-reloaded

Linked list Mode

In linked list mode, the addresses between data blocks do not have to be consecutive.

Link list transfer types

Setting

Introduction

Src: Continue address

Dst: Link list

PGDMA_InitTypeDef->GDMA_SrcAddr = pSrc

PGDMA_InitTypeDef->GDMA_LlpDstEn = 1

Source memory is a continuous data block, while destination data blocks are organized in linked list.

Src: Auto-reloading

Dst: Link list

PGDMA_InitTypeDef->GDMA_ReloadSrc = 1

PGDMA_InitTypeDef->GDMA_SrcAddr = pSrc

PGDMA_InitTypeDef->GDMA_LlpDstEn = 1

In source, SAR register can be auto-reloaded from the initial value at the end of each

block, as shown in Multi-block DMA transfer with source address auto-reloaded and linked list destination address.

Src: Link list

Dst: Continue address

PGDMA_InitTypeDef->GDMA_LlpSrcEn = 1

PGDMA_InitTypeDef->GDMA_DstAddr = pDst

Source memory is organized in the form of a linked list, and destination memory is

a continuous data block, as shown in Multi-block DMA transfer with linked list source address and contiguous destination address.

Src: Link list

Dst: Auto-reloading

PGDMA_InitTypeDef->GDMA_LlpSrcEn = 1

PGDMA_InitTypeDef->GDMA_DstAddr = pDst

PGDMA_InitTypeDef->GDMA_ReloadDst = 1

The source data blocks are organized in a linked list, and the destination data blocks are auto-reloading.

Src: Link list

Dst: Link list

PGDMA_InitTypeDef->GDMA_LlpSrcEn = 1

PGDMA_InitTypeDef->GDMA_LlpDstEn = 1

Both source and destination data blocks are organized in linked lists,

as shown in Multi-block DMA transfer with linked address for source and destination.

If both the destination and the source are continuous data blocks, multi-block transmission should not be used, and single-block transmission is more appropriate.

Address Increment Type

Source Address Increment

There are two modes:

  • Increment: Indicates whether to increment the source address on every source transfer. Incrementing is done for alignment to the next CTLx.SRC_TR_WIDTH boundary.

  • No change: If the device is fetching data from a source peripheral FIFO with a fixed address, then set this field to No change.

Destination Address Increment

There are two modes:

  • Increment: indicates whether to increment destination address on every destination transfer. Incrementing is done for alignment to the next CTLx.DST_TR_WIDTH boundary.

  • No change: If the device is writing data to a destination peripheral FIFO with a fixed address, then set this field to No change.

Real-time Status Acquisition

GDMA supports real-time acquisition of the current transmission source address, destination address and the data size that has been transmitted. Call the corresponding APIs to read.

Note

To get the amount of data that has been transferred, the block_ts must be greater than 768 at least, and cannot be read in an interrupt function; otherwise, the value obtained is always 0.

Interrupt Type

There are several supported interrupt types, which can be used independently or in combination.

Interrupt type

Introduction

block interrupt

Triggered by the completion of a data block transfer

transfer interrupt

Occurs when all data blocks have been transferred

error interrupt

There was a transfer error

Note

  • In multi-block, when the block in auto-reload mode is interrupted, the data will be transmitted after the interrupt processing function.

  • In linked list mode, the transfer-completed condition is that the pointer of the last data block pointing to the next data block is null.

  • In linked list mode, when the block interruption comes, the data will still continue to be transmitted.

Secure

To start secure transfer, users need to configure the security channel control bit in the register.

  • Access for master interface and slave interface are secure when the secure bit is set.

  • Secure channel can only be configured in secure world, and secure channel can access secure memory and non-secure memory.

  • Non-secure channel can only access non-secure memory.

PGDMA_InitTypeDef->SecureTransfer = 1;

Suspend and Abort

GDMA supports channel suspend resume and termination.

  • To suspend a channel, just configure CFGx.CH_SUSP, but there is no guarantee that the current data transaction is completed. Combined with CFGx.INACTIVE, the channel can be safely paused without losing data.

  • To resume data transmission after suspension, clear CFGx.CH_SUSP.

  • To terminate data transfer, CFGx.INACTIVE must be continuously polled until this bit is set to 1, then the data transfer can be aborted.

Note

The following is situation that channels is inactive:

  • CFGx.INACTIVE can only be activated after Memory has been written, and then canceled.

  • The data of peripheral is 4 bytes, but the FIFO of DMAC is only 2 bytes. There is no writing at this time and CFGx.INACTIVE is activated directly.

Priority

GDMA supports two kinds of channel priority:

  • Software: the priority of each channel can be configured in the CFGx.CH_PRIOR. The valid value is 0 ~ (DMAC_NUM_CHANNELS-1), where 0 is the highest priority value and (DMAC_NUM_CHANNELS-1) is the lowest priority value.

  • Hardware: if two channel requests have the same software priority level, or if no software priority is configured, the channel with the lower number takes priority over the channel with the higher number. For example, channel 2 takes priority over channel 4.

Cache

When DMA slave type is memory, you need to pay attention to cache operation. DCache_CleanInvalidate() should be called every time before DMA transmission starts.

The following steps should be added when executing DMA Rx/Tx.

Operation

Step

DMA Rx

  1. Prepare DST buffer

  2. Do DCache_CleanInvalidate() to avoid cache flush during DMA Rx

  3. Do DMA Rx configuration

  4. Trigger DMA Rx interrupt

  5. Do DCache_Invalidate() in Rx Done Handler to clean the old data, to avoid the problem that the data in the cache is inconsistent.

    With the data in the memory after Rx done if the CPU read or write allocate the DST buffer during GDMA transmission.

Note

During GDMA transmission, it is forbidden to write or cache flush DST buffer. (Taking

{SDK}\component\example\peripheral\raw\uart\uart_dma_stream\src\main.c for example, uart_recv_string_done is DMA Rx Done Interrupt Handler)

u32 uart_recv_string_done(void * data)
{
   UNUSED(data);
   // To solve the cache consistency problem, DMA mode needs it
   DCache_Invalidate((u32)rx_buf, SRX_BUF_SZ);
   dma_free();
   rx_done = 1;
   return 0;
}
  1. CPU reads DST buffer

DMA Tx

  1. CPU prepares SRC buffer data

  2. Do DCache_CleanInvalidate() for SRC buffer to synchronize the data

  3. Do DMA Tx configuration

  4. Trigger DMA Tx interrupt

Aligning the buffer address with the cache line will reduce the problem of inconsistent cache and memory data, and details can be referred to Section Cache Consistency When Using DMA.

DMAC Demos

Single Block

  1. Allocate a free channel

    ch_num = GDMA_ChnlAlloc(gdma.index, (IRQ_FUN) Dma_memcpy_int, (u32)(&gdma), 3);
    

    This function also includes the following operation:

    • Register IRQ handler if using interrupt mode

    • Enable NVIC interrupt

    • Register the GDMA channel to use

  2. Configure the interrupt type

    PGDMA_InitTypeDef->GDMA_IsrType = (TransferType | ErrType);
    
  3. Configure interrupt handling function

    Clear the pending interrupt in the interrupt processing function.

    GDMA_ClearINT(0, PGDMA_InitTypeDef->GDMA_ChNum);
    
  4. Configure transfer settings

    PGDMA_InitTypeDef->GDMA_SrcMsize   = MsizeEight;
    PGDMA_InitTypeDef->GDMA_SrcDataWidth = TrWidthFourBytes;
    PGDMA_InitTypeDef->GDMA_DstMsize = MsizeEight;
    PGDMA_InitTypeDef->GDMA_DstDataWidth = TrWidthFourBytes;
    PGDMA_InitTypeDef->GDMA_BlockSize = DMA_CPY_LEN >> 2;
    PGDMA_InitTypeDef->GDMA_DstInc = IncType; // if dst type is peripheral:no change
    PGDMA_InitTypeDef->GDMA_SrcInc = IncType; // if src type is peripheral:no change
    
  5. Configure hardware handshake interface if slave is peripheral

    GDMA_InitStruct->GDMA_SrcHandshakeInterface= GDMA_HANDSHAKE_INTERFACE_AUDIO_RX;
    

    or

    GDMA_InitStruct->GDMA_DstHandshakeInterface = GDMA_HANDSHAKE_INTERFACE_AUDIO_TX;
    
  6. Configure the transfer address

    PGDMA_InitTypeDef->GDMA_SrcAddr = (u32)BDSrcTest;
    PGDMA_InitTypeDef->GDMA_DstAddr = (u32)BDDstTest;
    
  7. Program GDMA index, GDMA channel, data width, msize, transfer direction, address increment mode, hardware handshake interface, reload control, interrupt type, block size, multi-block configuration and the source and destination address using the GDMA_Init() function.

    GDMA_Init(gdma.index, gdma.ch_num, PGDMA_InitTypeDef);
    
  8. Clean and invalidate Cache

    DCache_CleanInvalidate();
    
  9. Enable GDMA channel

    GDMA_Cmd(gdma.index, gdma.ch_num, ENABLE);
    

Multi-block

This example is SRC auto reload, compared with single block, multi-block is different in Step 2 to Step 4.

  1. Allocate a free channel

    ch_num = GDMA_ChnlAlloc(gdma.index, (IRQ_FUN) Dma_memcpy_int, (u32)(&gdma), 3);
    

    This function also includes the following operation:

    • Register IRQ handler if use interrupt mode

    • Enable NVIC interrupt

    • Register the GDMA channel to use

  1. Configure the interrupt type

    PGDMA_InitTypeDef->GDMA_IsrType = (BlockType | TransferType | ErrType);
    
  2. Configure interrupt handling function

    1. Clear the interrupt.

      GDMA_ClearINT(0, GDMA_InitStruct->GDMA_ChNum);
      
    2. Clear the auto reload mode before the last block starts.

      GDMA_ChCleanAutoReload(0, GDMA_InitStruct->GDMA_ChNum, CLEAN_RELOAD_SRC);
      
  1. Configure transfer settings

    PGDMA_InitTypeDef->GDMA_SrcMsize   = MsizeEight;
    PGDMA_InitTypeDef->GDMA_SrcDataWidth = TrWidthFourBytes;
    PGDMA_InitTypeDef->GDMA_DstMsize = MsizeEight;
    PGDMA_InitTypeDef->GDMA_DstDataWidth = TrWidthFourBytes;
    PGDMA_InitTypeDef->GDMA_BlockSize = DMA_CPY_LEN >> 2;
    PGDMA_InitTypeDef->GDMA_DstInc = IncType; // If DST type is peripheral: no change
    PGDMA_InitTypeDef->GDMA_SrcInc = IncType; // If SRC type is peripheral: no change
    PGDMA_InitTypeDef->GDMA_ReloadSrc = 1;
    PGDMA_InitTypeDef->GDMA_ReloadDst = 0;
    
  2. Configure hardware handshake interface if slave is peripheral.

    GDMA_InitStruct->GDMA_SrcHandshakeInterface= GDMA_HANDSHAKE_INTERFACE_AUDIO_RX;
    

    or

    GDMA_InitStruct->GDMA_DstHandshakeInterface = GDMA_HANDSHAKE_INTERFACE_AUDIO_TX;
    
  3. Configure the transfer address

    PGDMA_InitTypeDef->GDMA_SrcAddr = (u32)BDSrcTest;
    PGDMA_InitTypeDef->GDMA_DstAddr = (u32)BDDstTest;
    
  4. Program GDMA index, GDMA channel, data width, Msize, transfer direction, address increment mode, hardware handshake interface, reload control, interrupt type, block size, multi-block configuration and the source and destination address using the GDMA_Init() function.

    GDMA_Init(gdma.index, gdma.ch_num, PGDMA_InitTypeDef);
    
  5. Clean and invalidate Cache

    DCache_CleanInvalidate();
    
  6. Enable GDMA channel

    GDMA_Cmd(gdma.index, gdma.ch_num, ENABLE);