All guides
TECHNICAL GUIDESTM32L4R5DMA2026

DMA & DMAMUX
Request Routing on STM32L4+

Two 7-channel DMA controllers plus the DMAMUX1 request router — CCR/CNDTR/CPAR/CMAR, mem2periph & periph2mem, circular streaming, the full request-line table, and complete ADC+DMA and UART+DMA examples for the STM32L4R5 (RM0432).

01 DMA + DMAMUX architecture on STM32L4+

The STM32L4R5 (Cortex-M4F, RM0432) has two general-purpose DMA controllers, DMA1 and DMA2, each with 7 independent channels (14 channels total). On this L4+ device the channels are decoupled from the peripherals by a request router called DMAMUX1.

Why DMAMUX exists

On classic STM32F1/F4 parts every peripheral DMA request was hard-wired to one fixed DMA channel (or picked from a tiny CxS set), so two peripherals that wanted the same channel could not both use DMA. STM32L4+ inserts DMAMUX1 between the ~94 peripheral request lines and the 14 DMA channels: any request line can be routed to any channel. You no longer look up "which channel does USART1_TX live on" — you pick a free channel and program its DMAMUX request ID.

Peripheral (ADC1, USART2, SPI1 …)
        │  request line  (fixed ID, e.g. ADC1 = 5)
        ▼
   DMAMUX1  ──  one CxCR per DMA channel selects DMAREQ_ID
        │  routed request
        ▼
   DMA1 / DMA2 channel  ──  CCR/CNDTR/CPAR/CMAR move the data
        │  AHB master
        ▼
   SRAM  ⇄  Peripheral data register

Memory map (AHB1)

BlockBase addressNotes
DMA10x4002 0000ISR @+0x00, IFCR @+0x04, Ch1 regs @+0x08
DMA20x4002 0400same layout as DMA1
DMAMUX10x4002 0800C0CR @+0x00 … C13CR @+0x34
RCC0x4002 1000AHB1ENR gates DMA1/DMA2/DMAMUX1 clocks
DMA1_Channel10x4002 0008each channel block = 0x14 bytes
ADC10x5004 0000DR @+0x40 → 0x5004 0040 (DMA source)
USART20x4000 4400RDR @+0x24, TDR @+0x28

DMAMUX channel ↔ DMA channel mapping

DMAMUX1 has one control register (CxCR) per DMA channel. The index is fixed:

DMAMUX channelDrivesRegister
0 … 6DMA1 Channel 1 … 7DMAMUX1_Channel0 … 6
7 … 13DMA2 Channel 1 … 7DMAMUX1_Channel7 … 13

Formula: dmamux_index = (dma == DMA2 ? 7 : 0) + (channel - 1), with channel in 1…7.

Enable the clocks (register level)

Three clock bits live in RCC->AHB1ENR. Forgetting DMAMUX1EN is the single most common "my DMA does nothing" bug.

c — clock enable
#include "stm32l4r5xx.h"

/* RCC->AHB1ENR: DMA1EN=bit0, DMA2EN=bit1, DMAMUX1EN=bit2 */
RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN      /* controller  */
             |  RCC_AHB1ENR_DMAMUX1EN;  /* the router — do NOT forget */
(void)RCC->AHB1ENR;                     /* read-back: let the clock settle */

This section

  • DMA1 & DMA2, 7 channels each; DMAMUX1 routes any of ~94 request lines to any channel.
  • DMAMUX channel index = (DMA2?7:0) + (channel−1); DMA1.Ch1→mux0, DMA2.Ch7→mux13.
  • Enable DMA1/DMA2 and DMAMUX1 in RCC→AHB1ENR (bits 0/1/2).

02 DMA channel registers: CCR / CNDTR / CPAR / CMAR

Each channel is described by exactly four 32-bit registers. The controller adds two shared registers, ISR (status) and IFCR (flag clear). Channel n registers sit at DMAx_BASE + 0x08 + 0x14·(n−1).

RegisterOffsetMeaning
CCR+0x00Channel configuration (direction, sizes, increment, circular, IRQ enables, EN)
CNDTR+0x04Number of data items to transfer (0…65535). Counts down; read-only while EN=1
CPAR+0x08Peripheral address — usually &PERIPH->DR (source for P2M, dest for M2P)
CMAR+0x0CMemory address — your SRAM buffer

CCR bit fields

BitsFieldFunction
0ENChannel enable. Set last. Most other fields are read-only while EN=1
1TCIETransfer-complete interrupt enable
2HTIEHalf-transfer interrupt enable
3TEIETransfer-error interrupt enable
4DIRDirection: 0 = read from peripheral (P2M), 1 = read from memory (M2P)
5CIRCCircular mode — CNDTR auto-reloads, transfer never stops
6PINCPeripheral address increment (normally 0 for a data register)
7MINCMemory address increment (normally 1 to walk a buffer)
9:8PSIZEPeripheral data size: 00=8-bit, 01=16-bit, 10=32-bit
11:10MSIZEMemory data size: 00=8-bit, 01=16-bit, 10=32-bit
13:12PLPriority: 00=low, 01=medium, 10=high, 11=very high
14MEM2MEMMemory-to-memory mode (channel runs freely, no hardware request)

Sizes, direction and CNDTR — the practical rules

PSIZE = width of the peripheral registerADC_DR and USART_RDR/TDR are read as configured: ADC = 16-bit (halfword), UART = 8-bit (byte). Mismatch corrupts data or faults.
MSIZE = width of one buffer elementuint16_t buf[] → MSIZE=01; uint8_t buf[] → MSIZE=00. If PSIZE≠MSIZE the DMA packs/unpacks.
CNDTR = element count, not bytesFor a 3-sample ADC scan CNDTR=3; for a 64-byte UART frame CNDTR=64. It is decremented once per transfer and reloaded in circular mode.
DIR is about the READ sourceP2M (RX / ADC in): DIR=0. M2P (TX / DAC out): DIR=1. Getting this backwards is a classic silent failure.
Write order

CPAR, CMAR and CNDTR are writable only while the channel is disabled (EN=0). Writing them with EN=1 is ignored. Always clear EN and spin until it reads back 0 before reprogramming.

03 DMAMUX1 request routing & request-line table

To connect a peripheral to a channel you write the peripheral's request ID into the DMAREQ_ID[7:0] field of the DMAMUX CxCR register that corresponds to that channel. That single write is the whole "routing" step.

DMAMUX CxCR bit fields

BitsFieldFunction
7:0DMAREQ_IDSelected request line (see table below). This is all you need for plain peripheral DMA
8SOIESynchronization overrun interrupt enable
9EGEEvent generation enable (drive a request-generator trigger)
16SESynchronization enable — gate requests on a sync input
18:17SPOLSync edge polarity (00 none, 01 rising, 10 falling, 11 both)
23:19NBREQNumber of requests to forward per sync event (minus 1)
28:24SYNC_IDSync input selection (used only when SE=1)

For the vast majority of transfers you leave SE=0 and EGE=0 and write only DMAREQ_ID, i.e. DMAMUX1_ChannelN->CCR = request;.

DMAMUX1 request-line numbers (STM32L4R5)

The L4R5 has a single ADC (ADC1 only), so the request IDs are not shifted by the optional ADC2 slot. IDs run 0…93. Request 0 is memory-to-memory (no hardware trigger); 1–4 are the DMAMUX request-generator outputs; 5 and up are peripherals.

IDRequest lineIDRequest line
0MEM2MEM (no request)25USART1_TX
1–4DMAMUX req generator 0–326USART2_RX
5ADC127USART2_TX
6DAC1_CH128USART3_RX
7DAC1_CH229USART3_TX
8TIM6_UP30UART4_RX
9TIM7_UP31UART4_TX
10SPI1_RX32UART5_RX
11SPI1_TX33UART5_TX
12SPI2_RX34LPUART1_RX
13SPI2_TX35LPUART1_TX
14SPI3_RX36–37SAI1_A / SAI1_B
15SPI3_TX38–39SAI2_A / SAI2_B
16I2C1_RX40–41OCTOSPI1 / OCTOSPI2
17I2C1_TX42–48TIM1 CH1..4/UP/TRIG/COM
18I2C2_RX49–55TIM8 CH1..4/UP/TRIG/COM
19I2C2_TX56–60TIM2 CH1..4/UP
20I2C3_RX61–66TIM3 CH1..4/UP/TRIG
21I2C3_TX67–71TIM4 CH1..4/UP
22I2C4_RX72–77TIM5 CH1..4/UP/TRIG
23I2C4_TX78–85TIM15/16/17
24USART1_RX86–89DFSDM1_FLT0..3
90DCMI / PSSI91–93AES_IN / AES_OUT / HASH_IN

These IDs match the CMSIS macros DMA_REQUEST_ADC1 (5), DMA_REQUEST_USART2_TX (27), etc., and the RM0432 DMAMUX assignment table. Always cross-check against the exact reference manual for your part number.

Sync & request generator (brief)

Synchronization (SE=1) forwards a batch of NBREQ+1 requests only after an edge on SYNC_ID — useful to align a DMA burst to a timer or external event. The request generator (IDs 1–4) turns an external signal into periodic DMA requests for peripherals that have no native DMA line. Both are configured in the DMAMUX request-generator block at DMAMUX1 + 0x100; leave them off for ordinary peripheral streaming.

04 Programming sequence: the transfer recipe

Every DMA setup on this device follows the same ten steps. Do them in order and the transfer just works; skip step 2 or step 10 and it silently does nothing.

#Step
1Enable clocks: DMAx + DMAMUX1 + the peripheral
2Disable the channel (CCR.EN=0) and spin until EN reads 0
3Clear this channel's flags in DMAx→IFCR
4Write CPAR (peripheral data-register address)
5Write CMAR (buffer address)
6Write CNDTR (element count)
7Route the request: DMAMUX1_ChannelN→CCR = request ID
8Write CCR: DIR, PINC/MINC, PSIZE/MSIZE, PL, CIRC, IRQ enables (EN still 0)
9Set CCR.EN = 1
10Enable the peripheral's DMA request bit (ADC DMAEN, USART DMAT/DMAR, SPI TXDMAEN/RXDMAEN)

Reusable register-level helper

c — generic DMA channel setup
#include "stm32l4r5xx.h"
#include <stdint.h>

/* DMAMUX CxCR that drives a given DMA channel (ch = 1..7).
 *   DMA1 Ch1..7 -> DMAMUX1 Channel 0..6
 *   DMA2 Ch1..7 -> DMAMUX1 Channel 7..13                     */
static inline DMAMUX_Channel_TypeDef *dmamux_for(DMA_TypeDef *dma, uint8_t ch)
{
    uint32_t idx = (dma == DMA2 ? 7u : 0u) + (uint32_t)(ch - 1u);
    return DMAMUX1_Channel0 + idx;      /* 4 bytes per CxCR */
}

/* Register block for DMA channel ch (1..7). */
static inline DMA_Channel_TypeDef *dma_ch(DMA_TypeDef *dma, uint8_t ch)
{
    uint32_t base = (uint32_t)dma + 0x08u + 0x14u * (uint32_t)(ch - 1u);
    return (DMA_Channel_TypeDef *)base;
}

/* One-shot or circular transfer setup. `ccr_flags` carries DIR, MINC,
 * PSIZE/MSIZE, PL, CIRC and any *IE bits — but NOT the EN bit.        */
void dma_setup(DMA_TypeDef *dma, uint8_t ch, uint8_t request,
               volatile void *periph, void *mem, uint16_t count,
               uint32_t ccr_flags)
{
    DMA_Channel_TypeDef *c = dma_ch(dma, ch);

    c->CCR &= ~DMA_CCR_EN;                     /* (2) disable   */
    while (c->CCR & DMA_CCR_EN) { }            /*     wait off  */

    dma->IFCR = 0xFu << (4u * (ch - 1u));      /* (3) clear GIF/TCIF/HTIF/TEIF */

    c->CPAR  = (uint32_t)periph;               /* (4) periph addr */
    c->CMAR  = (uint32_t)mem;                  /* (5) buffer addr */
    c->CNDTR = count;                          /* (6) item count  */

    dmamux_for(dma, ch)->CCR = request;        /* (7) route request */

    c->CCR = ccr_flags;                        /* (8) config, EN=0   */
    c->CCR |= DMA_CCR_EN;                       /* (9) go             */
}
Note

Step 10 (the peripheral's own DMA-request enable) lives in the peripheral driver, not in dma_setup() — the two examples below show exactly where it goes.

05 Worked example: ADC1 + DMA (circular scan)

Continuously scan three ADC channels into an SRAM buffer with zero CPU involvement. ADC1 raises request ID 5, routed to DMA1 Channel 1 (DMAMUX channel 0). Circular DMA + continuous conversion means the buffer stays fresh forever.

c — ADC1 + DMA1 Ch1, circular (CMSIS register level)
#include "stm32l4r5xx.h"
#include <stdint.h>

#define NUM_CH   3u
static volatile uint16_t adc_buf[NUM_CH];   /* 12-bit samples, halfwords */

static void clocks_init(void)
{
    RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN | RCC_AHB1ENR_DMAMUX1EN;
    RCC->AHB2ENR |= RCC_AHB2ENR_ADCEN;       /* ADC is on AHB2 */
    (void)RCC->AHB2ENR;

    /* Clock the ADC synchronously from HCLK/4 (CKMODE=11) so no
       RCC ADCSEL kernel-clock selection is required.                */
    ADC1_COMMON->CCR |= ADC_CCR_CKMODE_0 | ADC_CCR_CKMODE_1;  /* 11 */
}

static void adc_enable(void)
{
    ADC1->CR &= ~ADC_CR_DEEPPWD;             /* leave deep-power-down   */
    ADC1->CR |= ADC_CR_ADVREGEN;             /* turn on ADC regulator   */
    for (volatile int i = 0; i < 4000; i++) { }  /* > t_ADCVREG_STUP (~20us) */

    ADC1->CR &= ~ADC_CR_ADCALDIF;            /* single-ended calibration */
    ADC1->CR |= ADC_CR_ADCAL;
    while (ADC1->CR & ADC_CR_ADCAL) { }      /* wait for cal to finish   */

    ADC1->ISR = ADC_ISR_ADRDY;              /* clear ADRDY (rc_w1)      */
    ADC1->CR |= ADC_CR_ADEN;                 /* enable the ADC           */
    while (!(ADC1->ISR & ADC_ISR_ADRDY)) { }
}

static void adc_dma_init(void)
{
    /* --- regular sequence: 3 conversions, channels 1,2,3 --- */
    ADC1->SQR1 = ((NUM_CH - 1u) << ADC_SQR1_L_Pos)  /* L = length-1 */
               | (1u << ADC_SQR1_SQ1_Pos)
               | (2u << ADC_SQR1_SQ2_Pos)
               | (3u << ADC_SQR1_SQ3_Pos);
    ADC1->SMPR1 = 0x3FFFFFFFu;               /* long sample time, ch0..9 */

    /* continuous + DMA + circular DMA (DMACFG=1) + overrun overwrite */
    ADC1->CFGR = ADC_CFGR_CONT | ADC_CFGR_DMAEN
               | ADC_CFGR_DMACFG | ADC_CFGR_OVRMOD;

    /* --- DMA1 Channel 1: ADC1->DR (16-bit) -> adc_buf, circular --- */
    DMA_Channel_TypeDef *c = DMA1_Channel1;
    c->CCR &= ~DMA_CCR_EN;
    while (c->CCR & DMA_CCR_EN) { }
    DMA1->IFCR = 0xFu;                        /* clear ch1 flags */

    c->CPAR  = (uint32_t)&ADC1->DR;          /* source: ADC data reg */
    c->CMAR  = (uint32_t)adc_buf;             /* dest: SRAM buffer    */
    c->CNDTR = NUM_CH;                        /* 3 halfword items     */

    DMAMUX1_Channel0->CCR = 5u;              /* DMA1 Ch1 = mux0, req = ADC1 */

    c->CCR = DMA_CCR_MINC                     /* walk the buffer        */
           | DMA_CCR_CIRC                     /* wrap forever           */
           | DMA_CCR_PSIZE_0                  /* PSIZE=01 -> 16-bit     */
           | DMA_CCR_MSIZE_0                  /* MSIZE=01 -> 16-bit     */
           | DMA_CCR_PL_1;                    /* priority = high (10)   */
    /* DIR left 0 = read from peripheral (P2M); PINC left 0 */
    c->CCR |= DMA_CCR_EN;                      /* enable channel         */
}

int main(void)
{
    clocks_init();
    adc_enable();
    adc_dma_init();

    ADC1->CR |= ADC_CR_ADSTART;               /* start regular conversions */

    for (;;) {
        /* adc_buf[0..2] refreshed continuously by DMA — read anytime. */
    }
}
Order matters

ADC_CFGR_DMAEN (step 10) is set before ADSTART, and the DMA channel is enabled before the ADC starts converting. If the ADC produces a result before DMA is armed you lose the first sample (or set OVRMOD=1, as above, so the ADC overwrites rather than stalls).

06 Worked example: USART/UART + DMA (TX & RX)

UART is the canonical mem2periph (TX) and periph2mem (RX) case. TX = memory→peripheral (DIR=1), RX = peripheral→memory (DIR=0). Bytes, so PSIZE=MSIZE=8-bit (the CCR reset value). Here RX runs circular into a ring buffer while TX is fired on demand.

c — USART2 TX (Ch7) + RX circular (Ch6), register level
#include "stm32l4r5xx.h"
#include <stdint.h>

#define RX_LEN 64u
static volatile uint8_t rx_buf[RX_LEN];   /* circular RX ring */

/* Assumes USART2 already has baud/framing set and UE/TE/RE enabled. */
void uart_dma_init(void)
{
    RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN | RCC_AHB1ENR_DMAMUX1EN;

    /* ---- RX: USART2->RDR -> rx_buf, circular ----
       DMA1 Ch6 == DMAMUX1 Channel 5, request 26 = USART2_RX */
    DMA1_Channel6->CCR &= ~DMA_CCR_EN;
    while (DMA1_Channel6->CCR & DMA_CCR_EN) { }
    DMA1->IFCR = 0xFu << (4u * (6u - 1u));    /* clear ch6 flags */

    DMA1_Channel6->CPAR  = (uint32_t)&USART2->RDR;
    DMA1_Channel6->CMAR  = (uint32_t)rx_buf;
    DMA1_Channel6->CNDTR = RX_LEN;
    DMAMUX1_Channel5->CCR = 26u;             /* USART2_RX */
    DMA1_Channel6->CCR = DMA_CCR_MINC | DMA_CCR_CIRC; /* 8-bit sizes = reset */
    DMA1_Channel6->CCR |= DMA_CCR_EN;

    /* ---- TX: only route the channel now (armed per message) ----
       DMA1 Ch7 == DMAMUX1 Channel 6, request 27 = USART2_TX */
    DMAMUX1_Channel6->CCR = 27u;             /* USART2_TX */

    /* Step 10: let USART generate DMA requests in both directions */
    USART2->CR3 |= USART_CR3_DMAT | USART_CR3_DMAR;
}

/* Fire a memory-to-peripheral transfer of `len` bytes. */
void uart_send_dma(const uint8_t *data, uint16_t len)
{
    DMA_Channel_TypeDef *c = DMA1_Channel7;

    c->CCR &= ~DMA_CCR_EN;
    while (c->CCR & DMA_CCR_EN) { }
    DMA1->IFCR = 0xFu << (4u * (7u - 1u));    /* clear ch7 flags */

    c->CPAR  = (uint32_t)&USART2->TDR;
    c->CMAR  = (uint32_t)data;
    c->CNDTR = len;
    c->CCR   = DMA_CCR_DIR                     /* memory -> peripheral */
            | DMA_CCR_MINC                     /* walk the buffer      */
            | DMA_CCR_TCIE;                    /* IRQ when done        */
    c->CCR  |= DMA_CCR_EN;
}

/* TX-complete: DMA1 Channel 7 (IRQn = 17). */
void DMA1_Channel7_IRQHandler(void)
{
    if (DMA1->ISR & DMA_ISR_TCIF7) {
        DMA1->IFCR = DMA_IFCR_CTCIF7;         /* ack TC */
        DMA1_Channel7->CCR &= ~DMA_CCR_EN;    /* stop the channel */
        /* For half-duplex/RS-485 wait for USART2->ISR & USART_ISR_TC
           before releasing the driver-enable line. */
    }
}

/* Bytes received so far into the ring = RX_LEN - remaining. */
uint16_t uart_rx_count(void)
{
    return (uint16_t)(RX_LEN - DMA1_Channel6->CNDTR);
}

HAL variant (same USART2 TX via DMA)

The HAL hides the channel and DMAMUX writes behind a DMA_HandleTypeDef. You still choose the channel (Instance) and the request (Init.Request = DMA_REQUEST_USART2_TX, which is 27), then link it to the UART handle.

c — STM32L4xx HAL: HAL_UART_Transmit_DMA
#include "stm32l4xx_hal.h"

extern UART_HandleTypeDef huart2;   /* configured elsewhere (baud, pins) */
DMA_HandleTypeDef hdma_usart2_tx;

void uart2_tx_dma_init(void)
{
    __HAL_RCC_DMA1_CLK_ENABLE();
    __HAL_RCC_DMAMUX1_CLK_ENABLE();

    hdma_usart2_tx.Instance                 = DMA1_Channel7;
    hdma_usart2_tx.Init.Request             = DMA_REQUEST_USART2_TX; /* 27 */
    hdma_usart2_tx.Init.Direction           = DMA_MEMORY_TO_PERIPH;
    hdma_usart2_tx.Init.PeriphInc           = DMA_PINC_DISABLE;
    hdma_usart2_tx.Init.MemInc              = DMA_MINC_ENABLE;
    hdma_usart2_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
    hdma_usart2_tx.Init.MemDataAlignment    = DMA_MDATAALIGN_BYTE;
    hdma_usart2_tx.Init.Mode                = DMA_NORMAL;
    hdma_usart2_tx.Init.Priority            = DMA_PRIORITY_HIGH;
    HAL_DMA_Init(&hdma_usart2_tx);

    __HAL_LINKDMA(&huart2, hdmatx, hdma_usart2_tx);  /* wire handle to UART */

    HAL_NVIC_SetPriority(DMA1_Channel7_IRQn, 5, 0);
    HAL_NVIC_EnableIRQ(DMA1_Channel7_IRQn);
}

/* Route the channel IRQ into the HAL DMA state machine. */
void DMA1_Channel7_IRQHandler(void)
{
    HAL_DMA_IRQHandler(huart2.hdmatx);
}

void app_send(const uint8_t *buf, uint16_t n)
{
    HAL_UART_Transmit_DMA(&huart2, (uint8_t *)buf, n);
}
TX vs RX peripheral address

RX reads USART2->RDR (offset 0x24); TX writes USART2->TDR (offset 0x28). They are different registers — pointing CPAR at the wrong one is a frequent copy-paste bug.

07 Interrupts: TC / HT / TE, flags & ping-pong

Each channel can raise three events — transfer complete (TC), half transfer (HT) and transfer error (TE) — enabled by TCIE/HTIE/TEIE in CCR. Status is read from DMAx->ISR and cleared by writing 1 to the matching bit in DMAx->IFCR.

ISR / IFCR flag layout

Each channel owns a nibble; the same bit positions are used in ISR (read) and IFCR (write-1-to-clear, prefixed C).

ChannelGIF (global)TCIFHTIFTEIF
1bit 0bit 1bit 2bit 3
2bit 4bit 5bit 6bit 7
3bit 8bit 9bit 10bit 11
n4(n−1)4(n−1)+14(n−1)+24(n−1)+3
7bit 24bit 25bit 26bit 27

NVIC interrupt numbers (RM0432 / DS12023)

IRQPositionIRQPosition
DMA1_Channel1..711 … 17DMA2_Channel1..556 … 60
DMA2_Channel668DMA2_Channel769
ADC118DMAMUX1_OVR94

Note the DMA2 vector table is not contiguous: channels 1–5 are 56–60 but channels 6–7 jump to 68–69. Use the enum names (DMA2_Channel6_IRQn) from the CMSIS header, never a hand-computed number.

Ping-pong double buffering with HT + TC

In circular mode HT fires at the midpoint and TC at the end. Process the first half on HT and the second half on TC while the DMA keeps filling the other half — continuous streaming with no gaps and no data loss.

c — half/full-buffer streaming (DMA1 Ch1)
#include "stm32l4r5xx.h"
#include <stdint.h>

#define BLK 128u
static volatile uint16_t stream[2u * BLK];   /* two half-buffers */
extern void process(const volatile uint16_t *half, uint32_t n);

void stream_irq_enable(void)
{
    /* CNDTR must be 2*BLK and CIRC set when the channel was configured. */
    DMA1_Channel1->CCR |= DMA_CCR_HTIE | DMA_CCR_TCIE | DMA_CCR_TEIE;
    NVIC_SetPriority(DMA1_Channel1_IRQn, 5);
    NVIC_EnableIRQ(DMA1_Channel1_IRQn);
}

void DMA1_Channel1_IRQHandler(void)
{
    uint32_t isr = DMA1->ISR;

    if (isr & DMA_ISR_HTIF1) {               /* first half ready  */
        DMA1->IFCR = DMA_IFCR_CHTIF1;         /* ack BEFORE work    */
        process(&stream[0], BLK);
    }
    if (isr & DMA_ISR_TCIF1) {               /* second half ready */
        DMA1->IFCR = DMA_IFCR_CTCIF1;
        process(&stream[BLK], BLK);
    }
    if (isr & DMA_ISR_TEIF1) {               /* bus / config error */
        DMA1->IFCR = DMA_IFCR_CTEIF1;
        /* channel auto-disables on TE — clear cause and re-init here */
    }
}
Always clear the flag

Clear TCIF/HTIF/TEIF in IFCR at the top of the handler. If you forget, the pending bit stays set and the ISR re-fires immediately, hanging the CPU in the handler.

08 Gotchas & common mistakes

The DMA controller gives almost no feedback when it is misconfigured — it simply does nothing. These are the failure modes that cost the most debugging time on STM32L4+.

SymptomCause & fix
DMA does absolutely nothingDMAMUX1 clock not enabled. Set RCC_AHB1ENR_DMAMUX1EN (bit 2) as well as the DMA1/DMA2 clock. Without it the request never reaches the channel.
Channel ignores new CPAR/CMAR/CNDTRThose registers are read-only while EN=1. Clear EN and spin until it reads 0 before rewriting them.
Transfer runs but no peripheral data movesStep 10 missing: the peripheral's own DMA-request enable (ADC DMAEN, USART DMAT/DMAR, SPI TXDMAEN/RXDMAEN) is separate from the channel EN bit. Both are required.
Only ~half the bytes appear, or garbagePSIZE/MSIZE mismatch. ADC_DR is 16-bit → PSIZE=01 into a uint16_t buffer; UART RDR/TDR is 8-bit → PSIZE=00 into a uint8_t buffer. A byte DMA into a halfword register truncates.
Wrong direction / no movementDIR is about the read source. RX/ADC-in = read peripheral = DIR=0. TX/DAC-out = read memory = DIR=1. It is easy to invert.
CNDTR seems off by a factorCNDTR is an element count, not a byte count. 32 halfwords → CNDTR=32, not 64.
ISR fires forever / CPU stuck in handlerYou did not clear the flag. Write the matching CTCIFn/CHTIFn/CTEIFn bit to IFCR at the top of the handler.
First sample lost / ADC overrun (OVR)DMA armed after the peripheral produced data. Enable the channel first, or set ADC OVRMOD=1 so it overwrites instead of stalling.
Data updates but a stale copy is usedThe Cortex-M4 in the L4 has no data cache (no cache maintenance needed, unlike Cortex-M7), but the compiler can still cache the buffer in a register. Declare DMA buffers volatile.
Occasional corruption / hard faultBuffer misaligned for 16/32-bit transfers, or placed in a region the DMA master cannot reach. Keep DMA buffers word-aligned in SRAM1/SRAM2.
Two channels, one starves the otherDMAMUX lets any channel serve any request, but bus arbitration is still per-channel: higher PL wins, and on a tie the lower channel number wins. Raise PL on the latency-critical stream.
Re-used channel behaves oddlyWhen you repurpose a channel for a different peripheral, rewrite DMAMUX1_ChannelN->CCR with the new request ID. A leftover ID from the previous use keeps routing the old peripheral.
Channel only runs in mem2memDMAMUX request ID 0 = MEM2MEM (no hardware trigger). If you leave CxCR at its reset value (0) the channel waits for a request that never comes unless CCR.MEM2MEM is set.

Checklist before you flash

  • Clocks: DMA1/DMA2 and DMAMUX1 and the peripheral.
  • Channel disabled while writing CPAR/CMAR/CNDTR; EN set last.
  • DMAMUX CxCR = correct request ID for the target channel.
  • PSIZE/MSIZE match the register and buffer widths; DIR matches the direction.
  • Peripheral DMA-request enable bit set; buffer volatile, aligned, in SRAM.
  • IRQ handler clears its IFCR flag first; NVIC enabled with the right IRQn enum.