01 The SPI peripheral on the L4R5
The STM32L4R5 has three general-purpose SPI blocks. SPI1 lives on the fast APB2 bus; SPI2 and SPI3 are on APB1. All three share the same "SPI with FIFO" IP described in RM0432: a 32-bit-wide TX and RX FIFO (4×8-bit deep), programmable frame size from 4 to 16 bits, and Motorola or TI frame formats.
MASTER (STM32L4R5 SPI1) SLAVE (e.g. flash, sensor, ADC)
┌────────────────────┐ ┌──────────────┐
│ TX FIFO ─► shift ─┼─ MOSI ────►│ MOSI │
│ RX FIFO ◄─ shift ◄┼─ MISO ◄────┤ MISO │
│ BR prescaler ─────►┼─ SCK ─────►│ SCK │
│ NSS (SW/HW) ──────┼─ CS ─────►│ /CS │
└────────────────────┘ └──────────────┘
PCLK2 (SPI1) / PCLK1 (SPI2,3)
| Block | Bus / kernel clock | Base address | CMSIS pointer | Notes |
|---|---|---|---|---|
| SPI1 | APB2 (PCLK2) | 0x4001 3000 | SPI1 | Fastest; up to PCLK2 = 120 MHz |
| SPI2 | APB1 (PCLK1) | 0x4000 3800 | SPI2 | Also I2S-capable |
| SPI3 | APB1 (PCLK1) | 0x4000 3C00 | SPI3 | Also I2S-capable; uses AF6 |
Unlike the STM32H7, the L4/L4+ SPI has no independent kernel-clock mux — the serial clock is derived directly from the APB clock that feeds the block (PCLK2 for SPI1, PCLK1 for SPI2/SPI3) through the 3-bit BR prescaler. There is exactly one clock domain to reason about.
The 4-entry TX/RX FIFO lets you keep the bus 100% utilised: you can push a second byte before the first has finished shifting out. In polling code you wait on TXE/RXNE; in DMA mode the controller keeps the FIFO fed automatically. The FIFO is also why the disable sequence is non-trivial (see §08).
02 Signals, framing & the four CPOL/CPHA modes
SPI is a full-duplex synchronous bus: on every SCK edge the master shifts one bit out on MOSI and simultaneously latches one bit in on MISO. There is no "read" or "write" — only a transfer. To read N bytes you must clock out N (dummy) bytes.
CPOL and CPHA — the four modes
CPOL (CR1 bit 1) sets the idle level of SCK. CPHA (CR1 bit 0) selects which clock edge samples the data. The combination gives four standard modes; the slave datasheet dictates which one you must use.
| Mode | CPOL | CPHA | SCK idle | Data sampled on | Data shifted on |
|---|---|---|---|---|---|
| 0 | 0 | 0 | Low | 1st edge (rising) | 2nd edge (falling) |
| 1 | 0 | 1 | Low | 2nd edge (falling) | 1st edge (rising) |
| 2 | 1 | 0 | High | 1st edge (falling) | 2nd edge (rising) |
| 3 | 1 | 1 | High | 2nd edge (rising) | 1st edge (falling) |
CPOL, CPHA, BR, MSTR, LSBFIRST, DS and the NSS bits must be programmed with the SPI disabled (CR1.SPE = 0). Changing them on a running peripheral yields undefined behaviour. Set everything, then set SPE last.
Frame size and bit order
DS[3:0] (CR2 bits 11:8) selects the frame length: 0b0011 = 4-bit … 0b0111 = 8-bit … 0b1111 = 16-bit. Values below 0b0011 are reserved. LSBFIRST (CR1 bit 7) chooses MSB-first (0, the default and most common) or LSB-first (1). For frames of 8 bits or fewer you must also set FRXTH (CR2 bit 12) so that RXNE is asserted per byte — see §04.
03 Pin mapping, alternate functions & clocks
SPI1 and SPI2 pins use alternate function AF5; SPI3 uses AF6. The table below lists the mappings shared across the STM32L4 family (verify the exact set available on your package against DS12023). Not every pin exists on every package.
| Signal | AF | Pin options |
|---|---|---|
| SPI1_SCK | AF5 | PA5, PB3, PE13 |
| SPI1_MISO | AF5 | PA6, PB4, PE14 |
| SPI1_MOSI | AF5 | PA7, PB5, PE15 |
| SPI1_NSS | AF5 | PA4, PA15, PE12 |
| SPI2_SCK | AF5 | PB10, PB13, PD1 |
| SPI2_MISO | AF5 | PB14, PC2, PD3 |
| SPI2_MOSI | AF5 | PB15, PC3, PD4 |
| SPI2_NSS | AF5 | PB9, PB12, PD0 |
| SPI3_SCK | AF6 | PB3, PC10 |
| SPI3_MISO | AF6 | PB4, PC11 |
| SPI3_MOSI | AF6 | PB5, PC12 |
| SPI3_NSS | AF6 | PA4, PA15 |
Note that PA4/PA15/PB3/PB4/PB5 are shared between SPI1 (AF5) and SPI3 (AF6): the AF number selects which peripheral drives the pin.
Clock enables (RCC)
GPIO ports live on AHB2; DMA and DMAMUX on AHB1. SPI1 is enabled in APB2ENR; SPI2/SPI3 in APB1ENR1.
// GPIO port clocks (AHB2)
RCC->AHB2ENR |= RCC_AHB2ENR_GPIOAEN; // PA4..PA7 for SPI1
// SPI peripheral clocks
RCC->APB2ENR |= RCC_APB2ENR_SPI1EN; // SPI1 (bit 12, APB2)
RCC->APB1ENR1 |= RCC_APB1ENR1_SPI2EN; // SPI2 (bit 14, APB1)
RCC->APB1ENR1 |= RCC_APB1ENR1_SPI3EN; // SPI3 (bit 15, APB1)
// For DMA later (AHB1):
RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN | RCC_AHB1ENR_DMAMUX1EN;
// Erratum-safe: read back after enabling a clock before using the block
(void)RCC->APB2ENR;
GPIO alternate-function setup
static void spi1_gpio_init(void)
{
// MODER = 0b10 (alternate function) for PA5, PA6, PA7
GPIOA->MODER &= ~(GPIO_MODER_MODE5 | GPIO_MODER_MODE6 | GPIO_MODER_MODE7);
GPIOA->MODER |= (2u << GPIO_MODER_MODE5_Pos)
| (2u << GPIO_MODER_MODE6_Pos)
| (2u << GPIO_MODER_MODE7_Pos);
// Very-high-speed output (0b11) — needed above a few MHz
GPIOA->OSPEEDR |= (3u << GPIO_OSPEEDR_OSPEED5_Pos)
| (3u << GPIO_OSPEEDR_OSPEED6_Pos)
| (3u << GPIO_OSPEEDR_OSPEED7_Pos);
// Push-pull (default), no pull-up/down (MISO may want a pull if slave tristates)
GPIOA->PUPDR &= ~(GPIO_PUPDR_PUPD5 | GPIO_PUPDR_PUPD6 | GPIO_PUPDR_PUPD7);
// AFR[0] handles pins 0..7; write AF5 into the 4-bit nibble of each pin
GPIOA->AFR[0] &= ~((0xFu << (5*4)) | (0xFu << (6*4)) | (0xFu << (7*4)));
GPIOA->AFR[0] |= ((5u << (5*4)) | (5u << (6*4)) | (5u << (7*4)));
// Chip-select on PA4 as a plain GPIO output, idle HIGH (software NSS)
GPIOA->MODER &= ~GPIO_MODER_MODE4;
GPIOA->MODER |= (1u << GPIO_MODER_MODE4_Pos); // general output
GPIOA->BSRR = GPIO_BSRR_BS4; // deselect (drive high)
}
04 Register map: CR1, CR2, SR, DR
Five registers do the real work: CR1 (mode/clock), CR2 (framing, FIFO threshold, DMA/IRQ enables), SR (status/FIFO levels), DR (data), plus the CRC registers. Offsets are from the SPI base address.
SPI_CR1 (offset 0x00)
| Bit | Field | Meaning |
|---|---|---|
| 15 | BIDIMODE | 0 = 2-line unidirectional (full-duplex); 1 = 1-line bidirectional |
| 14 | BIDIOE | Output enable in bidirectional mode |
| 13 | CRCEN | Hardware CRC calculation enable |
| 12 | CRCNEXT | Transmit CRC next |
| 11 | CRCL | CRC length: 0 = 8-bit, 1 = 16-bit |
| 10 | RXONLY | Receive-only (clock keeps running, MOSI idle) |
| 9 | SSM | Software NSS management |
| 8 | SSI | Internal NSS level when SSM = 1 |
| 7 | LSBFIRST | 0 = MSB first, 1 = LSB first |
| 6 | SPE | SPI enable — set this LAST |
| 5:3 | BR[2:0] | Baud rate prescaler (see table below) |
| 2 | MSTR | 1 = master, 0 = slave |
| 1 | CPOL | Clock polarity (idle level) |
| 0 | CPHA | Clock phase (sampling edge) |
SPI_CR2 (offset 0x04)
| Bit | Field | Meaning |
|---|---|---|
| 14 | LDMA_TX | Odd-byte handling for TX DMA (data ≤ 8-bit) |
| 13 | LDMA_RX | Odd-byte handling for RX DMA (data ≤ 8-bit) |
| 12 | FRXTH | RX FIFO threshold: 1 = RXNE at 8-bit, 0 = at 16-bit |
| 11:8 | DS[3:0] | Data size: 0111 = 8-bit, 1111 = 16-bit |
| 7 | TXEIE | TX-buffer-empty interrupt enable |
| 6 | RXNEIE | RX-buffer-not-empty interrupt enable |
| 5 | ERRIE | Error interrupt enable (OVR, MODF, FRE, CRCERR) |
| 4 | FRF | Frame format: 0 = Motorola, 1 = TI |
| 3 | NSSP | NSS pulse between frames (hardware NSS output) |
| 2 | SSOE | NSS output enable (hardware NSS) |
| 1 | TXDMAEN | TX DMA request enable |
| 0 | RXDMAEN | RX DMA request enable |
SPI_SR (offset 0x08) — status & FIFO levels
| Bit | Field | Meaning |
|---|---|---|
| 12:11 | FTLVL[1:0] | TX FIFO level: 00 = empty … 11 = full |
| 10:9 | FRLVL[1:0] | RX FIFO level: 00 = empty … 11 = full |
| 8 | FRE | TI-mode frame format error |
| 7 | BSY | Bus busy — a transfer is in progress |
| 6 | OVR | Overrun (an RX byte was lost) |
| 5 | MODF | Mode fault (hardware NSS pulled low in master) |
| 4 | CRCERR | CRC mismatch |
| 1 | TXE | TX buffer empty — OK to write DR |
| 0 | RXNE | RX buffer not empty — DR holds received data |
Baud-rate prescaler (CR1.BR)
fSCK = fPCLK / 2(BR+1). The example column assumes SPI1 with PCLK2 = 120 MHz.
| BR[2:0] | Divisor | fSCK @ PCLK2 = 120 MHz |
|---|---|---|
| 000 | /2 | 60 MHz — exceeds spec, do not use for master |
| 001 | /4 | 30 MHz |
| 010 | /8 | 15 MHz |
| 011 | /16 | 7.5 MHz |
| 100 | /32 | 3.75 MHz |
| 101 | /64 | 1.875 MHz |
| 110 | /128 | 937.5 kHz |
| 111 | /256 | 468.75 kHz |
DS12023 caps the master SCK frequency (roughly 40 MHz in transmit-only, and lower — around 24 MHz depending on VDD — in full-duplex / master-receive). At PCLK2 = 120 MHz the /2 divisor (60 MHz) is out of spec; the fastest safe full-duplex divisor is typically /8. Always check the SPI timing table for your VDD range.
NSS management
05 Register-level master init + full-duplex transfer
A complete, compilable bare-metal driver for SPI1 as an 8-bit, mode-0, MSB-first master with software NSS at PCLK2/16. Uses only CMSIS device headers (stm32l4xx.h). Byte-wide DR access and FRXTH are the two details people miss.
#include "stm32l4xx.h"
// ---- forward decl from §03 ----
static void spi1_gpio_init(void);
void spi1_master_init(void)
{
// 1) Clocks
RCC->AHB2ENR |= RCC_AHB2ENR_GPIOAEN;
RCC->APB2ENR |= RCC_APB2ENR_SPI1EN;
(void)RCC->APB2ENR; // read-back barrier
spi1_gpio_init(); // PA5/6/7 = AF5, PA4 = CS output
// 2) Configure with SPE = 0
SPI1->CR1 = 0; // clear: CPOL=0, CPHA=0 (mode 0), MSB first
SPI1->CR1 |= SPI_CR1_MSTR // master
| SPI_CR1_SSM // software NSS management
| SPI_CR1_SSI // internal NSS high -> no MODF
| (3u << SPI_CR1_BR_Pos); // BR=011 -> PCLK2/16
// 3) CR2: 8-bit frames, RXNE per byte, no DMA/IRQ yet
SPI1->CR2 = (0x7u << SPI_CR2_DS_Pos) // DS=0111 -> 8-bit
| SPI_CR2_FRXTH; // FRXTH=1 -> RXNE asserts at 8 bits
// 4) Enable
SPI1->CR1 |= SPI_CR1_SPE;
}
// One full-duplex byte: send tx, return the byte clocked in on MISO.
uint8_t spi1_txrx(uint8_t tx)
{
while (!(SPI1->SR & SPI_SR_TXE)) { } // wait TX space
*(volatile uint8_t *)&SPI1->DR = tx; // 8-bit write (critical!)
while (!(SPI1->SR & SPI_SR_RXNE)) { } // wait for the echo
return *(volatile uint8_t *)&SPI1->DR; // 8-bit read
}
// Buffer transfer with manual chip-select (PA4).
void spi1_transfer(const uint8_t *tx, uint8_t *rx, uint32_t n)
{
GPIOA->BSRR = GPIO_BSRR_BR4; // CS low (select)
for (uint32_t i = 0; i < n; i++) {
uint8_t out = tx ? tx[i] : 0xFF; // 0xFF = dummy for reads
uint8_t in = spi1_txrx(out);
if (rx) rx[i] = in;
}
// Wait until the shift register has fully drained before raising CS
while (SPI1->SR & SPI_SR_BSY) { }
GPIOA->BSRR = GPIO_BSRR_BS4; // CS high (deselect)
}
With DS = 8-bit, a 32-bit or 16-bit store to SPI1->DR pushes two bytes into the TX FIFO. You must cast to volatile uint8_t* for both the write and the read. Likewise set FRXTH = 1 so RXNE triggers on a single byte instead of waiting for a half-word that never completes.
Switching to 16-bit frames
// While SPE = 0:
SPI1->CR2 = (0xFu << SPI_CR2_DS_Pos); // DS=1111 -> 16-bit; FRXTH is ignored
SPI1->CR1 |= SPI_CR1_SPE;
uint16_t spi1_txrx16(uint16_t tx)
{
while (!(SPI1->SR & SPI_SR_TXE)) { }
SPI1->DR = tx; // 16-bit access is correct here
while (!(SPI1->SR & SPI_SR_RXNE)) { }
return SPI1->DR;
}
06 Full-duplex DMA over DMAMUX1
On the L4R5 the DMA controllers are request-agnostic: any DMA1/DMA2 channel can serve any peripheral, and DMAMUX1 routes a peripheral request line onto a channel. You program the request-line ID into the DMAMUX channel's CCR.
| Request | DMAMUX1 request-line ID |
|---|---|
| SPI1_RX | 10 |
| SPI1_TX | 11 |
| SPI2_RX | 12 |
| SPI2_TX | 13 |
| SPI3_RX | 14 |
| SPI3_TX | 15 |
DMAMUX channels map one-to-one onto DMA channels: DMAMUX1_Channel0..6 drive DMA1_Channel1..7, and DMAMUX1_Channel7..13 drive DMA2_Channel1..7. In this example RX uses DMA1_Channel2 (→ DMAMUX1_Channel1) and TX uses DMA1_Channel3 (→ DMAMUX1_Channel2).
#include "stm32l4xx.h"
// Call once after spi1_master_init().
void spi1_dma_init(void)
{
RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN | RCC_AHB1ENR_DMAMUX1EN;
// Route request lines. DMAREQ_ID lives in bits [6:0]; other bits stay 0.
DMAMUX1_Channel1->CCR = 10u; // SPI1_RX -> DMA1_Channel2
DMAMUX1_Channel2->CCR = 11u; // SPI1_TX -> DMA1_Channel3
}
// Blocking full-duplex transfer of n bytes via DMA.
void spi1_dma_transfer(const uint8_t *tx, uint8_t *rx, uint16_t n)
{
// ---- RX channel: peripheral -> memory (DIR = 0) ----
DMA1_Channel2->CCR = 0; // disable while configuring
DMA1_Channel2->CPAR = (uint32_t)&SPI1->DR; // 8-bit peripheral reg
DMA1_Channel2->CMAR = (uint32_t)rx;
DMA1_Channel2->CNDTR = n;
DMA1_Channel2->CCR = DMA_CCR_MINC // increment memory
| DMA_CCR_TCIE; // PSIZE/MSIZE = 00 (8-bit), DIR=0
// ---- TX channel: memory -> peripheral (DIR = 1) ----
DMA1_Channel3->CCR = 0;
DMA1_Channel3->CPAR = (uint32_t)&SPI1->DR;
DMA1_Channel3->CMAR = (uint32_t)tx;
DMA1_Channel3->CNDTR = n;
DMA1_Channel3->CCR = DMA_CCR_MINC | DMA_CCR_DIR; // DIR=1 mem->periph
GPIOA->BSRR = GPIO_BSRR_BR4; // CS low
// Enable channels: RX first so it can never miss the first byte.
DMA1_Channel2->CCR |= DMA_CCR_EN;
DMA1_Channel3->CCR |= DMA_CCR_EN;
// Enable the SPI DMA requests: RX before TX.
SPI1->CR2 |= SPI_CR2_RXDMAEN;
SPI1->CR2 |= SPI_CR2_TXDMAEN;
// Wait for RX completion (RX finishing guarantees the frame is done).
while (!(DMA1->ISR & DMA_ISR_TCIF2)) { }
DMA1->IFCR = DMA_IFCR_CTCIF2; // clear TC flag
// ---- Tear down in the correct order ----
while (SPI1->SR & SPI_SR_BSY) { }
SPI1->CR2 &= ~(SPI_CR2_TXDMAEN | SPI_CR2_RXDMAEN);
DMA1_Channel2->CCR &= ~DMA_CCR_EN;
DMA1_Channel3->CCR &= ~DMA_CCR_EN;
GPIOA->BSRR = GPIO_BSRR_BS4; // CS high
}
Enable the RX DMA stream and RXDMAEN before the TX side. The RX channel drains the FIFO in lock-step with TX; if TX starts first you will get an OVR. On teardown, wait for RX transfer-complete, then BSY = 0, then clear the DMA-enable bits. For frames ≤ 8 bits, the peripheral and memory data sizes are both 8-bit (PSIZE = MSIZE = 00).
07 HAL variant (init, MSP, transfers, callbacks)
The same SPI1 master with STM32Cube HAL. HAL hides FRXTH/DS and the byte-wide DR access; you only choose the high-level options. The MSP callback wires up GPIO and DMA, using the symbolic DMA_REQUEST_SPI1_TX/_RX which expand to the DMAMUX IDs 11/10.
#include "stm32l4xx_hal.h"
SPI_HandleTypeDef hspi1;
DMA_HandleTypeDef hdma_spi1_tx;
DMA_HandleTypeDef hdma_spi1_rx;
void MX_SPI1_Init(void)
{
hspi1.Instance = SPI1;
hspi1.Init.Mode = SPI_MODE_MASTER;
hspi1.Init.Direction = SPI_DIRECTION_2LINES; // full-duplex
hspi1.Init.DataSize = SPI_DATASIZE_8BIT;
hspi1.Init.CLKPolarity = SPI_POLARITY_LOW; // CPOL = 0
hspi1.Init.CLKPhase = SPI_PHASE_1EDGE; // CPHA = 0 (mode 0)
hspi1.Init.NSS = SPI_NSS_SOFT; // software NSS
hspi1.Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_16; // PCLK2/16
hspi1.Init.FirstBit = SPI_FIRSTBIT_MSB;
hspi1.Init.TIMode = SPI_TIMODE_DISABLE;
hspi1.Init.CRCCalculation = SPI_CRCCALCULATION_DISABLE;
hspi1.Init.CRCPolynomial = 7;
hspi1.Init.CRCLength = SPI_CRC_LENGTH_DATASIZE;
hspi1.Init.NSSPMode = SPI_NSS_PULSE_DISABLE;
if (HAL_SPI_Init(&hspi1) != HAL_OK) { Error_Handler(); }
}
// Called automatically by HAL_SPI_Init() — configure clocks, pins, DMA here.
void HAL_SPI_MspInit(SPI_HandleTypeDef *spi)
{
if (spi->Instance != SPI1) return;
__HAL_RCC_SPI1_CLK_ENABLE();
__HAL_RCC_GPIOA_CLK_ENABLE();
__HAL_RCC_DMA1_CLK_ENABLE();
__HAL_RCC_DMAMUX1_CLK_ENABLE();
GPIO_InitTypeDef g = {0};
g.Pin = GPIO_PIN_5 | GPIO_PIN_6 | GPIO_PIN_7; // SCK/MISO/MOSI
g.Mode = GPIO_MODE_AF_PP;
g.Pull = GPIO_NOPULL;
g.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
g.Alternate = GPIO_AF5_SPI1;
HAL_GPIO_Init(GPIOA, &g);
// TX: DMA1_Channel3
hdma_spi1_tx.Instance = DMA1_Channel3;
hdma_spi1_tx.Init.Request = DMA_REQUEST_SPI1_TX; // ID 11
hdma_spi1_tx.Init.Direction = DMA_MEMORY_TO_PERIPH;
hdma_spi1_tx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_spi1_tx.Init.MemInc = DMA_MINC_ENABLE;
hdma_spi1_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_spi1_tx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_spi1_tx.Init.Mode = DMA_NORMAL;
hdma_spi1_tx.Init.Priority = DMA_PRIORITY_HIGH;
HAL_DMA_Init(&hdma_spi1_tx);
__HAL_LINKDMA(spi, hdmatx, hdma_spi1_tx);
// RX: DMA1_Channel2
hdma_spi1_rx.Instance = DMA1_Channel2;
hdma_spi1_rx.Init.Request = DMA_REQUEST_SPI1_RX; // ID 10
hdma_spi1_rx.Init.Direction = DMA_PERIPH_TO_MEMORY;
hdma_spi1_rx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_spi1_rx.Init.MemInc = DMA_MINC_ENABLE;
hdma_spi1_rx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_spi1_rx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_spi1_rx.Init.Mode = DMA_NORMAL;
hdma_spi1_rx.Init.Priority = DMA_PRIORITY_HIGH;
HAL_DMA_Init(&hdma_spi1_rx);
__HAL_LINKDMA(spi, hdmarx, hdma_spi1_rx);
HAL_NVIC_SetPriority(DMA1_Channel2_IRQn, 5, 0);
HAL_NVIC_EnableIRQ(DMA1_Channel2_IRQn);
HAL_NVIC_SetPriority(DMA1_Channel3_IRQn, 5, 0);
HAL_NVIC_EnableIRQ(DMA1_Channel3_IRQn);
}
// The DMA IRQs must forward into the HAL DMA handler:
void DMA1_Channel2_IRQHandler(void) { HAL_DMA_IRQHandler(&hdma_spi1_rx); }
void DMA1_Channel3_IRQHandler(void) { HAL_DMA_IRQHandler(&hdma_spi1_tx); }
Running transfers
uint8_t tx[4] = { 0x9F, 0xFF, 0xFF, 0xFF }; // e.g. flash "read ID"
uint8_t rx[4];
// Software NSS -> toggle your CS GPIO around the call.
void read_id(void)
{
HAL_GPIO_WritePin(GPIOA, GPIO_PIN_4, GPIO_PIN_RESET); // CS low
// Blocking full-duplex:
HAL_SPI_TransmitReceive(&hspi1, tx, rx, 4, HAL_MAX_DELAY);
HAL_GPIO_WritePin(GPIOA, GPIO_PIN_4, GPIO_PIN_SET); // CS high
}
// Non-blocking full-duplex over DMA:
void read_id_dma(void)
{
HAL_GPIO_WritePin(GPIOA, GPIO_PIN_4, GPIO_PIN_RESET);
HAL_SPI_TransmitReceive_DMA(&hspi1, tx, rx, 4);
// return; completion is signalled in the callback below.
}
// Fires when the DMA full-duplex transfer completes.
void HAL_SPI_TxRxCpltCallback(SPI_HandleTypeDef *spi)
{
if (spi->Instance == SPI1)
HAL_GPIO_WritePin(GPIOA, GPIO_PIN_4, GPIO_PIN_SET); // CS high
}
void HAL_SPI_ErrorCallback(SPI_HandleTypeDef *spi)
{
// Inspect spi->ErrorCode: HAL_SPI_ERROR_OVR / _MODF / _DMA ...
}
For a full-duplex transfer always use HAL_SPI_TransmitReceive[_DMA], even if you only care about RX — SPI must clock TX to receive. HAL_SPI_Receive alone works but internally still drives dummy TX. The Request field in the DMA init is what programs the DMAMUX; get it wrong and the channel simply never triggers.
08 Gotchas & common mistakes
Nearly every "SPI doesn't work" bug on the L4R5 is one of the following. They are ordered roughly by how often they bite.
SPIx->DR as a 32/16-bit value with DS=8-bit pushes two bytes. Always cast to volatile uint8_t* and set FRXTH=1 so RXNE fires per byte.BSY=0 before deasserting chip-select, or the last bits get truncated.Init.Request, not just the channel.Correct disable sequence (RM0432)
To stop the SPI without truncating the last frame, follow the reference-manual order rather than just clearing SPE:
static void spi1_disable(void)
{
while ((SPI1->SR & SPI_SR_FTLVL) != 0) { } // 1) TX FIFO drained
while (SPI1->SR & SPI_SR_BSY) { } // 2) last frame shifted out
SPI1->CR1 &= ~SPI_CR1_SPE; // 3) disable
// 4) flush any remaining RX bytes
while ((SPI1->SR & SPI_SR_FRLVL) != 0)
(void)*(volatile uint8_t *)&SPI1->DR;
}
SPI2/SPI3 run from PCLK1; if you gate APB1 in a low-power mode they stop. Also, after wake-up from Stop mode the PLL is off and PCLK reverts to MSI — re-check your BR prescaler so fSCK stays in range for the slave.