A little while ago I got one of the fairly common “Nokia 5110” LCD modules, a 84×48 b/w graphic LCD screen, thinking it would be handy to have in current or future projects. One of the things I especially like about this module is that it is using a serial protocol (SPI) to send data and control messages. This reduces the number of pins required tremendously. When I finally got around to playing around with it, I managed to get it to work with my STM32F4-Discovery board fairly quickly. I could have left it at that, but the code I was using during my initial tests was rather crude and inefficient, and I thought this module would be a good reason to finally get my hands dirty with the DMA controller on the STM32F4.
What is DMA?
If you don’t know what a DMA controller is or why it might be helpful, maybe a quick explanation will clarify this. DMA stands for “Direct Memory Access”. As the name suggests, it has something to do with accessing memories, or more precisely with moving data between different locations in “memory space”. A very typical example would be reading the value of a receive register of a communication module such as a UART and storing it in a variable or array in RAM. Of course you can easily do this with the CPU, so why do you need a special module for this transfer? The problem with using the CPU for this type of work is that it can actually tie up the CPU quite a bit. You would either have to constantly check if new data is available in the peripheral by reading and checking a status bit (or bits), or you could use an interrupt (with the appropriate handler) to alert you to the arrival of new data and to perform the transfer. This might be perfectly fine if you’re only interested in transferring a small number of bytes, but what if you would like to transfer larger amounts of data? Continuously? In that case even the interrupt method can become quite inefficient, because there is some overhead associated with entering and exiting the interrupt handlers. So wouldn’t it be nice if there was a little helper module that you could tell to watch out for incoming bytes on the peripheral and to store them in some chunk of memory as they come in, and to only alert the CPU (if at all) if a certain number of bytes have been received? This is what a DMA controller is for: it allows you to transfer data without the involvement of the CPU ( that’s where the “direct” comes from). Essentially it allows you to use the resources in your microcontroller/microprocessor system more efficiently. It frees up the CPU for other tasks while data are being shuffled back and forth between memories and peripherals (or between memories).
Setting up DMA transfers
So, what is required to setup a DMA transfer? If you think about the above explanation, it’s actually quite straight forward: you need a “source”, a memory address where you’re copying data from, a “destination”, i.e. where you’re copying to, some information about the size of the data chunks to be transferred, and some sort of signal telling the DMA controller when the next byte is ready to be transferred. In practice, it’s slightly more complicated as there are a few other (quite useful) features that need to be configured to complete the setup. So let’s look at the DMA controller in the STM32F4 device family.
When you read the data sheet or the reference manual for the STM32F4 microcontrollers regarding DMA, the first thing you will notice that there are actually two DMA controllers on these devices. Both of these DMA controllers are almost identical. The main difference between them is that only the DMA2 controller can perform memory-to-memory transfers. The other difference is that DMA1 is connected to the APB1 peripherals, whereas DMA2 is connected to the APB2 peripherals. This will become important soon. Each DMA controller has 8 separate streams. Each stream itself needs to be associated with one of 8 channels. These channels carry the signal I mentioned above to notify the DMA controller that data are ready to be transferred. More formally, these are called “DMA requests”. And that’s the first slightly complicated thing in using DMA. Which stream and channel do I need to use? Well, if your intended DMA transfer involves a peripheral it’s actually not complicated at all, as long as you know where to look. On pages 164 and 165 of the Reference Manual you will find the following two tables:
These tables show which of the DMA streams a particular peripheral is connected to and via which channel. A quick side note: the text above the tables in the manual states: ” The 8 requests from the peripherals … are independently connected to each channel and their connection depends on the product implementation.” This seems to imply that these mappings can change from device to device. However, I have not yet seen any specific information about the DMA mapping in a controller-specific document, and apparently I’m not alone.
An example using the standard peripheral library
As usual, the Standard Peripheral Library provides convenient structures and functions to get everything set up correctly. Here’s the initialization procedure I used to set up DMA transfers between an array in RAM and the SPI2 module.
dma_is.DMA_Channel = DMA_Channel_0;
dma_is.DMA_Memory0BaseAddr = (uint32_t)screenBuffer;
dma_is.DMA_PeripheralBaseAddr = (uint32_t)(0x4000380C); //SPI2 DR
dma_is.DMA_MemoryDataSize = DMA_MemoryDataSize_Byte;
dma_is.DMA_PeripheralDataSize = DMA_MemoryDataSize_Byte;
dma_is.DMA_DIR = DMA_DIR_MemoryToPeripheral;
dma_is.DMA_Mode = DMA_Mode_Normal;
dma_is.DMA_MemoryInc = DMA_MemoryInc_Enable;
dma_is.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
dma_is.DMA_BufferSize = 6*84;
dma_is.DMA_Priority = DMA_Priority_High;
dma_is.DMA_MemoryBurst = DMA_MemoryBurst_Single;
dma_is.DMA_PeripheralBurst = DMA_PeripheralBurst_Single;
dma_is.DMA_FIFOMode = DMA_FIFOMode_Disable;
SPI_I2S_DMACmd(SPI2, SPI_I2S_DMAReq_Tx, ENABLE);
The first few lines should be fairly self-explanatory. The DMA request for the transmit buffer empty signal of the SPI2 module is located on channel 0 of DMA1, stream 4. Our memory base address is the pointer to an array (and since the name of an array is just the pointer to the first element, we don’t need the ampersand. The peripheral base address is the pointer to the data register of the SPI2 module. We’re transmitting single bytes from memory to peripheral. For DMA mode there are two options: normal and circular. In normal mode, the DMA stops transferring bytes after the specified number of data units, whereas in circular mode it simply returns to the initial pointer and keeps going. “Initial pointer?” you might ask: that’s where the next two lines come into play. You can ask the DMA controller to automatically increment one or both of the pointers after each transfer. That’s what’s happening in my example, while the peripheral address doesn’t change (obviously we want to keep reading the same register), the pointer location into memory does get incremented thereby stepping through my “screenBuffer” array, and the size of this array is exactly 6*84 bytes (corresponding to the 48 x 84 pixels). Lastly, we specify a priority level for this DMA stream (since other streams may be active as well, and they may compete for access to the buses).
The last couple of lines of the initialization structure are slightly less obvious. To understand what’s happening here you have to know that each DMA stream has its own small FIFO (first-in-first-out) buffer, which can temporarily store data from the source before transferring it to the destination. In addition, the transfer can happen in burst instead of single transfers. Together this can be helpful to deal with bus contention issues and in other scenarios, but there are a few extra rules that need to be followed when using the FIFO and burst transfers. If you’re considering using them, make sure to carefully study pages 173ff of the Reference Manual. In my example I’m using “direct” mode (=no FIFO), in single item modes (=no bursts).
Once the initialization structure is defined, it’s time to initialize the DMA stream, and turn it on. It should then be ready to receive and react to DMA requests on the specified channel. Lastly, once we’re ready to start the transfer we need to start issuing DMA requests. In my example this is done via the SPI2 module, whenever the transmitter empty (TXE) flag is set. Of course the SPI module itself needs to be initialized first, which is not shown here. Similarly, I didn’t show that the clock signal to the DMA controller needs to be turned on first, just like for many (or all?) of the other peripherals.
End of transfer and double buffering
What happens next? The DMA controller should now transfer single bytes from the “screenBuffer” array to the SPI2 data register, whenever the transmitter empty flag gets set. It will do so 504 times (=6*84), and once completed will turn itself off (since it’s not in circular mode). If we wanted to re-transmit the data in screenBuffer (maybe after they have been altered), all we would need to do is re-enable the DMA stream. If we need a continuous data stream between memory and peripheral, it may actually be better to configure the DMA stream in circular mode, but in “double buffer” mode. In this mode there are two memory locations (hence double) and the DMA stream swaps to the respective other memory location each time it finished dealing with one (i.e. reading from or writing to it). That allows the rest of the software to process the location/buffer that’s not currently used for the DMA transfers. To keep track of which of the two buffers is currently in use, a bit in the stream’s configuration register becomes a status indicator (i.e. read-only; you can write to this bit only if the stream is disabled) once the stream is enabled. Of course you could also use a DMA generated interrupt to keep track of the two buffers.
So far a quick overview of DMA with the STM32F4. I’m sure you’ll agree with me that it can be an incredibly powerful feature that allows you to make the most of available computing power of the microcontroller. How useful it is will of course depend on the precise nature of the task to be accomplished. If there’s nothing else for the CPU to do as long as the data haven’t been transferred then DMA isn’t really of much help. However, if you rather not wait for a slow peripheral to transfer to or from the microcontroller before doing other things, DMA can be an excellent choice.