The i.MX8QM applications processor is a feature and performance-scalable multicore platform including 2 Cortex-M4 cores. These secondary cores typically run an RTOS optimized for microcontrollers or a bare-metal application. Toradex provides FreeRTOS™, a free professional-grade real-time operating system for microcontrollers, along with drivers and several examples that can be used on our Apalis iMX8QM platform. The FreeRTOS™ port is based on NXP's MCUxpresso SDK for i.MX8QM.
The build system supported to build the firmware and examples is:
The 2 Cortex-M4 CPU cores live side by side with the Cortex-A53/A72 based primary CPU cores. Both CPU complexes have access to the same interconnect and hence have equal access to all peripherals (shared bus topology). The graphic below is an incomplete and simplified drawing of the architecture with emphasis on the relevant sub systems to understand the heterogeneous asymmetric multicore architecture.
There are several types of memory available. The Cortex-M4 provides local memory (Tightly Coupled Memory, TCM), which is relatively small but can be accessed by the CPU without any latency.
For applications requiring more memory, the system DRAM is accessible by the M4 cores. From a performance perspective the TCM memory should be used whenever possible.
A traditional microcontroller typically has internal NOR flash where the firmware is stored and executed from. This is not the case on the Apalis iMX8QM: There is no NOR flash where the firmware can be flashed onto. Instead, the firmware needs to be stored on the mass storage device such as SD-card or the internal eMMC flash. The available mass storage devices are not "memory mapped", and hence application can not be executed directly from any of the cores (no executed-In-Place, XIP). Instead, code need to be loaded into one of the available memory sections before the CPU can start executing it.
The M4 firmware can be placed in the common boot container, so it will be loaded and started by the boot ROM, or it can be placed on a mass storage device. In this case U-Boot needs to be configured to load and execute the M4 firmware.
The two CPU platforms use a different memory layout to access individual sub systems. This table lists some important areas and their memory location for each of the cores side by side. The full list can be found in the i.MX8QM reference manual.
Region | Size | Cortex-A53/A72 | M4-0 (Code Bus) | M4-0 (System Bus) | M4-1 (Code Bus) | M4-1 (System Bus) |
---|---|---|---|---|---|---|
DDR Address | 2GB(*1) | 0x80000000-0xFFFFFFFF |
0x00100000-0x1BFFFFFF |
0x80000000-0xDFFFFFFF |
0x00100000-0x1BFFFFFF |
0x80000000-0xDFFFFFFF |
TCML for M4-0 | 128KB | 0x34FE0000-0x34FFFFFF |
0x1FFE0000-0x1FFFFFFF |
N/A | N/A | |
TCMU for M4-0 | 128KB | 0x35000000-0x3501FFFF |
0x20000000-0x2001FFFF |
N/A | N/A | |
TCML for M4-1 | 128KB | 0x38FE0000-0x38FFFFFF |
N/A | N/A | 0x1FFE0000-0x1FFFFFFF |
|
TCMU for M4-1 | 128KB | 0x39000000-0x3901FFFF |
N/A | N/A | 0x20000000-0x2001FFFF |
(*1): Full DRAM range is 0x8_00000000 - 0xB_FFFFFFFF
. Only a part of the DRAM is accessible by the M4 cores
The Cortex-M4 CPU has two buses connected to the main interconnect (modified Harvard architecture). One bus is meant to fetch data (system bus) whereas the other bus is meant to fetch instructions (code bus).
To get optimal performance, the program code should be located and linked for a region which is going to be fetched through the code bus, while the data area (e.g. bss or data section) should be located in a region which is fetched through the system bus.
The TCML and TCMU regions can be accessed with zero wait-states and thus provides massively better performance than DRAM, even if it is cached. Therefore it is advisable to place all code and data in the TCM whenever possible.
The FreeRTOS source code is currently only available on NXP's MCUXpresso web page:
Here are the steps to download the resources (as of 2019-10-04)
The standard FreeRTOS and bare-metal examples provided by NXP use the M4's tightly coupled UARTs to communicate with the user. This section describes how to configure the Apalis Evaluation Board V1.1A in order to make the required UART ports accessible from your development PC, and how to load and run binary examples.
Warning: The setup supports either to run samples in M4 core 0, or in M4 core 1, but not on both cores at the same time.
Rotate the Apalis Evaluation board (EvB), so that the SD card sockets and audio connectors are pointing towards you.
In order to run the regular examples on M4 core 0, the following jumpers on the Apalis Evaluation Board (EVB) need to be removed:
Remove Jumper | Apalis Standard function |
---|---|
X6-35 | UART1_DTR (to PC) |
X6-32 | UART2_TXD (to PC) |
X6-29 | UART2_RXD (from PC) |
X6-40 | UART1_DSR (from PC) |
Then add two patch wires to route the M4_0 tightly-coupled UART to the DSub-9 connector X28
Patch from | patch to | Comment |
---|---|---|
X5-35 | X7-32 | iMX8 -> PC |
X5-40 | X7-29 | PC -> iMX8 |
In order to run the regular examples on M4 core 1, the following jumpers on the Apalis Evaluation board (EvB) need to be removed:
Remove Jumper | Apalis Standard function |
---|---|
X3-12 | PWM4 (to PC) |
X6-32 | UART2_TXD (to PC) |
X6-29 | UART2_RXD (from PC) |
X3-13 | PWM3 (from PC) |
Then add two patch wires to route the M4_1 tightly-coupled UART to the DSub-9 connector X28
Patch from | patch to | Comment |
---|---|---|
X2-12 | X7-32 | iMX8 -> PC |
X2-13 | X7-29 | PC -> iMX8 |
At the time of writing this article, the tested U-Boot version was:
U-Boot 2020.04-5.1.0-devel+git.0a26a04408ca (Dec 28 2020 - 14:02:46 +0000)
Currently U-Boot does not support loading .elf files. Therefore we need to use .bin files which tend to be less defined and leave more space for usage errors.
Prepare your environment as follows:
Then for each time you want to run the example, Put your keyboard focus into the U-Boot/Linux terminal and follow the steps below:
Turn on the EvB
Press Space in the terminal to enter U-Boot command line.
Optional: To verify that your binary file is accessible on the SD card, enter:
> ls mmc 2
> fatload mmc 2 ${loadaddr} hello_world_m40.bin && dcache flush && bootaux ${loadaddr} 0
18536 bytes read in 15 ms (1.2 MiB/s)
## Starting auxiliary core at 0x80280000 ...
Power on M4 and MU
Copy M4 image from 0x80280000 to TCML 0x34fe0000
Start M4 0
bootaux complete
hello_world_m40.bin is the name of your application and the trailing 0 stands for core 0.
To run the example on core 1 instead, simply replace the trailing "0" with a "1" in the command above.