Search by Tags

TorizonCore Boot Time Optimization

 

Article updated at 23 Aug 2021

Select the version of your OS from the tabs below. If you don't know the version you are using, run the command cat /etc/os-release or cat /etc/issue on the board.



Remember that you can always refer to the Torizon Documentation, there you can find a lot of relevant articles that might help you in the application development.

Torizon 5.4.0

Introduction

For an Operating System, it is always desirable to have a shorter boot time and to have the applications run as soon as possible.

The leeway one has in making optimizations for a general-purpose distribution, such as TorizonCore, is quite limited as compared to a system that is designed to do a limited set of tasks.

A general-purpose distribution, such as TorizonCore, decides and defines many features at the distribution level. This in turn may decide:

  • What device drivers are available
  • What filesystems are supported
  • Which Bootloader/Kernel features are enabled and so on

To support a wide clientele that is making diverse products using TorizonCore, Toradex enables all the features that most of our customers are going to need. This in turn can increase the size of the resulting boot artifacts(Bootloader, Kernel, Rootfs) for a Special Purpose Linux system that a customer may be deploying in the end.

Toradex does the distribution level optimizations and the customer are free to use different tools, such as the TorizonCore Builder Tool, to further optimize the system for their specific needs.

This article focuses mostly on distribution level optimizations.

Toradex also provides technical resources for the customers to enable them to optimize the boot time further for their final application(s). Please see the blog post linked at the end of this article.

This article complies to the Typographic Conventions for the Toradex Documentation.

Prerequisites

  • A Toradex SoM with Torizon installed.

How to measure TorizonCore boot time

There are many ways and tools to determine the boot performance of a Linux system. As TorizonCore uses systemd as an init system, systemd-analyze can be used to determine and analyze system boot-up performance statistics.

  • systemd-analyze time

This command gives the time spent in the kernel before userspace is reached and the time userspace took to spawn all services.

# systemd-analyze time
Startup finished in 3.629s (kernel) + 6.978s (userspace) = 10.607s 
multi-user.target reached after 6.947s in userspace
  • systemd-analyze blame

If the time measurements given by systemd-analyze time seem too long we can use the blame option of systemd-analyze to see where most of the time was spent in userspace initialization.

# systemd-analyze blame | head
3.019s plymouth-quit-wait.service          
2.078s systemd-logind.service              
1.836s dev-disk-by\x2dlabel-otaroot.device 
1.720s docker.service                      
1.318s plymouth-quit.service               
1.115s NetworkManager-wait-online.service  
1.001s udisks2.service                     
 920ms mwifiex-delay.service               
 816ms systemd-networkd.service            
 765ms systemd-resolved.service            
  • systemd-analyze critical-chain [UNIT]

The critical-chain option reports a tree of time critical-chain of units. By default, it reports this for the default target. In the case of TorizonCore, the default target is multi-user.target. The default boot target can be retrieved by systemctl get-default.

To print critical-chain for the default target:

# systemd-analyze critical-chain 
The time when the unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

multi-user.target @6.947s
└─getty.target @6.947s
  └─serial-getty@ttymxc0.service @6.945s
    └─systemd-user-sessions.service @3.881s +26ms
      └─network.target @3.858s
        └─wpa_supplicant.service @5.466s +73ms
          └─basic.target @2.046s
            └─sockets.target @2.046s
              └─sshd.socket @2.020s +24ms
                └─sysinit.target @2.005s
                  └─systemd-timesyncd.service @1.285s +718ms
                    └─systemd-tmpfiles-setup.service @1.247s +31ms
                      └─systemd-journal-flush.service @1.225s +17ms
                        └─systemd-journald.service @581ms +639ms
                          └─systemd-journald.socket @550ms
                            └─-.mount @525ms
                              └─system.slice @525ms
                                └─-.slice @525ms

To print critical-chain for a particular target, say docker.service:

# systemd-analyze critical-chain docker.service           
The time when the unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

docker.service +1.720s
└─network-online.target @3.858s
  └─network.target @3.858s
    └─wpa_supplicant.service @5.466s +73ms
      └─basic.target @2.046s
        └─sockets.target @2.046s
          └─sshd.socket @2.020s +24ms
            └─sysinit.target @2.005s
              └─systemd-timesyncd.service @1.285s +718ms
                └─systemd-tmpfiles-setup.service @1.247s +31ms
                  └─systemd-journal-flush.service @1.225s +17ms
                    └─systemd-journald.service @581ms +639ms
                      └─systemd-journald.socket @550ms
                        └─-.mount @525ms
                          └─system.slice @525ms
                            └─-.slice @525ms

There are some important points to consider to interpret the measurement as reported by these commands. Please take a look at man systemd-analyze and take these considerations into account when interpreting the measurements. Also, note that systemd-analyze does not report the time spent in the bootloader and the time it reports is certainly not from when the power button is pressed on the device.

Practically, there is a variance of around 3 seconds in the overall boot time for successive boots.

Blacklisting WiFi drivers to optimize boot time

Some of the Toradex SoMs use NXP (formerly Marvell) chipsets for WiFi. The mwifiex drivers for these devices are enabled as loadable kernel modules by default and are loaded at boot time. It was observed that the boot time could vary depending upon when the WiFi drivers were loaded during the boot process. If the WiFi drivers loaded early in the boot, the boot time would be longer and vice versa. To always have a shorter boot time, we blacklist these drivers by default and force their loading at a later stage in boot.

The drivers(modules) are blacklisted by creating a conf file in /etc/modprobe.d/ (See $ man modprobe.conf) and then loaded later in the boot using a systemd service. The dependencies of this service also manage at what time these drivers are loaded.

See this commit as an example of how we did this in TorizonCore. The same approach may also be applied to any other WiFi driver that exhibits the same issue.

Unneeded modules can also be blacklisted to optimize the boot time further. Modules currently loaded into the system can be checked using # lsmod.

Running container applications earlier in the boot process

One of the main challenges in running the container applications earlier in the the boot process is to initialize the container engine/daemon earlier.

The container daemon in the case of TorizonCore is dockerd which is launched by a systemd service docker.service.

Let us take a look at how we can optimize the boot time for a containerized application that does not depend on the network. We will use a sample Qt application for demonstration purposes.

If you are running a Torizon image with evaluation containers, please disable the docker-compose service.

# sudo systemctl disable docker-compose

Next, launch a Qt5 OpenGL cube example using docker run --restart unless-stopped so that it is automatically run at boot. For example, on iMX8

# docker run -d --restart unless-stopped --net=host --name=qt5-wayland-examples \
    -e QT_QPA_PLATFORM=eglfs \
    --cap-add CAP_SYS_TTY_CONFIG -v /dev:/dev -v /tmp:/tmp -v /run/udev/:/run/udev/ \
    --device-cgroup-rule='c 4:* rmw'  --device-cgroup-rule='c 13:* rmw' \
    --device-cgroup-rule='c 199:* rmw' --device-cgroup-rule='c 226:* rmw' \
    torizon/qt5-wayland-examples-vivante:$CT_TAG_QT5_WAYLAND_EXAMPLES_VIVANTE \
    /usr/bin/kms-setup.sh /usr/lib/aarch64-linux-gnu/qt5/examples/opengl/cube/cube

You will see a dice on the connected display. Reboot the board and the application will load automatically. Use journalctl to find the time it took to launch the application:

# journalctl | head -n1
-- Logs begin at Tue 2021-08-31 08:31:54 UTC, end at Tue 2021-08-31 08:32:07 UTC. --
# journalctl | grep --color  "Now running.*cube" | tail -n1
Aug 31 08:32:05 verdin-imx8mm-06827785 9aa3037ebe11[688]: Now running "/usr/lib/aarch64-linux-gnu/qt5/examples/opengl/cube/cube"

The above log shows that it took 11s(08:32:05 - 08:31:54) to launch the application.

Also, take a systemd-analyze time reading for reference:

# systemd-analyze time
Startup finished in 3.518s (kernel) + 11.009s (userspace) = 14.528s 
multi-user.target reached after 10.981s in userspace

Now let us see the dependencies of the docker.service

# systemctl cat docker.service | grep "^After\|^Wants\|^Requires"
After=network-online.target docker.socket firewalld.service usermount.service
Wants=network-online.target
Requires=docker.socket usermount.service

Say our application does not require the network, firewall, and usermount services and we remove them so that the final dependencies in the docker.service look like this:

# systemctl cat docker.service | grep "^After\|^Wants\|^Requires"
After=docker.socket
Wants=network-online.target
Requires=docker.socket

As we changed a systemd service/unit file, run the following to reload systemd units

# sudo systemctl daemon-reload

and reboot the target.

Now journalctl shows 9s(08:45:45 - 08:45:36) which is an improvement of two seconds from before.

# journalctl | head -n1
-- Logs begin at Tue 2021-08-31 08:45:36 UTC, end at Tue 2021-08-31 08:46:03 UTC. --
# journalctl | grep --color  "Now running.*cube" | tail -n1
Aug 31 08:45:45 verdin-imx8mm-06827785 9aa3037ebe11[574]: Now running "/usr/lib/aarch64-linux-gnu/qt5/examples/opengl/cube/cube"

systemd-analyze time also confirms the same

# systemd-analyze time
Startup finished in 3.571s (kernel) + 9.017s (userspace) = 12.588s 
multi-user.target reached after 8.989s in userspace

Editing Systemd Services

In the example above, we modify a Systemd unit to make the boot faster directly on the board. But how can we make the change persistent?

There are good resources out there explaining how to override a systemd service, either by using systemctl edit <unit> or manually creating the required files. Here are some references:

According to the systemd.unit documentation, system units created by the administrator must be placed inside /etc/systemd/system.

Since the overrides you make will be inside /etc, you can capture changes with TorizonCore Builder. This is also why you must not edit the original files under /usr directly.

Final Considerations

If your product requires a boot time that is faster than Torizon can meet, even after you try some optimization as described in this article, then you may have to consider an alternative solution, as creating your own distribution starting from our BSP Layers and Reference Images for Yocto Project Software.

You can read our blog post on Boot Time Optimization from 2015. While it may not apply to our latest software without adaptations, the concepts are still valid and it's a valuable read.