Select the version of your OS from the tabs below. If you don't know the version you are using, run the command cat /etc/os-release
or cat /etc/issue
on the board.
Remember that you can always refer to the Torizon Documentation, there you can find a lot of relevant articles that might help you in the application development.
TorizonCore is built with OSTree and Aktualizr-Torizon, the former is a shared library and suite of command-line tools that combines a "git-like" model for committing and downloading bootable filesystem trees, along with a layer for deploying them and managing the bootloader configuration. The latter is a fork of Aktualizr, a "daemon-like" open-source implementation of the Uptane SOTA standard that secures updates from end-to-end.
OSTree and Aktualizr-Torizon are complementary and together they form the foundation for OTA (over-the-air) and offline update capabilities on the device.
On the server-side, Toradex is working on a cloud-based hosted option as well as an on-premise option to provide a complete OTA and offline solution that works with TorizonCore.
This article complies to the Typographic Conventions for Torizon Documentation.
OSTree has its own article, please refer to OSTree for a brief overview and a demonstration of how to use it.
Our remote and offline updates implementation allows us to update the following components:
/usr
is updated/var
and /home
are ignored, they keep their content over updates/etc
is updated via a 3-way mergeThe following components cannot be updated at the moment, though we plan to support them in future releases of TorizonCore:
Uptane is a de facto automotive SOTA standard, held by a non-profit consortium named Uptane Alliance under the IEEE/ISTO Federation. Its focus is to enable secure software updates over-the-air resiliently. It relies on multiple servers to provide security by validating data before a download starts and ensuring that even an offline attack that compromises a single server would still not be enough to compromise the system security. Uptane is an enhancement to the TUF (The Update Framework) security framework, which is currently a very widely used framework to secure software and package updates on computers and smartphones. The motivations to expand the TUF framework are described in detail in the Uptane Design page and a favorable explanation of TUF is in its docs page Understand the Notary service architecture.
Aktualizr-Torizon is TorizonCore's client implementation of Uptane (forked from Aktualizr's default client). It is written in C++ and its responsibility is to communicate with a Uptane compatible server. It verifies if new downloads are available, installs those updates on the system, and reports status to the server while guaranteeing the integrity and confidentiality of OTA and offline updates. Aktualizr handles Docker image updates seamlessly by using Docker Compose YAML files.
Aktualizr - Modifying the Settings of Torizon OTA Client is a dedicated article that covers the practical aspects, including its usage.
There can be cases where the system may fail to boot or the boot process is considered unsuccessful either due to kernel panic or failure to start any critical user-space application. These issues can be handled by developers during development, but it becomes a nightmare if the solution is deployed and such an issue occurs due to any bad update.
TorizonCore and Aktualizr-Torizon are fully capable to recover from bad updates by doing the following:
In TorizonCore, the Linux kernel is configured to panic and reboot in case of freezes or crashes. This helps to recover from bad kernel updates.
At the user-space level, systemd hardware watchdog integration is enabled by default in TorizonCore. That means systemd will regularly ping the watchdog hardware, and if systemd (or the kernel) hangs, this ping will not happen anymore and the hardware will automatically reboot the device. This helps to recover from bad updates when the kernel or the initialization daemon (systemd) is not able to run.
Lastly, TorizonCore will consider a successful boot if the boot-complete
systemd target is successfully executed. This is because the main operating system services required for proper operation, including the Docker daemon, are inside boot-complete.target
. And if boot-complete.target
fails during an update, TorizonCore will automatically reboot. This helps TorizonCore to recover from bad updates when critical processes from the base operating system don't run as expected.
A TorizonCore user may also define his own "rules" to validate an update using the Greenboot framework. Greenboot (Generic Health Check Framework) is a Fedora project that helps manage systemd services health, and TorizonCore uses Greenboot as a framework to make update checks more flexible and manageable for the user. With Greenboot, you can define a shell script that can do additional checks in the system and force a reboot if needed. For more information about how to use Greenboot, have a look at Update Checks and Rollbacks.
In addition to general OS updates, you can also separately update the containers (your application) on a TorizonCore device. These types of updates use the same update framework as OS updates but, are otherwise different in some ways.
Most important among these differences are the conditions for a successful container update. Unlike OS updates a container update does not require a reboot. This eliminates the possibility of checks at boot. Furthermore, the use-cases for containers are far more varied making it difficult to have update checks that account for all possible cases.
Therefore we have opted for basic general checks upon update. The checks performed are as follows:
docker-compose pull --no-parallel
on the new docker-compose.yml, to pull the new container images.docker-compose -p torizon down
on the old docker-compose.yml, to stop and remove any container associated with the old docker-compose.ymldocker-compose -p torizon up --detach --remove-orphans
on the new docker-compose.yml, to bring up the new containers as defined by the parameters of the compose file.docker system prune -a --force
. This will clean up any unused containers, networks, and images from the device.If any of the above commands “fails” then the entire update is considered failed. With failure being defined by the exit code returned from the command. 0 is considered a success while all other exit codes are failures.
An important thing to note here is that, no further checks are made on the state of the container after being started. This can lead to instances where a container starts successfully, but soon after exits due to an error. By the above checks, this would still be considered a successful update.
This means it is important that you verify the status of your containers after they have started. Remote and offline updates are not equal to health monitoring, which would allow containers to be automatically restarted if they stop meeting user-defined health criteria.
As mentioned above, TorizonCore will automatically roll back after 3 unsuccessful reboots. And the automatic rollback feature relies on Aktualizr-Torizon’s rollback support and U-Boot's bootcount
feature.
TorizonCore uses Aktualizr-Torizon with rollback_mode
set to uboot_masked
. This enables Aktualizr-Torizon’s U-Boot bootcount
integration:
upgrade_available
to 1
and bootlimit
to 3
.bootcount
environment variable. After three times (when bootcount
is greater than bootlimit
), the system will roll back to the previously installed OS version.upgrade_available
and bootcount
are set back to 0
.TorizonCore’s OTA and offline updates allows it to roll back to the last installed update thanks to its OSTree based root file system. It also allows to keep multiple deployments (kernel/initramfs/device-tree and the rootfs) on a system and have them bootable. The initial (factory) image has only a single deployment available and is assumed to be a working deployment (no rollback can be done at this point). After the first update has been rolled out, there will be two deployments on the system at all times. If a new deployment fails, the system will automatically roll back to the previous deployment.
Note: When installing an update without Aktualizr-Torizon (e.g. using ostree admin
directly) automatic rollback will not work. To use automatic rollback in a pure OSTree system, those steps need to be executed manually as described in Ostree!
Starting with TorizonCore 5.4.0, a new feature "synchronous updates" was added. In the context of Torizon OTA and offline updates, a "synchronous update" refers to the simultaneous update of the OS and application packages on a TorizonCore device. However, such an update is more than just updating 2 components at once. It’s truly synchronous, in the sense that both OS and application must update successfully or they will fail together as if they were a single component.
The main motivation behind synchronous updates is for cases where the OS and application are intertwined. For example, imagine you have a new application that relies on a new driver. This new driver is only available in a new version of the OS. Given the dependency here it’d be ideal to be able to update both OS and application at the same time. Also due to the design of synchronous updates, it’s guaranteed that you’ll end up with a system that either, failed to update or successfully updated both components. This way the system prevents prevent awkward scenarios, where only 1 component is updated successfully leaving you with a halfway updated system.
However, the above example is not the only situation that would warrant synchronous updates. In general, if you want to tie the failure and success states of both an OS and application update to one another, then synchronous updates are the solution. Otherwise, if it is acceptable for the OS and application update to fail or succeed independently of one another, then 2 non-synchronous updates are sufficient.
Given the requirement to tie the success and failure states of both OS and application updates, the process of a synchronous update differs quite a bit from a standard non-synchronous update. As a user, in order to debug possible update failures, it's important to understand the general process of a synchronous update.
docker-compose.yml
docker-compose.yml
and prune the new container images from the system.docker-compose.yml
and bring up the new docker-compose.yml
.
docker-compose.yml
fails to be brought up successfully, then we remove it and prune the new container images from the system. A flag is then set telling the system to rollback to the previous OS. A reboot is then triggered to perform the OS rollback. The old docker-compose.yml
and containers are still in the system, so the rollback set the system to use them.docker-compose.yml
has been brought up successfully, we then remove the old docker-compose.yml
replacing it with the new one. Finally, we prune the system to clean up old containers and images leftover by the now previous docker-compose.yml
.