Select the version of your OS from the tabs below. If you don't know the version you are using, run the command cat /etc/os-release
or cat /etc/issue
on the board.
Remember that you can always refer to the Torizon Documentation, there you can find a lot of relevant articles that might help you in the application development.
Device monitoring with Torizon encompasses several different areas of functionality related to understanding the health, status, and performance of your devices.
When we think about device monitoring, we break it down into three types of data: metrics, logs, and alerts. A metric is a numerical value that we can measure and report on a regular interval over time, like memory usage, CPU temperature, or custom data from our users' applications. Logs are just the log output that various parts of the system produce, including docker container logs, kernel logs, and application logs from journald. Alerts are special events or errors that the device wants to raise in real-time because they require attention or remediation, like when a critical application fails to launch, or the device is running out of storage space.
Starting in TorizonCore 5.3, a monitoring agent is available that can collect all three types of data--metrics, logs, and alerts--and send it either to the Torizon Platform Services Web Interface, or independently to other external services. Today, the Torizon Platform supports metrics of all types. Out of the box, TorizonCore will report some basic system info, but you can also create and send your own custom metrics to build dashboards showing whatever data is most important to you. Log forwarding and real-time alerting are not available yet, but are planned for the future.
When investigating the best option for the monitoring agent in TorizonCore, we looked for an option that would be modular, event-driven, based on known and widely adopted standards, open-source, and with acceptable performance and resource usage on resource-constrained devices. Additionally, we wanted it to be flexible enough to handle all three types of device monitoring data. After considering all the options, we chose Fluent Bit as our monitoring agent.
Fluent Bit is an open-source log processor and forwarder, which allows to collect any data like metrics and logs from different sources (hardware and software), enrich them with filters, and send them to multiple destinations.
In Fluent Bit, information is processed in a pipeline, with a very pluggable architecture. Data is collected with input plugins, filtered with filter plugins, and sent to remote servers with output plugins:
Fluent Bit is written in C and designed with performance in mind (high throughput with low CPU and memory usage).
For more information about the project, see the Fluent Bit official documentation.
This article complies to the Typographic Conventions for the Toradex Documentation.
Tip: In this article, you will need to execute the commands as root. You can log in as root or (better) use the sudo
command when logged in with a regular user to do it.
Fluent Bit is integrated and enabled in TorizonCore 5.4.0 and later versions. By default, it's configured to monitor CPU, memory, temperature, and the docker daemon, and send the information to the Torizon Platform.
When you first boot TorizonCore, the Fluent Bit service will not start due to the absence of the /etc/fluent-bit/enabled
file:
# systemctl status fluent-bit
* fluent-bit.service - Fluent Bit
Loaded: loaded (/usr/lib/systemd/system/fluent-bit.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Condition: start condition failed at Tue 2021-09-14 07:42:15 UTC; 5h 19min ago
`- ConditionPathExists=/etc/fluent-bit/enabled was not met
Sep 14 07:42:15 colibri-imx6-10492785 systemd[1]: Condition check resulted in Fluent Bit being skipped.
As soon as the device is provisioned to the Torizon Platform, the provisioning script will enable the Fluent Bit service by creating the /etc/fluent-bit/enabled
file (unless you choose to disable it). After that, Fluent Bit will start collecting data and sending it to the Torizon Platform. The data transport is secured passing it through Aktualizr-Torizon, so it takes advantage of the mutual TLS connection that software updates use.
Device monitoring is already enabled by default in TorizonCore 5.4.0 and later versions. The only thing you need to do is provision the device to the Torizon Platform Services and make sure the "Enable device metrics" box is checked. No extra steps are required.
In case your device has been provisioned prior to TorizonCore 5.4.0 and you have updated it using the Torizon Platform, you may need to create the file /etc/fluent-bit/enabled
manually and restart the service:
# sudo touch /etc/fluent-bit/enabled
# sudo systemctl restart fluent-bit
TorizonCore 5.3.0 has all the required infrastructure for device monitoring, but it is not enabled by default.
To enable device monitoring in TorizonCore 5.3.0, the first step is to provision the device to the Torizon Platform.
Then you have to create a Fluent Bit configuration file:
/etc/fluent-bit/fluent-bit.conf[SERVICE] flush 1 daemon Off log_level info parsers_file parsers.conf plugins_file plugins.conf [INPUT] name cpu tag cpu interval_sec 300 [FILTER] Name nest Match cpu Operation nest Wildcard * Nest_under cpu [INPUT] name mem tag memory interval_sec 300 [FILTER] Name nest Match memory Operation nest Wildcard * Nest_under memory [INPUT] name thermal tag temperature name_regex thermal_zone0 interval_sec 300 [FILTER] Name nest Match temperature Operation nest Wildcard * Nest_under temperature [INPUT] name proc proc_name dockerd tag proc_docker fd false mem false interval_sec 300 [FILTER] Name nest Match proc_docker Operation nest Wildcard * Nest_under docker [OUTPUT] name tcp port 8850 format json_lines match *
Enable the Fluent Bit service:
# systemctl enable fluent-bit
# systemctl restart fluent-bit
And configure the data proxy in Aktualizr-Torizon:
# mkdir -p /etc/systemd/system/aktualizr-torizon.service.d/
# sh -c 'echo "[Service]" > /etc/systemd/system/aktualizr-torizon.service.d/override.conf'
# sh -c 'echo "Environment=\"AKTUALIZR_CMDLINE_PARAMETERS=--enable-data-proxy\"" >> /etc/systemd/system/aktualizr-torizon.service.d/override.conf'
# systemctl daemon-reload
# systemctl restart aktualizr-torizon
Now the device should be ready to send monitoring data to the Torizon Platform.
Due to the missing infrastructure to support device monitoring, it's not possible to enable it in TorizonCore 5.2.0 and earlier versions.
Disabling device monitoring can be easily done by simply disabling the Fluent Bit service:
# systemctl stop fluent-bit
# systemctl disable fluent-bit
TorizonCore ships with a default configuration file for Fluent Bit that allows it to collect four basic metrics:
However, you can modify this default configuration to send customized metrics, such as metrics from your own applications or from sensors connected to your board.
The Torizon Platform can accept metrics of any kind, as long as they are formatted properly. In this section, you'll learn how to add a new input plugin to fluent bit, add a filter plugin that formats the data for the Torizon Platform, and start sending data. If your use case isn't covered here, you can always consult the official documentation of Fluent Bit to learn how to do more.
The default Fluent Bit configuration (shown above and available on your board in /etc/fluent-bit/fluent-bit.conf
) has four key parts:
[SERVICE]
section containing basic Fluent Bit options[INPUT]
sections enabling input plugins[FILTER]
sections formatting the data that those input plugins produce[OUTPUT]
section that tells Fluent Bit how to send the data to Aktualizr-TorizonTo add custom metrics, we don't need to change [SERVICE]
or [OUTPUT]
; we just need to add a new input plugin and filter. We need to set up our config file so that we get JSON-formatted output that looks like this:
{
"custom": {
"my_metric_1": 123.4,
"my_metric_2": 567.8
}
}
The specific requirements are:
custom
The simplest way to do this is with an input plugin that accepts raw JSON as an input, like the HTTP input plugin, and nest it under the custom
key with the Nest filter plugin.
Try adding the following to your /etc/fluent-bit/fluent-bit.conf
, just above the [OUTPUT]
block, then restart Fluent Bit (systemctl restart fluent-bit
):
[INPUT]
name http
host localhost
port 9999
[FILTER]
Name nest
Match custom
Operation nest
Wildcard *
Nest_under custom
This will make Fluent Bit listen on port 9999 for HTTP POST requests, and any custom metrics you send to http://localhost/custom
will be nested inside the custom
object--exactly what we need.
You can try it out by manually sending some JSON data to that port using curl. For example, the following curl command will send exactly the metrics given in the example json above:
# curl -X POST http://localhost:9999/custom \
-H 'Content-Type: application/json' \
-d '{"my_metric_1":123.4,"my_metric_2":567.8}'
This is just one basic example. As long as you have an input plugin that gives you the data you want, and a filter (or series of filters) to put the data into the format required, you can build in all kinds of metric reporting. A few other inputs that might be of interest:
Each of these will require some kind of filtering to get the data into the required format; for more information consult the Fluent Bit official documentation.
One common metric to monitor is disk usage. Fluent bit doesn't offer an input plugin for this, but it's easy enough to get using the df
command, the Exec input plugin, and a simple filter.
df -k
gives us the total, used, and available space in kilobytes:
# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 1898992 25824 1873168 2% /run
devtmpfs 1384624 0 1384624 0% /dev
/dev/disk/by-label/otaroot 15226800 1687972 12745640 12% /sysroot
Adding a | grep otaroot
gives us just the disk we're looking for, and either awk
or jq
can give us the rest of what we need to parse that into JSON:
# jq example
df -k | grep otaroot | jq -R -c -s 'gsub(" +"; " ") | split(" ") | { "otaroot_total": .[1], "otaroot_used": .[2], "otaroot_avail": .[3]}'
# awk example
df -k | grep otaroot | awk '{print "{\"otaroot_total\":" $2 ",\"otaroot_used\":" $3 ",\"otaroot_avail\":" $4 "}"}'
Now add one of these commands to /etc/fluent-bit.conf
using the Exec input plugin and the Nest filter plugin. Disk usage doesn't change that frequently, so we'll only report these metrics once per hour by setting Interval_Sec
to 3600.
[INPUT]
Name exec
Tag disksize
Command df -k | grep otaroot | jq -R -c -s 'gsub(" +"; " ") | split(" ") | { "otaroot_total": .[1], "otaroot_used": .[2], "otaroot_avail": .[3]}'
Parser json
Interval_Sec 3600
[FILTER]
Name nest
Match disksize
Operation nest
Wildcard *
Nest_under custom
Restart the fluent-bit service:
# systemctl restart fluent-bit
Your device will now be reporting metrics named otaroot_total
, otaroot_used
, and otaroot_avail
to the Torizon Platform, and you can begin creating custom charts with those metrics.
Fluent Bit is a flexible tool that can send data to various other platforms. It is compatible with the most popular Cloud providers and protocols (AWS, Microsoft Azure, Google Cloud, Datadog, Elasticsearch, etc). For more information about how to configure Fluent Bit, see the official documentation. Note that Fluent Bit can output to multiple different sources, if you wish, so you can send device metrics to the Torizon Platform for monitoring, and use another platform for log storage, for example.
As mentioned above, logs are another important type of device monitoring. Although Torizon Platform does not yet support log aggregation, you can still use Fluent Bit to process and forward various types of logs, using another data sink for analyzing the data.
One common use case for TorizonCore is the monitoring of containers running user applications. Combined with the flexibility of Fluent Bit, this enables use cases like data aggregation, diagnostics, or basic monitoring of applications running on Torizon. The only requirement for this setup is that your applications must be running within a Docker container.
Warning: Application log monitoring not yet available on the Torizon Platform Services Web Interface. Therefore the receiving and processing of this data are up to the customer implementation.
To monitor Docker containers, we utilize the built-in Fluentd logging driver in Docker. This allows the logs for Docker Containers to be sent to the Fluentd collector as structured log data. Fluentd is an open-source data collector for unified logging, similar to Fluent Bit.
In order to enable Fluentd logging for Docker containers, there are two methods:
/etc/docker/daemon.json
, you may need to create this file if it does not already exist. To enable Fluentd logging via config file please consult the Docker documentation, to see the correct syntax as well as possible options for the config file.docker run
or docker-compose
. For docker run
you need just add --log-driver=fluentd
. For docker-compose
consult the Docker documentation. This method is useful if you only want to monitor specific containers.Now Fluent Bit must be configured correctly to accept the data coming from the Fluentd logger in Docker. Fortunately Fluentd and Fluent Bit are compatible and complementary tools. All that's needed is to utilize the "Forward" input plugin for Fluent Bit. This input plugin is designed to accept data from a Fluentd stream. The only thing to be careful of is to make sure that both Docker and Fluent Bit are configured to use the same TCP ports.
As for output streams, as mentioned above our Torizon Platform does not yet display or accept log data, though this feature is planned for the future. Therefore how the data is received and processed is up to user discretion.
Open 2 terminals to your TorizonCore device. On the first terminal start up Fluent Bit:
# fluent-bit -i forward -o stdout -p format=json_lines -f 1
For demonstration, the output will just be set to stdout on the device.
Next, on the second terminal run a container like so:
# docker run --log-driver=fluentd debian echo "Testing a log message"
You should now see the following output in the first terminal running Fluent bit:
{"date":1636585969.0,"source":"stdout","log":"Testing a log message","contain er_id":"eaf3a3e80ba3dd778ffd4c1057481cd889cbde75ed7b94ce3dddd5cc462a7c98","co ntainer_name":"/gracious_driscoll"}
Tip: the Fluentd Docker logging driver only captures data from the container's stdout or stderr. Keep this in mind for containers that you would like to monitor with this method.
Toradex has presented webinars about Device Monitoring and you can watch them on demand.
Learn more about this webinar on the landing page, or watch it below: