Search by Tags

Watchdog (Linux)

 

Article updated at 29 Oct 2021
Compare with Revision




Introduction

A watchdog on Linux is usually exported through a character device under /dev/watchdog. A simple API allows opening the device to enable the watchdog. Writing to it triggers the watchdog, and if the device is not cleanly closed, the watchdog will reboot the system.

However, a newer, more feature rich API using ioctrl is available too. We provide a small sample application called watchdog-test. For more information, see: http://git.toradex.com/cgit/linux-toradex.git/tree/Documentation/watchdog/watchdog-api.txt

The kernel configuration option WATCHDOG_NOWAYOUT ("Disable watchdog shutdown on close") gives the userspace application no way to disable the watchdog. Once opened, the application has to trigger the watchdog forever. If the application closes (even it was a "clean close" with a magic character sent), the watchdog will not be disabled. The watchdog cannot be disabled using the "WDIOC_SETOPTIONS" ioctl. However, if this option is compiled in the kernel, it can be disabled again using the "nowayout" parameter. In our BSP, this option is not set by default.

Hardware Support

See information about specific Toradex modules in this section.

i.MX 6, i.MX 6ULL, i.MX 8M Mini and Vybrid based Modules

The NXP/Freescale i.MX6 and Vybrid SoC watchdog is the same hardware as in the i.MX2. Because of the i.MX2 having been released earlier, the driver is called imx2-wdt. The watchdog driver creates one device under /dev/watchdog. By default, the watchdog resets the system after 60 seconds.

Parameter Description
nowayout Watchdog cannot be stopped once started
timeout Watchdog timeout in seconds (default=60)

Those parameters can be configured in the U-Boot environment bootargs, which is used to pass commands to the Linux kernel, with the driver name prepended (imx2-wdt):

imx2-wdt.timeout=$seconds

Note: The respective boot environment needs to be configured on U-Boot since it passes some arguments to the Linux kernel using the bootargs environment variable.

i.MX 7 based Modules

On the NXP i.MX 7 based modules we make use of the PMIC watchdog, the driver is called rn5t618-wdt. The watchdog driver creates one device under /dev/watchdog. By default, the watchdog resets the system after 128 seconds.

Parameter Description
nowayout Watchdog cannot be stopped once started
timeout Watchdog timeout in seconds (default=128, possible values=1, 8, 32, 128)

Those parameters can be configured in the U-Boot environment bootargs, which is used to pass commands to the Linux kernel, with the driver name prepended (rn5t618-wdt):

rn5t618-wdt.timeout=8

Note: The respective boot environment needs to be configured on U-Boot since it passes some arguments to the Linux kernel using the bootargs environment variable.

Readout Reset Reason

Usually the reset reason of the module is shown either in U-Boot or Linux Kernel console messages, so user can identify if the SoM was reset by Watchdog or not. However SoM Colibri iMX7 is a special case due to its design and an errata (e10574) in the SOC.

To differentiate if the SoM was reset by usual Power Cycle or Watchdog, user should readout the register 0xAh of PMIC by the following commands in U-Boot:

Colibri iMX7 # i2c dev 0
Colibri iMX7 # i2c md 33 0a 1

The different values of this register shows the different states of the module:

000a: 00 >> Power ON / Cold Start / Power Cycle
000a: 10 >> Soft Reset/Reboot
000a: 20 >> Reset by Watchdog

The description of the register is also explained in the following image.

NVIDIA Tegra based Modules

NVIDIA Tegra T20/T30/TK1 based modules have a built-in hardware watchdog. When the watchdog isn't triggered within the defined period of time (by default 60 seconds on T20/T30 and 80 seconds on TK1) the system will reset itself completely (all cores).

Images 20130305 (T20) and 20130820 (T30) and older contain the official NVidia Watchdog driver. This driver has a different behaviour compared to other Linux watchdog drivers (e.g. PXA aka SA1100 one). The driver resets the watchdog by itself, either writing to /dev/watchdog or doing a WDIOC_KEEPALIVE ioctl on it will not change anything. The userspace device (/dev/watchdog) was not really useful.

By default, our updated driver now behaves as described in the official Watchdog API.

The NVIDIA Tegra T20 based modules support one watchdog available under /dev/watchdog. If this watchdog is used by the kernel level heartbeat (TEGRA_WATCHDOG_ENABLE_HEARTBEAT), it can not be used from userspace.

The NVIDIA Tegra T30 based modules support four watchdogs on the hardware side. However, one watchdog is used kernel internally for suspend mode. Three watchdogs are exported to userspace (/dev/watchdog[0-2]). If the kernel level heartbeat (TEGRA_WATCHDOG_ENABLE_HEARTBEAT) is enabled, the first watchdog cannot be used from userspace.

The NVIDIA Tegra TK1 based modules support one watchdog available under /dev/watchdog0.

The driver has two parameters:

Parameter Description
nowayout Watchdog cannot be stopped once started
heartbeat Watchdog timeout in seconds (default=60 on T20/T30 default=80 on TK1)

Those parameters can be configured in the U-Boot environment bootargs, which is used to pass commands to the Linux kernel, with the driver name prepended (tegra_wdt):

tegra_wdt.heartbeat=$seconds

Note: The respective boot environment needs to be configured on U-Boot since it passes some arguments to the Linux kernel using the bootargs environment variable.

If reboot was due to a watchdog timeout, you will find the following message on next boot:

[ 4.447307] tegra_wdt tegra_wdt: last reset due to watchdog timeout

Note: On T20/T30 the option CONFIG_TEGRA_WATCHDOG_ENABLE_ON_PROBE was renamed to TEGRA_WATCHDOG_ENABLE_HEARTBEAT, which re-enables the watchdog reset by the driver itself (kernel level heartbeat).

Note: When using the kernel level heartbeat, the Kernel will not necessarily reboot when a kernel panic occurs since interrupts might still be handled. In order to reboot on kernel panic, use the command line option panic=<seconds> or the sysctrl.conf option kernel.panic = <seconds>.

Software support

This section has information about generic watchdog support on Linux.

Systemd

Systemd supports hardware watchdogs through the system configuration options RuntimeWatchdogSec and ShutdownWatchdogSec (in /etc/systemd/system.conf). For more information see this blog article about Watchdog support for systemd.

C example

A simple C example which keeps the watchdog fed. After this program gets closed or killed, the system will reboot after the watchdog timeout has expired.

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <sys/ioctl.h>
#include <linux/watchdog.h>

static int api_watchdog_fd = -1;

int api_watchdog_open(const char * watchdog_device)
{
    int ret = -1;
    if (api_watchdog_fd >= 0){
        fprintf(stderr, "Watchdog already opened\n");
        return ret;
    }
    api_watchdog_fd = open(watchdog_device, O_RDWR);
    if (api_watchdog_fd < 0){
        fprintf(stderr, "Could not open %s: %s\n", watchdog_device, strerror(errno));
        return api_watchdog_fd;
    }
    return api_watchdog_fd;
}

int api_watchdog_hwfeed(void)
{
    int ret = -1;
    if (api_watchdog_fd < 0){
        fprintf(stderr, "Watchdog must be opened first!\n");
        return ret;
    }
    ret = ioctl(api_watchdog_fd, WDIOC_KEEPALIVE, NULL);
    if (ret < 0){
        fprintf(stderr, "Could not pat watchdog: %s\n", strerror(errno));
    }
    return ret;
}

int api_watchdog_init(const char *pcDevice)
{
    printf("Open WatchDog\n");
    int ret = 0;
    ret = api_watchdog_open("/dev/watchdog");
    if(ret < 0){
        return ret;
    }
    ret = api_watchdog_hwfeed();
    return ret;
}

int main(int argc, char **argv)
{
    int ret = 0;
    ret = api_watchdog_init("/dev/watchdog");
    if(ret < 0) {
        fprintf(stderr, "Could not init watchdog: %s\n", strerror(errno));
    }
    while(1)
    {
        ret = api_watchdog_hwfeed();
        if(ret < 0) {
            return ret;
        }
        sleep(1);
    }
}

Refer to the kernel documentation file for more information: watchdog-api.txt