My Omnia with TurrisOS 5.3.2 crashed/hanged how to debug and why didn't watchdog trip?

Hi, over the holiday my Omnia running TurrisOS 5.3.2 crashed/hanged. The LEDs were frozen and nothing appeared over WiFi. Hard power-cycling resolved the issue. My two questions are:

  1. How do I debug? I don’t see logs retained anywhere.
  2. Why didn’t the hardware watchdog trip and reboot the router?

Hi @lucasrangit

By default, logs will be lost after reboot, but you could enable “System logs retention” in reForis and then look what went wrong after reboot.

  1. Insert external USB drive (flash drive would be fine for logs) and set it up as storage.
  2. Enable “System logs retention” in Administration->Maintenance->System logs retention page in reForis.

@mmtj hey I had that on already and have message files in /srv/log/! Thanks for the tip!

root@turris:/srv/log# ls -lh
-rw-------    1 root     root        3.5M Jan 13 17:54 messages
-rw-------    1 root     root      906.7K Jan  9 00:12 messages.1
-rw-------    1 root     root      208.7K Dec 26 00:12 messages.2.gz
-rw-------    1 root     root      384.9K Dec 19 00:12 messages.3.gz

I see shortly before the crash/hang a package update issue:

Dec 26 00:36:01 turris crond[27902]: (root) CMD (/usr/bin/rainbow_button_sync.sh)
Dec 26 00:36:01 turris crond[27901]: (root) CMDEND (/usr/bin/rainbow_button_sync.sh)
Dec 26 00:36:51 turris updater-supervisor: Running pkgupdate
Dec 26 00:36:53 turris updater[27960]: repository.lua.lua:58 (Globals): Target Turris OS: 5.3.3
Dec 26 00:37:00 turris updater[27960]: planner.lua:356 (pkg_plan): Requested package luci-i18n-rainbow-en that is missing, ignoring as requested.
Dec 26 00:37:00 turris updater-supervisor: pkgupdate reported no errors
Dec 26 00:37:01 turris crond[28008]: (root) CMD (/usr/bin/rainbow_button_sync.sh)
Dec 26 00:37:01 turris crond[28007]: (root) CMDEND (/usr/bin/rainbow_button_sync.sh)

Then just usual DHCP and crond output and then the crash:

Dec 26 01:03:01 turris crond[29838]: (root) CMDEND (/usr/bin/rainbow_button_sync.sh)
Dec 26 01:04:01 turris crond[29904]: (root) CMD (/usr/bin/rainbow_button_sync.sh)
Dec 26 01:04:01 turris crond[29903]: (root) CMDEND (/usr/bin/rainbow_button_sync.sh)
Jan  7 16:32:57 turris syslog-ng[4610]: syslog-ng starting up; version='3.35.1'
Jan  7 16:32:57 turris kernel: [    0.000000] Booting Linux on physical CPU 0x0
Jan  7 16:32:57 turris kernel: [    0.000000] Linux version 4.14.254 (packaging@turris.cz) (gcc version 7.5.0 (OpenWrt GCC 7.5.0 r11387+90-8fb714edd6)) #0 SMP Tue Nov 23 09:52:21 2021

Any ideas what happened? Also, why didn’t the hardware watchdog reset the device and recover automatically?

Did that happen just once or did you encounter this freeze more often?
Was there any power outage during holidays?
If that happens more frequently, it would be best to reach out to our Tech Support.

Perhaps @hagrid would know more?

No this was the first time and there were no power outages that I am aware of. I will keep log retention enabled and monitor for this happening again.