On this weekend I rebooted my router and during start-up it would reboot itself while it was starting. The LAN/wireless are brought up and I can ssh, however around 5-10 seconds after that the router reboots. On next boot the same would happen. This is the exact problem that has been reported here:
By the way, the router is fully up-to-date: TurrisOS 3.5.3 with Kernel 4.4.39-80079e1c1e5f9ca7ad734044462a761a-4.
I connected via serial to inspect the boot process. While inspecting the output I see nothing unusual until the point in which the router reboots without warning. No message is ever logged. I had to do a full factory reset to be able to recover.
Once I reconfigure the router I reboot and it starts happening again. I had to do a new factory reset. Due to the symptoms and the reboot without warning, I naturally suspect of the watchdog. From what I can see everything is fine:
orion_wdt
module is enabled (vialsmod
) and I can seeorion_wdt: Initial timeout 171 sec
indmesg
/dev/watchdog
exists and is open byprocd
(checked vialsof
)/etc/init.d/watchdog_adjust
is started at boot time (when running it again I can see the same settings were already applied, 2 second frequency and 12 seconds timeout)
Once more, I configure the router and the reboot loop comes back. At this point I start trying to do some experimentation and I found out that if I boot the router with the cable disconnected from the WAN port, then the router starts up successfully and does not reboot! If I connect it later then everything works fine, no spontaneous reboots.
Now with this new knowledge I try out something else: I set network.wan.auto='0'
in order to avoid the WAN being brought up at boot, leave the cable connected to the WAN port and try out rebooting. The router starts fine, no spontaneous reboots. Rebooted many times to confirm and itās 100% reproducible ā always successful, no reboots.
Now I try out bringing up the WAN at the end of the start-up: I put ifup wan
in /etc/rc.local
. The router still starts up successfully. No reboots. Again, repeated this many times and again itās 100% reproducible.
Just as a sanity check, once more I configure the WAN interface to start at boot. The reboot loop comes back ā a few seconds after networking is brought up, it reboots. Now that I know how to stop it, I disconnect the cable from the WAN port and the reboot loop stops.
Still, the only explanation that I can think of is the hardware watchdog and somehow procd
is not pinging it early enough at startup, and the router reboots. Moving the WAN initialization to the end of the startup process somehow fixes/works around it.
So, as an experiment I try out editing /etc/init.d/watchdog_adjust
and bumping the timeout to 120 seconds. Unfortunately that did solve the problem: when the WAN is enabled at boot and the cable is connected, the reboot loop happens again.
Iām running out of ideas here. I still believe it must be some timing problem and some weird interaction between procd
and the watchdog. As I understand, U-Boot enables the watchdog with a 120 second timeout before booting Linux. Is it possible to disable the watchdog on the U-Boot command line before booting? I would like to try that next.
Is there somebody at Omnia (hopefully an engineer) that could help me troubleshooting this? I can survive for now with the hack of starting the WAN in /etc/rc.local
, but I would really like to get to the bottom of this. It should not be happening. I wonder if Iām the only one experiencing this.