Surviving cable downtime?

Hi,

I have noticed on a number of occasions when my ISP provider (Telstra HFC cable internet) has some sort of problem (rarely), the internet drops off, but I find with the Turris router, that when the ISP service comes back online, Turris seems to needs a reboot to pick up the internet again even though the modem itself is back online.

Is that a Turris bug?

Dave

I’ve run the Watchdog for such rare cases. He checks every half hour whether the connection is. If this is not the case and the time without connection is exceeded, it restarts. This option is located under Services in the Luci Interface. Whether it’s a bug, hard to say. The few times I had problems with the network, he recovered immediately when it was back

Best regards

Have you tested if it’s on DNS layer or lower? There is currently a caching approach that might not be optimal with flaky internet connection, but by default that cached failure shouldn’t last for more than one minute.

I believe this to be a side effect of how DOCSIS provisioning works. After the modem reboots it will, for a short period of time, learn new MAC to IP-address associations (and only a limited number, on many home links just a single one). These associations seem to be triggered by dhcp traffic, now if your turris still believes to have a valid dhcp lease it might simply not exchange dhcp traffic with the CMTS/modem while the modem is still willing to learn. You could test this hypothesis by unplugging your modem to cause the internet drops and then see whether releasing the DHCP lease and getting a new one shortly after the modem rebooted works, while doing so a few minutes after the reboot might not.

@dcam Does it happen to you always when internet drops, or only in some cases? I might have the same problem, for me it does not happen all the times, but very rarely. But when it happens, the ethernet link to modem is dead (LED indicators off on both sides) and only restart of Omnia helps. There are some kernel error messages in logs suggesting there is a software bug, so I sent e-mail with details to Turris tech support. Hopefully they can find out something from those logs.
When it happens to you next time, try to grab the logs from router as described here: https://doc.turris.cz/doc/en/howto/error_reporting it should reveal whether it is the same problem as the one I have, or something else.

@freshdax There’s no watchdog option in Luci menu. Do you mean watchcat instead? The packages watchcat & luci-app-watchcat are not installed by default, so you probabbly added them yourself… but still, it might not be the right mitigation for me, because:

  1. Restarting whole router is pretty invasive action. All the LXC containers go down for example. Wifi devices will lose connection too.
  2. I live in China. The internet connection is often messed up because of government censorship, and even without it… the Chinese “quality” is infamous. As a result, any site (even those located in China) can become unavailable for some period of time, so simple monitoring like watchcat would lead to undesired restarts. I would need something more complex than checking single IP-address. Probably checking set of several IP-addresses and restarting only when none of them is reachable.

So I still prefer to have the bug found and corrected…