Turris OS 5.3.4 is released!

pwgen · February 1, 2022, 3:04pm

Yeah, “great”. One location of ours got offline because of that. I need to take a plane to reboot router now. We’re getting rid of turris omnias because of latest upgrade issues, where locations are randomly cut off from internet. That’s not what we expect from professional network devices.

Pepe · February 1, 2022, 3:16pm

That’s your decision, and it is up to you. I see that you posted a similar comment one month ago. It is possible that you might edit a file, which you should not have. Also, it is good to mention that you are using Bird, which is not included in our default configuration.

As I don’t know the exact configuration of your router, it is hard to do some troubleshooting or help you.

Turris Omnia does not suffer from a reboot issue described here. Did you check the tools which are available on the router to see what is going to be overwritten by the upcoming router? Did you reboot your router after the major update (migration process) and once you configured it as you would like?

As you can see, there were no breaking changes in this update based on the changelog and you can list all the changes and do the diffs in repositories which have been changed.

Since you posted it here, I’d expect that you want to get some help. Please, visit our documentation.

//Edit: What I am thinking is that if you are using Bird. Isn’t possible that you are using something different for DNS/DHCP as well?

peci1 · February 1, 2022, 4:11pm

@Pepe maybe one nice feature for these far-away setups would be another update mode - delayed with confirmation of success. So you’d basically install a cron job to revert the update before executing it, and the user would need to manually connect to the router and click a button in Reforis to confirm the update went well. If not, the router would automatically revert the update and reboot after some time. Of course, that’s not the best idea with MOX, but on Omnia, it could be pretty useful (I’m myself doing something similar manually).

viktor · February 1, 2022, 4:24pm

And nobody other can use two led restart? And what recovery USB with older configured TOS? What about remote access?

The problems I describe are related to the original MOX A module, certainly not Omnia.

pwgen · February 1, 2022, 4:48pm

Hi Pepe,

I have posted it here not to ask for help, but just to give you feedback, so maybe you can consider better testing of new releases. Indeed, we had problem in december with one router, and after this release - with a second router.

We’re using bird, but it’s not a critical component - ie. it should not prevent router from restarting and getting a public IP address. It is used to advertise private networks for ipsec tunnels (we use ipsec w/ VTI, to connect between locations and public cloud providers like AWS). For DHCP/DNS, there weren’t any significant changes, everything was configured using luci.

Regarding bird being “not included in default configuration” - I’ve seen BGP advertised in official product data sheet and this was a key decision point when we considered platform to buy, so (only in my humble opinion) it should not be treated as “non-default”.

Router has been rebooted last time a few weeks ago (beginning of December, if I recall properly) and it came up succesfully.

Best!

Pepe · February 1, 2022, 6:42pm

Sorry, but I don’t take a feedback as “we get rid of routers, because they suck”. If you would like to help or give us feedback, then there is a better way of doing it.

I am thinking loud wasn’t Turris OS 5.3.4 in the Testing branch for almost 2 weeks?

In LuCI, you can configure anything what you want since you have a root access even not valid configurations.

moeller0 · February 1, 2022, 8:00pm

What are the symptoms you are observing exactly with the omnia, if I might ask?

viktor · February 2, 2022, 3:53pm

Could you anwer to my questions above @pwgen?

That would also interest me.

pwgen · February 2, 2022, 4:53pm

I can access this host only from outside (internet).

it’s not updating its dynamic dns record
its ipsec peers don’t show any connection attempts for ipsec tunnels
hosts from inside (ie. alarm, which uses outgoing TCP session to security service in internet; also SIP-based phones) don’t connect to servers (no connection attempts are visible in the logfiles of respective servers)

pete · February 2, 2022, 6:15pm

This sure sounds like youre using Omnia for an enterprise use case, and a specific config that the router is not tested for.

As @Pepe alluded to in his post, I think this one is on you, not the Turris team.

If you need complete reliability, either:

Don’t update the router
Use staged rollout of updates and test every update in a lab/simulated environment - this is a standard approach even for Windows updates in mission critical enterprise scenarios

We live and we learn…

moeller0 · February 3, 2022, 10:42am

OK, that seems to match something seen on a few TOS as well as OpenWrt installations. In my case I need to issue /etc/config/firewall restart for internal host to be able to reach the internet. Here is what I added to my /etc/rc.local:

root@turris:~# cat /etc/rc.local
# Put your custom commands here that should be executed once
# the system init finished. By default this file does nothing.




# opkg update ; opkg install ethtool tcpdump netperf mtr luci-app-sqm luci-app-nlbwmon iftop nano luci-app-statistics collectd collectd-mod-cpu collectd-mod-ping kmod-wireguard wireguard-tools luci-app-wireguard iputils-ping coreutils-date coreutils-sleep nftables luci-app-nft-qos nft-qos coreutils-stat bcp38 luci-app-bcp38 kmod-sched-ctinfo flent-tools collectd-mod-sensor


# to deal with the fact that routing does not seem to work out of the box after a reboot
# the next one seems to work
#ifup lan 
# the ifup lan replaces the following three lines
/etc/init.d/firewall restart
/etc/init.d/sqm stop
/etc/init.d/sqm start

# ddns fails to come up automatically
# is DDNS still/already running?
#DDNS_PID=$( pgrep -f -a dynamic_dns_updater.sh )
# pgrep returns the PID as first word, get rid of the rest
DDNS_PID=$( pgrep -f -a dynamic_dns_updater.sh | sed 's/ *\([^ ]*\).*/\1/' )
# if yes kill it
[ -z "$DDNS_PID" ] && pkill ${DDNS_PID}
sleep 5

# the stop/start dance will not update a running instance...
/etc/init.d/ddns stop
/etc/init.d/ddns start


exit 0

Yes, that is far from ideal, but since the same phenotype has been reported on stock OpenWrt, see e.g.: https://forum.openwrt.org/t/manual-restart-of-br-lan-required-for-internet-access-on-clients/118224 my unsubstantiated hunch is some sort of race condition in which something is too fast (or maybe something else too slow) and hence the system coming up half-baked…

As you can see DDNS was especially affected and required desperate measures… and I am not 100% sure that this is 100% robust and reliable yet (I did only test a few enforced reboots, by no means enough to consider this “worked-around”)

moeller0 · February 3, 2022, 10:48am

I respectfully disagree. This seems to be an issue inherited from OpenWrt that affects a few configurations, while not others. The affects are really really obvious and not subtle at all, and in no way “enterprise”… I do agree that this does not appear to be something caused by the Turris team, as so many other bugs they helped squash over time were also not “their fault” ;).

As long as I assumed it was just my router, I mostly ignored it (I have the luxury of being able to baby-sit most reboots anyway). And only recently after not being around for a day, tried to automate the work-around. But what seems really required is to figure out the root cause instead.

Yeah, but this is part of the claim to fame for TOS, automated updates that simply work… Again, not implying in any way team Turris bears responsibility for the issue, but expecting to be able to reboot a router successfully does seem like a very low bar. (And in fairness, the router does reboot, it only does not come up with full functionality).

peci1 · February 3, 2022, 12:02pm

I’ve created an issue for the possible new updater mode “approvals with good update confirmation”. I know it’s not best for a large-scale installation, but if you manage just a few routers, it could be doable. If you have a large-scale deployment, then you can set just a few typical clients to this mode and leave the others with classical approvals which you would trigger once you verify the update.

Using automated updates with client devices you have no physical control about seems like a suicide anyways, so at least approvals seem to be required for this kind of deployments.

moeller0 · February 3, 2022, 12:12pm

The thing is, for my issue it is sufficient to reboot the router, upgrades are only involved as they often require a reboot as well. So +1 for your issue seems like an interesting addition; but it will not help in my case, as updates are not the critical step in triggering the issue…