Turris 1.x - kernel 5.15: router does not want start properly

Hi, i am using Turris OS 6.0 (HBL) on the Turris 1.x and this morning my router was not able to boot correctly. Also DHCP was broken, so i was able to connect it only with the static ip and no internet was working. After connection to the router i found that router failed to start properly with a huge CPU load and LA > 8. Commands like ifconfig or ip addr were hanging. Reboot did not help. After all i found that router was upgraded this night and problem started after reboot, so i downgraded to the latest pre-upgrade snapshot and disabled automatic upgrades for now.

Below is an os-release diff between working and failed openwrt. Hope this would help:

--- @287/etc/os-release 2022-07-17 18:45:26.000000000 +0200
+++ @288/etc/os-release 2022-08-03 19:07:00.000000000 +0200
@@ -7,7 +7,7 @@
 HOME_URL="https://www.turris.cz/"
 BUG_URL="https://gitlab.nic.cz/groups/turris/-/issues/"
 SUPPORT_URL="https://www.turris.cz/support/"
-BUILD_ID="r16616+81-6f89233c41"
+BUILD_ID="r16625+110-f94b30d83c"
 OPENWRT_BOARD="mpc85xx/p2020"
 OPENWRT_ARCH="powerpc_8540"
 OPENWRT_TAINTS="busybox"
@@ -15,4 +15,4 @@
 OPENWRT_DEVICE_MANUFACTURER_URL="https://www.turris.cz/"
 OPENWRT_DEVICE_PRODUCT="Turris 1.x"
 OPENWRT_DEVICE_REVISION="v0"
-OPENWRT_RELEASE="TurrisOS 6.0 6f89233c41ae8af4537e64684fa274ab4a21e9d4"
+OPENWRT_RELEASE="TurrisOS 6.0 f94b30d83c2aa401b3ff9c4914bb2ef6386df8aa"

Also i see that kernel was upgraded from 5.4.203 to the 5.15.58 what likely is a root cause.

-Version: 5.4.203+5.10.110-1-1-f70e9c9745643e220f2338b431a1b5ff
-Depends: kernel (=5.4.203-1-f70e9c9745643e220f2338b431a1b5ff), kmod-mac80211
+Version: 5.15.58+5.15.33-1-1-9a2687e627ae31779792480683f13d02
+Depends: kernel (=5.15.58-1-9a2687e627ae31779792480683f13d02), kmod-mac80211

cc: @Pepe

Thanks for reporting, I’ll look at it.

1 Like

Commands like ifconfig or ip addr were hanging.

This I can reproduce, but just because the network stack takes too long until it is fully loaded. In my case, it took almost 9 minutes after everything is settled down. After that, it works as it should.

We are investigating that.

Thank you for checking. In my case internet was broken completely, thats how i found that there is an issue. Also i have a syslog forwarding set to another host (RPI) but i think syslog-ng was never able to start, as i found that all logs after upgrade are missing. I will be on vacation next 2 weeks, but happy to test whatever is needed on return ) And hope that ToS 6 will be released eventually, so will switch back to stable.

Yes I experience same issues on both Turris 1 and Omnia as well.
It works like after reboot I had connection to wan for while then it disappear and sometimes it get back. I did not know that it could take up to 9 minutes but I experienced same behaviour.
Not knowing this is a new feature I get quite mad fiddling with both devices for past two weeks.
I was suspicious that my old images of TOS 6 HBL from march on omnia went somehow broken over time with all the daily updates in HBL so I decided to rolback to fresh factory medkit and do new configuration. Not knowing such important information you could imagine how much time I spent refreshing end experimenting and trying to figure out what is wrong. Hope that this will not be marked as another offtopic story.

No, this happens only on Turris 1.x routers. We don’t know the culprit yet, but we are working on it. If you experience it on Turris Omnia router, then I am pretty confident that it is related to the misconfiguration on your end.

I see number of commits in the branch, do you think that issue is fixed or not yet? I am fine to test whatever is needed to be tested :slight_smile:

No, it is not. I will let you know.

2 Likes

Sorry for bumping up the thread. Do you know if there is any progress here? Do you need any help?

No progress, so far. Busy with other things, currently.