Great @davidhaluska, thanks for your efforts! I have not yet received a reply from Turris on my report of the same issue (and a few others). Good to know that someone is home over there.
I have the same switch and the same problem… I have had to disable automatic updates as I’m out of home often and need remote access to my servers here when I’m out. The automatic reboot caused my systems to become unavailable at random intervals. I’d really like to turn automatic updates back on for security reasons. Please, let us know when the update with the fix is out!
Many thanks for efforts you all put to narrowing down and resolving this problem. Even though it is such a simple thing as resetting each PHY in the switch chip might be, it took quite a long time for us to be able to reproduce the problem and debug it finally and it would be even longer without your help. So thank you again!
brill–if I flash using the medkit, after I’m done testing the fix, can I still rollback to one of my prior snapshots? I have a working configuration (that uses fix-switch) that I’d like to return to after testing your fix.
Hi! Unfortunately no… Flashing the router from USB completely wipes the eMMC and replaces the factory defaults snapshot and all other snapshots as well.
Well, maybe you can try to put the dev-tms branch (that contains the testing kernel) to your updater sources, which might be less intrusive and you will be able to return to the snapshot afterwards. I’ll try it here and write down short how-to.
Theoretically you could try to “btrfs send” to backup root device to USB stick and then restore it with “btrfs receive” after testing. It is high risk operation.
If you only want to save the configuration then backing up /etc/config and /etc/updater is enough. And Foris UI has backup and restore functionality in 3.4 to help backup those directories but I have understood there are some bugs in restore part of it.
Tomas/Turris team, did this fix make it into the upcoming 3.5 release (which is coming out Thursday). While I can use the fix-switch binary to get around it, it would be great to have the fix built in.
I have more to back up than just configuration so snapshot is easier for me. Tomas I will keep an eye out for this to hit nightly once it does will switch to that branch, test thoroughly and rollback via snapshot. As I understand the GitLab activity will show when this moves to test branch, is that correct?
However, I did pull the January 13th nightly onto my Omnia (which contains the fix for this) and tested it. It does fix the problem (I rebooted and power cycled the router multiple times and all ports came up, when that wasn’t the case for me before without the fix-switch binary), so it looks likely that the final fix will be in the 3.6 release.
The fix for this is now incorporated into the 3.5.2 release which was pushed today. Confirmed that it resolved the issue for me (after I removed the fix-switch workaround)
I am not sure if this is related. But when there is a power loss, one of my devices has no link after boot. I have to physically disconnect the cable, usually on both ends. The problem appeared when I moved from Turris 1.0 to Omnia. I and is consistent across the ethernet ports on Omnia. No other device has a problem.