After a fail upgrade, system no longer manageable

My Turris Omnia crashed and, for unknown reasons, did not reboot (power on, but zero response on the network). I first tried to return to the latest snapshot, same issue. Then I returned to factory settings, Everything went fine, I reconfigured the box.

Then, I had two bad ideas, upgrading the software, and doing it too rapidly without thinking or recording what I did.

Now, I can no longer manage the Omnia:

  • Foris crashes “Unhandled Exception” with a long Python stack trace in the log,
  • Luci crashes “500 Internal Server Error”
  • I can log in with SSH but:
    • pkgupdate crashes with many SSL errors such as “Error relocating /usr/lib/libunbound.so.8: SSL_ctrl: symbol not found”
    • opkg crashes “Error loading shared library libssl.so.1.0.0: No such file or directory (needed by /usr/bin/curl)”

Indeed, I can see that /usr/lib has only libssl.so.1.1 while some apps are built with 1.0.0.

So, two questions:

  1. Is there a way to repair the current system? Such as statically built versions of software somewhere?
  2. If I return again to factory settings, is there a way to upgrade properly an old Omnia (2015) to a current system or is it hopeless?
1 Like

I’d try to update the factory image and then reset to it. Update with schnapps -f

I’ve never used schnapps before but -f does not seem a recognized option:

root@turris:~# schnapps -f
Unknown command -f!

Usage: schnapps [-d root] command [options]

Also :

root@turris:~# schnapps update-factory
Unknown command update-factory!

Try rescue mode 4 to flash a newer Turris OS. Then use schnapps update-factory to get the newest factory image. Then you can reset the router using the latest factory image (or you may just use the system flashed in the first step).

If you’re more adveturous, this is the code that performs update-factory using raw btrfs CLI commands:

    if btrfs subvolume create "$TMP_MNT_DIR"/@$NUMBER > /dev/null; then
        if tar -C "$TMP_MNT_DIR"/@$NUMBER --numeric-owner -xpzvf "$TAR"; then           
            if [ -d "$TMP_MNT_DIR"/@factory ]; then
                mv "$TMP_MNT_DIR"/@factory "$TMP_MNT_DIR"/@factory-old
            else
                echo "No factory image present, this will be the first one"
            fi
            mv "$TMP_MNT_DIR"/@$NUMBER "$TMP_MNT_DIR"/@factory
            [ \! -d "$TMP_MNT_DIR"/@factory-old ] || btrfs subvolume delete -c "$TMP_MNT_DIR"/@factory-old
            echo "Your factory image was updated!"
        else
            btrfs subvolume delete "$TMP_MNT_DIR"/@$NUMBER
            die "Tarball seems to be corrupted"
        fi
1 Like

How about upgrade to TOS 7.0?

Hi,

have you tried: schnapps rollback factory && reboot
To get default state.

I can do it from the reset button. But after that, is it possible to upgrade from this old factory image? My first attempt was a failure.

Then maybe try to use medkit and usb flash drive:

Part:
4 LEDs: Re-flash router from flash drive

and if you are successfull and your Omnia will boot properly, then after you pass the guide log via ssh and use command as @peci1 metnioned to update your old factory image to newer one with command:

schnapps update-factory

Please let me know if it works for you. If not I have one another option in my head. :slight_smile:

1 Like

It’s a shot in the dark, but I reset to factory recently, and I think I had an issue where the updater tried to rollback a 7.0.0 medkit to 9.x.x (which broke a lot of things including curl/updates). So when you setup from scratch, I would advise disabling the automatic updates and set it to “manual approval” mode, and double check which versions it wants to install.
I might be wrong though, and I didn’t bother taking extensive logs so YMMV

OK, the Omnia is back on track. I never used the schnapps command before but, in that case, it was the solution.

  1. schnapps list to find the list of snapshots. Try to find ones which were before the crash but not too old.
  2. try them with schnapps rollback THENUMBER && reboot until you find one which works
  3. continue as usual and pray that the next automatic upgrade does not break things again.

Thanks to those who searched and helped.

3 Likes

You don’t have to wait, if you use pkgupdate through ssh, then the upgrade should start. Staging part is over now. (Divided upgrade of routers to waves).

1 Like

Actually, the next upgrade did break things again. Same problem, same solution. I had to go back two snapshots since the “Automatic post-update snapshot (TurrisOS 7.0.0 - hbs)” was broken as well. “Automatic pre-update snapshot (TurrisOS 6.5.2 - hbs)” worked.
I of course disabled automatic updates but, now, I have a question: why does my Omnia fail to go to TurrisOS 7? How to debug that?

One of the problems could be when you’re low on RAM during the update. Do you have many services or LXC containers configured? Did you configure swap space?

Two LXC containers. I’m investigating.

On my Omnia, I have pre update and post update hooks to stop and start these memory eaters. Since then, my updates are much smoother.

1 Like

Sounds like an excellent idea - May I ask how you do that?

May I ask what do you use LXC containers for on a router?
I use full VMs in my laptops, but have never been smart enough to think of a use case for LXC containers. :upside_down_face:

root@turris:~# cat /etc/updater/hook_preupdate/01_tor.sh
#!/bin/sh

/etc/init.d/tor stop
root@turris:~# cat /etc/updater/hook_postupdate/96_tor.sh
#!/bin/sh

/etc/init.d/tor start

Mostly to run software that is not packaged for OpenWRT, but is available in e.g. Debian. I use it to run NodeRed+InfluxDB+Grafana to log environmental data from sensors in my rooms.