Turris omnia not accessible

Hi,
during the day I found a a lot of entries in syslog like “…BTRFS warning (device mmcblk0p1)…”, and no wlan at home.
I switched off power and after waiting a few minutes I restarted the device. It took some reboots to to switch to the last known good configuration. Now the device is up, internet is working, lxc containers are available but I can not access the admin interface or access the router using ssh.
I want to avoid a factory reset because of some lxc images I would like to save and also some tricky configs that I do not want to use…
it looks that webserver and ssh are not running (using nmap from a lxc container), how to start them without ssh or what else should I do???
Any help is highly appreciated!!!
thanks,Thomas

You can always try rollback instead of factory reset. The latest state will be made into a snapshot, so you can return to it later or e.g. mount it somewhere and copy the containers out to a working config.

Hi,
thanks but the device is not accessible- so I have no chance of anything.
In the meantime the serial console arrived and showed that I have some serious problem with the emmcblk storage. Even factory reset is not successful at the moment.
what can I do using u-boot or rescue mode or boot something else with tftfp in order to provide some valid device for factory reset?
or is it just a hardware failure and I have to ask for replacement?
please help because I am lost.
thanks…

Not accessible over network? You can roll back just by pressing the button, BTW.

no, the device is currently offline, without power.
believe me, I tried to rollback to the latest snapshot but it was not working and I have no idea why.
in the meantime I found out that I do have problems with the internal storage. And I really need help to proceed!
thanks

OK, I didn’t understand “not accessible” as not being able to power-up.

to be exact I can power up. at least some leds are shining :slight_smile: but nothing else. just serial console is working and I guess I should do something to bring the internal storage back to work properly. maybe a special fdisk, format, mmc something magic command?

The lxc containers ran on the internal storage apparently? There are numerous advisories against doing so as it eventually could wear and tear the internal storage and maybe that just happened.

How did you reach the conclusion about the problems with the internal storage, any printout when accessing with the console?

No, lxc run on a seperate msata ssd. problems found on using medkit with serial console. the device can boot to rescue and does a lot until rebooting. Then the mmcblk0 device is unavailable for mounting. When I arrive at home in approx. 1hour I can try to send a log…
Thomas

The impression of lxc on the internal storage was caused by

A factory reset and/or flashing a new firmware image should/would not touch containers on the msata ssd.

If flashing the router with medkit does not work you may want to contact TO support and see whether the device is eventually suffering from a faulty hardware and covered under warranty.

I have similar issues - I now have an unresponsive router. None of the LAN lights operate, I have tried to reflash the medkit but to no avail (I tried both a FAT usb drive and a EXT3 usb drive - both showed some activity initially then nothing…

I managed to capture some of my dmesg before it went - it was reporting BTRFS warning csum failed. All my logging and LXC was on an external 1TB drive.

I now have no internet for the household which as you can imagine is unpleasant for me to say the least!

Some quick pointers would be most welcome! I need to be back online for work on Monday and at the moment have no way other than tethered to my mobile phone!

edit to add: to be clear - there is nothing responding at 192.168.1.1, no web, ssh or ping…

…oh no! This is bad news for me, since my device suffers from the exact same behaviour like.
First thing happend was a “borked foris interface” which I was not able to resolve by (re-) installing foris package and schnapps rollback…
Today I wanted to try the " Rollback to factory reset" and " Re-flash router", both with no success at all. Things went even worse. :frowning: Before these tries I was able to use my network but the lxc-containers (on external USB drive) were gone and I therefore tried to reset my TO.
At the moment it boots into some state with four LEDs lit: power, lan2, lan4 and wan. The device is not responsive anymore, no DHCP, no webinterface, no PING (Used its old network address and default 192.168.1.1)
~$ nmap -Pn -p- 192.168.1.1

Starting Nmap 7.60 ( https://nmap.org ) at 2018-09-15 15:23 CEST
Nmap scan report for 192.168.1.1
Host is up (0.029s latency).
All 65535 scanned ports on 192.168.1.1 are filtered

Nmap done: 1 IP address (1 host up) scanned in 83.37 seconds
~$

[edit]
I managed to get a boot log from my TO via serial link: https://pastebin.com/5mYw7vy9

As far as I understand there is a problem reading from some 8MB SPI flash IC, which fails on CRC calculation.
Where does U-Boot save its environment? Is it that flash IC?
Why is the data mixed up there? (Note: I never power cycle my TO, it runs 24/7)
[/edit]

[edit2]
…and here’s a boot log showing the failure of “Re-flash router” procedure. Btw: I’m pretty sure the USB thumb drive is prepared well and connected to the correct front USB port of the TO.
The log: https://pastebin.com/nhqucErE
[/edit2]

Does anyone have any idea/instructions on how to reenable my Turris Omnia?
How can I unbrick the device? I cannot find appropriate instructions…

kind regards
Chris

Since you can access the router via serial have you tried schnapps and rollback to a previous snapshot that is likely to wotk?

There are RAID and GSM modem log entries. Perhaps does not make a difference but did you try to flash medkit with any extra devices (hd / gsm modem) removed?

From my experience with medkit I had to reformat the usb drive for each re-flash try again (and copy the firmware).

If nothing works however it might be worth to open a support ticket with TO.

You are right, schnapps could at least undo the reset/reflash attemps. I’ll try that right now.
[edit]
Bingo! I restored a snapshot made earlier so I have rudimental LuCI interface just as the day before today.
[/edit]

I don’t have any extra devices attached to my Turris Omnia, so did try without any extra device already.

Which format did you use to get the USB thumb drive ready? I did a try with two different USB thumb drives each ext4 formatted. With the second one I gave it another try using FAT32 system.

I did open a support ticket already with the above linked issue and attached a link to this forum entry too.

Thanks for your thoughts though!

kind regards
Chris

FAT


The second log you posted stating

No medkit image found. Exit to shell.


A bit confusing/strange then seeing these RAID6 entries in the log:

Summary

[ 0.520207] raid6: int32x1 gen() 155 MB/s
[ 0.689957] raid6: int32x1 xor() 213 MB/s
[ 0.860084] raid6: int32x2 gen() 255 MB/s
[ 1.029926] raid6: int32x2 xor() 229 MB/s
[ 1.199953] raid6: int32x4 gen() 284 MB/s
[ 1.369956] raid6: int32x4 xor() 190 MB/s
[ 1.539924] raid6: int32x8 gen() 369 MB/s
[ 1.709910] raid6: int32x8 xor() 214 MB/s
[ 1.709913] raid6: using algorithm int32x8 gen() 369 MB/s
[ 1.709916] raid6: … xor() 214 MB/s, rmw enabled
[ 1.709919] raid6: using intx1 recovery algorithm

I experimented further and had some progress:
When using one of my USB 2.0 thumb drives I watched the kernel log outout and found this
usb 1-3: device descriptor read/64, error -110
which point according to this to a bad USB device since it should never consume more than 500 mA…
I just gave it another try with my external USB 3.0 hard disk drive and – Eureka! – it works!
The interesting difference to my USB 3.0 thumb drive is the partition table which the thumb drive does not have at all. I guess the rescue.sh script (@ line 118) is not able to accomplish finding the right ext/vfat/btrfs partition…
Anyway, my TO is back again. [Solved] :slight_smile:
Thanks for your input though!

Best regards
Chris