Optional migration from Turris OS 3.x for advanced users

I can send you logs from all migration attempts, fortunately, for I relocated log file(s) to flash memory, where they survive reboots :wink:

logs

-rw------- 1 root root 980251 Jun 17 14:57 messages
-rw------- 1 root root 1581988 Jun 17 04:35 messages.1
-rw------- 1 root root 129444 Jun 16 04:35 messages.2.gz
-rw------- 1 root root 120809 Jun 14 04:35 messages.3.gz
-rw------- 1 root root 241616 Jun 12 04:35 messages.4.gz
-rw------- 1 root root 103376 Jun 10 17:12 messages.5.gz
-rw------- 1 root root 82906 Jun 10 04:35 messages.6.gz
-rw------- 1 root root 83882 Jun 8 04:35 messages.7.gz

If I understand it well, log file from last migration attempt is messages.1 (dated Jun 17 04:35); of interest maight be times from 14:25 (shutdown - reboot) or even later, 14:33 (pkgupdate).

As far as I don’t know how to send it to you directly from Omnia, I can unplug flash drive and hopefuly send the file using Windows :slight_smile:

Edit: better I generated diagnostics by Foris (renamed messages.1 to messages…). Mailing it to support.

I’ve completed the upgrade following the migration docs at https://docs.turris.cz/geek/tos3_migration/ - I upgraded from 3.11.17 to 5.0.2, which and encountered a few issues:

  1. tos3to4: the postrm process for tos3to4 errored like so:

    INFO:Running postrm of tos3to4
    /usr/lib/opkg/info//tos3to4.postrm: line 1: luci-app-squid: not found
    /usr/lib/opkg/info//tos3to4.postrm: line 20: syntax error: unexpected "}"
    INFO:Running postrm of cznic-repo-keys-test
    INFO:Cleaning up control files
    ERROR:Failed operations:
    tos3to4/postrm: /usr/lib/opkg/info//tos3to4.postrm: line 1: luci-app-squid: not found
    /usr/lib/opkg/info//tos3to4.postrm: line 20: syntax error: unexpected "}"
    

    Thankfully, this postrm is only sending the upgrade success notification and nothing else happens - phew. I’ve submitted an issue report and patch at https://gitlab.labs.nic.cz/turris/turris-os-packages/issues/617

  2. LXC containers failed to load after migration. The issue is because they had static IP addresses assigned and the LXC migration script failed to take this into account:

    turris# lxc-ls -l DEBUG
    lxc-ls: parse.c: lxc_file_for_each_line_mmap: 142 Failed to parse config file "/srv/lxc/con/config" at line "lxc.net.0.ipv4.ipv4 = 192.168.1.2"
    Failed to load config for con
    

    The issue was that lines like lxc.net.0.ipv4.ipv4 should actually be lxc.net.0.ipv4.address (and same for ipv6 lines). I’ve created an issue and patch at https://gitlab.labs.nic.cz/turris/turris-os-packages/issues/618

  3. Wireguard interface was broken was an error DEVICE_CLAIM_FAILED displayed in the LuCI UI. I fixed the issue by removing the following line from /etc/config/network:

        option ifname 'wg0'
    

    and restarting the interface. New wg interfaces don’t add this line but my other interfaces seem to have it just fine. I don’t know what the issue here was.

  4. DDNS: luci-app-ddns, ddns-scripts and luci-i18n-ddns-en were removed during the upgrade; I don’t know why this was so I had to thus install luci-app-ddns again after the upgrade, which installed the other packages back as dependencies. Running opkg install luci-app-ddns required me to approve the install from Foris and then after I did the result was an error notification:

    ##### Error notifications #####
    Updater failed: Failed operations:
    
    ddns-scripts/preinst:  [31;1mDIE [0m:Failed to exec /usr/lib/opkg/info//ddns-scripts.preinst: Exec format error
    
    ##### Update notifications #####
     • Installed version 2.7.8-13.0 of package ddns-scripts
     • Installed version 2.4.9-7.2 of package luci-i18n-ddns-en
    

    ddns-scripts seemingly installed fine though; the preinst script only stops/stopped the ddns service in specific circumstances; no issue since it wasn’t installed then. At this point, I had to restore my /etc/config/ddns file from my old installation, re-set the ddns service to autostart again and reconfigure ddns to use a different network interface as the numbering (eg eth0 to eth1). However, after all that, it’s working.

  5. Docs It’d be helpful if the docs explained I’d end up on HBK after starting from stable; and how to switch to back to HBS. I’ve read all the comments in this thread and the 5.0 thread so I knew to expect this, but if the docs mentioned it that’d be better. I’ve created an issue and patch at https://gitlab.labs.nic.cz/turris/user-docs/issues/67

The last issue I haven’t sorted yet is an old one of mine which has cropped up again, being able to configure /etc/resolv.conf to use a custom DNS. I asked this question a long while ago Set persistent "nameserver" entries in /etc/resolv.conf and still had that config set up in 3.x. The config is still there now in 5.x but it’s not working; /etc/resolv.conf is defaulting to using localhost as the resolver. Thoughts on a fix for that?

3 Likes

@jada4p I have your logs. I am going to look in to them when I have free time on my hands. Thank you very much.

@davidjb
I answered most of the issues in referred merge requests (trough referred issues). Just few notes:

  1. If you found out why that is then we can create some fix script to tos3to4 package to migrate it
  2. luci-app-ddns was part of netutils package list. This package was dropped in favor of less generic package lists. More info can be seen here: https://gitlab.labs.nic.cz/turris/turris-os-packages/issues/344#note_103709
  3. You should not have landed in HBK. Turris OS 3.x deploy maps to HBS, rc to HBT and anything else to HBK. The cause was a bug in migration script. (https://gitlab.labs.nic.cz/turris/turris-os-packages/-/merge_requests/392)

To the last issue… the problem is in generic in upstream. This file is part of base-files and not marked as configuration file. Feel free to contact upstream with this. During that time running resolver with forwarding configured should be the same (with exception of course that you have to run resolver on router).

1 Like

Ah, I understand now, thanks. I see the docs mention this change and I’d read that but I didn’t associate luci-app-ddns with the Netutils package list (or know what this was). Could the migration docs link to a full list of affected packages?

Eh yeh possibly. That list would have to be created. The only link I have is in Lua code https://repo.turris.cz/omnia/lists/netutils.lua. I would be probably better to create some diff list but I am not sure how to grab that.

1 Like

After reading through /etc/init.d/dnsmasq, I did find a solution to ensuring my LAN’s DNS server is in /etc/resolv.conf:

# Set an interface's DNS server (can be lan, wan, wan6 etc)
uci add_list network.lan.dns=192.168.1.253

# Restores /tmp/resolv.conf to a symlink to /tmp/resolv.conf.auto
/etc/init.d/dnsmasq stop
# Prevent dnsmasq replacing /tmp/resolv.conf again
uci add_list dhcp.@dnsmasq[-1].localuse=0
uci commit
/etc/init.d/dnsmasq start

The result is that /etc/resolv.conf symlinks to /tmp/resolv.conf which links to /tmp/resolv.conf.auto which is generated from the network interface custom DNS server(s) compared to the previous setup where dnsmasq was writing its own /tmp/resolv.conf to use localhost as its sole resolver.

1 Like

Well after trying everything I could think of to fix my Turris 1.1 and failed I try this thread.

After 5.0.1 came out I tried to upgrade. I’m not sure if there was need to upgrade also my 3.x to higher version or not. Now I’m not sure which version I had installed before upgrade. This ended up with my router throwing errors when I tried to access Forris. I have sent Error log to Turris support. After few hours it became unresponsive. I could not connect to via ethernet to it and wifi was also unavailable. I didn’t have time to fix it and used different router instead.

Reset didn’t help. I tried to remove SD card with no effect. With SD removed I tried to reset it one last time. Also with zero effect. So no what? What is the best way to fix this mess? I could connect to it via serial connection but I’m not sure what would be the best way to fix bricked device.

Serial output:
Without sd

ible
in: 56
out: 0
io_config: 255
rom_loc: nor upper bank
SD/MMC : 4-bit Mode
eSPI : Enabled
I2C:   Error, wrong i2c adapter 2 max 2 possible
ready
SPI:   ready
DRAM:  Error, wrong i2c adapter 2 max 2 possible
Detected UDIMM 9905594-003.A00G
2 GiB (DDR3, 64-bit, CL=6, ECC off)
[    0.556043] pci 0002:04:00.0:   bridge window [mem 0x80000000-0x9fffffff]
[    0.562824] pci_bus 0002:04: Some PCI device resources are unassigned, try booting with pci=realloc
[    0.572151] mpic-msgr ffe41400.message: Found 0 message registers
[    0.578169] mpic-msgr ffe41400.message: Of-device full name /soc@ffe00000/message@41400
[    0.586172] mpic-msgr ffe41400.message: Failed to find message register block alias
[    0.616789] raid6: int32x1  gen()   171 MB/s
[    0.638175] raid6: int32x1  xor()   220 MB/s
[    0.659489] raid6: int32x2  gen()   300 MB/s
[    0.680867] raid6: int32x2  xor()   261 MB/s
[    0.723472] raid6: int32x4  xor()   283 MB/s
[    0.744833] raid6: int32x8  gen()   378 MB/s
[    0.766169] raid6: int32x8  xor()   253 MB/s
[    0.770354] raid6: using algorithm int32x4 gen() 398 MB/s
[    0.775743] raid6: .... xor() 283 MB/s, rmw enabled
[    0.780611] raid6: using intx1 recovery algorithm
[    0.786611] clocksource: Switched to clocksource timebase
[    0.792743] NET: Registered protocol family 2
[    0.797439] TCP established hash table entries: 4096 (order: 2, 16384 bytes)
[    0.804452] TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
[    0.810887] TCP: Hash tables configured (established 4096 bind 4096)
[    0.817254] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    0.823032] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    0.829424] NET: Registered protocol family 1
[    0.842343] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.848282] jffs2: version 2.2 (NAND) (SUMMARY) (ZLIB) (LZO) (LZMA) (RTIME) (RUBIN) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.861082] bounce: pool size: 64 pages
[    0.864910] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    0.872238] io scheduler noop registered
[    0.876144] io scheduler deadline registered (default)
[    0.881565] pcieport 0001:02:00.0: enabling device (0106 -> 0107)
[    0.887728] pcieport 0002:04:00.0: enabling device (0106 -> 0107)
[    0.894071] pcieport 0000:00:00.0: Signaling PME through PCIe PME interrupt
[    0.900963] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[    0.907502] pcieport 0001:02:00.0: Signaling PME through PCIe PME interrupt
[    0.914434] pci 0001:03:00.0: Signaling PME through PCIe PME interrupt
[    0.920981] pcieport 0002:04:00.0: Signaling PME through PCIe PME interrupt
[    0.954233] Serial: [    1.276313] fsl-gianfar ffe24000.ethernet: enabled errata workarounds, flags: 0x4
[    1.294252] fsl-gianfar ffe24000.ethernet eth0: mac: 00:00:00:00:00:00
[    1.300796] fsl-gianfar ffe24000.ethernet eth0: Running with NAPI enabled
[    1.307596] fsl-gianfar ffe24000.ethernet eth0: RX BD ring size for Q[0]: 256
[    1.314742] fsl-gianfar ffe24000.ethernet eth0: TX BD ring size for Q[0]: 256
[    1.322174] fsl-gianfar ffe25000.ethernet: enabled errata workarounds, flags: 0x4
[    1.340047] fsl-gianfar ffe25000.ethernet eth1: mac: 00:00:00:00:00:00
[    1.346593] fsl-gianfar ffe25000.ethernet eth1: Running with NAPI enabled
[    1.353393] fsl-gianfar ffe25000.ethernet eth1: RX BD ring size for Q[0]: 256
[    1.360539] fsl-gianfar ffe25000.ethernet eth1: TX BD ring size for Q[0]: 256
[    1.367764] fsl-gianfar ffe26000.ethernet: enabled errata workarounds, flags: 0x4
[    1.385623] fsl-gianfar ffe26000.ethernet eth2: mac: 00:00:00:00:00:00
[    1.392173] fsl-gianfar ffe26000.ethernet eth2: Running with NAPI enabled
[    1.398973] fsl-gianfar ffe26000.ethernet eth2: RX BD ring size for Q[0]: 256
[    1.406119] fsl-gianfar ffe26000.ethernet eth2: TX BD ring size for Q[0]: 256
[    1.413356] ucc_geth_driver: QE UCC Gigabit Ethernet Controller
[    1.419482] i2c /dev entries driver
[    1.423129] mpc-i2c ffe03000.i2c: timeout 1000000 us
[    1.430572] rtc-ds1307 0-006f: rtc core: registered mcp7940x as rtc0
[    1.436956] rtc-ds1307 0-006f: 64 bytes nvram
[    1.441514] mpc-i2c ffe03100.i2c: timeout 1000000 us
[    1.446762] booke_wdt: powerpc book-e watchdog driver loaded
[    1.452618] sdhci: Secure Digital Host Controller Interface driver
[    1.458812] sdhci: Copyright(c) Pierre Ossman
[    1.463177] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.469018] sdhci-esdhc ffe2e000.sdhc: No vmmc regulator found
[    1.474865] sdhci-esdhc ffe2e000.sdhc: No vqmmc regulator found
[    1.503111] mmc0: SDHCI controller on ffe2e000.sdhc/etc/preinit: exec: line 5: /sbin/init: not found
[    3.546168] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[    3.546168]
[    3.555312] CPU: 0 PID: 1 Comm: sh Not tainted 4.4.199-f90a52a6230ecb072f657fce5aebd444-0 #1
[    3.563751] Call Trace:
[    3.566200] [da843e60] [c04ed1e0] dump_stack+0x84/0xb0 (unreliable)
[    3.572472] [da843e70] [c04e8d7c] panic+0xe0/0x21c
[    3.577267] [da843ed0] [c002e778] do_exit+0x430/0x84c
[    3.582318] [da843f10] [c002ec18] do_group_exit+0x48/0xac
[    3.587718] [da843f30] [c002ec90] __wake_up_parent+0x0/0x18
[    3.593294] [da843f40] [c000e18c] ret_from_syscall+0x0/0x3c
[    3.598867] --- interrupt: c01 at 0xb7b92214
[    3.598867]     LR = 0xb7be9648
[    3.606271] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[    3.606271]

SD card:

ible
in: 56
out: 0
io_config: 255
rom_loc: nor upper bank
SD/MMC : 4-bit Mode
eSPI : Enabled
I2C:   Error, wrong i2c adapter 2 max 2 possible
ready
SPI:   ready
DRAM:  Error, wrong i2c adapter 2 max 2 possible
 = 59)
[    1.210273] libphy: Fixed MDIO Bus: probed
[    1.214580] libphy: Freescale PowerQUICC MII Bus: probed
[    1.222470] switch0: Atheros AR8337 rev. 2 switch registered on mdio@ffe24520
has been deprecated. Update your scripts to load br_netfilter if you need this.
[    1.531271] Bridge firewalling registered
[    1.535301] 8021q: 802.1Q VLAN Support v1.8
[    1.540338] BTRFS: using crc32c-generic for crc32c
[    1.546463] Btrst is w4/r0), UUID 2BB8079F-4C36-4826-89FF-B335CD044083, small LPT model
[    3.391235] VFS: Mounted root (ubifs filesystem) on device 0:13.
[    3.397475] Freeing unused kernel memory: 200K
/etc/preinit: exec: line 5: /sbin/init: not found
[

Must say that now 3 weeks later the trouble from moving from 3.17 ( kickstarter omnia ) to 5.0.x was worth it. Moved from HBT or even more wonky to HBS ( both on Omnia and Mox ) so all is quiet and running. Thxs pepe and gang!

3 Likes

I agree though I still don’t have a working LXC - is it worth a uninstall and reinstall of the LCX bit in luci I wonder?

Edit: Just tried it, removed LXC packages and reinstalled but still have the same issue with “A new login is required since the authentication session expired.” on the LXC page of Luci??

Today’s trial to migrate from 3.11.17 ended up badly - inaccessible Omnia, division by zero in kernel right after boot (which however did not stop the boot)

Summary
[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.180 #0
[    0.000000] Hardware name: Marvell Armada 380/385 (Device Tree)
[    0.000000] [<c010feec>] (unwind_backtrace) from [<c010b35c>] (show_stack+0x10/0x14)
[    0.000000] [<c010b35c>] (show_stack) from [<c07bfc40>] (dump_stack+0x94/0xa8)
[    0.000000] [<c07bfc40>] (dump_stack) from [<c07be2cc>] (Ldiv0+0x8/0x10)
[    0.000000] [<c07be2cc>] (Ldiv0) from [<c0515ea0>] (clk_cpu_recalc_rate+0x28/0x2c)
[    0.000000] [<c0515ea0>] (clk_cpu_recalc_rate) from [<c0512298>] (clk_register+0x3f4/0x67c)
[    0.000000] [<c0512298>] (clk_register) from [<c0a1a4a8>] (of_cpu_clk_setup+0x16c/0x310)
[    0.000000] [<c0a1a4a8>] (of_cpu_clk_setup) from [<c0a19d08>] (of_clk_init+0x16c/0x214)
[    0.000000] [<c0a19d08>] (of_clk_init) from [<c0a0399c>] (time_init+0x24/0x2c)
[    0.000000] [<c0a0399c>] (time_init) from [<c0a00c48>] (start_kernel+0x35c/0x4c4)
[    0.000000] [<c0a00c48>] (start_kernel) from [<0000807c>] (0x807c)
[    0.000000] Division by zero in kernel.
[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.180 #0
[    0.000000] Hardware name: Marvell Armada 380/385 (Device Tree)
[    0.000000] [<c010feec>] (unwind_backtrace) from [<c010b35c>] (show_stack+0x10/0x14)
[    0.000000] [<c010b35c>] (show_stack) from [<c07bfc40>] (dump_stack+0x94/0xa8)
[    0.000000] [<c07bfc40>] (dump_stack) from [<c07be2cc>] (Ldiv0+0x8/0x10)
[    0.000000] [<c07be2cc>] (Ldiv0) from [<c0515ea0>] (clk_cpu_recalc_rate+0x28/0x2c)
[    0.000000] [<c0515ea0>] (clk_cpu_recalc_rate) from [<c0512298>] (clk_register+0x3f4/0x67c)
[    0.000000] [<c0512298>] (clk_register) from [<c0a1a4a8>] (of_cpu_clk_setup+0x16c/0x310)
[    0.000000] [<c0a1a4a8>] (of_cpu_clk_setup) from [<c0a19d08>] (of_clk_init+0x16c/0x214)
[    0.000000] [<c0a19d08>] (of_clk_init) from [<c0a0399c>] (time_init+0x24/0x2c)
[    0.000000] [<c0a0399c>] (time_init) from [<c0a00c48>] (start_kernel+0x35c/0x4c4)
[    0.000000] [<c0a00c48>] (start_kernel) from [<0000807c>] (0x807c)
[    0.000000] Switching to timer-based delay loop, resolution 40ns

and then lots of

00:56:17 err kernel[]: [   25.407164] mvneta f1030000.ethernet eth0: bad rx status 0da10000 (crc error), size=99

It might be related to:

00:54:22 crit kernel[]: [2936830.031598] BTRFS critical (device mmcblk0p1): corrupt leaf, non-root leaf's nritems is 0: block=1947410432, root=409, slot=0
00:54:22 info kernel[]: [2936830.043127] BTRFS info (device mmcblk0p1): leaf 1947410432 total ptrs 0 free space 3995

right before reboot, but btrfs scrub shows no errors at all:

Summary
root@turris:~# btrfs scrub status /dev/mmcblk0p1
UUID:             2d640e77-8547-488d-9997-a30dba289df3
Scrub started:    Thu Jul  9 03:43:15 2020
Status:           finished
Duration:         0:00:50
Total to scrub:   1.67GiB
Rate:             34.27MiB/s
Error summary:    no errors found

I’m used to seeing these BTRFS errors from time to time, but I never noticed anything broken and there are discussions saying that these might be false alarms or kernel bugs…

Clean install of TOS 5 (via schnapps import) works well on this particular device.

I’m sending logs to support.

Just upgraded to 5.1.1 from 3.x.
“Session expired” on lxc page is still shown due to a wrong line in pi-hole lxc config.
lxc.network.ipv4 = 192.168.2.2/24

after correcting this to the right format the session bug is gone.
I thought I read that a problem with static adresses was fixed.

1 Like

I’ve updated my Omnia semi successfully…

  • Reforis doesn’t work Can not use Reforis on TOS 5+ - ControllerMissing After installing all the reforis plugins, it works.
  • opkg doesn’t work. Fixed by running /usr/bin/update_alternatives.sh via ssh as seen in Opkg can't update
  • openvpn and ddns weren’t running. I needed to install their packages because it seems the upgrade process get rid of them… the previous configuration was still there tho, so I didn’t need to reconfigure anything.
  • The /srv mountpoint was missing. I modified the /etc/config/fstab to fix it.

Edit to add the workarounds.

1 Like

Is there any specificity of migration to TOS 5 if the system is installed on mSATA SSD ?

No idea. I would say no but I have not tested it so I can’t guarantee that.

I think more people went to the SSD forcibly because the internal memory failed. I don’t want to be the first testior. The transition to an SSD isn’t easy anymore.

3 Likes

I was one of the ones forced in to SSD due to internal failure. I have used this script. Pretty much went to plan with the exception of my LXC were not migrated and I had to work out how to get it working again using the CLI (the Luci LCX page would break and send me in to a repeated logon loop).

The Luci page now works but I don’t know if that was something that was fixed elsewhere or as a result of me now having a working LXC config…

Hope that helps.

1 Like

Hello.
I’ve just updated my Turris Omnia from 3.11.20 to 5.1.3.

It looks like everything is fine except my wireguard connection: all configs are in place (doudle checked it)

 config interface 'VPS'
    option proto 'wireguard'
    option private_key '<mysupersecretkey>'
    option listen_port '443'
    list addresses '192.168.255.2'
    option delegate '0'
    option ifname 'VPS'

config wireguard_VPS
    list allowed_ips '0.0.0.0/1'
    list allowed_ips '128.0.0.0/1'
    option route_allowed_ips '1'
    option endpoint_port '443'
    option persistent_keepalive '25'
    option public_key 'hHSO4rSb6Bt6dgf5hQ8RytEoavyadnZaS3sqpFNAA2U='
    option endpoint_host '<mysupersecretip>'

but I get

Protocol:  WireGuard VPN
RX:  0 B (0 Pkts.)
TX:  0 B (0 Pkts.)
Error:  Unknown error (DEVICE_CLAIM_FAILED)

error on interface

How can it be fixed?

Hello @horseinthesky,

Thank you for testing optional migration! We appreciate your feedback. Unfortunately, wireguard is not supported in migration and it was not tested by us, yet. But we received the same report on our GitLab.
You might want to take a look as there is a solution for it:

We would appreciate it if you can confirm there that it works for you.

Thank you very much.
I checked my notes regarding configuring this feature and indeed this option wasn’t in my config before.
Removing it solved the issue.

4 posts were split to a new topic: Discussions about state of optional migration from Turris 3.x for advanced users