Omnia does not boot (data abort)

Hi!

Last night my Omnia restarted for a software update to TurrisOS 5.1.5 but didn’t bring back. It’s constantly rebooting itself and I can’t understand the reason.

High speed PHY - Version: 2.0
SERDES0 card detect: SATA

Initialize Turris board topology
Detected Device ID 6820
board SerDes lanes topology details:
 | Lane #  | Speed |  Type       |
 --------------------------------
 |   0    |  6   |  SATA0       |
 |   1    |  5   |  USB3 HOST0  |
 |   2    |  5   |  PCIe1       |
 |   3    |  5   |  USB3 HOST1  |
 |   4    |  5   |  PCIe2       |
 |   5    |  0   |  SGMII2      |
 --------------------------------
poll_op_execute: TIMEOUT
:** Link is Gen1, check the EP capability
PCIe, Idx 1: remains Gen1
:** Link is Gen1, check the EP capability
PCIe, Idx 2: remains Gen1
High speed PHY - Ended Successfully
DDR3 Training Sequence - Ver TIP-1.29.0
Memory config in EEPROM: 0x02
DDR3 Training Sequence - Switching XBAR Window to FastPath Window
DDR3 Training Sequence - Ended Successfully


U-Boot 2015.10-rc2 (Aug 18 2016 - 20:43:35 +0200), Build: jenkins-omnia-master-23

SoC:   MV88F6820-A0
       Watchdog enabled
I2C:   ready
SPI:   ready
DRAM:  2 GiB (ECC not enabled)
Enabling Armada 385 watchdog.
Disabling MCU startup watchdog.
Regdomain set to **
MMC:   mv_sdh: 0
SF: Detected S25FL164K with page size 256 Bytes, erase size 64 KiB, total 8 MiB
PCI:
  00:01.0     - 168c:003c - Network controller
PCI:
  01:00.0     - 11ab:6820 - Memory controller
  01:01.0     - 168c:002e - Network controller
Model: Marvell Armada 385 GP
Board: Turris Omnia SN: 0000000B000128C9
Regdomain set to **
SCSI:  MVEBU SATA INIT
Target spinup took 0 ms.
SATA link 1 timeout.
AHCI 0001.0000 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
flags: 64bit ncq led only pmp fbss pio slum part sxs
Net:   neta2
Hit any key to stop autoboot:  0
Setting bus to 1
BOOT eMMC FS
btrfs probe failed
** Unrecognized filesystem type **
switch to partitions #0, OK
mmc0(part 0) is current device
** No partition table - mmc 0 **
scanning bus for devices...
  Device 0: (0:0) Vendor: ATA Prod.: Samsung SSD 860 Rev: RVT4
            Type: Hard Disk
            Capacity: 238475.1 MB = 232.8 GB (488397168 x 512)
Found 1 device(s).

SCSI device 0:
    Device 0: (0:0) Vendor: ATA Prod.: Samsung SSD 860 Rev: RVT4
            Type: Hard Disk
            Capacity: 238475.1 MB = 232.8 GB (488397168 x 512)
... is now current device
Failed to mount ext2 filesystem...
BtrFS init: Max node or leaf size 16384
Scanning scsi 0:1...
Failed to mount ext2 filesystem...
BtrFS init: Max node or leaf size 16384
data abort
pc : [<00000408>]          lr : [<7ff8b6f0>]
reloc pc : [<808a3408>]    lr : [<0082e6f0>]
sp : 7fb46f10  ip : 00000004     fp : 7ff9f9c8
r10: 00000004  r9 : 7fb4ced8     r8 : ffffffff
r7 : 00000004  r6 : 00000000     r5 : 00000000  r4 : 7fbf9088
r3 : 00000000  r2 : 600001d3     r1 : 7ffb177c  r0 : 600001d3
Flags: nZCv  IRQs off  FIQs off  Mode SVC_32
Resetting CPU ...

resetting ...

My MMC is long broken so it is expected to boot from SSD.

I can boot into the rescue shell (the 7-LED reboot) and access my SSD partition from there, it seems fine, my files ar on place, btrfsck is successful.

The software update itself looks innocent at first but it includes kernel and uboot-envtools. What do they do as post-upgrade procedure?
I tried to rollback to an old btrfs snapshot, but that didn’t solve the issue.

One more thing to note: I might had full /tmp partition during the upgrade (but not sure), could that be a problem?
What else can I check apart from btrfsck?
What is “data abort”?
By the way, sometimes the message was replaced by “undefined instruction”

    Device 0: (0:0) Vendor: ATA Prod.: Samsung SSD 860 Rev: RVT4
            Type: Hard Disk
            Capacity: 238475.1 MB = 232.8 GB (488397168 x 512)
... is now current device
Failed to mount ext2 filesystem...
BtrFS init: Max node or leaf size 16384
Scanning scsi 0:1...
Failed to mount ext2 filesystem...
BtrFS init: Max node or leaf size 16384
undefined instruction
pc : [<0000006c>]          lr : [<7ff8b6f0>]
reloc pc : [<808a306c>]    lr : [<0082e6f0>]
sp : 7fb46f10  ip : 00000004     fp : 7ff9f9c8
r10: 00000004  r9 : 7fb4ced8     r8 : ffffffff
r7 : 52ffff94  r6 : 00000000     r5 : 600001d3  r4 : ffffe002
r3 : 600001d3  r2 : 600001d3     r1 : 7ffb177c  r0 : 600001d3
Flags: nzCv  IRQs off  FIQs off  Mode SVC_32
Resetting CPU ...

resetting ...

Does anyone have any ides?

1 Like

One more thing to note: My SSD has two partitions: /dev/sda1 with BtrFS rootfs and /dev/sda2 with LUKS. Could it try to boot from LUKS instead?

Thanks in advance!

Unfortunately, in HBK and HBT branches, there was erased U-boot env while using something different than eMMC which happened due to upstream changes. :frowning:

1 Like

Will check that, thanks!

Works great after uboot environment update, thanks a lot!