DDR3 Training Sequence - FAILED

I rebooted my Omnia with a serial console connected and saw this:

root@turris:/# reboot
root@turris:/# Stopping router Turris.
[ 287.391819] reboot: Restarting system

U-Boot SPL 2015.10-rc2 (Aug 18 2016 - 20:43:35)
High speed PHY - Version: 2.0
SERDES0 card detect: PEX

Initialize Turris board topology
Detected Device ID 6820
board SerDes lanes topology details:
| Lane # | Speed | Type |

| 0 | 5 | PCIe0 |
| 1 | 5 | USB3 HOST0 |
| 2 | 5 | PCIe1 |
| 3 | 5 | USB3 HOST1 |
| 4 | 5 | PCIe2 |
| 5 | 0 | SGMII2 |

:** Link is Gen1, check the EP capability
PCIe, Idx 0: remains Gen1
:** Link is Gen1, check the EP capability
PCIe, Idx 1: remains Gen1
PCIe, Idx 2: detected no link
High speed PHY - Ended Successfully
DDR3 Training Sequence - Ver TIP-1.29.0
Memory config in EEPROM: 0x01
ddr3_tip_pbs_rx failure CS #0
Title: I/F# , Tj, Calibration_n0, Calibration_p0, Calibration_n1, Calibration_p1, Calibration_n2, Calibration_p2,CS0 ,
VWTx, VWRx, WL_tot, WL_ADLL, WL_PH, RL_Tot, RL_ADLL, RL_PH, RL_Smp, Cen_tx, Cen_rx, Vref, DQVref, PBSTx-Pad0,PBSTx-Pad1,PBSTx,
Data: 0,63,19,14,19,15,20,20,CS0 ,
0,0,10,10,0,422,6,1,6,23,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,
0,0,6,6,0,422,6,1,6,19,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 0,2,2,9,0,0,3,7,9,7,
0,0,7,7,0,422,6,1,6,20,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 3,0,0,6,0,0,2,2,6,5,
0,0,10,10,0,422,6,1,6,23,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 2,2,0,2,0,0,

Run_alg: tuning failed 0
DDR3 run algorithm - FAILED 0x1
DDR3 Training Sequence - FAILED

After that, the system boot halted and I had to power cycle it to recover. Luckily I had the serial connection up or I would have been a lot more confused by the unresponsive hang.

Has anyone else seen this and/or is this a known issue that the devs ran into? Or should I be concerned about my hardware itself.

If I understand correctly the system was trying to perform DDR3 leveling to compensate on varying delays in the signal propagation caused by different paths to each RAM bank/chip. And this operation failed for some reason.

I’ve gotten time to play with my Omnia a bit more over the holidays and I’ve now seen this another couple of times. Am I still the only one seeing this? I left it alone for longer this time and noticed that what I assume is the watchdog on the MCU kicked in and reset the board after a minute or two.

DDR3 Training Sequence - Ver TIP-1.29.0
Memory config in EEPROM: 0x01
ddr3_tip_pbs_rx failure CS #0
Title: I/F# , Tj, Calibration_n0, Calibration_p0, Calibration_n1, Calibration_p1, Calibration_n2, Calibration_p2,CS0 ,
VWTx, VWRx, WL_tot, WL_ADLL, WL_PH, RL_Tot, RL_ADLL, RL_PH, RL_Smp, Cen_tx, Cen_rx, Vref, DQVref, PBSTx-Pad0,PBSTx-Pad1,PBSTx-
Pad2,PBSTx-Pad3,PBSTx-Pad4,PBSTx-Pad5,PBSTx-Pad6,PBSTx-Pad7,PBSTx-Pad8,PBSTx-Pad9,PBSTx-Pad10, PBSRx-Pad0,PBSRx-Pad1,PBSRx-Pad2,PBS
Rx-Pad3,PBSRx-Pad4,PBSRx-Pad5,PBSRx-Pad6,PBSRx-Pad7,PBSRx-Pad8,PBSRx-Pad9,PBSRx-Pad10,
Data: 0,69,18,14,18,15,20,20,CS0 ,
0,0,10,10,0,422,6,1,6,23,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 1,1,0,0,0,0,
0,0,0,0,0,
0,0,6,6,0,422,6,1,6,19,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 0,0,0,9,0,0,1,6,8,6,
0,
0,0,7,7,0,422,6,1,6,20,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 3,0,1,9,0,0,1,3,9,6,
0,
0,0,10,10,0,422,6,1,6,23,10,4,0, 63,63,63,63,31,31,63,63,63,63,63, 0,0,0,0,0,0,0,0,0,0,0, 4,4,0,4,0,0,
4,2,5,4,0,

Run_alg: tuning failed 0
DDR3 run algorithm - FAILED 0x1
DDR3 Training Sequence - FAILED

You could try to open a case at tech.support@turris.cz