WAN (eth2) issue on turris v1 with 5.3.3

After upgrading to RouterOS 5 i am having issues with eth2 (WAN) port. Few times per day my internet interrupting and i have such messages in the kernel log:

[247392.032542] fsl-gianfar ffe26000.ethernet eth2: Link is Down
[247392.119413] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
[247401.312447] fsl-gianfar ffe26000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
[247401.312469] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready

and in the system log:

Jan 20 23:13:23 turris netifd: Network device 'eth2' link is down
Jan 21 00:13:23 turris kernel: [247392.032542] fsl-gianfar ffe26000.ethernet eth2: Link is Down
Jan 21 00:13:23 turris kernel: [247392.119413] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
Jan 20 23:13:32 turris netifd: Network device 'eth2' link is up
Jan 21 00:13:32 turris kernel: [247401.312447] fsl-gianfar ffe26000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
Jan 21 00:13:32 turris kernel: [247401.312469] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
Jan 20 23:13:32 turris netifd: Network alias 'eth2' link is up
Jan 20 23:13:32 turris hotplug-iface: USER=root ACTION=ifup SHLVL=2 HOME=/ HOTPLUG_TYPE=iface LOGNAME=root DEVICENAME= TERM=linux PATH=/usr/sbin:/usr/bin:/sbin:/bin INTERFACE=wan PWD=/ DEVICE=eth2
Jan 20 23:13:33 turris firewall: Reloading firewall due to ifup of wan (eth2)

This causing service interruption visible on zoom call, etc. To exclude ISP from the diagnostic i connected ISP via another 1Gb switch, placed in between ISP and router WAN. I see on this switch that Turris still failing few times per day and that ISP link is always up. I also tried to replace patch-cords, to ensure that this is not a hardware issue, it did not help. Last thing i tried was to switch off auto negation and force 1Gb/FDX. It did not help, moreover, at some point interface got down state which was not recovering until i re-enabled autoneg. Interface was in DHCP mode, now temporary tried to switch it to static to see if that would help.

I am out of ideas what to do next. Put router to the trashcan? Connect WAN to the LAN switch? Any other ideas? I would like to avoid moving back to 3.x as this OS will be out of support very soon.

1 Like

Similar thread (but with 100mb link) Turris 1.x eth2 driver problem

Now upgraded to TOS 5.3.4 (kernel 4.14.262), will wait few days to see if issue still occurs.

1 Like

I look forward to your feedback!

1 Like

So far it looks good, 0 events on eth2. I will wait few more days as in the past it was happens mostly on active internet usage time and will update this thread. But keeping fingers crossed that it was kernel issue and it is fixed in the TOS 5.3.4 already :slight_smile:

Update: nope, same symptoms :frowning:

Its not fixed :frowning:

[   18.937038] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[83388.775386] fsl-gianfar ffe26000.ethernet eth2: Link is Down
[83394.919397] fsl-gianfar ffe26000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
[83396.967337] fsl-gianfar ffe26000.ethernet eth2: Link is Down
[83397.993248] fsl-gianfar ffe26000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off

So already 2 up/down events :frowning: I will also try to upgrade to RoS 6 to see if 5.x kernel changing anything here.

Be sure to try TOS6.
In TOS5 it helped me to use port lan1 for wan and remap in / etc / config / network
However, there were declines in performance.

1 Like

Okay, upgraded to TOS6 (Linux turris 5.4.171), at least device boots fine and seems to work. I saw many patches to gianfar driver in the recent kernels, with some of them seems to be relevant (handling of buffer overrun). Hope that would help. Will report in a few days.

It seems that Turris OS 6.x fixed the issue!

[    1.637086] fsl-gianfar ffe26000.ethernet eth2: mac: 00:00:00:00:00:00
[    1.643628] fsl-gianfar ffe26000.ethernet eth2: Running with NAPI enabled
[    1.650423] fsl-gianfar ffe26000.ethernet eth2: RX BD ring size for Q[0]: 256
[    1.657567] fsl-gianfar ffe26000.ethernet eth2: TX BD ring size for Q[0]: 256
[   20.080955] fsl-gianfar ffe26000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
[   20.081910] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready

No issues few days already! @Pepe do you have any plans for the Turris OS 6 release? Or at least to backport gianfar fixes to TOS5, there are not that many of them. As now Turrus 1.1 + ROS5 is a very problematic combo…

That’s interesting. We will need to take a look what is going on with your case and be able to reproduce it on our end. Then it is question, if those fixes can be easily backported or not.

I think I do, but nothing promising, yet. :slight_smile: Still I am not good at giving some ETAs, etc.

1 Like

I assume it could be a buffer underrun patch in the recent commits. I do have combo of 1Gb and few 100Mb links so this condition is expected to happen. I am okay to do testing for the Turris project as it seems that on my side issue is reproduced within 1-2 days. So ping me if i can help :slight_smile:

@Pepe also user from Turris 1.x eth2 driver problem thread had same issue and actually same solution, so i assume its affect every 1.x router. But as issue is self-healing within a few seconds you typically not paying attention to it (or blaming ISP). I actually started to debug it once i had a quarantine and had a lot of zoom meetings from home :slight_smile:

Anyway, thank you for all your work, marking problem as solved. As a bonus with TOS6 DSA vlans are working as expected now.

Gitlab report: WAN (eth2) issue on turris v1 with 5.x (#323) · Issues · Turris / Turris OS / Turris Build · GitLab

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.