Turris unreachable (ping, SSH, routing) sporadically

Hi there,

I’m a pretty new user of a Turris Omnia. It works great except for one regularly occuring hiccup.

It becomes unreachable for couple of minutes sporadically. During this outage, it is not reachable, i.e. it doesn’t respond to ping, doesn’t allow login via SSH or web interface etc. (also no DNS, DHCP, routing etc.)

When it happens, it always comes back up after a short amount of time (a couple of minutes) and everything is back to normal

Once I login to the system, I can see the following in dmesg:
[188828.523505] br-lan: received packet on lan3 with own address as source address (addr:<my_mac>, vlan:0)

(sometimes multiple entries)

I’ve tried to diagnose it with tcpdump to show me a dump of all packets with the same originator mac but not with the Turris IP but some other IP from the local network. I even tried to be extra clever and run in it tmux - but … nothing! (which makes it extra strange)

Any ideas on how to get closer to the root of the problem?

Thanks a ton!

Edit: Turris OS 5.2.2, but the problem was there with 4.x

This message says that you have a loop in your network. Check your cables and if you didnt connect switch to switch with additional cables.

Well, there is no loop but regardless of the problem to debug the origin of the ethernet frames coming in with the Turris MAC address:

The kernel log shows exactly one message (if there were multiple looped frames they should have been logged by the kernel as well) and the Turris is in “lockdown” after that. But log wise there is no indication as to what’s happening.

So, after looking through possible causes I’m still wondering what’s triggering this: Any ideas on what to look into here, i.e. what Turris or OpenWRT might specifically do here to cause this?

You could try disabling/enabling STP spanning tree protocol on the bridge that the problem persist. In general when the packet with the same mac appears on the port it came from that port enters disabled state because of STP and that might be the case in your example. Resulting in not responding until it enters back the promicious/forwarding state. Please post your network config.