Ports went offline over night (entered disabled state)

Hi,

I’ve got a Turris Mox A with 2 Mox E attached.
VERSION=“4.0.5”

It is running for some weeks without problems.
This morning some ports were offline, i.e. a PC connected to one port had internet access, a PC on another did not have internet access.

dmesg shows a lot of errors like:

[179958.049705] br-guest: port 1(lan7) entered disabled state
[179978.260303] mv88e6085 d0032004.mdio-mii:10 lan7: Link is Up - 100Mbps/Full - flow control rx/tx
[179978.269261] br-guest: port 1(lan7) entered blocking state
[179978.274840] br-guest: port 1(lan7) entered forwarding state
[180437.069785] mv88e6085 d0032004.mdio-mii:10 lan7: Link is Down
[180437.077177] br-guest: port 1(lan7) entered disabled state
[180457.498164] mv88e6085 d0032004.mdio-mii:10 lan7: Link is Up - 100Mbps/Full - flow control rx/tx
[180457.508073] br-guest: port 1(lan7) entered blocking state
[180457.513929] br-guest: port 1(lan7) entered forwarding state
[191132.365846] mv88e6085 d0032004.mdio-mii:10 lan7: Link is Down
[191132.372840] br-guest: port 1(lan7) entered disabled state
[191152.626765] mv88e6085 d0032004.mdio-mii:10 lan7: Link is Up - 100Mbps/Full - flow control rx/tx
[191152.635946] br-guest: port 1(lan7) entered blocking state
[191152.641705] br-guest: port 1(lan7) entered forwarding state
[192560.000986] mv88e6085 d0032004.mdio-mii:10 lan7: Link is Down
[192560.010721] br-guest: port 1(lan7) entered disabled state

logread shows

May 26 22:12:02 turris syslog-ng[21778]: syslog-ng starting up; version=‘3.21.1’
May 26 22:14:15 turris dnsmasq-dhcp[10780]: DHCPDISCOVER(br-wlan) xx:xx:xx:xx:xx:xx
May 26 22:14:15 turris dnsmasq-dhcp[10780]: DHCPOFFER(br-wlan) 192.168.80.28 xx:xx:xx:xx:xx:xx
May 26 22:14:15 turris dnsmasq-dhcp[10780]: DHCPREQUEST(br-wlan) 192.168.80.28 xx:xx:xx:xx:xx:xx
May 26 22:14:15 turris dnsmasq-dhcp[10780]: DHCPACK(br-wlan) 192.168.80.28 xx:xx:xx:xx:xx:xx Galaxy
May 26 20:14:15 turris 99-dhcp_host_domain_ng.py: Add_lease, hostname check failed
May 26 20:14:15 turris 99-dhcp_host_domain_ng.py: DHCP unknown update operation [update,LottesGalaxy]
May 26 20:14:16 turris kresd[30734]: > hints.del(‘printer.lan’)
May 26 20:14:16 turris kresd[30734]: [result] => true
May 26 20:14:16 turris kresd[30734]:
May 26 20:14:16 turris kresd[30734]: > hints.del(‘tpswitch.lan’)
May 26 20:14:16 turris kresd[30734]: [result] => true
May 26 20:14:16 turris kresd[30734]:
May 26 20:14:16 turris kresd[30734]: > hints.del(‘sg200.lan’)
May 26 20:14:16 turris kresd[30734]: [result] => true

Since I need internet for work I had to reboot the router.
After the reboot everything seems to work normally.

Does anybody knows what is happening?

Thanks.

Peter

… just wild guess, i think there was some process causing services related to iptables/firewall/network to reload/restart. If you participate in ludus/haas/sentinel projects it might be the case that firewall rules were updated and that caused network/interfaces to get restarted with following new lease from dhcp.

I don’t know what ludus/haas/sentinal projects are so I assume that I don’t participate in those projects :slight_smile:
The point was not that the interfaces restarted, but that they went off-line.

Peter

Some fw-rules are updated on regular basis /etc/cron.d/fw-rules I would check the events before that block you posted. Maybe there is some clue. I tried to grep my logs, i did not found any entry with disabled/blocking state (3 days back) i will check the older ones…, but you are right blocking/disabled is not usual state,

I don’t have a file /etc/cron.d/fw-rules

BTW: how do you keep logs for 3 days? My /temp/log/messages has 15000 lines with 70 entries by cron and the rest with dnsmasq and kresd messages. Even the restart this morning is not included anymore since the log seems to be automatically be shortened.

Peter

i’ve edited /etc/logrotate.conf (to handle messages) and /etc/logrotate.d/* (to handle iptables, nikola, pcap,ucollect,lxc) files a bit. there you can say when to rotate the log and what to do before/after rotation …
EDIT: i have almost all logs in /srv/logs so i do not loose them during reboot.

1 Like

You can get inspired by my logging config which filters out nonimportant stuff and saves the rest to an HDD under /mnt/nas/data/omnia-logs: https://gist.github.com/peci1/979fd510d82a99a784d5996d6d93c93a . If nothing’s wrong, it spins up the drive for writing approx every five to ten minutes. The config is for Turris OS 3.x, but it should work for 4.x and 5.x, too.

1 Like

I have the same for quite long time on HBK (TOS 5.0) in kernel log:

[165011.862952] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Up - 100Mbps/Full - flow control off
[165011.871623] br-lan: port 1(lan0) entered blocking state
[165011.876960] br-lan: port 1(lan0) entered forwarding state
[165020.012432] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Down
[165020.018505] br-lan: port 1(lan0) entered disabled state
[165023.323267] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Up - 100Mbps/Full - flow control rx/tx
[165023.332120] br-lan: port 1(lan0) entered blocking state
[165023.337460] br-lan: port 1(lan0) entered forwarding state
[165060.088409] ath10k_pci 0000:02:00.0: NIC rx-max-rate: 0 calculated-max: 0 rxnss_override: 0x80000000  nss160: 1  spatial-streams: 2
[165114.486842] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Down
[165114.492771] br-lan: port 1(lan0) entered disabled state
[165286.320873] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Up - 100Mbps/Full - flow control rx/tx
[165286.329724] br-lan: port 1(lan0) entered blocking state
[165286.335057] br-lan: port 1(lan0) entered forwarding state
[165291.514347] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Down
[165291.520396] br-lan: port 1(lan0) entered disabled state
[165351.726683] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Up - 100Mbps/Full - flow control rx/tx
[165351.735642] br-lan: port 1(lan0) entered blocking state
[165351.740977] br-lan: port 1(lan0) entered forwarding state
[165423.933729] mv88e6085 f1072004.mdio-mii:10 lan0: Link is Down
[165423.939648] br-lan: port 1(lan0) entered disabled state

But when some device is connected to the LAN cable, it works without any problem.

TV (AndroidTV) is connected to that port, and when in standby mode, maybe it is checking something via network…

Thanks a lot @peci1 & @Maxmilian_Picmaus

I’m still looking for this “OpenWrt/TOS for experienced Linux user” page in the internet to learn how things work if they are different from my “normal” Fedora system.

Peter

1 Like

I would suspect thats not up to bridge to disable the port but the interface is going down and thats why the bridge is following up with disabled state. Please check the cable and if there is no physical errors with ‘ethtool -S eth0/eth1‘