Same here. When reboot my omnia by reboot command, reboot button or power cycling, sometimes just a few swicth ports doesn’t work, sometimes no switch port works and sometimes all ports are working. WAN port always works (and I’m using SFP). I’ve tried to connect 2 different devices with different cables with the same result. Sometimes it helps to unplug/plug cable but usually not and I have to plug cable to the different port. Switch config is default, no tagged VLANs. Once the link is established on the port, everything is working rock solid for days until next reboot.
This is crucial issue for me, because my omnia is placed at hardly accessible place at the attic and it’s very unpleasant to going there just for plugging and unplugging cables whenever I reboot my router. Because of this fact I’ve setup remote access to omnia serial console to be able to figure out what is happening. Any suggestions what I can try when this happens? I’d like to help to debug this issue.
EDIT: And now I did SW reboot and it happened again. I have two cables connected to the omnia switch. Port2 and Port4. Port4 has link and is working. Both SGMII CPU ports has link too. Port2 is without link (LED is turned off, swconfig reporting link down, connection is not working). There is nothing suspicious in neither dmesg or /var/log/messages. So far I’ve tried: swconfig dev switch0 load network swconfig dev switch0 set reset /etc/init.d/network restart
Nothing above helped, so I’ve done another sw reboot and now it’s working again, weird.
2davidhaluska: Did you received some response from Turris team yet?
I was asking if package was delivered without issues so I received response that package is there without damage and that they will look into that but no response yet for actual issue.
My guess is that issue is somewhere around link/MDI-X negotiation (turris can’t detect other side auto-neg?.. possibly cable? I’ve tested 3 but all straight… don’t have any cross at hand right now)
after reboot with: [root@turris:~]# reboot
on turris:
root@turris:~# ethtool eth0
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Half 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes <===================================
Link partner advertised link modes: 1000baseT/Full
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: No <=======================
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 0
Transceiver: external
Auto-negotiation: on <==============================================
Link detected: yes <================================================
on laptop (tg3 driver):
root@pve:~# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes <=================================
Speed: Unknown!
Duplex: Unknown! (255)
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on <==================================================
MDI-X: Unknown
Supports Wake-on: g
Wake-on: d
Current message level: 0x00000020 (32)
ifup
Link detected: no <=====================================================
when it works (after “ethtool -s eth0 autoneg on” or ifconfig eth0 down && ifconfig eth0 up on laptop):
on turris:
root@turris:~# ethtool eth0
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Half 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Link partner advertised link modes: 1000baseT/Full
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: No <===============================
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 0
Transceiver: external
Auto-negotiation: on
Link detected: yes
on laptop:
root@pve:~# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: off
Supports Wake-on: g
Wake-on: d
Current message level: 0x00000020 (32)
ifup
Link detected: yes <==============================
It doesn’t help when I run ethtool -r eth0 on laptop but it helps when I do “ethtool -s eth0 autoneg on” even if it is already on.
For now, as a workaround, I’m gonna put cron entry to run “ethtool -s eth0 autoneg on” every minute on my laptop.
Hi Peter, I agree it can be auto MDI/MDI-X or autonegotiation related issue. Unfortunately output of ethtool on Turris side won’t tell you much, because it’s related just to the SGMII link between SoC interface and switch chip interface. I think some useful informations could be obtained from switch chip registers accessible through MDIO, but 88E6176 full datasheet which contains it’s registers description is available only under NDA with Marvell I’ve tried to get some insight by looking to 88e6176 linux driver, but it lacks defintions for PHY HW - related registers which I believe would be most helpful.
dandys: drivers/net/phy/marvell.c should have the phy register definitions. the dsa driver only manages the switch not the phy.
I am already working on this part although without documentation. The dsa drivers export nearly all data of this switch and its phys. OpenWRT’s swconfig driver is the bare minimum to get the VLANs configured. I am currently working to get accessing the PHYs working. Reading registers from the switch itself already works somewhat (i can read its ID and version). Accessing the phy registers is some crazy as it is double indirect access.
This is not my main priority at the moment as i currently work on some wifi and dhcp related problems with my setup.
You’re right, I’ve looked only at swconfig driver and as you said - it’s bare minimum. My idea was to hack swconfig driver (as it already contains functions for indirect register addressing and propper locking) by exporting custom netlink attributes for raw register access - it seems this part can be done quite easily and even doesn’t require to touch swconfig userspace tool.
Thanks for DSA tip it indeed looks promising! Neverthless I think I will be able to obtain marvell datasheet by somewhat official way next week.
It simply unconditionally resets the 5 lan ports, sets them to 10/100/1000 half/full and no pause and then starts auto-negotiation. It is a bit unsafe as there are no locks or something like this in place. Do not call swconfig or ethtool while this runs.
It should be called after swconfig has setup the vlans because swconfig will reset the switch.
@adminX I’ve tried your code and it helped. I’m now also able to read (and probably write) switch EEPROM via MDIO and it seems it’s empty. Maybe some initialization is done in uboot. Do you have some suggestions what registers to what values should be initialized? I’ve looked to your code and beside accessing undocumented registers and bits (phy reg16_3 and phy reg16_2 bit 5) it’s more or less just phy reset.
Your non-working dump says no link and no annoucement from the other side. Energy detect mode Sense and transmit NLP each second should wake every switch on the other side. So this seems all correct.
My code does about the same things the other (unused in *WRT) kernel driver does. There seems to be some errata about how to reset the PHYs.
The main cause could be a too short reset pulse. Gets detected by the switch but not by the PHYs. As only 2 bits in the PHY get changed in the mvsw61xx source this may leave the PHYs in an undefined state.
phy page 0 register 16 gets bit 4 cleared (energy detect mode?)
phy page 0 register 0 gets bit 11 cleared (powerdown get set to disabled)
mvsw_get_reg(16,3) will give you the switch hardware id. If it is different from 1761 it could mean we have different revisions and they use different internal reset procecdures.
In the end i am glad my crystal ball worked this time.
Do you mean hw reset pulse generated by board on reset pin or internal reset triggered by bit 15 of switch global control register? Because the only thing which is reset by this bit is MAC state machine, it’s not propagated to PHYs.
phy page 0 register 16 gets bit 4 is marked as reserved, energy detect mode is configured in bits 9:8.