User experience - ALLNET ALL4781-VDSL2-SFP / Switch Modul (Mini-GBIC), VDSL2

Whist the module performs on my IPS’s subscriber line generally stable after the switch from TOS3.x to TOS4.x I noticed intermittent hiccups manifested in the logs

sfp: module transmit fault indicated
sfp: module transmit fault recovered

and sometimes also

sfp: module persistently indicates fault, disabling

Those messages are generated by SFP.C [1] (not available in TOS3.x with kernel 4.9.x) and are pertinent to the check routines implemented for state machine:

  • checks signal status (asserted / dessarted) for RX_LOS and TX_FAULT

with

sfp: module transmit fault indicated

relating to the signal status of TX_FAULT (tx-fault in hi IRQ)

SFP.C tries to clear (recover) the fault fives times in total, pausing one second between each attempt and if successful (tx-fault in lo IRQ) prints

sfp: module transmit fault recovered

If the five attempts are however exhausted it prints

sfp: module persistently indicates fault, disabling

and as a result there is no WAN connectivity. Signal status detection is only attempted again if the interface is being restarted, else the link will remain a down state.


It is not clear why the module most of the times passes the check but other times intermittently fails. Potential reasons could be:

  • module hardware defect that was not exposed in TOS3.x (lacking the presence of SFP.C and related checks)
  • something chocking the I2C bus communication with the module and preventing a timely response (within 300 ms) on the TX_FAULT signal status from the module
  • some bug in the SFP.C code, though its developer is adamant that it is not the case but trusts that the module misbehaves instead

[1] linux/drivers/net/phy/sfp.c at master · torvalds/linux · GitHub