Modules G+F interoperability

This is gonna be a continuation of my previous post from general discussion. I went ahead and ordered an F module to my existing AG (G etched on the PCB, just to be certain that I didn’t get B by accident).
After a whole day of debugging many problems I come to a state where I have my NAS working or I have Wi-Fi (the G one). This is a log from when I tried to disabled the Wi-Fi, rebooted, checked my drives are mounted and working, and then enabled Wi-Fi again. The moment I tried to run a speed check on my phone, my drives disconnected and basically the whole F disconnected as evidenced by the log. Is there something obvious that I’m missing or is AGF not really viable?

Dec 15 17:45:46 turris hostapd: Configuration file: /var/run/hostapd-phy0.conf
Dec 15 18:45:47 turris kernel: [ 116.418490] ath10k_pci 0000:02:00.0: 10.1 wmi init: vdevs: 16 peers: 127 tid: 256
Dec 15 18:45:47 turris kernel: [ 116.436394] ath10k_pci 0000:02:00.0: wmi print ‘P 128 V 8 T 410’
Dec 15 18:45:47 turris kernel: [ 116.442706] ath10k_pci 0000:02:00.0: wmi print ‘msdu-desc: 1424 sw-crypt: 0’
Dec 15 18:45:47 turris kernel: [ 116.449818] ath10k_pci 0000:02:00.0: wmi print ‘alloc rem: 26384 iram: 26852’
Dec 15 18:45:47 turris kernel: [ 116.531563] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
Dec 15 18:45:47 turris kernel: [ 116.541983] br-lan: port 3(wlan0) entered blocking state
Dec 15 18:45:47 turris kernel: [ 116.547244] br-lan: port 3(wlan0) entered disabled state
Dec 15 18:45:47 turris kernel: [ 116.553397] device wlan0 entered promiscuous mode
Dec 15 17:45:47 turris hostapd: wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
Dec 15 17:45:47 turris hostapd: wlan0: interface state COUNTRY_UPDATE->HT_SCAN
Dec 15 17:45:47 turris hostapd: Using interface wlan0 with hwaddr 04:f0:21:45:cd:5d and ssid “Turris”
Dec 15 18:45:47 turris kernel: [ 116.792546] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Dec 15 18:45:47 turris kernel: [ 116.798943] br-lan: port 3(wlan0) entered blocking state
Dec 15 18:45:47 turris kernel: [ 116.804492] br-lan: port 3(wlan0) entered forwarding state
Dec 15 17:45:47 turris hostapd: wlan0: interface state HT_SCAN->ENABLED
Dec 15 17:45:47 turris hostapd: wlan0: AP-ENABLED
Dec 15 17:45:47 turris netifd: Network device ‘wlan0’ link is up
Dec 15 17:45:47 turris netifd: lan (3905): udhcpc: performing DHCP renew
Dec 15 17:45:47 turris netifd: lan (3905): udhcpc: sending renew to 0.0.0.0
Dec 15 17:45:47 turris netifd: lan (3905): udhcpc: lease of 192.168.0.10 obtained, lease time 3600
Dec 15 18:45:48 turris kernel: [ 117.416550] ath10k_pci 0000:02:00.0: NOTE: Firmware DBGLOG output disabled in debug_mask: 0x10000000
Dec 15 17:46:01 turris hostapd: wlan0: STA 08:c5:e1:a9:5f:13 IEEE 802.11: authenticated
Dec 15 17:46:01 turris hostapd: wlan0: STA 08:c5:e1:a9:5f:13 IEEE 802.11: associated (aid 1)
Dec 15 17:46:01 turris hostapd: wlan0: AP-STA-CONNECTED 08:c5:e1:a9:5f:13
Dec 15 17:46:01 turris hostapd: wlan0: STA 08:c5:e1:a9:5f:13 RADIUS: starting accounting session F6FA699A10DF938F
Dec 15 17:46:01 turris hostapd: wlan0: STA 08:c5:e1:a9:5f:13 WPA: pairwise key handshake completed (RSN)
Dec 15 18:46:54 turris kernel: [ 183.290528] xhci_hcd 0000:03:00.0: xHCI host not responding to stop endpoint command.
Dec 15 18:46:54 turris kernel: [ 183.298975] xhci_hcd 0000:03:00.0: xHCI host controller not responding, assume dead
Dec 15 18:46:54 turris kernel: [ 183.307165] xhci_hcd 0000:03:00.0: HC died; cleaning up
Dec 15 18:46:54 turris kernel: [ 183.307288] usb 2-1-port4: cannot reset (err = -22)
Dec 15 18:46:54 turris kernel: [ 183.315695] usb 2-1: USB disconnect, device number 2
Dec 15 18:46:54 turris kernel: [ 183.318165] usb 2-1.4: USB disconnect, device number 3
Dec 15 18:46:54 turris kernel: [ 183.360418] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
Dec 15 18:46:54 turris kernel: [ 183.368559] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 09 40 00 00 08 00
Dec 15 18:46:54 turris kernel: [ 183.376521] print_req_error: I/O error, dev sda, sector 2368
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read index block: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=17 count=1 br=-1: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Could not decode the type of inode 17
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=3826 count=1 br=-1: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Could not decode the type of inode 3826
Dec 15 18:46:54 turris kernel: [ 183.391976] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.397359] Buffer I/O error on dev sda1, logical block 786436, async page read
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=17 count=1 br=-1: I/O error
Dec 15 18:46:54 turris kernel: [ 183.409624] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.414892] Buffer I/O error on dev sda1, logical block 788526, async page read
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=8376 count=1 br=-1: I/O error
Dec 15 18:46:54 turris kernel: [ 183.427400] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.432404] Buffer I/O error on dev sda1, logical block 786441, async page read
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=36 count=1 br=-1: I/O error
Dec 15 18:46:54 turris kernel: [ 183.442959] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.448038] Buffer I/O error on dev sda1, logical block 787388, async page read
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=3826 count=1 br=-1: I/O error
Dec 15 17:46:54 turris block: hotplug-call call failed
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read of MFT, mft=3827 count=1 br=-1: I/O error
Dec 15 18:46:54 turris kernel: [ 183.462236] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.467139] Buffer I/O error on dev sda1, logical block 787388, async page read
Dec 15 18:46:54 turris kernel: [ 183.478339] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.483581] Buffer I/O error on dev sda1, logical block 4258931, async page read
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read vcn 0x28: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read vcn 0x28: I/O error
Dec 15 18:46:54 turris kernel: [ 183.491967] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.497046] Buffer I/O error on dev sda1, logical block 4258931, async page read
Dec 15 18:46:54 turris kernel: [ 183.505613] blk_partition_remap: fail for partition 1
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read index block: I/O error
Dec 15 18:46:54 turris kernel: [ 183.510676] Buffer I/O error on dev sda1, logical block 4258926, async page read
Dec 15 18:46:54 turris kernel: [ 183.519393] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.524670] Buffer I/O error on dev sda1, logical block 4258931, async page read
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read vcn 0x28: I/O error
Dec 15 18:46:54 turris kernel: [ 183.583274] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.588482] Buffer I/O error on dev sda1, logical block 4258931, async page read
Dec 15 18:46:54 turris kernel: [ 183.597354] blk_partition_remap: fail for partition 1
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read vcn 0x28: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read vcn 0x28: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read index block: I/O error
Dec 15 18:46:54 turris kernel: [ 183.603599] blk_partition_remap: fail for partition 1
Dec 15 18:46:54 turris kernel: [ 183.609498] blk_partition_remap: fail for partition 1
Dec 15 17:46:54 turris ntfs-3g[4501]: ntfs_attr_pread_i: ntfs_pread failed: I/O error
Dec 15 17:46:54 turris ntfs-3g[4501]: Failed to read vcn 0x28: I/O error
Dec 15 17:47:09 turris ntfs-3g[4501]: Unmounting /dev/sda1 (RZR-MBK)

I’d like to bump this thread to at least get some discussion going on, because I’m not considering my MOX to be working as intended.

Current state:

The device itself is stable, no weird reboots, HDDs are working fine, Qualcomm Wi-Fi still unusable.

What I’ve tried so far:

  1. Rescue mode 6 and installing minimum packages apart from LXC and SMB
  2. Making sure that I have all the necessary kmod drivers (that alone is a separate issue to a G module) - How to force USB 3 device to use mass-storage driver using quirks?
  3. Using additional power supply to make sure MOX doesn’t have enough juice
  4. Using only 1 known good USB drive that does not require USB power
  5. Checking the Qualcomm’s mPCIe card is properly seated - which is known to be working flawlessly before
  6. Checking G module is properly seated
  7. Trying HBT branch with OS 5.0

Without knowing the specifics of MOX I suspect that both G and F are sharing PCI lanes and when I hit the Wi-Fi with a load (e.g. speedtest), I effectively cut the communication to the USB hub as the G module gets priority over that. That would render the F module completely useless.

I can dig up any info necessary for debugging but everything seems normal during boot and I already posted the log of what happens when I try to use Qualcomm Wi-Fi with HDDs connected.

Could anyone please share any information or reasoning of what I’m experiencing with my MOX?

Hi,
I have a similar configuration AGFC and I have experienced the same problems.
Everything we connect to USB ports (even in case of USB3.0 port on A module) is disconnected in while. The noticeable difference between A and F USB 3.0 ports is that USB 3.0 port on A module is sometimes able to recovery from failure state (tested with USB 3.0 hub). The F USB 3.0 ports are unable to detect the connection of anything until reboot.

The devices which I tried are, external USB Sandisk SSD, FTDI serial, Behringer USB2.0 sound card, USBI2C01B converter.

Could you confirm that if you disable the Wi-Fi card in the G module (Qualcomm), your F module gets stable? I’m still fairly sure that the culprit is somewhere along the PCI bus.

Yes, I could confirm that. I do not disable the Qualcomm Atheros QCA9880 802.11bgnac module itself. I had only disabled all wifi connections in the LuCi interface and USB FTDI serial device connected to the F module become stable without noticeable failures.

Is there some software option to disable USB HUB in the F module to do a reciprocal test with wifi?

I’ve been trying unsuccessfully to get any help from the Turris team for over a month now. There’s no official response to my thread nor to my support ticket. I have a sneaking suspicion that it’s a hardware flaw that cannot be fixed by software change.
Now I’m stuck with a network device that actually does no real networking and only provides access to my drives. And that itself could be substituted by a simpler (and cheaper) device.

Still hoping I’m wrong but until somebody actually shows up, I declare my project as a expensive failure.

I wrote at Turris tech support yesterday, sending the “diagnostics data”, but no response until now.

I suppose there is something wrong with PCIe data transmission. Therefore I had try to test possible wifi interference on PCIe bus. The bus is probably 2.5Gbps wchich is pretty close to 2.4GHz and 5 GHz wifi carrier frequency. Unfortunatelly I noticed the F module malfunction even in case the wifi power was set to 1 mW (no matter if was 2.4GHz or 5 GHz), but I do not know if power settings is applied on every transmitted wifi packet.

The tech support finally responded and the issue seems to be solved by the following command in the root ssh session in up to date Turris OS:

fw_setenv quirks pci=nomsi

Then restarting the MOX.

3 Likes

It has been SO LONG! I stopped caring and disregarded MOX to a role of a dumb NAS.

While I’ll mark this as a solution because I’ve been able to turn access point on with my NAS still working, I’ve experienced a new problem that one of my USB HDDs stopped working when plugged to a bottom port on front side of module F right after I applied the quirks fix. No replugs or hard restarts could get me expected lsusb result. My quick workaround was to plug it straight to module A and the HDD was recognized immediately.

I’ll poke it with a dumb flashdrive to see if the ports are completely dead or somehow that particular drive was not happy with the quirks fix. I might be lucky that I only have 3 USB drives connected if 2 out of 5 are out.

Thanks for letting me know, because tech support still didn’t get in touch with me.