Some devices cannot access WAN from LAN based WAPs

OK, I’ve got my new Omnia up. It has TOS 7.1.3 and almost everything works perfectly. But Ibhave a deep mystery I am struggling to solve and seek guidance to diagnose and ultimately fix.

The Omnia is a gateway between the WAN and my LAN. On that LAN are numerous WAPs besides the Omnia’s two WLAN interfaces. These WAPs include two OpenWRT WAPs, two Netgear WAPs, and the Omnia as a WAP too.

The symptoms are as follows:

  1. All wired LAN devices have access to the WAN
  2. I am serving numerous websites from the LAN using lighttpd as a reverse proxy.
  3. I have numerous IOT things on the WLAN that are seeing the WAN.
  4. I can connect to any WAP with Linux laptop and I can access the WAN.
  5. I have tested three phones, a Samsung, a Google Pixel and an Air Ultra all Android and have problems. One the Pixel works fine, the other two refuse to connect to any of the WAPS complaining that they can connect but not access the internet so cycle between WAPs trying again and again.
  6. I can even ssh to the WAPs (the openWRT ones), and see the WAN from there. But when some phones connect to that WAP they cannot. And on the previous Omnia (that this one replaces, they could).

In summary, a select few devices (with nothing much in common bar Android of different versions) cannot connect to the WAPs because they complain about lacking access to the Internet.

To add to the conundrum, I have two Omnias. The one I am commissioning is my spare. Which I am glad of because my other one has developed a serious fil3system problem (nothing can be saved anymore every reboot resets to the last snapshot and I can’t take any new snapshots and a btrfs scrub finds a load of unrecoverable errors, and more … so it’s pulled out of service for now for deeper diagnosis and rebuild, but available for comparison on configs, in fact had to manually transfer a lot of configs across. The old one is TOS4 and the one I’m working with now is TOS7.

The TOS4 one allowed both these errant phones to acccess the internet on any of these WAPs. The TOD7 Omnia does not.

it is not a WLAN configuration, as evidence by the fact that the same issue exists on all WAPS, the Turris ones and four other WAPs on the LAN. It is not a general firewall issue as laptops and some phones have no problem. But there is something at play, likely in the routing/firewall zone that is denying some phones access to the WAN when they connect to one of my WAPs.

When I look at the Firewall configs in Luci they are many and complicated. It is much more useful to work CLI (using ssh) and being able to dump configs into text format files for comparison.

What I am looking for here is pointers. What kind of issue could be at play here, how do we diagnose it and ultimately fix it.

WAN access issue for some phones all WAPs, for no laptops on any WAP, nor for any wired LAN device nor many IoT devices on the WLAN (connected to different WAPs for reasons of signal strength across a broad property).

There is something at play here with Android. Not least it’s annoying feature of refusing to connect to the WAP if it can’t get internet. If I knew how to turn that off! Any laptop connects fine to the WAP internet or not, and can navigate LAN devices. Why not these phones for crying out loud. And why did these phone shave no problem on my old flaky TOS4 Omnia and what files/configs can I dump from that and the new one to compare.

Could it be the new Omnia’s Threat Detection or Active Firewall?

I’ve experienced similar issues with the Omnia with TOS 7.1.x releases. I attributed it to VLAN quirks which the Omnia (or rather its mvebu platform) has suffered from repeatedly over time. Not sure whether you’re using any VLANs and I could be wrong with my suspicion. I didn’t get to the bottom of it; ultimately, upgrading to TOS 8 (which builds on OpenWrt 23.05) did away with these issues.

A major clue has arrived. Experimenting with things on the phone, discovered that if I set a static IP, all is good. Conclusions are two:

  1. It’s a DHCP issue, or a DHCP security issue. To delve into.
  2. Android devs need a kick in the butt for not providing sensible feedback on that issue. “Can’t connect to internet” might me OK for a summary, but I should be able to get the details “DHCP server not found” or “DHCP server refuses to issue IP address” or …

I have enabled dhcp logging in dnsmasq now and tried dot connect but see no request arriving from the phone.

I have charged three old Android phones, and two connect fine (a Motorolla and an Honor) and one has the same problem (a Nokia)

I’m not actively using nay VLANs no, but my ISP does provide a VLAN ID, and I use than on the WAN configuration. But none on configured on my LAN.

What I need now is a DHCP diagnostic tool on Android which is my next port of call. I have managed to convince the phone to keep the link open and up and stop switching around by forgetting all my networks and turning Autoconnect off on the connection. The phone at least calms down and stays in the stable state of “Connected to device. Can’t provide internet”.

And it is fixed. The lack of clear diagnostics here is a PITA. Android sure sucks in that space. The fix was to ensure in ReForis a DHCP Start and Limit set that did not overlap with static leases. I infer then that the problem related to dnsmasq being confused in some way or offering IP addresses already used, which failed some check (should have) or worse, just put two devices with same IP on the network cause each of them to get very confused. Who knows?

How those two settings map into uci is mystery. I have set:

DHCP Start: 192.168.1.45
DHCP Max Leases: 100

in ReForis, and in uci I see:

uci show dhcp.lan

dhcp.lan=dhcp
dhcp.lan.interface=‘lan’
dhcp.lan.dhcpv4=‘server’
dhcp.lan.dhcpv6=‘server’
dhcp.lan.ra=‘server’
dhcp.lan.ra_flags=‘managed-config’ ‘other-config’
dhcp.lan.ignore=‘0’
dhcp.lan.start=‘301’
dhcp.lan.limit=‘100’
dhcp.lan.leasetime=‘43200’
dhcp.lan.dhcp_option=‘6,192.168.0.1’

uci show network.lan

network.lan=interface
network.lan.device=‘br-lan’
network.lan.proto=‘static’
network.lan.ip6assign=‘60’
network.lan.ip6ifaceid=‘eui64’
network.lan._turris_mode=‘managed’
network.lan.ipaddr=‘192.168.0.1/22’

The notes online suggest 192.168.0.1 + 301 is 192.168.1.45 which is credible (301-255=46 so +/-1 that’s credible).

dhcp.lan.start was 0, and I have a pile of static leases on 192.168.0.*

I’m glad you were able to narrow it down! In my case it wasn’t DHCP as devices were correctly configured but forwarding would stall after a couple of packets.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.