IPV4 DHCP fails regularly

I regularly have issues with IPV4 DHCP not renewing leases. In the last day, I have battled this issue twice and am currently in a situation where no new IPV4 leases are being given. I have found many older posts with this referenced, but none seem to have an actual fix. I am hoping i missed something in the mess of posts, but running the standard commands below do not resolve the issue immediately; if they do resolve the issue eventually, after a reboot, the issue returns basically every time. Over the last day, I have had it recur without a reboot and it is making my life very difficult, to put it nicely.

Does anyone have any input on what to do to mitigate this or at the least, a fix that works every time this occurs?

Command that ‘eventually’ seem to work after multiple reboots or schnapps rollbacks and/or reboots:
/etc/init.d/dnsmasq restart

Do you use dnsmasq for dns as well? Port 53 is for DNS. If yes it doesn’t work well with Kresd and resolver because of racing, and causing dnsmasq crashes intermittently. If not set port to 0. Setting dnsmasq for dns is another story, the easiest way can be setting Kresd port some other than 53.

That workaround has been reliable in my case. So far I’ve only noticed the issue after router reboot (sometimes).

EDIT: the linked posts are around collision on port 53; I don’t think those would be related to what you described, but you’d see the collision in the logs.

I’m at my wits end on this… The last few times this happened i have had to revert to a previous software version [via schnapps] and then manually run “dnsmasq restart” after the reversion to get IPV4 DHCP working. Happy to give logs/info on anything anyone thinks may be relevant, as this issue is just a nightmare scenario for someone who works remote and relies on their home-network to be stable. The fact that it has happened 3 times in less than 2 days is making me rethink using the Omnia as my router at this point.

What info would help to troubleshoot this further?

You could try to set lease time to “infinite” for that particular laptop you work with.

I’ve set static IP’s on my important machines/devices, but this doesn’t help when the family gets home and the “media” devices are not working. Thanks for the suggestion tho, as I may need to set an infinite timeout in the DHCP to try and keep the fallout minimal when I do have an issue.

some progress in finding the issue!
I found the post below on the site the other day. Today, I awoke to things offline again. When I checked the “dnsmasq.conf” file, I saw that it no longer had “dhcp-range”, dhcp-option", and “no-dhcp-interface” variables. I deleted the “dnsmasq.conf” file and ran “/etc/init.d/dnsmasq restart” and afterwards it had the variables and DHCP began to give out addresses immediately…

DHCP not working after upgrade to TurrisOS 5.3.3 - #2 by hagrid - SW help - Turris forum

1 Like

I had similar problem and then adding below in /etc/config/dhcp for all subnets solved my problem, it is happening because of race condition from netifd.

option force '1'

4 Likes

I had previously set a startup script in /etc/rc.local to remove the config file and restart the service to try and mitigate the issue:

rm /tmp/etc/dnsmasq.conf.xxxxxxxx
/etc/init.d/dnsmasq restart

I have now commented those out and added the config you outlined per the linked documentation. The first change I see is that I no longer get an error when restarting the service, so this seems like a great sign!

BEFORE:
root@turris:~# /etc/init.d/dnsmasq restart
root@turris:~# udhcpc: started, v1.25.1
root@turris:~# udhcpc: sending discover
root@turris:~# udhcpc: no lease, failing

AFTER:
root@turris:~# /etc/init.d/dnsmasq restart
root@turris:~#

I also rebooted the router to see if I had any issues and see that IPV4 addresses were issued successfully. The only change I see in the config file now is that there is no longer an entry for “no-dhcp-interface=” variable defined, as before it had one there for “=eth2”. (Not a concern, just an observation I have made.)

The dhcp-range missing after boot bug was fixed in dnsmasq: abort dhcp_check on interface state ¡ openwrt/openwrt@aa403a4 ¡ GitHub by changing carrier to up in /etc/init.d/dnsmasq.

If you are using the hbl branch, which has dnsmasq_2.85 then this fix is already included. In hbk with dnsmasq_2.80 this fix is not present yet.

I fixed it for myself by changing the relevant line:

sed -i 's/\(jsonfilter -e @[.]\)carrier/\1up/' /etc/init.d/dnsmasq

Although having option force '1' for the lan interface is probably not a bad idea. Because another check what’s skipped by force is to check for another DHCP server on the network. I want my router to provide DHCP even if there is a rogue DHCP server on the network somewhere.

Thanks! I have made the edit as well and appreciate the additional input on this issue. Glad to also hear this was a known issue that was resolved in the newer releases.