DHCP stopped working

Fl0w · June 22, 2019, 12:22pm

Hi,

Today DHCP server stopped working properly without making any change in my configuration. Clients can’t get IP address from DHCP. I check the logs and I see that:
2019-06-22 14:18:10 crit dnsmasq[12327]: failed to create listening socket for port 53: Address in use
2019-06-22 14:18:10 crit dnsmasq[12327]: FAILED to start up
I check which process uses port 53, seems that kresd is using it.
Here is my config:

config dnsmasq
    option domainneeded '1'
    option boguspriv '1'
    option localise_queries '1'
    option expandhosts '1'
    option authoritative '1'
    option readethers '1'
    option leasefile '/tmp/dhcp.leases'
    option resolvfile '/tmp/resolv.conf.auto'
    option localservice '1'
    option port '53'
    option domain 'toto.lan'
    list server '192.168.0.18' < My PI-Hole server, running inside LXC container
    option nonwildcard '0'
    option rebind_protection '0'
    option logqueries '1'

config dhcp 'lan'
    option interface 'lan'
    option start '100'
    option limit '150'
    option leasetime '12h'
    option dhcpv6 'server'
    option ra 'server'
    list dhcp_option '6,192.168.0.1'
    option ra_management '1'

config dhcp 'wan'
    option interface 'wan'
    option ignore '1'
    list dhcp_option '6,192.168.1.2'

config odhcpd 'odhcpd'
    option maindhcp '0'
    option leasefile '/tmp/hosts/odhcpd'
    option leasetrigger '/usr/sbin/odhcpd-update'

Any idea of what’s wrong?

Thanks!

anon50890781 · June 22, 2019, 12:58pm

only one process can listen on a given port, kresd and dnsmsaq cannot be both at the same time claim port 53.

completely out of the blue - not updated the OS or rebooted (wittingly or unwittingly) or anything?

Fl0w · June 22, 2019, 1:12pm

Hi,
Thanks for your reply.
I know that only one process can listen, I tried killing kresd which was already listening on 53/udp, and starting manually dnsmasq, but it seems the init script calls kresd because it was running again after calling the script.
No update for a week or two, I just rebooted the router after noticing the issue.

anon50890781 · June 22, 2019, 1:18pm

The vanilla medkit sets dnsmasq

option port '0'

That seems to be changed on your node. Suggest to revert to a.m. setting and then restart dnsmasq and kresd

Maybe the pi-hole integration was the reason for the change, but that is only speculative of course

Fl0w · June 22, 2019, 1:23pm

That worked, thanks n8v8r!

uvesten · June 22, 2019, 8:07pm

I had the same problem, the dhcp service just stopped working suddenly today. Rebooted a couple of times, no dice. After reading this I checked my logs, and I also had failures

root@turris:~# cat /var/log/messages | grep FAILED
2019-06-22 16:59:49 crit dnsmasq[3913]: FAILED to start up
2019-06-22 16:59:53 crit dnsmasq[5022]: FAILED to start up
2019-06-22 16:59:58 crit dnsmasq[6192]: FAILED to start up
2019-06-22 17:00:00 crit dnsmasq[6341]: FAILED to start up
2019-06-22 17:00:04 crit dnsmasq[6608]: FAILED to start up

and also option port '53' for the dnsmasq in /etc/config/dhcp. Changing it to option port 0 seems to have solved it, but I’d really like to know what update was pushed today to break it after it’s been working flawlessly for months

Do you know how I can see the most recent updates installed, to figure out what broke dhcp on the turris?

mdv · June 23, 2019, 7:28am

Hi, I have the same problem with DHCP. But in /etc/config/dhcp i already have option port 0.

/etc/config/dhcp:

config dnsmasq
option domainneeded ‘1’
option boguspriv ‘1’
option localise_queries ‘1’
option rebind_protection ‘1’
option rebind_localhost ‘1’
option local ‘/lan/’
option domain ‘lan’
option expandhosts ‘1’
option authoritative ‘1’
option readethers ‘1’
option leasefile ‘/tmp/dhcp.leases’
option resolvfile ‘/tmp/resolv.conf.auto’
option port ‘0’
option logqueries ‘1’
option localservice ‘0’
option nonwildcard ‘0’
option dhcpscript ‘/etc/resolver/dhcp_host_domain_ng.py’

config dhcp ‘lan’
option interface ‘lan’
option start ‘100’
option limit ‘150’
option ignore ‘0’
option leasetime ‘86400’
list dhcp_option ‘6,192.168.1.1’

root@turris:~# dnsmasq
dnsmasq: failed to create listening socket for port 53: Address already in use

root@turris:~# cat /var/log/messages | grep FAILED
2019-06-23 09:22:00 crit dnsmasq[7752]: FAILED to start up
2019-06-23 09:22:05 crit dnsmasq[8411]: FAILED to start up
2019-06-23 09:22:10 crit dnsmasq[8683]: FAILED to start up
2019-06-23 09:22:13 crit dnsmasq[8760]: FAILED to start up
2019-06-23 09:22:18 crit dnsmasq[8900]: FAILED to start up
2019-06-23 09:22:23 crit dnsmasq[9989]: FAILED to start up
2019-06-23 09:22:23 crit dnsmasq[10006]: FAILED to start up
2019-06-23 09:22:34 crit dnsmasq[11270]: FAILED to start up
2019-06-23 09:22:45 crit dnsmasq[12628]: FAILED to start up
2019-06-23 09:22:52 crit dnsmasq[13011]: FAILED to start up

What should I do? I believe it started happening after an update in past two or three days.

Thank you

anon50890781 · June 23, 2019, 7:36am

Apparently dnsmasq still wants to listen on the already occupied port 53 though with the config it should not. Is this happening even after rebooting the router?

mdv · June 23, 2019, 9:07am

Yes, i tried to reboot the router many times.

anon50890781 · June 23, 2019, 9:10am

that is odd then indeed.

could you from the cli exec /etc/init.d/dnsmasq stop and then ps | dnsmasq ?

mdv · June 23, 2019, 9:16am

Still the same.

Port 53 is already in use.

root@turris:~# ps | dnsmasq

dnsmasq: failed to create listening socket for port 53: Address already in use
root@turris:~#

anon50890781 · June 23, 2019, 9:47am

what is the output of cat /tmp/etc/dnsmasq.conf.* | grep port ?

mdv · June 23, 2019, 9:51am

It seems dnsmasq is not working at all.

root@turris:~# cat /tmp/etc/dnsmasq.conf.* | grep port
cat: can’t open ‘/tmp/etc/dnsmasq.conf.*’: No such file or directory

But if I use a cat /tmp/etc/dnsmasq.conf | grep port, the result is port=0.

anon50890781 · June 23, 2019, 9:57am

If I am not mistaken that file is generated on the fly when dnsmasq gets started, which though however it is not (fails) and thus it probably is logical that the file does not exist.

Missed that - not sure whether that might be due to a different TOS version on my node or the actual cause of the issue. On my node this file /tmp/etc/dnsmasq.conf.cfg0b411c is in residence.

I am afraid I would not know of how else to get to the bottom of it.

as a temporary workaround you could try to manually from the cli dnsmasq -p 0

ftmx · June 23, 2019, 9:57am

What does this show: netstat -nap | grep LISTEN | grep 53 ?

anon50890781 · June 23, 2019, 10:02am

what is the output of cat /etc/config/ucitrack | grep dns ?

ftmx · June 23, 2019, 10:16am

I think there is an ordinal copy of dnsmasq config under /rom/etc/config/ : do you want to compare them to see what changed?

mdv · June 23, 2019, 10:23am

ftmx:

root@turris:~# netstat -nap | grep LISTEN | grep 53
tcp 0 0 0.0.0.0:53 0.0.0.0:* LISTEN 12014/unbound
tcp 0 0 127.0.0.1:8953 0.0.0.0:* LISTEN 12014/unbound

n8v8r:

root@turris:~# cat /etc/config/ucitrack | grep dns
option init ‘dnsmasq’

mdv · June 23, 2019, 10:25am

All files with dnsmasq says that port = 0.

anon50890781 · June 23, 2019, 10:28am

I suspect that a hotplug script is detecting the start of dnsmasq and rewriting the dnsmasq port