DNS broken, again

My Turris Omnia rebooted and DNS is broken again sigh. Non-local lookups work fine, it’s the local hosts that don’t resolve. That includes the turris itself.

I want to use kresd for non-local lookups and benefit from DNSSEC and dnsmasq for local lookups to have dhcp clients automatically added, with my local domain-name, static leases etc. I used a number of ‘tricks’ documented elsewhere on the forum eventually settling on adding this in /etc/kresd/kresd.custom.conf:

root@turris:~# cat /etc/kresd/kresd.custom.conf 
local lan_rule = policy.add(policy.suffix(policy.STUB('127.0.0.1@54'), policy.todnames({'mylocal.lan','42.168.192.in-addr.arpa'})))
policy.del(lan_rule.id)
table.insert(policy.rules, 1, lan_rule)

Some weird things I noticed but I really don’t have time nor motivation to reverse-engineer all involved scripts and configs, again.

Two instances of dnsmasq are running. They’re both controlled by /etc/init.d/dnsmasq

root@turris:~# ps w |grep dnsmasq 
15226 nobody    1184 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k -x /var/run/dnsmasq/dnsmasq.pid
15229 root      1112 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k -x /var/run/dnsmasq/dnsmasq.pid

nslookup can’t resolve ‘(null)’ issue

root@turris:~# nslookup turris
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'turris': Name does not resolve
root@turris:~# ping turris
ping: bad address 'turris'

dnsmasq correctly listening on port 54, kresd on 53

root@turris:~# netstat -ap |grep dnsmasq
    tcp        0      0 0.0.0.0:54              0.0.0.0:*               LISTEN      15226/dnsmasq
    tcp        0      0 :::54                   :::*                    LISTEN      15226/dnsmasq
    udp        0      0 0.0.0.0:54              0.0.0.0:*                           15226/dnsmasq
    udp        0      0 0.0.0.0:bootps          0.0.0.0:*                           15226/dnsmasq
    udp        0      0 :::54                   :::*                                15226/dnsmasq
    unix  2      [ ]         DGRAM                     23784 15226/dnsmasq       
    root@turris:~# netstat -ap |grep kresd
    tcp        0      0 0.0.0.0:domain          0.0.0.0:*               LISTEN      15455/kresd
    tcp        0      0 :::domain               :::*                    LISTEN      15455/kresd
    udp        0      0 0.0.0.0:domain          0.0.0.0:*                           15455/kresd
    udp        0      0 :::domain               :::*                                15455/kresd
    unix  2      [ ACC ]     STREAM     LISTENING      23346 15455/kresd         tty/15455
  • I don’t really care about foris. But here’s a screenshot of the DNS settings:

  • I’ve got 17 instances of python /usr/bin/foris-controller -b openwrt -C /var/run/foris-controller-client.sock ubus --path /var/run/ubus.sock Seems excessive.

  • I’ve got a couple of -opkg files in /etc/config, including resolver-opkg and dhcp-opkg. Who knows what settings I need to merge… auto-updater ftw…

Any help is appreciated!

I’m doing the same thing.

I’ve got multiple instances of dnsmasq running. I think that’s normal. I think something is broken though if you have 17 running. I think the last time that happened, my internet connection was down.

My setup is a bit different. I’m not sure what

is for.

This setup has been working for me through all my updates:

policy.add(policy.suffix(policy.STUB('127.0.0.1@5353'), policy.todnames({'mydomain.home','43.168.192.in-addr.arpa'})))
policy.add(policy.suffix(policy.DENY, policy.todnames({'168.192.in-addr.arpa'})))
policy.add(policy.suffix(policy.PASS, { todname('43.168.192.in-addr.arpa') }))

# nslookup mygames
nslookup: can't resolve '(null)': Name does not resolve

Name:      mygames
Address 1: 192.168.43.50 mygames.mydomain.home

I do not have “Enable DHCP clients in DNS” checked.

That nslookup error isn’t normal. It means the libc gethostbyname() doesn’t work properly which can cause all sorts of issues… It has something to do with the host not being able to lookup it’s own name.

root@turris:/etc/config# ping turris
ping: bad address 'turris'

I’ve noticed the 2nd dnsmasq instance is related to Foris’ "Enable DHCP clients in DNS’ checkbox. When checked, I get two instances when running /etc/init.d/dnsmasq start. When unchecked I only get one (running as ‘nobody’ user).

What’s your resolv.conf look like? I have

search mydomain.home
nameserver 127.0.0.1

If you don’t have a “search” statement, shortnames wont work.

When dnsmasq is running my resolv.conf looks similar to yours. When I stop dnsmasq, it gets rewritten to use my ISP DNS server. Which is incorrect since kresd is the preferred resolver and should be in charge of /etc/resolv.conf :frowning:

What happens when you do lookups against your dnsmasq instance?

The nslookup is just part of busybox. You’ll want to use dig. dig must have full names:

# dig -p 5353 gateway.mydomain.home

; <<>> DiG 9.11.2-P1 <<>> -p 5353 gateway.mydomain.home
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5589
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;gateway.mydomain.home.      IN      A

;; ANSWER SECTION:
gateway.mydomain.home. 0     IN      A       192.168.43.1

;; Query time: 0 msec
;; SERVER: 127.0.0.1#5353(127.0.0.1)
;; WHEN: Wed Jun 20 15:15:19 EDT 2018
;; MSG SIZE  rcvd: 69

why does it matter if it’s part of busybox? It should work, as it did before. Local lookups against dnsmasq work fine, including the router itself. For some reason I can now also lookup some other local hosts against kresd but not all of them :frowning: Those that work are listed in /tmp/kresd/hints.tmp

Edit: only static leases are listed in /tmp/kresd/hints.tmp perhaps my other dhcp clients need to renew their lease to get added?

You can’t specify a port.

Any utility munged into busybox is going to be pretty severely limited. You wont be able to talk to just dnsmasq, running on port 54 (in your case). dig is installed by default I believe and is much more powerful than nslookup anyway.

I know what busybox is. The point is, it used to work and now it doesn’t.

Nslookup is basically the most simple way to resolve a name, an executable around the well-known gethostbyname() function of the standard library, and works if the system is configured correctly (using the regular port 53 or the as specified in /etc/services).

Dig is way more powerful and meant for troubleshooting.

kresd log of an unsuccesful lookup of the router hostname:

[    0][plan] plan 'turris.' type 'A'
[57943][iter]   'turris.' type 'A' id was assigned, parent id 0
[57943][cach]   => trying zone: .
[57943][cach]   => NSEC sname: covered by: tunes. -> tushu., new TTL 85800
[57943][cach]   => NSEC wildcard: covered by: . -> aaa., new TTL 85800
[57943][cach]   => writing RRsets: +++
[57943][iter]   <= rcode: NXDOMAIN
[    0][resl] AD: secure (start)
[    0][resl] AD: secure (between ANS and AUTH)
[    0][resl] AD: secure (1)
[57943][resl]   finished: 0, queries: 1, mempool: 16392 B
[    0][plan] plan 'turris.' type 'AAAA'
[47247][iter]   'turris.' type 'AAAA' id was assigned, parent id 0
[47247][cach]   => trying zone: .
[47247][cach]   => NSEC sname: covered by: tunes. -> tushu., new TTL 85800
[47247][cach]   => NSEC wildcard: covered by: . -> aaa., new TTL 85800
[47247][cach]   => writing RRsets: +++
[47247][iter]   <= rcode: NXDOMAIN
[    0][resl] AD: secure (start)
[    0][resl] AD: secure (between ANS and AUTH)
[    0][resl] AD: secure (1)
[47247][resl]   finished: 0, queries: 1, mempool: 32784 B

It seems to completely ignore the custom config. No mention of it during startup either, though it is appended to /tmp/kresd.conf

Dhcp and static leases resolution is supported by kresd in newer Turris OS, there is no need for custom config for these use-cases.

/etc/config/resolver:

config resolver 'common'
    [...]
    option static_domains '1'
    option dynamic_domains '1'
    
config resolver 'kresd'
    [...]
    list hostname_config '/etc/hosts'

Static hostnames can then be set via „Hostnames“ in luci

Thanks! I had dynamic_domains ‘0’, changing it to ‘1’ makes kresd resolve my local hosts. It still won’t resolve the router hostname but I’ll just put it in /etc/hosts and be done with it…

Still curious why knot ignores the custom policy (@vcunat?)

Or add it in luci in hostnames, then it will resolve to yourrouter.lan, which never worked for me when setting it in /etc/hosts :wink:

You can specify as many aliasses as you like in a hosts file, i.e. 192.168.42.1 turris.mylocal.lan turris.lan turris: http://man7.org/linux/man-pages/man5/hosts.5.html

It’s unfortunate that we still have to use to this archaic file in 2018.

Seems I spoke to soon, I can resolve everything on the router itself, but not on other hosts :frowning:

But dig works, but only on fully-qualified names. I’m lost…

Hello.

It is unclear what currently works, what does not work, and how were doing your tests. Please provide exact commands (preferably dig but feel free to add nslookup if you wish) along with their outputs from router and some other host in LAN, we can debug it then.

If you want to check that your custom config snipper is being loaded you can do that by adding print('---test-mark---') into the snippet and check logs if the ---test-mark--- appears in there (after resolver restart).

The main issue of kresd not forwarding local queries to dnsmasq has been worked-around, by setting option dynamic_domains '1', enabling local lookups on kresd. Marked that comment as solution.

Some clients (those that depend on systemd-resolved) have issues with local lookups. Restarting the connection (i.e. restart NetworkManager) solves the problem, I’ve not spend enough time on it to further debug. It may be related to suspend/resume (though it’s not consistently broken after each resume). Dig works when I specify the router ip as dns directly, so it seems to be unrelated to kresd/dnsmasq on turris but some kind of caching issue with the local systemd-resolved.
One weird thing I did notice (but unrelated to local lookups) is that systemd-resolve --status reports DNSSEC is not supported.

patrickm@azazel:~$ systemd-resolve --status
Global
         DNS Servers: 127.0.0.1
          DNS Domain: mylocal.lan
          DNSSEC NTA: 10.in-addr.arpa
                      16.172.in-addr.arpa
                      168.192.in-addr.arpa
                      17.172.in-addr.arpa
                      18.172.in-addr.arpa
                      19.172.in-addr.arpa
                      20.172.in-addr.arpa
                      21.172.in-addr.arpa
                      22.172.in-addr.arpa
                      23.172.in-addr.arpa
                      24.172.in-addr.arpa
                      25.172.in-addr.arpa
                      26.172.in-addr.arpa
                      27.172.in-addr.arpa
                      28.172.in-addr.arpa
                      29.172.in-addr.arpa
                      30.172.in-addr.arpa
                      31.172.in-addr.arpa
                      corp
                      d.f.ip6.arpa
                      home
                      internal
                      intranet
                      lan
                      local
                      private
                      test

Link 2 (enp3s0)
      Current Scopes: DNS
       LLMNR setting: yes
MulticastDNS setting: no
      DNSSEC setting: no
    DNSSEC supported: no
         DNS Servers: 192.168.42.1
          DNS Domain: mylocal.lan

Running nslookup on the router still shows the weird ‘(null)’ issue but at least lookups now work:

root@turris:~# nslookup azazel
nslookup: can't resolve '(null)': Name does not resolve

Name:      azazel
Address 1: 192.168.42.2

I do prefer the ‘old’ solution that doesn’t rely on dhcp_host_domain_ng.py. The print statement in kresd.custom.conf is executed and appears in the logs but it just doesn’t seem to use the custom policies.

That ‘(null)’ is a bug in the busybox implementation of nslookup (and specifying whom to ask doesn’t work either).

DNSSEC in systemd-resolved: I believe it’s all about DNSSEC setting: no, as apparently it defaults to “off”.