Why use knot-resolver?

When I set up my Omnia last year, it was not resolving some sites I wanted to see. So, I removed knot-resolver and installed dnsmasq-full, and configured it to keep doing it in the /etc/updater/conf.d/user.lua

I set up another Omnia a couple months ago without changing the resolver. And then last week, kresd started crashing. The symptom was that requests for DNS resolution would fail, and then I would SSH into the router and find that kresd was not running. I tried restarting kresd, but it would crash again after a couple days. I tried rebooting the router, but then kresd crashed again. When I tried to restart it, it froze the entire router. So, I removed knot-resolver and installed dnsmasq-full.

Why bother with knot-resolver anymore? The main reason when the Omnia was first created was DNSSEC, but now dnsmasq does DNSSEC, too. And dnsmasq has way more eyeballs, is much less likely to fail mysteriously. Maybe the project should just switch to dnsmasq.

I keep on reading complaints knot does x and y or crashes or doesn’t do z…
But I for myself didn’t have any problems on both TO I own from the very beginning and the beginning of this year respectively. With the first ISP DNSSEC worked seamlessly, with the second not but then DNS over https was already available so I don’t care anymore.
I am running openwrt routers for about 10 years by now an can’t figure out why it is necessary to do heavy adjustments to a resolver - simple things like setting up DNS over https or configuring static routes or hosts, OK, but anything further …
Sry, only my 2c as I’m simply annoyed of that knot-bashing

To cut a long story short: why use knot-dns? Because it simply works!

1 Like

OpenWRT has used dnsmasq as the default resolver for many years now. Most Linux router makers use dnsmasq. Only CZ.NIC, the developer of knot-resolver, uses knot-resolver in a consumer router.

Since the resolver has been crashing after a couple days, I suspect the latest update has some mistake in the manual memory management, that’s exposed because my second Omnia is in a network with dozens of simultaneous users carrying many devices. The typical home network has not that many users.

Would you provide logs from the crashes, etc.? Ideally verbose logs according to https://doc.turris.cz/doc/en/howto/dnsdebug By any chance, could some (other) process have filled /tmp or RAM?

BTW, knot-resolver is the resolver used by 1.1.1.1, so I wouldn’t say it has few eyeballs, given how new the project is. The main question: I certainly see advantages, but obviously I can’t be completely impartial. Turris switching away from knot-resolver (by default) seems very unlikely to me, but ultimately it wouldn’t be my decision anyway.

Yes, I was aware that Cloudflare uses Knot Resolver. I also know that they heavily modify and test the software that they use; and they use a modern cloud setup with anycast and load balancers, so a single server crashing wouldn’t disrupt their service.

I’m not sure if the device ran out of RAM or /tmp. All I can say is it has been 4 days running dnsmasq-full, and it has plenty of available memory.

I think debugging a crashing service on a device used by dozens of simultaneous users (who are at work providing a service to hundreds of clients) is not a prudent maneuver, but I guess I can reinstall knot-resolver on my first Omnia with merely 8 users and hope that it crashes.

On Omnia kresd service does get restarted as expected. I now tried kill and kill -9. I’m surprised it apparently takes up to five seconds (roughly) for the process to re-appear, so some queries would be lost, but that shouldn’t matter in practice. (Under systemd this seems to work better.) Cache is persistent through restarts by default, so there won’t be this slow startup period.

Well, I’m pretty sure that implementation of DNSSEC in dnsmasq is buggy and not that robust like other resolvers. Almost nobody uses it, because it is off by default. If you encounter temporary problems with Knot resolver, the easiest solution is to switch to unbound, which is already installed and all you have to do is editting configuration file /etc/config/resolver

2 Likes

I see. After 1½ weeks, dnsmasq stopped responding to DNS requests on the network. So, I’m going to try knot-resolver with debugging turned on, and see how long it takes to fail. And hopefully find what it’s doing.

Alright, something really is filling the RAM and triggering the oom-killer. The vexing part is that it runs occasionally, so most of the time I’m seeing the router with over 1 GB of available memory, and then in a flash some important daemon has been killed. I suspect the culprit is ucollect.

Also, unbound is not installed by default.