Intermittent DNS resolving issues with Turris Omnia running 6.0.1

blinkenlightsenjoyer · October 30, 2022, 7:00pm

I’m having DNS-resolution issues, and possibly other connectivity issues. Usually, it manifests in DNS failures in my web browser (“could not find www.duckduckgo.com”). After waiting a bit (say, 30–90 seconds), DNS resolves and I can get to the website.

I had a look at these other pages and they seemed somewhat helpful:

Not connecting to applications like Discord (although I’m not sure if the issue is “didn’t connect to the website in the past 10 minutes or so”)
- Not connecting to applications like Discord - #8 by vcunat (I disabled DNS over IPv6 since Frontier doesn’t support IPv6 out here at all)

I’ve changed my DNS resolver to be CZ.NIC. After about a day of all this, all of my DNS problems seem to be fixed.

What changed between 5.x and 6.x that might cause intermittent DNS-resolution failures with my ISP? It’s Frontier, if it matters. I considered disabling DNSSEC temporarily, but the big scary “you might be vulnerable to DNS spoofing attacks” warning convinced me I shouldn’t turn DNSSEC off for this, at least not without a better idea of the risks involved (are the risks lower if I’m using my provider’s DNS instead of a DNS provider like Cloudflare or CZ.NIC?)

vcunat · October 30, 2022, 7:31pm

There’s nothing like that, I believe, assuming it was relatively recent 5.x and that you don’t have Turris 1.x.

Probably. If the provider does verify DNSSEC for you, there’s much shorter path to attack between you and your provider than between you and e.g. Cloudflare.

blinkenlightsenjoyer · October 30, 2022, 7:41pm

It was probably a recent 5.x, since I have an Omnia and I didn’t pause upgrades on it ever.

Would it be a decent idea to disable DNSSEC and try my ISP’s resolver again? If it starts messing up, that would be something that they ought to be told about, right?

All the DNSSEC that an Omnia does is done via OpenWRT, right? If I debug this for/with my ISP, I’d like to be able to tell them “your DNS service is messing up with recent builds of OpenWRT”, not “your DNS service is messing up with an indie $400 router”.

vcunat · October 30, 2022, 8:16pm

No, the DNSSEC part is actually added value in comparison to usual OpenWrt. The SW doing DNS(SEC) is written by cz.nic (almost exclusively), with me being one of the devs.

Actually my preferred option is to turn off forwarding, i.e. not involve any other resolver at all (ISP’s or anyone else’s). But it’s your choice, of course.

ssdnvv · October 30, 2022, 9:10pm

Same issue for me - resolving this forum takes ages via Wi-Fi and milliseconds via mobile internet.
I am using cloudflare as DNSSEC-provider.

Personal gut feeling: I won’t dare stating my ISP (1&1 / Germany) is more trustworthy than cloudflare. They alltogether sell all end everything. If there is anything sensitive, make sure you are using at least TOR…

Pepe · October 30, 2022, 9:25pm

Well. Wait a second. In your case, does this happen only with this forum or you notice it also elsewhere? Because I noticed that some threads here are taking some time to load, I would not say that this is the same issue OP does as it only affects this forum.

MiKe · October 30, 2022, 9:34pm

Sorry for offtopic but I’ve also encountered same turris forum as you. I’ll try to monitor if it’s also happening on other sites

ssdnvv · October 30, 2022, 9:54pm

Maybe.
I’ll keep an eye on it.

vcunat · October 31, 2022, 6:55am

I do encounter forum.turris.cz long loads sometimes, not other webs.

blinkenlightsenjoyer · October 31, 2022, 5:25pm

That sounds nice in theory for privacy, but what happens if I visit a totally new website for the first time and my router doesn’t know what IP address to use to actually send data to it?

blinkenlightsenjoyer · October 31, 2022, 5:35pm

I’d consider using a different resolver than my ISP (and liking it, as opposed to thinking of it as a temporary measure), but neither of the domestic options I’m familar with (Google and Cloudflare) are ones I trust, mainly because of their size and how they like to throw their weight around in the cultural sphere. I try to avoid using their products wherever possible.

vcunat · October 31, 2022, 5:44pm

The DNS service on Omnia replaces the one provided by your ISP or Cloudflare or whoever.

dhopfm · October 31, 2022, 6:51pm

Your router will still use DNS to resolve the domain name. But instead of forwarding this request to another DNS server, it will take care of the resolution itself. It starts by querying root DNS servers which then point it to relevant subordinate DNS servers. The list of root DNS server addresses is built into Knot (and any other DNS resolver).

blinkenlightsenjoyer · October 31, 2022, 7:16pm

Thanks for the explanation. I was about to ask.

Shouldn’t I feel at least a little bad for using root DNS servers like this, just like I should feel at least a little bad for using a Stratum 1 NTP server directly instead of time.windows.com or time.apple.com?

vcunat · October 31, 2022, 7:41pm

No, not at all. You use root servers only very little – to find the TLD servers (top-level domain, e.g. .com or .cz). That’s cached for 24h.

iron-maiden · October 31, 2022, 10:20pm

I once tried pure Dnsmasq with Stubby with Cloudflare, it works very good.

WayOutWest · November 7, 2022, 2:01am

FYI, I did have DNS resolution issues too after installation of 6.0.1 (I’m using Cloudflare 1.1.1.1) but these were resolved after I performed another router reboot and computer restart.

All good now.

sgusa · January 10, 2023, 6:07pm

I’ve had somewhat (?maybe?) similar issues for some time now.
I’ve only ever noticed this on one single client (my work laptop) - but its possible I’ve only noticed it here because I’m on that machine 8-10 hours a day.

Intermittently, I seem to lose DNS on this device.
It only seems to happen maybe 1 or 2 days a month - other days seems to be just fine.
On days that the issue occurs, it seems to happen repeatedly. – and the next day is fine again.

I suspect that is is somehow around DHCP lease ( my length of lease is 12 hours)
And that the “it works fine the next day” is a result of some process that finally occurs after another 12 hours…

Anyway. Just happened again, and I noticed something in the logs.

a Lease renewal by another device had just occured before the device i was using started having issues.
That other device (and yet a 3rd device renewing lease) gettting a renewal did not result in a log entry for:
turris dhcp_host_domain_ng.py: Refresh kresd leases

I’m likely grasping at straws, but maybe this would give some insight to someone more technical than myself.


Jan 10 11:22:33 turris dnsmasq-dhcp[5713]: DHCPREQUEST(br-guest-turris) 10.111.222.208 <MAC1> 
Jan 10 11:22:33 turris dnsmasq-dhcp[5713]: DHCPACK(br-guest-turris) 10.111.222.208  <MAC1> Roku3

** a  device renews lease. Note the lack of “refresh kresd leases” call

** I’ve suddenly lost routing on my work laptop.  I manually recycle my client (disconnect, connect)
** I believe that only DNS functionality / dhcp. — but I need to confirm.


Jan 10 17:26:53 turris hostapd: wlan1: AP-STA-DISCONNECTED  <MAC2> 
Jan 10 17:26:53 turris hostapd: wlan1: STA <MAC2> IEEE 802.11: disassociated
Jan 10 17:26:54 turris hostapd: wlan1: STA <MAC2> IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Jan 10 17:26:57 turris hostapd: wlan1: STA <MAC2> IEEE 802.11: associated (aid 2)
Jan 10 17:26:57 turris hostapd: wlan1: AP-STA-CONNECTED <MAC2>
Jan 10 17:26:57 turris hostapd: wlan1: STA <MAC2> RADIUS: starting accounting session 7BA1B9C4C5D694B6
Jan 10 17:26:57 turris hostapd: wlan1: STA <MAC2> WPA: pairwise key handshake completed (RSN)
Jan 10 11:26:57 turris dnsmasq-dhcp[5713]: DHCPREQUEST(br-lan) 192.168.6.137 <MAC2>
Jan 10 11:26:57 turris dnsmasq-dhcp[5713]: DHCPACK(br-lan) 192.168.6.137 <MAC2> PM-11709
Jan 10 17:26:57 turris dhcp_host_domain_ng.py: DHCP update hostname [PM-11709,192.168.6.137]
Jan 10 17:26:57 turris dhcp_host_domain_ng.py: Refresh kresd leases

^^. Note the call to “refresh Kresd leases. After I manually recycle the connecion


**. A few minutes later,  yet another device on my network (printer this time) renews its lease

Jan 10 11:33:10 turris dnsmasq-dhcp[5713]: DHCPREQUEST(br-lan) 192.168.6.165 <MAC3>
Jan 10 11:33:10 turris dnsmasq-dhcp[5713]: DHCPACK(br-lan) 192.168.6.165 <MAC3> HP81DD5F
Jan 10 17:35:01 turris crond[11532]: (root) CMD (/usr/bin/notifier)
Jan 10 17:35:01 turris crond[11531]: (root) CMDOUT (There is no message to send.)
Jan 10 17:35:01 turris crond[11531]: (root) CMDEND (/usr/bin/notifier)

^^ again, in this case, “Refresh Kresd leases”. Call does not occur


** I’ve lost routing on my work laptop again.  I manually recycle my client (disconnect, connect)


Jan 10 17:35:07 turris hostapd: wlan1: AP-STA-DISCONNECTED <MAC2>
Jan 10 17:35:07 turris hostapd: wlan1: STA <MAC2> IEEE 802.11: disassociated
Jan 10 17:35:08 turris hostapd: wlan1: STA <MAC2> IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Jan 10 17:35:10 turris hostapd: wlan1: STA <MAC2> IEEE 802.11: authenticated
Jan 10 17:35:10 turris hostapd: wlan1: STA <MAC2>IEEE 802.11: associated (aid 2)
Jan 10 17:35:10 turris hostapd: wlan1: AP-STA-CONNECTED <MAC2>
Jan 10 17:35:10 turris hostapd: wlan1: STA <MAC2> RADIUS: starting accounting session F1395D653B9B982C
Jan 10 17:35:10 turris hostapd: wlan1: STA <MAC2> WPA: pairwise key handshake completed (RSN)
Jan 10 11:35:10 turris dnsmasq-dhcp[5713]: DHCPREQUEST(br-lan) 192.168.6.137 <MAC2>
Jan 10 11:35:10 turris dnsmasq-dhcp[5713]: DHCPACK(br-lan) 192.168.6.137 <MAC2> PM-11709
Jan 10 17:35:11 turris dhcp_host_domain_ng.py: DHCP update hostname [PM-11709,192.168.6.137]
Jan 10 17:35:11 turris dhcp_host_domain_ng.py: Refresh kresd leases

^^ again, a manual recycle of the lease results in a call to “refresh kresd leases” after DHCPACK

vcunat · January 10, 2023, 6:41pm

dhcp_host_domain_ng.py just provides resolving of foo.lan names. That’s most likely not what you’re interested in.

sgusa · January 10, 2023, 7:17pm

Thanks. I figured it was a long shot, but wanted to post just in case.

I think the actual issue is on the client, and not in the Turris router.
(This thing is full of security management software controlled by company IT, including network , vpn, etc… and the fact that no other machine in my network has issues…)