DNSSEC temporarily failing

Hello,
I’m getting complaints from my family “the internet does not work”… :frowning:

Short:
Today I was able to detect name resolution on a laptop connected on ETH to my Turris 1.0, while ping to internet address works:

root@:~# nslookup us-east-2.console.aws.amazon.com
;; connection timed out; no servers could be reached

dig:

root:~# dig us-east-2.console.aws.amazon.com

; <<>> DiG 9.11.19 <<>> us-east-2.console.aws.amazon.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 6768
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 8192
;; QUESTION SECTION:
;us-east-2.console.aws.amazon.com. IN	A

;; Query time: 2 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 31 20:03:39 CET 2022
;; MSG SIZE  rcvd: 61

ping:

root:~# ping 217.31.205.50
PING 217.31.205.50 (217.31.205.50): 56 data bytes
64 bytes from 217.31.205.50: seq=0 ttl=56 time=39.398 ms
64 bytes from 217.31.205.50: seq=1 ttl=56 time=21.014 ms
64 bytes from 217.31.205.50: seq=2 ttl=56 time=19.195 ms
^C
--- 217.31.205.50 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 19.195/26.535/39.398 ms

repeat the dig:

root:~# dig us-east-2.console.aws.amazon.com

; <<>> DiG 9.11.19 <<>> us-east-2.console.aws.amazon.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 61140
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 8192
;; QUESTION SECTION:
;us-east-2.console.aws.amazon.com. IN	A

;; Query time: 2 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 31 20:05:32 CET 2022
;; MSG SIZE  rcvd: 61

and old school nslookup:

;; connection timed out; no servers could be reached

So I logged to forris and disabled DNSSEC and name resolution work:

root:~# nslookup us-east-2.console.aws.amazon.com
Server:		127.0.0.1
Address:	127.0.0.1#53

Name:      us-east-2.console.aws.amazon.com
us-east-2.console.aws.amazon.com	canonical name = gr.console-geo.us-east-2.amazonaws.com
Name:      gr.console-geo.us-east-2.amazonaws.com
gr.console-geo.us-east-2.amazonaws.com	canonical name = af2049b9c08c62706.awsglobalaccelerator.com
Name:      af2049b9c08c62706.awsglobalaccelerator.com
Address 1: 76.223.79.155
Address 2: 13.248.199.77
us-east-2.console.aws.amazon.com	canonical name = gr.console-geo.us-east-2.amazonaws.com
gr.console-geo.us-east-2.amazonaws.com	canonical name = af2049b9c08c62706.awsglobalaccelerator.com
root@molekula:~# dig us-east-2.console.aws.amazon.com

; <<>> DiG 9.11.19 <<>> us-east-2.console.aws.amazon.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1786
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 8192
;; QUESTION SECTION:
;us-east-2.console.aws.amazon.com. IN	A

;; ANSWER SECTION:
us-east-2.console.aws.amazon.com. 59 IN	CNAME	gr.console-geo.us-east-2.amazonaws.com.
gr.console-geo.us-east-2.amazonaws.com.	60 IN CNAME af2049b9c08c62706.awsglobalaccelerator.com.
af2049b9c08c62706.awsglobalaccelerator.com. 300	IN A 76.223.79.155
af2049b9c08c62706.awsglobalaccelerator.com. 300	IN A 13.248.199.77

;; Query time: 752 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 31 20:06:12 CET 2022
;; MSG SIZE  rcvd: 195

root:~# cat /etc/turris-version 
3.11.23

Sometime later (I’ve found in the forums options like +trace so I’ve enabled DNSSEC back and surprisingly, it worked.

One more thing I did, while the name resolution with DNSSEC does not work on my Turris 1.0, I’ve tried to test the name resolution using Omnia (TOS5, DNSSEC enabled) and it properly worked.

All posts to DNSSEC I’ve found were about bad answers for a domain (BNP Paribas…), but this looks different to me as one Turris was able to resolve the name with DNSSEC, while the other was not.

Honestly, to me, it looks the same issue happens on Omnia with TOS5 (and previously TOS3) as user symptoms are the same, but this time I was able to connect to Turris and do some tests.

The question is, how to configure some diagnostic to capture log files that can be analyzed, I hardly believe that two independent internet providers will suffer from the same (or that too similar) issue(s).

The intermittent issue is unlikely to be captured by and analyzed hand…

From a configuration point of view, regarding DNS I only used Forris interface to (initially enable DNSSEC + forwarding to CZ NIC servers) disable forwarding and for the test case also to disable DNSSEC (and finally re-enable it back), so no UCI commands nor configuration file manual edits.

Long story:
What I was able to observe, switching WiFi on android mobile off and on resolves that (bloody android ?.. well, it looks no). Smart TV sometimes freezes (usually when ads break the movie…).
Than bad part is, if that happens while using mobile banking…

Having a laptop connected to the same wifi, ping to a known address from the internet works and rarely we are above 10Mbps while using Turris 1.0 with provider claiming (and according to netmeter it is true) 100Mbps downlink (provide use a 5G modem) and a second provider connected to Turris Omnia (recently upgraded to TOS5) with symmetric connectivity 20Mbps (according to netmetr, it works as well).

The only suspect in my mind (meaning I’ve ignored all more probable options…) was DNS resolving.

None of my providers supports DNSSEC, so I’ve been using forwarding to CZ.NIC validating resolvers, but since one day I received an email from my router that it seem not to work properly and suggested to switch off forwarding, I’m not using forwarding now.

Thanks for any help, Ales

How do you set up DNS forward? Try change …

ISPs should not matter, unless you have forwarding to ISP’s DNS servers. (or they actively intercept your DNS, but I’d consider that very bad)

There is Debugging DNS problems on Turris routers [Turris wiki], but I suspect it won’t work well on Turris 1.x, as it uses Unbound for the DNS daemon.

And yes, simple practical workaround attempts often include trying various forwarding targets and the mode without forwarding.

Thanks for the hints.
Forwarding was switches off as of 2020-08-31, I received notification:

Oznámení o chybách

DNS servery vašeho poskytovatele internetu nefungují úplně dobře - pravděpodobně nepodporují > DNSSEC. Bylo proto vypnuto forwardování a váš router bude nyní vyhodnocovat DNS dotazy sám.

It surprises me since I had configured CZ NIC ODVR servers: 193.17.47.1 and 185.43.135.1. It happens for both ISP.

Since I have also Omnia with TOS5, I hope some diagnostic would be possible, the problem for me is, that as the issue is intermittent, except additional logging it’d hard for me to know for what to search in the logs.
Is there an tool I can install and use it like I use netmeter for internet connection speed check and ping in the collectd graphs to check periodically that DNS resolving works and if it detect a failure, than colelct log files for support?

Thanks Ales

/etc/resolver/resolver-debug.sh start

/etc/resolver/resolver-debug.sh stop

/etc/resolver/resolver-debug.sh print-logs

thanks, when I experience such an issue, I’ll try the mentioned debug collection commands.

Luci - Scheduled Tasks

DHCP preventive debug … if an error occurs … we have tracking saved

debug DHCP error 
31 23 * * * /etc/resolver/resolver-debug.sh stop 
33 23 * * * /etc/resolver/resolver-debug.sh start
1 Like