TOS 7 unbound troubles

I was using unbound as my primary resolver since TOS 5 or 6. It is configured to forward everything using DoT to a private recurser (which runs unbound as well, currently 1.19.3). Everything worked like a charm…

Until my update to TOS 7. It looked like everything is working well, but after a short while I got unexpected SERVFAIL results with the available unbound 1.17.1. Raising the verbosity showed that the tcp connections got RST from time to time.

I then prepared a turris build env and built my own unknown 1.19.3 based on the Turris package. And it got worse. The TLS connection gets up and running (and verified) and then the receiving unbound complains about “short packages” and other oddities and sends RST at first sight.

And now the point were I got completely confused… I tried to role back to the official unbound 1.17.1 by uninstalling and reinstalling it… and now this unbound shows exactly the same behaviour. No success with DoT at all.

Then I tried to switch from DoT to classic UDP/TCP@53 DNS… UDP seems to work, but every TCP lookup fails with odd results as well. Due to DNSSEC these are necessary for a lot of lookups. And yes UDP and TCP lookups work with our without DNSSEC with “dig”. So no basic setup problem.

Switching to kresd immediately fixes all issues even with DoT. So my outside recurser works perfectly fine, but the local unbound is a complete mess and I can’t figure out why.

I also tried to build unbound without TFO support, but same result.

I’m using (custom built) unbound’s in a lot of setups (mostly RHEL) and never had troubles like that.

Has anybody seen similar behaviour? Any clues or fixes what I can watch out for?

Wow. It seems I found the problem and the fix…

opkg install --force-reinstall libevent2

“verbosity 4” logs of unbound, the “only untouched component” used only by unbound and the “iftop” oddities reported here told me to give it a try.

And to answer the obvious question… my eMMC says it’s not dead yet:

eMMC Life Time Estimation A [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A]: 0x01
eMMC Life Time Estimation B [EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B]: 0x04
eMMC Pre EOL information [EXT_CSD_PRE_EOL_INFO]: 0x01

And “btrfs scrub” didn’t report errors either.

unbound 1.19.3 up&running now. Would be nice of the turris people if they can provide it upstream as well since there are CVEs for lower versions.

1 Like

Now that’s interesting…

root@turris:/mnt/snapshot-@370# ls -l ./usr/lib/*
-rw-r--r--    1 root     root        205007 Feb 14  2021 ./usr/lib/
root@turris:/mnt/snapshot-@370# ls -l /usr/lib/*
-rw-r--r--    1 root     root        213203 Feb 14  2021 /usr/lib/

… comparing the “reinstalled” version to the one in place after the update to TOS7 (still found in snapshot 370) shows quite some difference.

stat shows the exact same modification date for both:
Modify: 2021-02-14 19:38:15.000000000 +0100

Comparing it to even older snapshots it looks like it was not updated at all during TOS7 upgrade and the TOS6 version is responsible for the oddities used on TOS7.

Migration in TOS from release to newer release is magic.

Normally openwrt uses sysupgrade and doesn’t support migrations.

I wonder of there is someone who didnt reflash since TOS3.0 and survived all migrations so far

I think I used a medkit for TOS5 and updates since then.
I searched for packages with same version numbers in openwrt 19 and TOS7 base and packages repos and found some more candidates for troubles IMVHO.
At least most of them installed files with different sizes after “reinstalling” them.


I’m wondering where the old libevent 2.1.12-1 came from since openwrt 19 contained 2.1.11-7. But I did not search all repositories yet.

OT: That would be me =) However, I admit a few hot whiles with serial cable…