Turris OS 3.11 is out!

Pepe · December 14, 2018, 1:10pm

Turris OS 3.11: miniupnpd no longer starts at router boot

For you and the others, who have this issue (e.g. @Jack_Konings, @Sammy_cda), we did some changes in the upcoming version of Turris OS. We’re preparing to include an updated version of miniupnd. We’re looking into it.

If I forget to catch any other issues, I’d appreciate, if you or somebody else can let me know to get it sorted.

Pepe · December 14, 2018, 1:22pm

You can do the changes from the commit in /etc/init.d/kresd. But I’m more curious, why your router doesn’t do resolving. Did you setup DNS over TLS from the community documentation? If yes, you’d need to required changes, which are described in the documentation: https://doc.turris.cz/doc/en/public/dns_knot_misc#using_dns_over_tls

winkler · December 14, 2018, 2:13pm

I did

btrfs dev stat /dev/mmcblk0p2

on both boxes. And both have the same output:

root@turris1:~# btrfs dev stat /dev/mmcblk0p2
[/dev/mmcblk0p2].write_io_errs    0
[/dev/mmcblk0p2].read_io_errs     0
[/dev/mmcblk0p2].flush_io_errs    0
[/dev/mmcblk0p2].corruption_errs  0
[/dev/mmcblk0p2].generation_errs  0

On one box I do not see those parent transid lines now. And if I understand you correctly, if numbers match, it is standard transaction ongoing and nothing harmful, right?
But second one has this in the beginning of output from “schnapps list”, followed by correct list of snapshots.

root@turris1:~# schnapps list
Warning, could not drop caches
Warning, could not drop caches
parent transid verify failed on 84262912 wanted 517972 found 514186
parent transid verify failed on 84262912 wanted 517972 found 514186
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=123633664 item=334 parent level=1 child level=2
Warning, could not drop caches

After a while it looks like:

root@turris1:~# schnapps list
Warning, could not drop caches
Warning, could not drop caches
parent transid verify failed on 123961344 wanted 518196 found 518131
parent transid verify failed on 123961344 wanted 518196 found 518131
Ignoring transid failure
Couldn't setup extent tree
Warning, could not drop caches

For btrfs scrub I assume that it will be good idea to have some good snapshot safely exported first in case that something goes wrong. So I must first do some homework. And maybe I will wait for 3.11.1 and see if it solves anything by chance.

And no, I did not have power outage for at least two months.

ejb · December 14, 2018, 2:22pm

Thanks, @Pepe. Yes, I set up DNS over TLS (using Cloudflare), and that seems to work now after another reboot.

In /etc/init.d/kresd I see:

USERNAME=kresd
GROUP=kresd

But “kresd” is not present in /etc/passwd and in /etc/group. Hence, kresd is running as root. Sounds like I’d rather change that to kresd.

AreYouLoco · December 14, 2018, 4:26pm

@ejb:
groupadd kresd
useradd -g kresd kresd

Should solve the issue…
EDIT:
I don’t know what about correct GID number

jiri.kunc · December 14, 2018, 4:49pm

For finish installing Turris 3.11, I stopped the ddns service. Reboot was successful. Now I have Turris 3.11. Can I run the ddns service now?

renne · December 14, 2018, 4:53pm

Obviously the Nextcloud update didn’t go through. Both packages - nextcloud-install and nextcloud - are version 14.0.4-0 while config.php says'version' => '13.0.7.2',.

If http://192.168.1.1/nextcloud/updater/ is opened, it shows the message

## Authentication

To login you need to provide the unhashed value of "updater.secret" in your config file.

If you don't know that value, you can access this updater directly via the Nextcloud admin screen or generate your own secret:

`php -r '$password = trim(shell_exec("openssl rand -base64 
48"));if(strlen($password) === 64) {$hash = password_hash($password, 
PASSWORD_DEFAULT) . "\n"; echo "Insert as \"updater.secret\": ".$hash; 
echo "The plaintext value is: ".$password."\n";}else{echo "Could not 
execute OpenSSL.\n";};'`

.

root@turris:~# pkgupdate
WARN:Script file:///usr/share/updater/localrepo/localrepo.lua not found, but ignoring its absence as requested
WARN:Requested package luci-i18n-ddns-en that is missing, ignoring as requested.

It’s a Turris Omnia 2GB WLAN.

P.S.: I need my contacts and calendar working, again …

sammy_cda · December 14, 2018, 5:51pm

Thanks for the update.

Jack_Konings · December 14, 2018, 8:13pm

Hi Pepe,

Thanks for the update.

This afternoon the TO experienced a total DNS failure (can not find DNS server) , no computer on my network was able to connect to the internet. At the time DNS was done via Cloudfare (TLS ). As i was not home at the time I could not investigate this further. My son rebooted the TO and everything was fine again.

As a precation I disabled Cloudfare (TLS) and forwarding. As a byproduct this seemed to resolve my other issue, the slowness of Luci. Both seem related.

tonyquan · December 14, 2018, 8:31pm

yes, you can re-enable DDNS afterwards.

vcunat · December 14, 2018, 9:31pm

This pair sounds really weird to be related. (DNS over TLS and luci speed)

anon50890781 · December 14, 2018, 10:22pm

@pepe Just had to discover that TO has forcefully disabled DNS over TLS for unbound users via the daemon’s init script.

I voiced already concerns about that when feedback for RC been invited.

Despite https://gitlab.labs.nic.cz/turris/turris-os-packages/issues/213 unbound performs DoT opportunistically, as opposed to verified, which though is still preferable than no DoT at all.

ssdnvv · December 14, 2018, 11:32pm

After several back and forth it finally worked out with

deleting old DoT (cloudflare) custom config
disabling ddns, adblock and openvpn
creating kresd user + group (thanks to @AreYouLoco)
What I’m curious about: where do I find updater log when running pkgupdate? /usr/share/updater/updater-log doesn’t show it - did you change file/path where it is stored?

Jack_Konings · December 15, 2018, 9:39am

Hi Vcunat,

I’m just observing.
There seems to be a connection between the two.

ssdnvv · December 15, 2018, 12:44pm

Unfortunately 3.11 is as bad as 3.10.x when it comes to DNS resolving
When trying to read news this morning with my smartphone (mac2 in log below) I got no connection - as with any other client in my network, no matter if it is connected via cable or WiFi.

History:
I restarted TO at 00:24 CET after updating to 3.11.
I followed your (new guide for 3.11) for DNS over TLS for cloudflare. It worked out. I went to bed.

At 01:50 CET the automatical compulsory isolation (that happens once a day, forced by my provider) happened on my router (Fritz!Box 7412) and with that things started going funny.
In syslog I get the following spam (exemplary some from this morning) with every DHCP request from a client:

2018-12-15 11:56:44 info dnsmasq-dhcp[3966]: DHCPREQUEST(br-iot)  <ip for mac1> <mac1> 
2018-12-15 11:56:44 info dnsmasq-dhcp[3966]: DHCPACK(br-iot) <ip for mac1> <mac1> <static lease hostname for mac1>
2018-12-15 11:56:44 info dhcp_host_domain_ng.py[]: DHCPv4 new lease
2018-12-15 11:56:44 warning dhcp_host_domain_ng.py[]: Add_lease, hostname check failed
2018-12-15 11:56:44 warning dhcp_host_domain_ng.py[2565]: Last message 'Add_lease, hostname ' repeated 11 times, suppressed by syslog-ng on Router
2018-12-15 11:56:44 info dhcp_host_domain_ng.py[]: DHCP update hostname [old,<static lease hostname for mac1>,<ip for mac1>]
2018-12-15 11:56:44 debug dnsmasq-script[3966]: uci: Entry not found
2018-12-15 11:56:44 err dhcp_host_domain_ng.py[]: Kresd socket failed:<class 'socket.error'>,[Errno 104] Connection reset by peer
2018-12-15 11:56:44 err dhcp_host_domain_ng.py[]: Wrong host format '/tmp/kresd/hints.tmp' in host file 127.0.0.1 localhost 
2018-12-15 11:56:44 err dhcp_host_domain_ng.py[]: Kresd socket failed:<class 'socket.error'>,[Errno 104] Connection reset by peer
2018-12-15 11:56:44 err dhcp_host_domain_ng.py[]: Wrong host format '/tmp/kresd/hints.tmp' in host file <ip for static host - derived from /etc/hosts>.<my TLD>

following several other errors with Errno104

2018-12-15 11:56:44 err dhcp_host_domain_ng.py[]: Kresd socket failed:<class 'socket.error'>,[Errno 32] Broken pipe
2018-12-15 11:56:44 err dhcp_host_domain_ng.py[]: Wrong host format '/tmp/kresd/hints.tmp' in host file <ip for mac2> <static lease hostname for mac2>.<my TLD>

following several other errors with Errno 32

I’ve got about 7.000 entries in 10 hours just like the ones above. Somewhen inbetween (or maybe directly after automatical compulsory isolation) resolver must have stopped working…
First thing I learned for the last releases to run /etc/init.d/resolver restart in such cases of no internet access and after that I’ve got access to internet again.

So here are my questions:

What was changed on /tmp/kresd/hints.tmp implementation in kresd (that obviously combines the < static lease hostname for macx >.< my TLD > deriving from /etc/config/dhcp and static hostnames deriving from the hints-file entered in kresd configuration - in my case I use /etc/hosts) that causes these errors. When looking into /tmp/kresd/hints.tmp it obviously doesn’t show all static leases I defined in /etc/config/dhcp. Those deriving from /etc/hosts are displayed just fine.
Why do I STILL need to restart resolver every now and then?
Why does my configuration - and I do NOT have any specials, only static leases (entered by luci gui), DDNS and OpenVPN - break on every 3rd or 4th of your releases? (I did have the time this summer to make a OpenVPN site2site-connection between 2 Omnias work for 3.10.3. As this is used only every now and then, I cannot tell, which release exactly broke this configuration, but it is not working anymore. And right now I do not have the time to do deep root cause analysis on it or start again building it from scratch).

I AM disappointed - this router is not for playing reasons, but in a household where disconnections are NOT accepted.
I can do updates/upgrades at night, when children and wife are sleeping. When I do these updates/upgrades and test them successfully, I need to be sure that I can rely on them. It seems I can’t

Please help me at least to get the DNS thing working. These dropouts are least acceptable…

edit: It seems it also happens with every router restart. Maybe with every restart of WAN (eth1) interface?

hadc · December 16, 2018, 9:11am

try this crontab
0 5 * * * /sbin/reboot

it fixes almost all issues of unstable router behaviour

icingaj · December 16, 2018, 10:15am

Omniu I updated to 3.11. I wanted to test samba 4 and other programs. After installing samba 4, the updater sends the emails every 4 hours:

#####Oznámení o chybách#####
Updater failed:
[string "transaction"]:323: [string "transaction"]:149: Collisions:
• /usr/sbin/smbd: samba4-server (existing-file), samba36-server (new-file)
• /etc/samba/smb.conf.template: samba4-server (existing-file), samba36-server (new-file)

Please how to break of this notice? … is poisonous!

The results of the entire test are written on the line:

Thanks.

radekpribyl · December 16, 2018, 12:36pm

HI @vcunat
even when it sound strange I’m observing it too. I used Cloudfare TLS configuration and experienced issues with DNS resolusion time to and Luci speed too. Now I changed DNS back to Provider’s DNS and Luci is quick again. I will see if the DNS issues are gone - too early to judge.

The issues were:

messages from updater: unreachable: https://repo.turris.cz/omnia/lists/base.lua: Couldn’t resolve host ‘repo.turris.cz’
messages from cloud backup:
_(Creating an automatic Cloud Backup from your router failed.)
_(Failed to connect to the ssbackup server.)
messages from ddns script - later the ddns was able to obtain the IP address
warning ddns-scripts[14844]: dnsomatic: Get registered/public IP for xxxxxxx.dynu.net failed - retry 37/0 in 60 seconds

and of course most importantly - the connected devices couldn’t resolve any DNS in browser - for these I cannot see anything in the logs

ssdnvv · December 16, 2018, 1:12pm

Seriously? That doesn’t fix anything, it’s only a sad workaround.
TO is an expensive piece of hardware that is only continuously working if you - from a configuration point of view - stay close to what the TO team supports.

But even your “fix” would not work for me as

a warm reboot kills the functionality of my 2,4GHz-NIC (hardware design flaw, see other thread for this issue)
I have encrypted external storage attached, where logs are written to in order to decrease wear out of internal storage. I need to enter password to unlock this storage, which is why an unattended reboot is no option.

vcunat · December 16, 2018, 1:20pm

In case DNS (in clients) doesn’t work at all, it doesn’t seem so strange. I can imagine luci (unintentionally) resolving some names. (“unintentionally” because I believe luci is supposed to work perfectly even without working internet.)

Your case sounds like DNS doesn’t work at all. @radekpribyl: could you have some edits of DNS config files remaining from before? (e.g. “hand-configured” TLS forwarding)