Mysterious Updater failed message, followed by bad connection

Two problems, the second probably caused by the first.

First problem: Mysterious broken update.
Yesterday evening, at exactly 0:00, all devices connected to the Turris Omnia lost connection to the internet.
At about 0:10 I logged into the Turris in an attempt to debug the problem, and was greeted by this message:

Now my Turris updates usually kick in at 3:00 in the night, after a notification email that an update is available. Nothing was pre-warned this time!
Internet was not resumed, but since it was late, I decided to look at it in the morning: So the Turris got the night to itself.
At 01:10, an Error Notification email was send, containing the same info as I get before. [Edit: Added mail info]

This morning I logged into the Turris, the warning message were gone, and it appeared the internet connection was back (a reboot had apparently taken place, but obviously no update as I had received no info about it.), so I resumed my regular schedule, with an one hour skype meeting at 6:30. However this quickly became impossible due to network interruptions, and I ended up having to use Skype on my mobile phones 4G network. :frowning: This lead to the second, apparently related problem:

Second problem: Bad network quality
After my meeting, I attempted to debug, first by going to Speed test - how fast is your internet? | DSLReports, ISP Information, to see quality of the connection.
My connections results usually are reasonably good, but today they were drastically different:

Down      55.91 --> 18.31
Up        10.24 -->  9.01
Ping         19 -->    20
BufferBloat  A+ -->     - (Absolute overload)
Quality       B -->     D
Speed         A -->     D
Streams    24/6 -->  24/1

Things didn’t look good, so I decided to trigger a reboot from the Turris Maintenance menu… however, whenever I tried to access the Turris from my browser, a “Foris” link in the left top cornor of the screen was shown for a split-second, before I was told that the page was not available: “This page isn’t working” (ERR_EMPTY_RESPONSE). [Edit: Added response]
After several attempts, such as switching back and forth between the 2.4 and 5G networks, and even the OpenVPN connection into the network (using my iPhone), which all gave the same result, I ended up performing a power-cycle.

I’m now able to access the Turris administration interface and the internet again, but still with lots of interruptions and the BufferBloat tests stays at at least 135ms, usually higher.

Connecting directly to my ISP’s fiber-box removes the problems, so it’s evident that this is a Turris related problem.

How can I best debug all this??


Foris version 94.2
Turris OS version 3.7.2
Kernel version 4.4.77-967673b9d511e4292e3bcb76c9e064bc-0

I experienced problem with same message last week. My devices lost connection to internet so I turned off and on turris first with no luck, then restarted UPC CATV modem and restarted turris again and get Missing CRL error as well. It seems that turris expect internet connectivity on restart and if it does not have it it get somehow stucked with no possible recovery. After another restart it was back to normal but did not tested connection speed.

I saw the same issue about almost exactly a week ago. However, I can’t tell much more beyond confirming the problem since the router (Turris v1.1) was at a remote location at that time. The internet connection is up and running, i. e. if there was an outage at all, it was short and did not require manual intervention. I did not perform any speed/latency/etc. measurements either.

I’ve tried rebooting my ISP’s fiberbox, followed by my Turris, but no change. :frowning:

I’ve attempted a few more dslreport speedtests, but get a number of strange errors, such as:

1.Your proxy does not support byte-range headers error:9
Your connection is going via a proxy or firewall which is not capable of satisfying byte Range requests. Byte-range requests are used by the speed test. If you can disable the proxy then this error will not occur.

2. During upload the measured speed went to zero and stayed there error:1
Your connection is very poor. So poor that packet loss is causing many halts.

It’s pretty annoying not to have a stable and working internet!!

My Skype meeting this morning went surprisingly fine, and yet another dslreport test revealed a somewhat better situation than yesterday:

Down      56.80
Up        10.05
Ping         20 
BufferBloat   C
Quality       C
Speed         C
Streams    24/6

It’s still far from what it used to be, and I have no idea what caused the improvement - just as I have no idea what caused the deterioration in the first place - so it may become bad again. :frowning:



Update: Yep! Fluctuating like the wind!!

Is SQM cake missing in action? [Edit: No, it’s an extra module that must be installed. See answer form @moeller0 below. ]

As part of attempting to understand my quality problem, I looked into SQM (http://turris.lan/cgi-bin/luci/admin/network/sqm).
I’ve tried the various Queue setup scripts available, but have noticed no difference whatsoever. BufferBloat is still to the roof. :frowning:
However, two of the provided scripts, I could not try, piece_of_cake.qos and layer_cake.qos, both indicates that they require the qdisc configures as cake, instead of the default fq_codel.

According to this https://github.com/CZ-NIC/turris-os/issues/33, the “Latest stable released turris includes cake module without tc patches.”, but it appear it’s not available on my system.

Am I missing something here?

Did you follow the instructions in https://forum.test.turris.cz/t/how-to-use-the-cake-queue-management-system-on-the-turris-omnia/3103 ? If not, please try that next…

Best Regards

BTW. I am not sure whether “hiding” a cake support request in an updater related thread is the most efficient way to get help :wink:

Thank you @moeller0,

You are right, it’s not the most direct way :slight_smile: but as it was related to my quest of getting my internet working again after the strange updater problem, I thought it was better to keep the question here.

I had not read the page you linked, and it did help me to install the packages. I had not assumed it was necessary to install extra packages, after reading the “Latest stable released turris includes cake module without tc patches.” note on https://github.com/CZ-NIC/turris-os/issues/33

Testing with piece_of_cake.qos was better, but still not up to the quality I had before the mysterious failed update:

Down      50.90
Up         9.25
Ping         20 
BufferBloat   B
Quality       A
Speed         A
Streams    24/6

BufferBloat remained between 84 and 125ms during download - much higher than the 1-2ms it used to be.

Regarding SQM configuration (after installation) have a look at https://lede-project.org/docs/howto/sqm
For having data for discussing de-bloating quality it is nice to share some data, I believe https://forum.lede-project.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803 gives decent instructions how to use the dslreports speedtest for that purpose.

It might be best to move this part of the the discussion into a new thread, no?

Best Regards

Thanks @moeller0 . You are absolutely right. I’ll get back to things related to SQM (and report my test results) once I have my system properly up and running again. Unlike my original thought, SQM appear not to be related to the problems I’m observing.

As for now my system is really behaving strange.
Lots of web-sites have started reporting errors, such as “Website send no data” or “Empty address”, “Address unreachable” etc. Some streaming services appear to work as they should, but Youtube is hardly useable.
Skype is sometimes very useable, and other times impossible to use. WeChat appear to work better when Skype doesn’t work at all.
Even the dslreport speedtest can’t complete most of the time, due to missing data. :frowning:

I’m really wondering how to debug this.

Anybody, any advice?