I see from various threads on this forum that sqm (sch_cake) has been shown to work at rates of up to 600mbit bidirectional. I hope that the next generation of turris’s work can scale to a full gbit here, bu:t it would be VERY interesting to know a couple things about current scaling problems, as I think further development work is going to be required to the core “cake” code. We tend to use the flent.org tests to get detailed reports on how and where things are going wrong. Web tests are far too weak to drive gbit networks reliabily enough to see what’s going wrong.
Is there anyone(s) out there that can run one or more strings of flent vs SQM test’s on bufferbloat.net’s behalf? flent is commonly available for linux repositories and can be made to work on OSX, and we maintain a fleet of flent servers throughout the globe. Our most current one was built for starlink testing (it has nothing to do with starlink itself), but can be leveraged for this:
What I’m mostly looking for at the moment is “rtt_fair” results at various rates - both at what can actually be achieved via turris, and above where it begins to fail with and without sqm enabled.
I don’t know if turris supports the “flent” package for openwrt or not(?) that has additional tools we can use to collect data more directly, via flent as well.
It is also very possible to setup a flent server on your local network, enable sqm
on it, and just test that. It would be great if someones had that also, as the rtt_fair test is also a really good test of the wifi version of fq_codel.
Anyway you can learn a lot more about the behaviors from the data in the *.flent.gz files, plot them a zillion different ways, and (especially) do comparison plots, but there doesn’t seem to be a way to upload them to the forum…
It is called flent-tools and is installable and seems to work (I get cpu stats from the router). opkg update ; opkg install flent-tools
I can only run internet tests up to ~110/37 Mbps, while local testing indicates that 55/550 with some tweaking should be possible, so unless I get a faster internet access not much I can offer. But please holler for any specific test you want.
Glad to hear turris also has that package. The sampling routine was rewritten in C however, and may not work properly on a modern tc…
I would appreciate you posting a reference to the good results you get at the speed you run at. I actually forget the command line requirements to also capture cpu_stats and qdisc_stats required…
Since rtt_fair is actually using CS0 for 2 hosts and CS1 for the other two hosts and I use layer_cake, I switched to rtt_fair4be to just compare between the different host locations:
I note that london.starlink.taht.net seems to limit each flow to around 0.8 Mbps thoughput… at least from my ISP, when I do an 8 flow rrul_var test I can reach ~6 Mbps with 8 flows
RTTs look okay, but throughput is decidedly not independent of RTT, sure London seems rate limited, but singapore is also not behaving as expected… mind you in upload direction things look okayish I think.
This is from an otherwise live network wirth loads of gunk happening (including running PAKON).
The convergence on the upload is absolutely marvelous, a goal tcp and aqm designers had for decades.
On the download there’s two problems - adjusting for the bandwidth in the wrong place (after the ISP’s router), and… what I think I found in the mikrotik series of tests was that we were hitting cake’s memory limit a bit early in the case of really long rtts. Since the turris has plenty of memory, quintuple that memlimit for another test?
I’m a bit puzzled by your latency spike (and I wish there was a way to post the flent.gz files here) but y’know, this is so massively better than 5g, and we’re used to it.
Mmh, that helped singapore in the download, but now fremont took a hit. Upload looks a bit worse, but still pretty acceptable except for london. I guess singapore and fremont suffer from my default interval 100…
As I said non-quiescent network with too much going on…
Good question I need to check on my Linux host, I believe it should be on…
I agree, but the orange line is also crap^W underwhelming in your examples, sure from your location london is far away, but so is de and still de looks much more believable…
Now with my ingress shaper reduced to 100Mbps (from 105 before) and the nodes rebooted. Upload is still excellent, download looks also better for some stretches… London has recovered and is now similarly well to de which is expected after all both are close by.
Fremont is the most busy dc I’m in. I just disabled ecn, and switched to sch_fq. Not trusting my cluster any at the moment… I saw some signs dctcp style ecn was on on my dallas server a while back…
Quick thought, I was using the ingress keyword which results in equalising ingress rates, not egress rates of the shaper and is known to result in unequal egress rates as it essentially distrubutes capacity by responsiveness…
Will repeat the test with egress-mode instead…
I clearly have some cyclic gunk going on at an ~30 seconds interval, which I should look at later, but the rtt fairness is much better. With RTT being close to 1:10 rate differences being smaller than 1:2 seems pretty OK, given that this is at default interval of 100ms, which is acting against both singapore and fremont from my location.
P.S.: There are still odd rate spikes for fremont which I do not know what to make of.
Base on that last result… your ISP is actually doing a pretty good job at defaults. Most end-users (not gamers) would be perfectly content with only the 60ms latency and jitter you are showing, as it’s kilometers better than most dsl and cable. But since we both have OCD and can do much better for all uses and users and all workloads, we sally forth, with cake.
Thx VERY much for testing this on the turris for me. I’m still hoping other turris users running at > 500Mbit can chip in, which was my original question!
in terms of moving variables around, could you re-test with cake 100mbit with the default memlimit? I’m pretty sure at this point that its too low from various other tests, but…
Yes they have come a long way… this is a VDSL2 link with bidirectional vectoring and G.INP retransmissions. But I am using my own modem and router so can not fully appreciate the experience of the bulk of my ISPs customers who likely use the all in one modem router offered by the ISP.
Again, yes. Since it is not that hard to overload/saturate my link I really appreciate the IP and flow isolation cake’s scheduler offers…
+1; over here the plan now is to convert the bulk of the links to FTTH by 2030, though that is more aspirational than reliable. So I guess if nobody beats me to it, I might be able to post such a result within the next 8 years
My pleasure, thanks for sharing.
I can and will do tonight, but when I had a look at cake’s reported memory usage (by manually calling tc -s qdisc on a second terminal during the test) it did not seem to get close to the default 4MB.