[poll] OpenWrt (subsequent TOS) curtailing CPU performance for Turris Omnia

fantomas · May 22, 2020, 10:02am

How hard is it to provide TOS without the slowdown patch?
Is the patch currently implemented in HBT so I could compare performance on 4.0.5 and 5.0.0 ?
(and if so, how could I meassure it?)

lapton · May 22, 2020, 10:03am

As moeller0 tried to explain, it seems confusing why would anyone tried to fix it when there is no proven issue yet, act just on your feeling and you did not do anything yourself. You are not forced to update to undesirable version of open source code, are you?
Also if I understand correctly the commit message, it was already 16-bit before, so you should be able to list the performance increase since the previous version?

anon82920800 · May 22, 2020, 10:11am

It was patched to cover for an issue with the ARM 370 series, there was no issue with the ARM 385. But since OpenWrt does not differentiate between the two CPU classes it impacts the ARM 385 code compilation as well.

Entirely beside the point of the poll.

Is your perspective

?

peci1 · May 22, 2020, 9:00pm

Time for some data:

$ ./arm-openwrt-linux-objdump -d libgcrypt.so.20.0.1 | grep -i vadd
   3d7f8:       f27688a6        vadd.i64        d24, d22, d22
   44c78:       f2366848        vadd.i64        q3, q3, q4
   44ce4:       f2366848        vadd.i64        q3, q3, q4
   44d3c:       f2366848        vadd.i64        q3, q3, q4
   44d58:       f22488e0        vadd.i32        q4, q10, q8
   44d5c:       f26a28e6        vadd.i32        q9, q13, q11
   44db0:       f220e8c8        vadd.i32        q7, q8, q4
   44db4:       f26628ca        vadd.i32        q9, q11, q5
   44df8:       f228e844        vadd.i32        q7, q4, q2
   44e00:       f26a284c        vadd.i32        q9, q5, q6
   44e60:       f264484e        vadd.i32        q10, q2, q7
   44e68:       f26c8862        vadd.i32        q12, q6, q9
...

$ ./arm-openwrt-linux-objdump -d libgcrypt.so.20.0.1 | grep -i vadd | wc -l
325

So, at least some libraries make acutal use of NEON instructions.

The kernel, on the other hand, seems to not use any:

$ ./arm-openwrt-linux-objdump -d ip_tables.ko | grep -i vadd
$ ./arm-openwrt-linux-objdump -d ip_tables.ko | grep -i vmul
$ ./arm-openwrt-linux-objdump -d ip_tables.ko | grep -i vabs
$ ./arm-openwrt-linux-objdump -d ip_tables.ko | grep -i vand
$ ./arm-openwrt-linux-objdump -d ip_tables.ko | grep -i i64
$ ./arm-openwrt-linux-objdump -d ip_tables.ko | grep -i i32
$

So, if my little experiment is generalizable, the change could leave at least the routing performance of the router unchaged. Running userspace apps could get less efficient, though.

anon82920800 · May 22, 2020, 9:57pm

probably most those involving mathematical operations with

, e.g. randomness, checksums, cryptography

fantomas · May 23, 2020, 3:20pm

anyone? I plan to move to HBT anyway, so I’d like to help with any benchmarks, if this can tell us anything…

anon82920800 · May 23, 2020, 3:26pm

The baseline should be the same OS version, with same settings/userland, otherwise the benchmark gets skewed.

Requires to compile kernel and userland with the NEON instruction set instead of the introduced patch.

fantomas · May 23, 2020, 3:38pm

iiuc you don’t have time to test, so it’s the least we can compare for now.

doesn’t it require just skip the patch (one-line change)?

and, btw I din’t get the answer to:

is it? or how can I find out?

fantomas · May 27, 2020, 5:39pm

is it possible for us to check the differencies in performance somehow?

anon82920800 · June 4, 2020, 6:01pm

Maybe there is hope after all, looks like a major contributor to the OpenWrt repo takes another stab at providing more diversity to the CPU classes https://github.com/openwrt/openwrt/pull/3079

anon82920800 · June 5, 2020, 4:09pm

Closed the poll now since it ran its course and with the aforementioned commit in OpenWrt Master has turned superfluous, the Omnia benefiting from NEON instructions (and all implied instructions) compiled into the code build from OpenWrt.

Thanks to those who elected to participate with a vote as well those having contributed in the discourse.