[poll] OpenWrt (subsequent TOS) curtailing CPU performance for Turris Omnia

With the impending OpenWrt release 19.07.3, probably within the next few weeks, the CPU performance for Turris Omnia devices will be curtailed by this patch in their compiling toolchain.

It basically discards the Neon technology that is provided by the CPU and castrates it instead to vfpv3-d16, which is far inferior to Neon and therefore will impede the performance of the Turris Omnia.

Therefore the invitation whether Turris Omnia users find this diminution of paid hardware through software:

  • Unacceptable
  • Acceptable
  • Do not care either way
0 voters

Currently the patch code is already introduced in TOS’s HBL and HBK branches and it may take some time until it reaches HBS, whatever the release schedule for TOS5.x might be.

Basic info for:

Neon registers are considered as vectors of elements of the same data type, with Neon instructions operating on multiple elements simultaneously. Multiple data types are supported by the technology, including floating-point and integer operations.

Been asked by TOS developer for benchmarking about potential performance degradation due to the introduced code change - to avoid scare mongering. However, such benchmarking I cannot provide for reasons:

  • unavailability of test device
  • impossibility to test a broad scope of use cases that leverages floating-point and/or integer operations
  • time constraints

Moreover, I am of the humble opinion that:

  • since the code change is introduced by the manufacturer it is their purview to demonstrate transparently performance impacts by the code change and diminution of paid hardware through software does not constitute a short end bargain for the user

Erm, the commit in question was not introduced by team turris, as far as I can see, but to fix an actual show-stopper bug on armada 370 CPUs. A proposed solution seems to be to move more capable aramada CPU to their own “arch” (see FS#867 - mvebu: Should be split different arches, current (Armada 370, XP and other "legacy") and 385+ · Issue #5801 · openwrt/openwrt · GitHub). Now, are you proposing team turris to tackle this or are you proposing that they diverge from upstream and keep their own patch to keep using neon? I am confused about what your poll is actually intended for?

The O features a Marvell Armada 385 88F6820 CPU which has apparently a different CPU feature set, as such supports Neon, different kettle of fish so to speak.

That has been long since neglected, the bug is stale for the past three years. And meantime there is another (more recent) bug (in the tracker) where OpenWrt stated

It has been decided for now to stick to vfpv3-d16 as lowest common denominator and not split the target into subtargets.

Which indicates the inability of OpenWrt to diversify target classes, notwithstanding their disregard for device owners.

It just provides a platform for O users to voice their opinion about the code change.

I do not see how this poll constitutes a proposal of any sorts. But since you ask I would expect CZ.NIC to protect their user’s investment in the hardware and not short change users with diminution through software, least me prefers to reap the benefits of the hardware features and not be derived of it instead - else I could have invested in less capable hardware to begin with.

Well, not for upstream OpenWrt, as the change you object to was introduced to make armada 370 work again.

Sorry, how are the OpenWrt core developers responsible for the hardware you bought? And why should they put in immediate work and cause future maintenance work if nobody can be arsed to actually try to measure the performance cost on d32 capable hardware this change brings?

How is that subjective opinion poll supposed to be acted upon and by whom? Or is this just to let off a bit of steam?

Sorry, that is a rather abstract argument, that sees premature before even having assessed the actual performance cost this code simplification (less targets → less maintenance work) carries. I would be amazed if the OpenWrt core developers could not be convinced by hard cold numbers demonstrating how d16 cripples armada 385+ CPUs while doing standard duty with just OpenWrt’s core software.

Again, unless you can actually put a number to the hardware benefit you believe is lost though this change, this argument might be too abstract to gain much traction with those that can actually change things, no?

Do you have a link to that bug?

Why you keep on riding on the 370 series when that hardware is not is even incorporated in the O?
Notwithstanding that OpenWrt is well aware of the CPU type incorporated in the O.

Exactly, they put in immediate work to cover (from the commit message):

Armada 370 processors have only 16 double-precision registers.

without providing any sort of performance cost analysis for the more capable Armada 385 series.

I did not ask for that patch, neither can I see that any owner of an Armada 385 device did.

If you do not care, or find it acceptable, about the code change there is an option for that purpose in the poll, should you elect to participate. Else I do not really see the point of discoursing whether the proof of the performance impact to be offloaded to the device owner, who did not ask for the code change in the first place.

Because the change in upstream OpenWrt was justified by unbreaking OpenWrt for amada 370 CPUs, which is indeed independent of the Omnia.

Yes, the point seems to be that OpenWrt, if I understand correctly, tries to minimize platform proliferation, and having multiple targets/archs for very similar mvebu SoCs is not something they are keen on having (after all they do the maintenance work and they have not received money for doing so). OpenWrt is in no way obliged to support any SoC at all, let alone at a certain feature level, independent of whether we like that or not.

Yes, but not running at all is the ultimate performance price, so having 370 CPUs not work at all certainly is a worse situation than having 384+ CPUs operating at a so far unquantified reduced performance, no? And that was the choice, if I understand correctly.

That is a bit selfish, no? Think about users of 370 CPU whose devices stopped working completely before that patch was introduced.

\No, I find that poll in bad taste an of questionable objective, I do NOT want to participate in it. IMHO this is not a productive way of dealing with the unfortunate situation.

Again, maybe think a bit broader than just about your device…

It is apparently not independent because it impacts the toolchain that the O software is subjected to (causality).

CZ.NIC is however obliged to fully support the hardware (and its available features) they are selling, or do you reckon neither?
Do you mind to express your own viewpoint instead of “we”', or explain who “we” does encompass as otherwise it remains unclear whose spokesperson you are?

Your are in the Turris forum here and the poll about the O, why do you care so vehemently what happens to the 370 series?

Absolutely not, why should an O owner ever think about the ARM 370 series if one does not own a device with such hardware? I am afraid your reasoning escapes me. You leave somehow the impression that the O owner should be expressing solidarity with the owner of a 370 series device at the expense of the O owner.

What do you reckon is unfortunate then since you appear unperturbed about a performance impact
by the code change?

I am sorry, I do not get your argument here. You seem pretty combative, and I see no reason for that. If I have offended you, please accept my apology, I really do not want this to escalate. Point being, a fix for a real show-stopper bug, had some side-effects, not nice, but certainly no catastrophe IMHO. I have a feeling that improving that situation, by ameliorating the side-effects is better achieved by not alienating those whose help would be required to do so.

I do not believe that the omnia was sold on the promise to guarantee the availability of any specific feature for any specified, let alone in an upstream project. But I also believe that team turris might be convincible to address this issue, as they do have an interest in keeping the turris devices working well. (But they also receive severe criticism when and where they diverge from upstream OpenWrt, which means short of fixing this situation upstream one fraction of omnia users will be unhappy).

That’s what I mean with “combativeness”. I wanted to indicate that I would be happier with 385+ devices operating at higher performance, and from your post I assumed you would too, so hence I used “we” to describe us as members of the set of people wanting a solution to the issue that would not sacrifice the omnia’s performance. It seems I misinterpreted your position, sorry.

Because I understand how TOS is based on and coupled with OpenWrt, and that divergence between the two is a pretty problematic issue. It took time and some pains by team turris to get TOS better aligned with OpenWrt proper. Here the issue is, that to maintain alignment, TOS will need to find an agreement with upstream that everybody can live with. But as long as yje performance sacrifice has not been quatified all of this discussion is rather theoretical… Also, just because I own an omnia, does not mean that I need to stop caring about the OpenWrt ecosystem in general.

Please, re-read what you just wrote and just think how you would behave if you owned a 370 device.

Sorry, by being unnecessarily combative you missed that I am not arguing against re-instating the more performant compiler settings for the omnia, I just believe this needs to be done with the turris and OpenWrt developers not against them (and I also accept that the side-effects of that might be unacceptable to either team, who ever maintains code, gets a say in decisions that affect maintainability IMHO, and the number of targets/archs/platfporms certainly carry a cost).
But then again without an actual assessment of the lost performance our argument is a bit academic. If you would/could show that omnia performance in normal use-cases would drop, by sat 50% I am sure you would get the attention of relevant developers, just arguing in the abstract about some loss in performance is far less effective IMHO.

agree, this is irrelevant without benchmark numbers

1 Like

Summary, you:

  • qualify the poll as bad taste and of questionable objective
  • qualify others in the course of the discourse as combatants
  • qualify the matter as theoretical, academic, unfortunate, not nice but non-catastrophic
  • qualify 50% (!) in performance drop as some legitimate threshold to get anyone’s attention
  • prefer to offload the benchmarking to the user
  • reckon that code maintenance burden outweighs code performance
  • assert that CZ.NIC is under no obligation towards their users to prevent diminution of paid hardware through software
  • assert that an O device owner should think about some hardware that is not even incorporated in the device

That sounds very well, legitimate and amicable. What has been your actual and constructive action however, aside from discoursing here?

What others have done:

  • reached out to CZ.NIC that yielded

For now, we are using what OpenWrt uses and there are no plans for changing that.

  • commented on the OpenWrt commit which yielded zero response from the developers
  • opened a thread in the OpenWrt forum which yielded zero response from the developers
  • opened a report in OpenWrt’s bug tracker that been dismissed as won’t fix

Do you intent to:

  • carry out the benchmarking, being an O owner?
  • issue a PR with OpenWrt or CZ.NIC and convince either otherwise?

Are you going to provide some? There is probably a reason (performance?) for CPU’s supporting Neon/vfpd32 as opposed to just vfpv3-d16.

I stand to that.

Read what I wrote, I called you combative, that is not the same as calling you an combatant also not others just you in the singular.

“sat 50%” which should have been “say 50%” was not intended as an exact threshold but just as an exemplary number for which I would think that developers would care.

No, not “the user”, specifically to you. You are making a big fuss and behaving belligerent without actually being sure whether there is any objective rationale to complain in the first place.

Again, no, I recon, that the developers are the arbiters of such decisions and I mentioned that maintenance cost is an important factor. I am not the developer here, so it is not my call to make.

Yes, unless you have a specific contract with CZ.NIC your demands will not be enforceable by law. But more importantly, by phrasing this as a demand you are not winning the heart and minds of those that can/will/might help on the code side of the problem…

The point is the change got introduced in OpenWrt because it fixes a massive problem for some devices, whether you as omnia owner care for those device or not is of no importance to the OpenWrt developers. So your argument, “undo that fix, because it potentially cost some performance on my omnia who cares about armada 370” is unlikely to gan traction over there. Might want to rethink/rephrase your proposition?

This is another example of what I call a combative attitude…

See, maintenance cost is important to developers after all.

If I see unexplained poor performance on my omnia I might start looking into that, but I realistically do not expect that. I have enough on my plate that I do not need to go looking for issues.

Sure, once you demonstrate that this is a real quantifiable problem with sufficient high performance cost, I will talk to folks, not that my voice carried weight around here. But I am not going to spend time on your pet peeve unless you demonstrate this to more than a theoretical consideration by giving data the measures the performance cost.

See, open source development works by scratching one’s own itches (or gently convincing people that they share an itch), so far I see only a theoretical issue. Feel free to convince me other-wise with real data (and maybe cool off for a bit and come back with less of an entitled attitude).

How does that relates to open source? You mean Neon/vfpd32 is an itch?

Done by others - and yet you qualify all of those persons just the same.

It will be demonstrating itself soon enough.

And a litigator on top of all.

Just keep on going with your qualifications, you seem really big on those.

You might wan’t to google for that phrase. This is about how free open source development works. None of the OpenWrt developers owe you anything and most develop this in their own time and hence on their own money. The presumption in open source development is that if something causes you unhappiness you go and fix it and then share that fix with others. That initial cause for unhappiness is the “itch” and fixing it yourself is the analog to scratching, at that point the metaphor fails, as the share with other part works not well with either itches nor scratches. So create a fix and post it on the OpenWrt/Turris developer lists. If that fix is actually good, you might see it committed even when presented with an attitude, but if the fix is far from perfect or maybe just an idea you expect some else to actually implement a more polite and civil approach seems to be better IMHO.
And on that note I will end my part of our conversation, as I have said all I have to say, and unfortunately did not manage to increase the level of discourse (my fault).

You keep on going about attitude, unhappiness and other such objections like a behavioural analyst, nothing that either relates to the technicality of the code introduction or the poll.

You have expressed disagreement on the poll and contributed expertly to the community.

Let the poll roll, shall we?

This is only about toolchain compilation which implies 0 to none performance impact. You obviously have no idea what it is about so the other guy was right calling you entitled.

just shows that you have no clue about CPU instructions being compiled into the code.

In this particular instance the compiled code performance does forgo half of the CPU’s available fpu registers as as well as NEON optimisations.

In a broad stoke it would be like compiling code with a 32-bit instruction set for a 64-bit capable CPU, forgoing the benefits of 64-bit code performance.

Why do you think there are CPUs with more advanced instructions sets (ARM 385) than others (ARM 370) - just some sort of fancy marketing by CPU manufacturers?

then you have actual numbers on the performance drop as you compared the differently compiled code and you are just teasing everybody? Or you don’t and it is a non issue until proven otherwise?

With curtailed CPU instructions compiled into the code the full CPU capabilities are logically diminished, unless you reckon that more advanced CPU instructions sets are just some sort of fancy marketing and do no actually contribute to the performance, do you?

Your prerogative of course to believe that there is no performance impact. Else, please see

We may disagree on whether the manufacturer or the user has to demonstrate the performance impact. But since you are opposing the notion of a performance impact you could always demonstrate it otherwise.