I take it that by bufferbloat you mean the RTT increase under load? So during the downloading test the latency probes reach base round-trip time (RTT) + 35 milliseconds and during the uploading test base RTT + 5 milliseconds?
Okay, so layer_cake currently uses 3 priority tiers and sorts packets into those based on the DSCP markings. As long as the packets are all marked CS0 there should not be a behavioral difference between piece_of_cake and layer_cake. BUT looking at the DSCP fields comes with a computational costs and cake will in case of CPU shortage tend to increase latency under load while sticking close to the configured shaper bandwidth (while HTB+fq_codel will tend to decrease the bandwidth but stick to the “configured” latency).
So the difference might be realted to incoming packets having weird DSCP markings; you could use tcpdump to capture the data from a speedtest run and look at the packets marking. Wireshark is great to look at the captured packets… Some ISPs in the past mis-labeld packets as CS1 which is treated as background class with high latency tolerance in cake, so if your ISP does that layer_cake is not ideal for your link.
Your router might be close enough to its maximum CPU capacity that the less demanding piece_of_cake still runs well, while layer_cake might already be affected by CPU-cycle shortage. You can log into your router during a speedtest and issue “top -d 1” to get a snapshot of the system load every second; do this during a speedtest with both piece_of- and layer_-cake and look at the first three lines of the output:
Mem: 45008K used, 15400K free, 620K shrd, 5424K buff, 9336K cached
CPU: 1% usr 2% sys 0% nic 95% idle 0% io 0% irq 0% sirq
Load average: 0.00 0.00 0.00 1/69 23918
If the idle value is equal or close to zero you are running out of CPU cycles (typically the sirq value will be rather high).
Could you quantify this a bit, please? Maybe you could post links to the detailed results pages of two dslreports speedtests one with piece_of_cake and one with layer_cake (see https://forum.lede-project.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803 for how to configure the dslreports speedtest)
Then “they” are wrong, the difference is mainly in the use of priority tiers by layer_cake, but whether this is good or bad or leads to better or worse performance really depends on your needs; so this is not a question of quality in my eyes, but of policy. So if you are happier with piece_of_cake just use it… (that is there always might be a bug in cake that leads to differences in performance, so if you think you isolated a bug, please keep reporting that, bugs should be fixed )
No, the bandwidth settings for piece_of_cake and layer_cake do not need to be different (you still might want to set them differently).
Hope that helps