HW crypto - Marvell CESA working?

Has someone managed to get marvell_cesa via cryptodev working?
/dev/crypto exists , marvell_cesa module is loaded and /proc/crypto shows mrvell_cesa as it’s options instead of kernel.
But openssl doesn’t load any cryptodev:

root@turris:~# openssl engine
(dynamic) Dynamic engine loading support

found it !
openssl sources in turris-os doesn’t include cryptodev.h library
The problem is that i can’t seem to be able to build packages…
libc built is different than the one installed
libc - 1.1.11-4
and built : libc - 1.1.11-3

Maybe you need this branch. https://api.turris.cz/openwrt-repo/turris-stable/
If you use the medikit from the same folder all package versions should match.

that’ for older turris - freescale cpu based - not omnia marvell based

Use this one: https://api.turris.cz/openwrt-repo/omnia-stable/

It wasnt so hard to remove “turris-stable” to see what is inside openwrt-repo :stuck_out_tongue:

omnia-stable has too old packages

I’ve solved the issue simply by installing:

opkg install kmod-crypto-ocf

and rebooting

After reboot engine cryptodev works.
But performance is not that great…
~50mbps 1k packets aes-128-cbc
~47mbps 1k packets aes-256-cbc

i have better performance inside the debian lxc ~59mbps 1kp aes-128-cbc

2 Likes

What is the performance without the cesa module loaded?

without:

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 30844.58k 32352.51k 33027.24k 33210.71k 33243.14k

with

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 1166.66k 4620.93k 17200.47k 49008.30k 93656.41k

1 Like

what is the command you use to test these speeds, with or without crypto?
I have a box running pfsense right now. I would like to have a comparision.

without cryptodev:

openssl speed aes-256-cbc

with cryptodev:

openssl speed -evp aes-256-cbc

2.4GHz C2558 with pfSense loaded

[2.3.2-RELEASE][: openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 5386446 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 1552874 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 401610 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 260036 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 32978 aes-256 cbc’s in 3.00s
OpenSSL 1.0.1s-freebsd 1 Mar 2016
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: clang
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 28727.71k 33127.98k 34270.72k 88758.95k 90051.93k

If you rerun the test with the -evp switch it should use the aesni instructions and perform alot faster.

supermicro C2758 motherboard running pfsense 2.3.2.

here is my output with the above commands:
[2.3.2-RELEASE][admin@pfSense.localdomain]/root: openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 5624461 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 1546310 aes-256 cbc’s in 3.01s
Doing aes-256 cbc for 3s on 256 size blocks: 399500 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 258261 aes-256 cbc’s in 2.99s
Doing aes-256 cbc for 3s on 8192 size blocks: 32763 aes-256 cbc’s in 3.00s
OpenSSL 1.0.1s-freebsd 1 Mar 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 29997.13k 32902.26k 34090.67k 88383.25k 89464.83k
[2.3.2-RELEASE][admin@pfSense.localdomain]/root: openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 940842 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 887360 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 736515 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 445157 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 92036 aes-256-cbc’s in 3.00s
OpenSSL 1.0.1s-freebsd 1 Mar 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 5017.82k 18930.35k 62849.28k 151946.92k 251319.64k

1 Like

The same supermicro C2758 board result with aes-256-gcm:

[2.3.2-RELEASE][admin@pfSense.localdomain]/root: env OPENSSL_ia32cap=0 openssl speed -elapsed -evp aes-256-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-gcm for 3s on 16 size blocks: 4136182 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 64 size blocks: 1240438 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 324611 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 82946 aes-256-gcm’s in 3.03s
Doing aes-256-gcm for 3s on 8192 size blocks: 10350 aes-256-gcm’s in 3.02s
OpenSSL 1.0.1s-freebsd 1 Mar 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 22059.64k 26462.68k 27700.14k 28020.36k 28115.96k
[2.3.2-RELEASE][admin@pfSense.localdomain]/root: openssl speed -elapsed -evp aes-256-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-gcm for 3s on 16 size blocks: 20013605 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 64 size blocks: 8781653 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 2787748 aes-256-gcm’s in 3.01s
Doing aes-256-gcm for 3s on 1024 size blocks: 751487 aes-256-gcm’s in 3.01s
Doing aes-256-gcm for 3s on 8192 size blocks: 95512 aes-256-gcm’s in 3.00s
OpenSSL 1.0.1s-freebsd 1 Mar 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 106739.23k 187341.93k 237269.94k 255841.31k 260811.43k

For your reference, asrock N3150 motherboard running xpenology DSM 6.0.2:

admin@DiskStation:~$ openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 4375351 aes-256 cbc’s in 2.98s
Doing aes-256 cbc for 3s on 64 size blocks: 1248789 aes-256 cbc’s in 2.99s
Doing aes-256 cbc for 3s on 256 size blocks: 322656 aes-256 cbc’s in 2.98s
Doing aes-256 cbc for 3s on 1024 size blocks: 220279 aes-256 cbc’s in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 27925 aes-256 cbc’s in 3.00s
OpenSSL 1.0.2j-fips 26 Sep 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 23491.82k 26729.93k 27718.10k 75188.57k 76253.87k
admin@DiskStation:~$ openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 25970494 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 9717522 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 2840404 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 742252 aes-256-cbc’s in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 93931 aes-256-cbc’s in 3.00s
OpenSSL 1.0.2j-fips 26 Sep 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 138509.30k 207307.14k 242381.14k 253355.35k 256494.25k

Asrock N3150 running xpenology DSM 6.0.2 aes-256-gcm outputs:

admin@DiskStation:~$ OPENSSL_ia32cap=0 openssl speed -elapsed -evp aes-256-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-gcm for 3s on 16 size blocks: 3485519 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 64 size blocks: 1047222 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 272000 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 68721 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 8192 size blocks: 8634 aes-256-gcm’s in 3.00s
OpenSSL 1.0.2j-fips 26 Sep 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 18589.43k 22340.74k 23210.67k 23456.77k 23576.58k
admin@DiskStation:~$ openssl speed -elapsed -evp aes-256-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-gcm for 3s on 16 size blocks: 16866142 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 64 size blocks: 8257333 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 2757682 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 759749 aes-256-gcm’s in 3.00s
Doing aes-256-gcm for 3s on 8192 size blocks: 97330 aes-256-gcm’s in 3.00s
OpenSSL 1.0.2j-fips 26 Sep 2016
The ‘numbers’ are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 89952.76k 176156.44k 235322.20k 259327.66k 265775.79k

pfsense 2.3.2 uses openssl 1.0.1
synology DSM 6.0.2 uses openssl 1.0.2

Are you sure this is right? IIRC, you have to use -elapsed to get openssl to measure actual time, and not CPU time in this case.

This is my linksys wrt1200ac router running openwrt chaos calmer without hardware encryption.

openssl speed aes-128-cbc aes-256-cbc
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc 22115.49k 23533.72k 23918.11k 23759.20k 23392.41k
aes-256 cbc 17021.01k 17849.96k 18528.97k 18411.86k 18751.19k

Results with -evp
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 57870.11k 166301.54k 500421.82k 2999330.13k 5462835.20k
aes-256-cbc 34949.69k 203289.60k 588880.00k 1769523.20k 8192819.20k

Results with -elapsed
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc 22028.01k 23443.88k 23716.69k 23905.96k 23344.47k
aes-256 cbc 16927.14k 18014.42k 18365.10k 18414.25k 18614.95k

Results with -elapsed -evp
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 2164.35k 7273.90k 18415.27k 30216.19k 36088.49k
aes-256-cbc 2054.61k 6656.23k 15593.65k 23533.91k 27211.09k

Build Info:

OpenSSL 1.0.2g  1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,2,long) aes(partial) blowfish(ptr) 
compiler: ccache_cc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/home/jow/relman-cc.1/.cache/sdk/mvebu/generic/staging_dir/target-arm_cortex-a9+vfpv3_uClibc-0.9.33.2_eabi/usr/include -I/home/jow/relman-cc.1/.cache/sdk/mvebu/generic/staging_dir/target-arm_cortex-a9+vfpv3_uClibc-0.9.33.2_eabi/include -I/home/jow/relman-cc.1/.cache/sdk/mvebu/generic/staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-4.8-linaro_uClibc-0.9.33.2_eabi/usr/include -I/home/jow/relman-cc.1/.cache/sdk/mvebu/generic/staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-4.8-linaro_uClibc-0.9.33.2_eabi/include -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DOPENSSL_NO_ERR -DTERMIOS -Os -pipe -march=armv7-a -mtune=cortex-a9 -mfpu=vfpv3-d16 -fno-caller-saves -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=soft -fpic -fomit-frame-pointer -Wal

For rerference my desktop system (debian jessie-backports, kernel 4.7) with an i3-6100T

Results with -elapsed
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc 126771.10k 141282.43k 144572.76k 146331.65k 146792.45k
aes-256 cbc 93610.62k 101680.19k 102504.11k 101647.02k 95764.48k

Results with -elapsed -evp
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 1041366.59k 1159505.11k 1202731.69k 1211473.24k 1214163.63k
aes-256 cbc 91820.65k 101493.80k 102946.30k 103063.89k 101029.21k

Build Info:

OpenSSL 1.0.2j  26 Sep 2016
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) 
compiler: gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM

Thanks for the pfsense comparison, seems these appliances can do everything i used openwrt for out of the box and are alot more powerfull but also more expensive. Giving the sg-2220 appliance an try.

Does ssh run stable with marvel-cesa? With strongswan the router sometimes reboots if ipsec starts but is afterwards stable.
Takes only a few seconds but can be seen on the console.

There is not much under my control. I just copy and pasted the command.