kernel-ark/arch/arm/crypto
Eric Biggers ede9622162 crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
Add an ARM NEON-accelerated implementation of Speck-XTS.  It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for
Speck64.  Each 128-byte chunk goes through XTS preprocessing, then is
encrypted/decrypted (doing one cipher round for all the blocks, then the
next round, etc.), then goes through XTS postprocessing.

The performance depends on the processor but can be about 3 times faster
than the generic code.  For example, on an ARMv7 processor we observe
the following performance with Speck128/256-XTS:

    xts-speck128-neon:     Encryption 107.9 MB/s, Decryption 108.1 MB/s
    xts(speck128-generic): Encryption  32.1 MB/s, Decryption  36.6 MB/s

In comparison to AES-256-XTS without the Cryptography Extensions:

    xts-aes-neonbs:        Encryption  41.2 MB/s, Decryption  36.7 MB/s
    xts(aes-asm):          Encryption  31.7 MB/s, Decryption  30.8 MB/s
    xts(aes-generic):      Encryption  21.2 MB/s, Decryption  20.9 MB/s

Speck64/128-XTS is even faster:

    xts-speck64-neon:      Encryption 138.6 MB/s, Decryption 139.1 MB/s

Note that as with the generic code, only the Speck128 and Speck64
variants are supported.  Also, for now only the XTS mode of operation is
supported, to target the disk and file encryption use cases.  The NEON
code also only handles the portion of the data that is evenly divisible
into 128-byte chunks, with any remainder handled by a C fallback.  Of
course, other modes of operation could be added later if needed, and/or
the NEON code could be updated to handle other buffer sizes.

The XTS specification is only defined for AES which has a 128-bit block
size, so for the GF(2^64) math needed for Speck64-XTS we use the
reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX
paper.  Of course, when possible users should use Speck128-XTS, but even
that may be too slow on some processors; Speck64-XTS can be faster.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-02-22 22:16:55 +08:00
..
.gitignore crypto: arm - ignore generated SHA2 assembly files 2015-07-06 16:32:03 +08:00
Kconfig crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS 2018-02-22 22:16:55 +08:00
Makefile crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS 2018-02-22 22:16:55 +08:00
aes-ce-core.S crypto: arm/aes-ce - remove cra_alignmask 2017-02-03 18:16:16 +08:00
aes-ce-glue.c crypto: algapi - make crypto_xor() take separate dst and src arguments 2017-08-04 09:27:15 +08:00
aes-cipher-core.S crypto: arm/aes-cipher - move S-box to .rodata section 2018-02-22 22:16:19 +08:00
aes-cipher-glue.c crypto: arm/aes - replace scalar AES cipher 2017-01-13 00:26:50 +08:00
aes-neonbs-core.S crypto: arm/aes - don't use IV buffer to return final keystream block 2017-02-03 18:16:21 +08:00
aes-neonbs-glue.c crypto: arm/aes-neonbs - Use PTR_ERR_OR_ZERO() 2017-12-11 22:36:56 +11:00
chacha20-neon-core.S crypto: arm/chacha20 - implement NEON version based on SSE3 code 2017-01-13 00:26:48 +08:00
chacha20-neon-glue.c crypto: arm/chacha20 - remove cra_alignmask 2017-02-03 18:16:19 +08:00
crc32-ce-core.S crypto: arm/crc32 - fix build error with outdated binutils 2017-03-01 19:47:51 +08:00
crc32-ce-glue.c crypto: hash - annotate algorithms taking optional key 2018-01-12 23:03:35 +11:00
crct10dif-ce-core.S crypto: arm/crct10dif - port x86 SSE implementation to ARM 2016-12-07 20:01:21 +08:00
crct10dif-ce-glue.c crypto: arm/crct10dif - port x86 SSE implementation to ARM 2016-12-07 20:01:21 +08:00
ghash-ce-core.S crypto: arm/ghash - add NEON accelerated fallback for vmull.p64 2017-08-04 09:27:24 +08:00
ghash-ce-glue.c crypto: arm/ghash - add NEON accelerated fallback for vmull.p64 2017-08-04 09:27:24 +08:00
sha1-armv4-large.S ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handling 2013-05-22 22:01:35 +01:00
sha1-armv7-neon.S crypto: arm/sha1-neon - add support for building in Thumb2 mode 2016-09-07 21:08:29 +08:00
sha1-ce-core.S crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer 2015-04-10 21:39:44 +08:00
sha1-ce-glue.c crypto: arm/sha1-ce - enable module autoloading based on CPU feature bits 2017-06-01 12:55:40 +08:00
sha1.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sha1_glue.c crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer 2015-04-10 21:39:42 +08:00
sha1_neon_glue.c crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer 2015-04-10 21:39:43 +08:00
sha2-ce-core.S crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer 2015-04-10 21:39:45 +08:00
sha2-ce-glue.c crypto: arm/sha2-ce - enable module autoloading based on CPU feature bits 2017-06-01 12:55:41 +08:00
sha256-armv4.pl crypto: arm/sha256 - Add optimized SHA-256/224 2015-04-03 18:03:40 +08:00
sha256-core.S_shipped crypto: arm/sha256 - Add optimized SHA-256/224 2015-04-03 18:03:40 +08:00
sha256_glue.c crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer 2015-04-10 21:39:44 +08:00
sha256_glue.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sha256_neon_glue.c crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer 2015-04-10 21:39:44 +08:00
sha512-armv4.pl crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON 2015-05-11 15:08:01 +08:00
sha512-core.S_shipped crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON 2015-05-11 15:08:01 +08:00
sha512-glue.c crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON 2015-05-11 15:08:01 +08:00
sha512-neon-glue.c crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON 2015-05-11 15:08:01 +08:00
sha512.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
speck-neon-core.S crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS 2018-02-22 22:16:55 +08:00
speck-neon-glue.c crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS 2018-02-22 22:16:55 +08:00