kernel-ark

History

Mathias Krause 66be895158 crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 This is an assembler implementation of the SHA1 algorithm using the Supplemental SSE3 (SSSE3) instructions or, when available, the Advanced Vector Extensions (AVX). Testing with the tcrypt module shows the raw hash performance is up to 2.3 times faster than the C implementation, using 8k data blocks on a Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25% faster. Since this implementation uses SSE/YMM registers it cannot safely be used in every situation, e.g. while an IRQ interrupts a kernel thread. The implementation falls back to the generic SHA1 variant, if using the SSE/YMM registers is not possible. With this algorithm I was able to increase the throughput of a single IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using the SSSE3 variant -- a speedup of +34.8%. Saving and restoring SSE/YMM state might make the actual throughput fluctuate when there are FPU intensive userland applications running. For example, meassuring the performance using iperf2 directly on the machine under test gives wobbling numbers because iperf2 uses the FPU for each packet to check if the reporting interval has expired (in the above test I got min/max/avg: 402/484/464 MBit/s). Using this algorithm on a IPsec gateway gives much more reasonable and stable numbers, albeit not as high as in the directly connected case. Here is the result from an RFC 2544 test run with a EXFO Packet Blazer FTB-8510: frame size sha1-generic sha1-ssse3 delta 64 byte 37.5 MBit/s 37.5 MBit/s 0.0% 128 byte 56.3 MBit/s 62.5 MBit/s +11.0% 256 byte 87.5 MBit/s 100.0 MBit/s +14.3% 512 byte 131.3 MBit/s 150.0 MBit/s +14.2% 1024 byte 162.5 MBit/s 193.8 MBit/s +19.3% 1280 byte 175.0 MBit/s 212.5 MBit/s +21.4% 1420 byte 175.0 MBit/s 218.7 MBit/s +25.0% 1518 byte 150.0 MBit/s 181.2 MBit/s +20.8% The throughput for the largest frame size is lower than for the previous size because the IP packets need to be fragmented in this case to make there way through the IPsec tunnel. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>		2011-08-10 19:00:29 +08:00
..
async_tx	net: remove mm.h inclusion from netdevice.h	2011-06-21 19:17:20 -07:00
ablkcipher.c	crypto: skcipher - remove redundant NULL check	2011-01-29 15:09:43 +11:00
aead.c
aes_generic.c
af_alg.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
ahash.c	crypto: hash - Fix handling of small unaligned buffers	2010-08-06 09:26:38 +08:00
algapi.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2010-05-03 11:28:58 +08:00
algboss.c	crypto: testmgr - Fix test disabling option	2010-08-06 09:40:28 +08:00
algif_hash.c	crypto: algif_hash - Handle initial af_alg_make_sg error correctly	2011-06-30 07:44:06 +08:00
algif_skcipher.c	crypto: algif_skcipher - Handle unaligned receive buffer	2010-11-30 17:04:31 +08:00
ansi_cprng.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
anubis.c
api.c
arc4.c	crypto: arc4 - Fixed coding style issues	2011-06-30 07:44:05 +08:00
authenc.c	crypto: Use scatterwalk_crypto_chain	2010-12-02 14:47:16 +08:00
authencesn.c	crypto: authencesn - Add algorithm to handle IPsec extended sequence numbers	2011-03-13 20:22:27 -07:00
blkcipher.c	mm: strictly nested kmap_atomic()	2010-10-26 16:52:08 -07:00
blowfish.c
camellia.c
cast5.c	crypto: cast5 - simplify if-statements	2010-11-13 21:47:55 +09:00
cast6.c
cbc.c
ccm.c
chainiv.c
cipher.c
compress.c
crc32c.c	crypto: crc32c - Fixed coding style issue	2011-06-30 07:44:05 +08:00
cryptd.c	crypto: cryptd - Adding the AEAD interface type support to cryptd	2010-09-20 16:05:12 +08:00
crypto_null.c
crypto_wq.c	crypto: mark crypto workqueues CPU_INTENSIVE	2011-01-04 23:34:08 +11:00
ctr.c	crypto: Use ERR_CAST	2010-05-26 10:36:51 +10:00
cts.c
deflate.c	net+crypto: Use vmalloc for zlib inflate buffers.	2011-06-29 05:48:41 -07:00
des_generic.c	Blackfin: Rename DES PC2() symbol to avoid collision	2010-10-07 14:08:50 +01:00
ecb.c
eseqiv.c	crypto: Use scatterwalk_crypto_chain	2010-12-02 14:47:16 +08:00
fcrypt.c
fips.c
gcm.c	crypto: Use scatterwalk_crypto_chain	2010-12-02 14:47:16 +08:00
gf128mul.c	crypto: gf128mul - fix call to memset()	2011-07-08 17:21:21 +08:00
ghash-generic.c
hmac.c
internal.h
Kconfig	crypto: sha1 - SSSE3 based SHA1 implementation for x86-64	2011-08-10 19:00:29 +08:00
khazad.c
krng.c
lrw.c
lzo.c
Makefile	crypto: authencesn - Add algorithm to handle IPsec extended sequence numbers	2011-03-13 20:22:27 -07:00
md4.c
md5.c
michael_mic.c
pcbc.c
pcompress.c
pcrypt.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	2011-01-13 10:25:58 -08:00
proc.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
ripemd.h
rmd128.c	crypto: ripemd - Set module author and update email address	2011-01-04 23:34:03 +11:00
rmd160.c	crypto: ripemd - Set module author and update email address	2011-01-04 23:34:03 +11:00
rmd256.c	crypto: ripemd - Set module author and update email address	2011-01-04 23:34:03 +11:00
rmd320.c	crypto: ripemd - Set module author and update email address	2011-01-04 23:34:03 +11:00
rng.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
salsa20_generic.c
scatterwalk.c	crypto: scatterwalk - Fix scatterwalk_done() test	2010-05-19 14:06:29 +10:00
seed.c
seqiv.c
serpent.c
sha1_generic.c	crypto: sha1 - export sha1_update for reuse	2011-08-10 19:00:28 +08:00
sha256_generic.c
sha512_generic.c
shash.c	crypto: hash - Fix async import on shash algorithm	2010-11-04 14:48:37 -04:00
tcrypt.c	crypto: tcrypt - CTR mode speed test for AES	2011-05-04 15:06:37 +10:00
tcrypt.h
tea.c
testmgr.c	crypto: testmgr - add support for aes ofb mode	2011-05-04 15:04:10 +10:00
testmgr.h	crypto: testmgr - add xts-aes-256 self-test	2011-06-30 07:44:00 +08:00
tgr192.c
twofish_common.c
twofish_generic.c	crypto: twofish: Rename twofish to twofish_generic and add an alias	2010-06-03 21:02:51 +10:00
vmac.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
wp512.c
xcbc.c
xor.c
xts.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
zlib.c	net+crypto: Use vmalloc for zlib inflate buffers.	2011-06-29 05:48:41 -07:00