aha/crypto
Huang Ying 254eff7714 crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq
Original cryptd thread implementation has scalability issue, this
patch solve the issue with a per-CPU thread implementation.

struct cryptd_queue is defined to be a per-CPU queue, which holds one
struct cryptd_cpu_queue for each CPU. In struct cryptd_cpu_queue, a
struct crypto_queue holds all requests for the CPU, a struct
work_struct is used to run all requests for the CPU.

Testing based on dm-crypt on an Intel Core 2 E6400 (two cores) machine
shows 19.2% performance gain. The testing script is as follow:

-------------------- script begin ---------------------------
#!/bin/sh

dmc_create()
{
        # Create a crypt device using dmsetup
        dmsetup create $2 --table "0 `blockdev --getsize $1` crypt cbc(aes-asm)?cryptd?plain:plain babebabebabebabebabebabebabebabe 0 $1 0"
}

dmsetup remove crypt0
dmsetup remove crypt1

dd if=/dev/zero of=/dev/ram0 bs=1M count=4 >& /dev/null
dd if=/dev/zero of=/dev/ram1 bs=1M count=4 >& /dev/null

dmc_create /dev/ram0 crypt0
dmc_create /dev/ram1 crypt1

cat >tr.sh <<EOF
#!/bin/sh

for n in \$(seq 10); do
        dd if=/dev/dm-0 of=/dev/null >& /dev/null &
        dd if=/dev/dm-1 of=/dev/null >& /dev/null &
done
wait
EOF

for n in $(seq 10); do
        /usr/bin/time sh tr.sh
done
rm tr.sh
-------------------- script end   ---------------------------

The separator of dm-crypt parameter is changed from "-" to "?", because
"-" is used in some cipher driver name too, and cryptds need to specify
cipher driver name instead of cipher name.

The test result on an Intel Core2 E6400 (two cores) is as follow:

without patch:
-----------------wo begin --------------------------
0.04user 0.38system 0:00.39elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6566minor)pagefaults 0swaps
0.07user 0.35system 0:00.35elapsed 121%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6567minor)pagefaults 0swaps
0.06user 0.34system 0:00.30elapsed 135%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.37system 0:00.36elapsed 119%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6607minor)pagefaults 0swaps
0.06user 0.36system 0:00.35elapsed 120%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.37system 0:00.31elapsed 136%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6594minor)pagefaults 0swaps
0.04user 0.34system 0:00.30elapsed 126%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6597minor)pagefaults 0swaps
0.06user 0.32system 0:00.31elapsed 125%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6571minor)pagefaults 0swaps
0.06user 0.34system 0:00.31elapsed 134%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6581minor)pagefaults 0swaps
0.05user 0.38system 0:00.31elapsed 138%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6600minor)pagefaults 0swaps
-----------------wo end   --------------------------


with patch:
------------------w begin --------------------------
0.02user 0.31system 0:00.24elapsed 141%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6554minor)pagefaults 0swaps
0.05user 0.34system 0:00.31elapsed 127%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6606minor)pagefaults 0swaps
0.07user 0.33system 0:00.26elapsed 155%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6559minor)pagefaults 0swaps
0.07user 0.32system 0:00.26elapsed 151%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.34system 0:00.26elapsed 150%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6603minor)pagefaults 0swaps
0.03user 0.36system 0:00.31elapsed 124%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.04user 0.35system 0:00.26elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6586minor)pagefaults 0swaps
0.03user 0.37system 0:00.27elapsed 146%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.04user 0.36system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6594minor)pagefaults 0swaps
0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6557minor)pagefaults 0swaps
------------------w end   --------------------------

The middle value of elapsed time is:
wo cryptwq: 0.31
w  cryptwq: 0.26

The performance gain is about (0.31-0.26)/0.26 = 0.192.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-19 14:42:19 +08:00
..
async_tx dmaengine: replace dma_async_client_register with dmaengine_get 2009-01-06 11:38:17 -07:00
ablkcipher.c crypto: skcipher - Avoid infinite loop when cipher fails selftest 2009-02-18 21:20:06 +08:00
aead.c crypto: aead - Avoid infinite loop when nivaead fails selftest 2009-02-18 21:21:24 +08:00
aes_generic.c crypto: aes - Precompute tables 2008-12-25 11:05:13 +11:00
ahash.c crypto: hash - Make setkey optional 2008-12-25 11:02:06 +11:00
algapi.c crypto: api - Fix algorithm test race that broke aead initialisation 2009-01-28 14:09:59 +11:00
algboss.c crypto: testmgr - Test skciphers with no IVs 2009-02-18 21:41:29 +08:00
ansi_cprng.c crypto: ansi_cprng - Panic on CPRNG test failure when in FIPS mode 2009-02-18 16:48:07 +08:00
anubis.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
api.c crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention 2009-02-18 16:56:59 +08:00
arc4.c [CRYPTO] api: Get rid of flags argument to setkey 2006-09-21 11:41:02 +10:00
authenc.c crypto: authenc - Fix zero-length IV crash 2009-01-15 15:33:49 +11:00
blkcipher.c crypto: skcipher - Avoid infinite loop when cipher fails selftest 2009-02-18 21:20:06 +08:00
blowfish.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
camellia.c crypto: camellia - use kernel-provided bitops, unaligned access 2008-12-25 11:01:15 +11:00
cast5.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
cast6.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
cbc.c Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 2008-02-07 08:42:26 -08:00
ccm.c crypto: ccm - Fix handling of null assoc data 2009-01-27 17:11:15 +11:00
chainiv.c crypto: skcipher - Use RNG interface instead of get_random_bytes 2008-08-29 15:50:06 +10:00
cipher.c [CRYPTO] api: Add missing headers for setkey_unaligned 2007-10-10 16:55:40 -07:00
compress.c cleanup asm/scatterlist.h includes 2007-11-02 08:47:06 +01:00
crc32c.c libcrc32c: Move implementation to crypto crc32c 2008-12-25 11:01:40 +11:00
cryptd.c crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq 2009-02-19 14:42:19 +08:00
crypto_null.c crypto: null - Switch to shash 2008-12-25 11:02:07 +11:00
crypto_wq.c crypto: api - Use dedicated workqueue for crypto subsystem 2009-02-19 14:33:40 +08:00
ctr.c [CRYPTO] seqiv: Add Sequence Number IV Generator 2008-01-11 08:16:48 +11:00
cts.c [CRYPTO] cts: Init SG tables 2008-06-02 15:46:51 +10:00
deflate.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
des_generic.c crypto: des3_ede - permit weak keys unless REQ_WEAK_KEY set 2008-12-25 11:02:28 +11:00
digest.c crypto: hash - Fix digest size check for digest type 2008-08-13 20:08:38 +10:00
ecb.c Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 2008-02-07 08:42:26 -08:00
eseqiv.c crypto: skcipher - Use RNG interface instead of get_random_bytes 2008-08-29 15:50:06 +10:00
fcrypt.c crypto: remove uses of __constant_{endian} helpers 2008-12-25 11:02:03 +11:00
fips.c crypto: api - Add fips_enable flag 2008-08-29 15:50:02 +10:00
gcm.c [CRYPTO] gcm: Introduce rfc4106 2008-01-11 08:16:56 +11:00
gf128mul.c [CRYPTO] xts: XTS blockcipher mode implementation without partial blocks 2007-10-10 16:55:45 -07:00
hash.c crypto: hash - Move ahash functions into crypto/hash.h 2008-07-10 20:35:18 +08:00
hmac.c crypto: hash - Export shash through hash 2008-12-25 11:01:33 +11:00
internal.h crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention 2009-02-18 16:56:59 +08:00
Kconfig crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq 2009-02-19 14:42:19 +08:00
khazad.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
krng.c crypto: rng - RNG interface and implementation 2008-08-29 15:50:04 +10:00
lrw.c crypto: lrw - Fix big endian support 2009-02-17 20:00:11 +08:00
lzo.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
Makefile crypto: api - Use dedicated workqueue for crypto subsystem 2009-02-19 14:33:40 +08:00
md4.c crypto: md4 - Switch to shash 2008-12-25 11:02:16 +11:00
md5.c crypto: md5 - Switch to shash 2008-12-25 11:02:18 +11:00
michael_mic.c crypto: michael_mic - Switch to shash 2008-12-25 11:02:24 +11:00
pcbc.c Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 2008-02-07 08:42:26 -08:00
proc.c crypto: api - Call type show function before legacy for proc 2008-12-25 11:01:32 +11:00
ripemd.h [CRYPTO] ripemd: Put all common RIPEMD values in header file 2008-07-10 20:35:12 +08:00
rmd128.c crypto: rmd128 - Switch to shash 2008-12-25 11:02:09 +11:00
rmd160.c crypto: rmd160 - Switch to shash 2008-12-25 11:02:10 +11:00
rmd256.c crypto: rmd256 - Switch to shash 2008-12-25 11:02:12 +11:00
rmd320.c crypto: rmd320 - Switch to shash 2008-12-25 11:02:13 +11:00
rng.c crypto: rng - RNG interface and implementation 2008-08-29 15:50:04 +10:00
salsa20_generic.c crypto: salsa20 - Remove private wrappers around various operations 2008-12-25 11:02:30 +11:00
scatterwalk.c crypto: scatterwalk - Avoid flush_dcache_page on slab pages 2009-02-09 14:30:25 +11:00
seed.c [CRYPTO] seed: New cipher algorithm 2007-10-10 16:55:38 -07:00
seqiv.c crypto: skcipher - Use RNG interface instead of get_random_bytes 2008-08-29 15:50:06 +10:00
serpent.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
sha1_generic.c crypto: sha1 - Switch to shash 2008-12-25 11:02:15 +11:00
sha256_generic.c crypto: sha256 - Switch to shash 2008-12-25 11:02:19 +11:00
sha512_generic.c crypto: sha512 - Switch to shash 2008-12-25 11:02:27 +11:00
shash.c crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention 2009-02-18 16:56:59 +08:00
tcrypt.c crypto: cryptomgr - Add test infrastructure 2008-08-29 15:49:55 +10:00
tcrypt.h crypto: cryptomgr - Add test infrastructure 2008-08-29 15:49:55 +10:00
tea.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
testmgr.c crypto: testmgr - Validate output length in (de)compression tests 2008-12-25 11:02:04 +11:00
testmgr.h crypto: testmgr - Correct comment about deflate parameters 2008-12-25 11:02:32 +11:00
tgr192.c crypto: tgr192 - Switch to shash 2008-12-25 11:02:21 +11:00
twofish.c [CRYPTO] all: Clean up init()/fini() 2008-04-21 10:19:34 +08:00
twofish_common.c [CRYPTO] twofish: Do not unroll big stuff in twofish key setup 2008-01-11 08:16:06 +11:00
wp512.c crypto: wp512 - Switch to shash 2008-12-25 11:02:22 +11:00
xcbc.c [CRYPTO] xcbc: Fix crash when ipsec uses xcbc-mac with big data chunk 2008-04-02 14:36:09 +08:00
xor.c async_tx: add the async_tx api 2007-07-13 08:06:14 -07:00
xts.c [CRYPTO] xts: Use proper alignment 2008-03-06 18:56:19 +08:00