go/src/crypto
Michael Munday 41402b59bd crypto/rc4: optimize generic implementation slightly
The compiler can't currently figure out that it can eliminate both c.s
loads (using store to load forwarding) in the second line of the
following code:

	...
	c.s[i], c.s[j] = c.s[j], c.s[i]
	x := c.s[j] + c.s[i]
	...

The compiler eliminates the second load of c.s[j] (using the original
value of c.s[i]), however the load of c.s[i] remains because the compiler
doesn't know that c.s[i] and c.s[j] either overlap completely or not at
all.

Introducing temporaries to make this explicit improves the performance
of the generic code slightly, the goal being to remove the assembly in
this package in the future. This change also hoists a bounds check out
of the main loop which gives a slight performance boost and also makes
the behaviour identical to the assembly implementation when len(dst) <
len(src).

name       old speed     new speed     delta
RC4_128-4  491MB/s ± 3%  596MB/s ± 5%  +21.51%  (p=0.000 n=9+9)
RC4_1K-4   504MB/s ± 2%  616MB/s ± 1%  +22.33%  (p=0.000 n=10+10)
RC4_8K-4   509MB/s ± 1%  630MB/s ± 2%  +23.85%  (p=0.000 n=8+9)

Change-Id: I27adc775713b2e74a1a94e0c1de0909fb4379463
Reviewed-on: https://go-review.googlesource.com/102335
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-23 15:47:48 +00:00
..
aes crypto/aes: optimize arm64 AES implementation 2018-03-06 00:44:29 +00:00
cipher crypto/cipher: add NewGCMWithNonceAndTagSize for custom tag sizes. 2018-02-14 15:32:26 +00:00
des
dsa all: fix article typos 2017-09-15 02:39:16 +00:00
ecdsa crypto/elliptic: reduce allocations on amd64 2017-11-30 21:01:10 +00:00
elliptic crypto: remove hand encoded amd64 instructions 2018-03-01 19:20:53 +00:00
hmac crypto, hash: document marshal/unmarshal implementation 2017-11-15 00:06:24 +00:00
internal/cipherhw
md5 all: fix non-standard "DO NOT EDIT" comments for generated files 2018-03-10 17:50:11 +00:00
rand
rc4 crypto/rc4: optimize generic implementation slightly 2018-03-23 15:47:48 +00:00
rsa crypto/rsa: improve error message for keys too short for PSS 2018-02-14 15:31:22 +00:00
sha1 hash: add MarshalBinary/UnmarshalBinary round trip + golden test for all implementations 2017-12-06 07:45:46 +00:00
sha256 crypto/sha256: speed-up for very small blocks 2018-02-20 23:39:10 +00:00
sha512 crypto/sha512: speed-up for very small blocks 2018-02-20 23:44:12 +00:00
subtle crypto/subtle: simplify and speed up constant-time primitives 2017-11-10 03:47:57 +00:00
tls crypto/tls: support keying material export 2018-03-22 18:48:49 +00:00
x509 crypto/x509: follow OpenSSL and emit Extension structures directly in CSRs. 2018-03-22 18:58:11 +00:00
crypto.go
issue21104_test.go