The existing implementation didn't use the CLMUL instructions for fast
and constant time binary-field multiplication. With this change, amd64
CPUs that support both AES and CLMUL instructions will use an optimised
asm implementation.
benchmark old ns/op new ns/op delta
BenchmarkAESGCMSeal8K 91723 3200 -96.51%
BenchmarkAESGCMOpen8K 91487 3324 -96.37%
BenchmarkAESGCMSeal1K 11873 546 -95.40%
BenchmarkAESGCMOpen1K 11833 594 -94.98%
benchmark old MB/s new MB/s speedup
BenchmarkAESGCMSeal8K 89.31 2559.62 28.66x
BenchmarkAESGCMOpen8K 89.54 2463.78 27.52x
BenchmarkAESGCMSeal1K 86.24 1872.49 21.71x
BenchmarkAESGCMOpen1K 86.53 1721.78 19.90x
Change-Id: Idd63233098356d8b353d16624747b74d0c3f193e
Reviewed-on: https://go-review.googlesource.com/10484
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>