go/src/math
Brian Kessler 1643d4f33a math/big: implement Lehmer's GCD algorithm
Updates #15833

Lehmer's GCD algorithm uses single precision calculations
to simulate several steps of multiple precision calculations
in Euclid's GCD algorithm which leads to a considerable
speed up.  This implementation uses Collins' simplified
testing condition on the single digit cosequences which
requires only one quotient and avoids any possibility of
overflow.

name                          old time/op  new time/op  delta
GCD10x10/WithoutXY-4          1.82µs ±24%  0.28µs ± 6%  -84.40%  (p=0.008 n=5+5)
GCD10x10/WithXY-4             1.69µs ± 6%  1.71µs ± 6%     ~     (p=0.595 n=5+5)
GCD10x100/WithoutXY-4         1.87µs ± 2%  0.56µs ± 4%  -70.13%  (p=0.008 n=5+5)
GCD10x100/WithXY-4            2.61µs ± 2%  2.65µs ± 4%     ~     (p=0.635 n=5+5)
GCD10x1000/WithoutXY-4        2.75µs ± 2%  1.48µs ± 1%  -46.06%  (p=0.008 n=5+5)
GCD10x1000/WithXY-4           5.29µs ± 2%  5.25µs ± 2%     ~     (p=0.548 n=5+5)
GCD10x10000/WithoutXY-4       10.7µs ± 2%  10.3µs ± 0%   -4.38%  (p=0.008 n=5+5)
GCD10x10000/WithXY-4          22.3µs ± 6%  22.1µs ± 1%     ~     (p=1.000 n=5+5)
GCD10x100000/WithoutXY-4      93.7µs ± 2%  99.4µs ± 2%   +6.09%  (p=0.008 n=5+5)
GCD10x100000/WithXY-4          196µs ± 2%   199µs ± 2%     ~     (p=0.222 n=5+5)
GCD100x100/WithoutXY-4        10.1µs ± 2%   2.5µs ± 2%  -74.84%  (p=0.008 n=5+5)
GCD100x100/WithXY-4           21.4µs ± 2%  21.3µs ± 7%     ~     (p=0.548 n=5+5)
GCD100x1000/WithoutXY-4       11.3µs ± 2%   4.4µs ± 4%  -60.87%  (p=0.008 n=5+5)
GCD100x1000/WithXY-4          24.7µs ± 3%  23.9µs ± 1%     ~     (p=0.056 n=5+5)
GCD100x10000/WithoutXY-4      26.6µs ± 1%  20.0µs ± 2%  -24.82%  (p=0.008 n=5+5)
GCD100x10000/WithXY-4         78.7µs ± 2%  78.2µs ± 2%     ~     (p=0.690 n=5+5)
GCD100x100000/WithoutXY-4      174µs ± 2%   171µs ± 1%     ~     (p=0.056 n=5+5)
GCD100x100000/WithXY-4         563µs ± 4%   561µs ± 2%     ~     (p=1.000 n=5+5)
GCD1000x1000/WithoutXY-4       120µs ± 5%    29µs ± 3%  -75.71%  (p=0.008 n=5+5)
GCD1000x1000/WithXY-4          355µs ± 4%   358µs ± 2%     ~     (p=0.841 n=5+5)
GCD1000x10000/WithoutXY-4      140µs ± 2%    49µs ± 2%  -65.07%  (p=0.008 n=5+5)
GCD1000x10000/WithXY-4         626µs ± 3%   628µs ± 9%     ~     (p=0.690 n=5+5)
GCD1000x100000/WithoutXY-4     340µs ± 4%   259µs ± 6%  -23.79%  (p=0.008 n=5+5)
GCD1000x100000/WithXY-4       3.76ms ± 4%  3.82ms ± 5%     ~     (p=0.310 n=5+5)
GCD10000x10000/WithoutXY-4    3.11ms ± 3%  0.54ms ± 2%  -82.74%  (p=0.008 n=5+5)
GCD10000x10000/WithXY-4       7.96ms ± 3%  7.69ms ± 3%     ~     (p=0.151 n=5+5)
GCD10000x100000/WithoutXY-4   3.88ms ± 1%  1.27ms ± 2%  -67.21%  (p=0.008 n=5+5)
GCD10000x100000/WithXY-4      38.1ms ± 2%  38.8ms ± 1%     ~     (p=0.095 n=5+5)
GCD100000x100000/WithoutXY-4   208ms ± 1%    25ms ± 4%  -88.07%  (p=0.008 n=5+5)
GCD100000x100000/WithXY-4      533ms ± 5%   525ms ± 4%     ~     (p=0.548 n=5+5)

Change-Id: Ic1e007eb807b93e75f4752e968e98c1f0cb90e43
Reviewed-on: https://go-review.googlesource.com/59450
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-24 22:42:43 +00:00
..
big math/big: implement Lehmer's GCD algorithm 2017-10-24 22:42:43 +00:00
bits math/bits: complete examples 2017-10-06 16:58:03 +00:00
cmplx math/cmplx: prevent infinite loop in tanSeries 2016-10-25 18:32:22 +00:00
rand math/rand: fix comment for Shuffle 2017-09-14 03:41:35 +00:00
abs.go cmd/compile,math: improve code generation for math.Abs 2017-08-25 19:15:01 +00:00
acos_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
acosh.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
acosh_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
all_test.go math: add RoundToEven function 2017-10-24 22:33:09 +00:00
arith_s390x.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
arith_s390x_test.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asin.go
asin_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asinh.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asinh_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asinh_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atan.go
atan2.go
atan2_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atan_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atanh.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atanh_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
bits.go math: add RoundToEven function 2017-10-24 22:33:09 +00:00
cbrt.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
cbrt_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
cbrt_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
const.go math: change oeis.org urls to https 2017-08-08 08:56:40 +00:00
copysign.go
cosh_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
dim.go
dim_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_arm64.s math: add some assembly implementations on ARM64 2016-09-27 23:52:12 +00:00
dim_s390x.s math: add functions and stubs for s390x 2016-04-06 23:35:56 +00:00
erf.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erf_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erf_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erfc_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erfinv.go math: implement the erfcinv function 2017-08-22 13:13:20 +00:00
example_test.go math: add examples for trig functions 2017-08-25 20:26:19 +00:00
exp.go math: fix inaccurate result of Exp(1) 2017-08-17 09:01:27 +00:00
exp2_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp2_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp2_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp2_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp_386.s math: use portable Exp instead of 387 instructions on 386 2016-10-05 03:53:11 +00:00
exp_amd64.s math: implement fast path for Exp 2017-09-20 21:43:00 +00:00
exp_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp_asm.go math: implement fast path for Exp 2017-09-20 21:43:00 +00:00
exp_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
expm1.go all: unindent some big chunks of code 2017-08-18 06:59:48 +00:00
expm1_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
export_s390x_test.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
export_test.go all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor.go math: add RoundToEven function 2017-10-24 22:33:09 +00:00
floor_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor_amd64.s internal/cpu: new package to detect cpu features 2017-05-10 17:02:21 +00:00
floor_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor_arm64.s math: add some assembly implementations on ARM64 2016-09-27 23:52:12 +00:00
floor_asm.go internal/cpu: new package to detect cpu features 2017-05-10 17:02:21 +00:00
floor_ppc64x.s math, cmd/internal/obj/ppc64: improve floor, ceil, trunc with asm 2016-09-23 13:03:08 +00:00
floor_s390x.s math: optimize Ceil, Floor and Trunc on s390x 2016-08-26 17:27:13 +00:00
frexp.go
frexp_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
frexp_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
frexp_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
frexp_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
gamma.go math: speed up Gamma(+Inf) 2016-10-18 22:12:03 +00:00
hypot.go
hypot_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
hypot_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
hypot_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
hypot_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
j0.go math: speed up bessel functions on AMD64 2016-08-31 14:45:29 +00:00
j1.go math: speed up bessel functions on AMD64 2016-08-31 14:45:29 +00:00
jn.go math: fix typos in Bessel function docs 2017-02-16 22:41:34 +00:00
ldexp.go
ldexp_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
ldexp_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
ldexp_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
ldexp_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
lgamma.go all: single space after period. 2016-03-02 00:13:47 +00:00
log.go all: single space after period. 2016-03-02 00:13:47 +00:00
log1p.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
log1p_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
log10.go math: fix Log2 test failures on ppc64 (and s390) 2015-07-15 05:35:22 +00:00
log10_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
log_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log_amd64.s math: speed up Log on amd64 2017-03-29 20:36:29 +00:00
log_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
logb.go
mod.go
mod_386.s
mod_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
mod_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
mod_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf.go all: single space after period. 2016-03-02 00:13:47 +00:00
modf_386.s all: fix assembly vet issues 2016-08-25 18:52:31 +00:00
modf_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf_arm64.s all: minor vet fixes 2016-10-24 17:27:37 +00:00
nextafter.go
pow.go math: eliminate overflow in Pow(x,y) for large y 2017-08-16 09:10:10 +00:00
pow10.go math: speed up and improve accuracy of Pow10 2017-02-22 19:17:04 +00:00
pow_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
pow_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
remainder.go all: single space after period. 2016-03-02 00:13:47 +00:00
remainder_386.s
remainder_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
remainder_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
remainder_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
signbit.go
sin.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
sin_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
sincos.go math: remove asm version of sincos everywhere, except 386 2017-04-24 15:09:18 +00:00
sincos_386.go math: remove asm version of sincos everywhere, except 386 2017-04-24 15:09:18 +00:00
sincos_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sinh.go math: use SIMD to accelerate some scalar math functions on s390x 2016-11-11 20:20:23 +00:00
sinh_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
sinh_stub.s math: use SIMD to accelerate some scalar math functions on s390x 2016-11-11 20:20:23 +00:00
sqrt.go math: delete unused function sqrtC 2016-03-03 02:29:09 +00:00
sqrt_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_amd64.s math: make sqrt smaller on AMD64 2016-09-29 15:56:52 +00:00
sqrt_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_arm64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_mipsx.s math, math/big: add support for GOARCH=mips{,le} 2016-11-03 22:55:06 +00:00
sqrt_ppc64x.s all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
sqrt_s390x.s math: add functions and stubs for s390x 2016-04-06 23:35:56 +00:00
stubs_arm64.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_mips64x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_mipsx.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_ppc64x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
tan.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
tan_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
tanh.go math: use SIMD to accelerate some scalar math functions on s390x 2016-11-11 20:20:23 +00:00
tanh_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
unsafe.go