go/src/math
Carlos Eduardo Seo 3cb41be817 math/big: improve performance for AddMulVVW and mulAddVWW for ppc64x
This change adds a better implementation in asm for AddMulVVW and
mulAddVWW for ppc64x, with speedups up to 1.54x.

benchmark                       old ns/op     new ns/op     delta
BenchmarkAddMulVVW/1-8          6.58          6.29          -4.41%
BenchmarkAddMulVVW/2-8          7.43          7.25          -2.42%
BenchmarkAddMulVVW/3-8          8.95          8.15          -8.94%
BenchmarkAddMulVVW/4-8          10.1          9.37          -7.23%
BenchmarkAddMulVVW/5-8          12.0          10.7          -10.83%
BenchmarkAddMulVVW/10-8         22.1          20.1          -9.05%
BenchmarkAddMulVVW/100-8        211           154           -27.01%
BenchmarkAddMulVVW/1000-8       2046          1450          -29.13%
BenchmarkAddMulVVW/10000-8      20407         14793         -27.51%
BenchmarkAddMulVVW/100000-8     223857        145548        -34.98%

benchmark                       old MB/s     new MB/s     speedup
BenchmarkAddMulVVW/1-8          9719.88      10175.79     1.05x
BenchmarkAddMulVVW/2-8          17233.97     17657.54     1.02x
BenchmarkAddMulVVW/3-8          21446.05     23550.49     1.10x
BenchmarkAddMulVVW/4-8          25375.70     27334.33     1.08x
BenchmarkAddMulVVW/5-8          26650.52     30029.34     1.13x
BenchmarkAddMulVVW/10-8         28984.29     31833.68     1.10x
BenchmarkAddMulVVW/100-8        30249.41     41531.69     1.37x
BenchmarkAddMulVVW/1000-8       31273.35     44108.54     1.41x
BenchmarkAddMulVVW/10000-8      31360.47     43263.54     1.38x
BenchmarkAddMulVVW/100000-8     28589.58     43971.66     1.54x

Change-Id: I8a8105d4da3592afdef3125757a99f378a0254bb
Reviewed-on: https://go-review.googlesource.com/53931
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-08-11 13:59:52 +00:00
..
big math/big: improve performance for AddMulVVW and mulAddVWW for ppc64x 2017-08-11 13:59:52 +00:00
bits math/bits: examples generator 2017-08-11 11:05:01 +00:00
cmplx math/cmplx: prevent infinite loop in tanSeries 2016-10-25 18:32:22 +00:00
rand math/rand: use t.Helper in tests 2017-08-08 23:49:31 +00:00
abs.go
acos_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
acosh.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
acosh_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
all_test.go math: additional tests for Ldexp 2017-08-09 15:33:37 +00:00
arith_s390x.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
arith_s390x_test.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asin.go
asin_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
asin_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asinh.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asinh_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
asinh_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atan.go
atan2.go
atan2_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan2_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atan_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
atan_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atanh.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
atanh_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
bits.go
cbrt.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
cbrt_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
cbrt_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
const.go math: change oeis.org urls to https 2017-08-08 08:56:40 +00:00
copysign.go
cosh_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
dim.go
dim_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
dim_arm64.s math: add some assembly implementations on ARM64 2016-09-27 23:52:12 +00:00
dim_s390x.s math: add functions and stubs for s390x 2016-04-06 23:35:56 +00:00
erf.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erf_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erf_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
erfc_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
example_test.go math: add a Sqrt example 2017-07-15 20:12:22 +00:00
exp.go all: single space after period. 2016-03-02 00:13:47 +00:00
exp2_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp2_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp2_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp2_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp_386.s math: use portable Exp instead of 387 instructions on 386 2016-10-05 03:53:11 +00:00
exp_amd64.s math: check overflow in amd64 Exp implementation 2017-02-10 13:40:08 +00:00
exp_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
exp_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
expm1.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
expm1_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
expm1_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
export_s390x_test.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
export_test.go all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor.go
floor_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor_amd64.s internal/cpu: new package to detect cpu features 2017-05-10 17:02:21 +00:00
floor_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
floor_arm64.s math: add some assembly implementations on ARM64 2016-09-27 23:52:12 +00:00
floor_asm.go internal/cpu: new package to detect cpu features 2017-05-10 17:02:21 +00:00
floor_ppc64x.s math, cmd/internal/obj/ppc64: improve floor, ceil, trunc with asm 2016-09-23 13:03:08 +00:00
floor_s390x.s math: optimize Ceil, Floor and Trunc on s390x 2016-08-26 17:27:13 +00:00
frexp.go
frexp_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
frexp_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
frexp_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
frexp_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
gamma.go math: speed up Gamma(+Inf) 2016-10-18 22:12:03 +00:00
hypot.go
hypot_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
hypot_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
hypot_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
hypot_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
j0.go math: speed up bessel functions on AMD64 2016-08-31 14:45:29 +00:00
j1.go math: speed up bessel functions on AMD64 2016-08-31 14:45:29 +00:00
jn.go math: fix typos in Bessel function docs 2017-02-16 22:41:34 +00:00
ldexp.go
ldexp_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
ldexp_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
ldexp_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
ldexp_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
lgamma.go all: single space after period. 2016-03-02 00:13:47 +00:00
log.go all: single space after period. 2016-03-02 00:13:47 +00:00
log1p.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
log1p_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log1p_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
log10.go
log10_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log10_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
log_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log_amd64.s math: speed up Log on amd64 2017-03-29 20:36:29 +00:00
log_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
log_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
logb.go
mod.go
mod_386.s
mod_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
mod_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
mod_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf.go all: single space after period. 2016-03-02 00:13:47 +00:00
modf_386.s all: fix assembly vet issues 2016-08-25 18:52:31 +00:00
modf_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
modf_arm64.s all: minor vet fixes 2016-10-24 17:27:37 +00:00
nextafter.go
pow.go math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
pow10.go math: speed up and improve accuracy of Pow10 2017-02-22 19:17:04 +00:00
pow_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
pow_stub.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
remainder.go all: single space after period. 2016-03-02 00:13:47 +00:00
remainder_386.s
remainder_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
remainder_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
remainder_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
signbit.go
sin.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
sin_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sin_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
sincos.go math: remove asm version of sincos everywhere, except 386 2017-04-24 15:09:18 +00:00
sincos_386.go math: remove asm version of sincos everywhere, except 386 2017-04-24 15:09:18 +00:00
sincos_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sinh.go math: use SIMD to accelerate some scalar math functions on s390x 2016-11-11 20:20:23 +00:00
sinh_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
sinh_stub.s math: use SIMD to accelerate some scalar math functions on s390x 2016-11-11 20:20:23 +00:00
sqrt.go math: delete unused function sqrtC 2016-03-03 02:29:09 +00:00
sqrt_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_amd64.s math: make sqrt smaller on AMD64 2016-09-29 15:56:52 +00:00
sqrt_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_arm64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
sqrt_mipsx.s math, math/big: add support for GOARCH=mips{,le} 2016-11-03 22:55:06 +00:00
sqrt_ppc64x.s all: make copyright headers consistent with one space after period 2016-05-02 13:43:18 +00:00
sqrt_s390x.s math: add functions and stubs for s390x 2016-04-06 23:35:56 +00:00
stubs_arm64.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_mips64x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_mipsx.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_ppc64x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
stubs_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
tan.go math,math/cmplx: fix linter issues 2016-10-24 23:25:46 +00:00
tan_386.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_amd64.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_amd64p32.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_arm.s all: make copyright headers consistent with one space after period 2016-03-01 23:34:33 +00:00
tan_s390x.s math: use SIMD to accelerate additional scalar math functions on s390x 2017-05-08 19:52:30 +00:00
tanh.go math: use SIMD to accelerate some scalar math functions on s390x 2016-11-11 20:20:23 +00:00
tanh_s390x.s cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions 2017-04-17 16:33:51 +00:00
unsafe.go