mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Josh Bleecher Snyder	b0df92703c	math/big: add shrVU and shlVU benchmarks Change-Id: Id67d6ac856bd9271de99c3381bde910aa0c166e0 Reviewed-on: https://go-review.googlesource.com/c/go/+/296011 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2021-03-07 23:02:35 +00:00
Josh Bleecher Snyder	60b500dc6c	math/big: remove bounds checks for shrVU_g inner loop Make explicit a shrVU_g precondition. Replace i with i+1 throughout the loop. The resulting loop is functionally identical, but the compiler can do better BCE without the i-1 slice offset. Benchmarks results on amd64 with -tags=math_big_pure_go. name old time/op new time/op delta NonZeroShifts/1/shrVU-8 4.55ns ± 2% 4.45ns ± 3% -2.27% (p=0.000 n=28+30) NonZeroShifts/1/shlVU-8 4.07ns ± 1% 4.13ns ± 4% +1.55% (p=0.000 n=26+29) NonZeroShifts/2/shrVU-8 6.12ns ± 1% 5.55ns ± 1% -9.30% (p=0.000 n=28+28) NonZeroShifts/2/shlVU-8 5.65ns ± 3% 5.70ns ± 2% +0.92% (p=0.008 n=30+29) NonZeroShifts/3/shrVU-8 7.58ns ± 2% 6.79ns ± 2% -10.46% (p=0.000 n=28+28) NonZeroShifts/3/shlVU-8 6.62ns ± 2% 6.69ns ± 1% +1.07% (p=0.000 n=29+28) NonZeroShifts/4/shrVU-8 9.02ns ± 1% 7.79ns ± 2% -13.59% (p=0.000 n=27+30) NonZeroShifts/4/shlVU-8 7.74ns ± 1% 7.82ns ± 1% +0.92% (p=0.000 n=26+28) NonZeroShifts/5/shrVU-8 10.6ns ± 1% 8.9ns ± 3% -16.31% (p=0.000 n=25+29) NonZeroShifts/5/shlVU-8 8.59ns ± 1% 8.68ns ± 1% +1.13% (p=0.000 n=27+29) NonZeroShifts/10/shrVU-8 18.2ns ± 2% 14.4ns ± 1% -20.96% (p=0.000 n=27+28) NonZeroShifts/10/shlVU-8 14.1ns ± 1% 14.1ns ± 1% +0.46% (p=0.001 n=26+28) NonZeroShifts/100/shrVU-8 161ns ± 2% 118ns ± 1% -26.83% (p=0.000 n=29+30) NonZeroShifts/100/shlVU-8 119ns ± 2% 120ns ± 2% +0.92% (p=0.000 n=29+29) NonZeroShifts/1000/shrVU-8 1.54µs ± 1% 1.10µs ± 1% -28.63% (p=0.000 n=29+29) NonZeroShifts/1000/shlVU-8 1.10µs ± 1% 1.10µs ± 2% ~ (p=0.701 n=28+29) NonZeroShifts/10000/shrVU-8 15.3µs ± 2% 10.9µs ± 1% -28.68% (p=0.000 n=28+28) NonZeroShifts/10000/shlVU-8 10.9µs ± 2% 10.9µs ± 2% -0.57% (p=0.003 n=26+29) NonZeroShifts/100000/shrVU-8 154µs ± 1% 111µs ± 2% -28.04% (p=0.000 n=27+28) NonZeroShifts/100000/shlVU-8 113µs ± 2% 113µs ± 2% ~ (p=0.790 n=30+30) Change-Id: Ib6a621ee7c88b27f0f18121fb2cba3606c40c9b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/297049 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2021-03-05 06:15:22 +00:00
fanzha02	2b50ab2aee	cmd/compile: optimize single-precision floating point square root Add generic rule to rewrite the single-precision square root expression with one single-precision instruction. The optimization will reduce two times of precision converting between double-precision and single-precision. On arm64 flatform. previous: FCVTSD F0, F0 FSQRTD F0, F0 FCVTDS F0, F0 optimized: FSQRTS S0, S0 And this patch adds the test case to check the correctness. This patch refers to CL 241877, contributed by Alice Xu (dianhong.xu@arm.com) Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e Reviewed-on: https://go-review.googlesource.com/c/go/+/276873 Trust: fannie zhang <Fannie.Zhang@arm.com> Run-TryBot: fannie zhang <Fannie.Zhang@arm.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2021-03-02 06:38:07 +00:00
Russ Cox	d4b2638234	all: go fmt std cmd (but revert vendor) Make all our package sources use Go 1.17 gofmt format (adding //go:build lines). Part of //go:build change (#41184). See https://golang.org/design/draft-gobuild Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/294430 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2021-02-20 03:54:50 +00:00
Russ Cox	474d5f4f4d	math: remove most 387 implementations The Surface Pro X's 386 simulator is not completely faithful to a real 387. The most egregious problem is that it computes Log2(8) as 2.9999999999999996, but it has some other subtler problems as well. All the problems occur in routines that we don't even bother with assembly for on amd64. If the speed of Go code is OK on amd64 it should be OK on 386 too. Just remove all the 386-only assembly functions. This leaves Ceil, Floor, Trunc, Hypot, and Sqrt in 386 assembly, all of which are also in assembly on amd64 and all of which pass their tests on Surface Pro X. Compared to amd64, the 386 port omits assembly for Min, Max, and Log. It never had Min and Max, and this CL deletes Log because Log2 wasn't even correct. (None of the other architectures have assembly Log either.) Change-Id: I5eb6c61084467035269d4098a36001447b7a0601 Reviewed-on: https://go-review.googlesource.com/c/go/+/291229 Trust: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2021-02-19 00:41:09 +00:00
Katie Hockman	e491c6eea9	math/big: fix comment in divRecursiveStep There appears to be a typo in the description of the recursive division algorithm. Two things seem suspicious with the original comment: 1. It is talking about choosing s, but s doesn't appear anywhere in the equation. 2. The math in the equation is incorrect. Where B = len(v)/2 s = B - 1 Proof that it is incorrect: len(v) - B >= B + 1 len(v) - len(v)/2 >= len(v)/2 + 1 This doesn't hold if len(v) is even, e.g. 10: 10 - 10/2 >= 10/2 + 1 10 - 5 >= 5 + 1 5 >= 6 // this is false The new equation will be the following, which will be mathematically correct: len(v) - s >= B + 1 len(v) - (len(v)/2 - 1) >= len(v)/2 + 1 len(v) - len(v)/2 + 1 >= len(v)/2 + 1 len(v) - len(v)/2 >= len(v)/2 This holds if len(v) is even or odd. e.g. 10 10 - 10/2 >= 10/2 10 - 5 >= 5 5 >= 5 e.g. 11 11 - 11/2 >= 11/2 11 - 5 >= 5 6 >= 5 Change-Id: If77ce09286cf7038637b5dfd0fb7d4f828023f56 Reviewed-on: https://go-review.googlesource.com/c/go/+/287372 Run-TryBot: Katie Hockman <katie@golang.org> Reviewed-by: Filippo Valsorda <filippo@golang.org> Trust: Katie Hockman <katie@golang.org>	2021-02-03 15:04:02 +00:00
Paul Davis	0f797f168d	math: fix typo in sqrt.go code comment "it does not necessary" -> "it is not necessary" Change-Id: I66f9cf2670d76b3686badb4a537b3ec084447d62 GitHub-Last-Rev: `52a0f9993a` GitHub-Pull-Request: golang/go#43935 Reviewed-on: https://go-review.googlesource.com/c/go/+/287052 Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Robert Griesemer <gri@golang.org>	2021-01-27 00:15:06 +00:00
Toasa	9eef49cfa6	math/rand: fix typo in comment Change-Id: I57fbabf272bdfd61918db155ee6f7091f18e5979 GitHub-Last-Rev: `e138804b1a` GitHub-Pull-Request: golang/go#43495 Reviewed-on: https://go-review.googlesource.com/c/go/+/281373 Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Alberto Donizetti <alb.donizetti@gmail.com>	2021-01-04 17:59:30 +00:00
Katie Hockman	dea6d94a44	math/big: add test for recursive division panic The vulnerability that allowed this panic is CVE-2020-28362 and has been fixed in a security release, per #42552. Change-Id: I774bcda2cc83cdd5a273d21c8d9f4b53fa17c88f Reviewed-on: https://go-review.googlesource.com/c/go/+/277959 Run-TryBot: Katie Hockman <katie@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Filippo Valsorda <filippo@golang.org>	2020-12-14 20:56:03 +00:00
Russ Cox	4f1b0a44cb	all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTemp As part of #42026, these helpers from io/ioutil were moved to os. (ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.) Update the Go tree to use the preferred names. As usual, code compiled with the Go 1.4 bootstrap toolchain and code vendored from other sources is excluded. ReadDir changes are in a separate CL, because they are not a simple search and replace. For #42026. Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/266365 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-12-09 19:12:23 +00:00
Jonathan Albrecht	b1369d5862	math/big: remove the s390x assembly for shlVU and shrVU The s390x assembly for shlVU does a forward copy when the shift amount s is 0. This causes corruption of the result z when z is aliased to the input x. This fix removes the s390x assembly for both shlVU and shrVU so the pure go implementations will be used. Test cases have been added to the existing TestShiftOverlap test to cover shift values of 0, 1 and (_W - 1). Fixes #42838 Change-Id: I75ca0e98f3acfaa6366a26355dcd9dd82499a48b Reviewed-on: https://go-review.googlesource.com/c/go/+/274442 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Trust: Robert Griesemer <gri@golang.org>	2020-12-03 18:43:06 +00:00
Katie Hockman	1e1fa5903b	math/big: fix shift for recursive division The previous s value could cause a crash for certain inputs. Will check in tests and documentation improvements later. Thanks to the Go Ethereum team and the OSS-Fuzz project for reporting this. Thanks to Rémy Oudompheng and Robert Griesemer for their help developing and validating the fix. Fixes CVE-2020-28362 Change-Id: Ibbf455c4436bcdb07c84a34fa6551fb3422356d3 Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/899974 Reviewed-by: Roland Shoemaker <bracewell@google.com> Reviewed-by: Filippo Valsorda <valsorda@google.com> Reviewed-on: https://go-review.googlesource.com/c/go/+/269657 Trust: Katie Hockman <katie@golang.org> Trust: Roland Shoemaker <roland@golang.org> Run-TryBot: Katie Hockman <katie@golang.org> Reviewed-by: Roland Shoemaker <roland@golang.org> TryBot-Result: Go Bot <gobot@golang.org>	2020-11-12 20:42:40 +00:00
surechen	f588974a52	math/big: reduce allocations for building decimal strings Append operations in the decimal String function may cause several allocations. Use make to pre allocate slices in String that have enough capacity to avoid additional allocations in append operations. name old time/op new time/op delta DecimalConversion-8 139µs ± 7% 109µs ± 2% -21.06% (p=0.000 n=10+10) Change-Id: Id0284d204918a179a0421c51c35d86a3408e1bd9 Reviewed-on: https://go-review.googlesource.com/c/go/+/233980 Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com> Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> Trust: Giovanni Bajo <rasky@develer.com> Trust: Martin Möhrmann <moehrmann@google.com>	2020-10-29 22:45:29 +00:00
SparrowLii	d54a9a9c42	math/big: replace division with multiplication by reciprocal word Division is much slower than multiplication. And the method of using multiplication by multiplying reciprocal and replacing division with it can increase the speed of divWVW algorithm by three times,and at the same time increase the speed of nats division. The benchmark test on arm64 is as follows: name old time/op new time/op delta DivWVW/1-4 13.1ns ± 4% 13.3ns ± 4% ~ (p=0.444 n=5+5) DivWVW/2-4 48.6ns ± 1% 51.2ns ± 2% +5.39% (p=0.008 n=5+5) DivWVW/3-4 82.0ns ± 1% 69.7ns ± 1% -15.03% (p=0.008 n=5+5) DivWVW/4-4 116ns ± 1% 71ns ± 2% -38.88% (p=0.008 n=5+5) DivWVW/5-4 152ns ± 1% 84ns ± 4% -44.70% (p=0.008 n=5+5) DivWVW/10-4 319ns ± 1% 155ns ± 4% -51.50% (p=0.008 n=5+5) DivWVW/100-4 3.44µs ± 3% 1.30µs ± 8% -62.30% (p=0.008 n=5+5) DivWVW/1000-4 33.8µs ± 0% 10.9µs ± 1% -67.74% (p=0.008 n=5+5) DivWVW/10000-4 343µs ± 4% 111µs ± 5% -67.63% (p=0.008 n=5+5) DivWVW/100000-4 3.35ms ± 1% 1.25ms ± 3% -62.79% (p=0.008 n=5+5) QuoRem-4 3.08µs ± 2% 2.21µs ± 4% -28.40% (p=0.008 n=5+5) ModSqrt225_Tonelli-4 444µs ± 2% 457µs ± 3% ~ (p=0.095 n=5+5) ModSqrt225_3Mod4-4 136µs ± 1% 138µs ± 3% ~ (p=0.151 n=5+5) ModSqrt231_Tonelli-4 473µs ± 3% 483µs ± 4% ~ (p=0.548 n=5+5) ModSqrt231_5Mod8-4 164µs ± 9% 169µs ±12% ~ (p=0.421 n=5+5) Sqrt-4 36.8µs ± 1% 28.6µs ± 0% -22.17% (p=0.016 n=5+4) Div/20/10-4 50.0ns ± 3% 51.3ns ± 6% ~ (p=0.238 n=5+5) Div/40/20-4 49.8ns ± 2% 51.3ns ± 6% ~ (p=0.222 n=5+5) Div/100/50-4 85.8ns ± 4% 86.5ns ± 5% ~ (p=0.246 n=5+5) Div/200/100-4 335ns ± 3% 296ns ± 2% -11.60% (p=0.008 n=5+5) Div/400/200-4 442ns ± 2% 359ns ± 5% -18.81% (p=0.008 n=5+5) Div/1000/500-4 858ns ± 3% 643ns ± 6% -25.06% (p=0.008 n=5+5) Div/2000/1000-4 1.70µs ± 3% 1.28µs ± 4% -24.80% (p=0.008 n=5+5) Div/20000/10000-4 45.0µs ± 5% 41.8µs ± 4% -7.17% (p=0.016 n=5+5) Div/200000/100000-4 1.51ms ± 7% 1.43ms ± 3% -5.42% (p=0.016 n=5+5) Div/2000000/1000000-4 57.6ms ± 4% 57.5ms ± 3% ~ (p=1.000 n=5+5) Div/20000000/10000000-4 2.08s ± 3% 2.04s ± 1% ~ (p=0.095 n=5+5) name old speed new speed delta DivWVW/1-4 4.87GB/s ± 4% 4.80GB/s ± 4% ~ (p=0.310 n=5+5) DivWVW/2-4 2.63GB/s ± 1% 2.50GB/s ± 2% -5.07% (p=0.008 n=5+5) DivWVW/3-4 2.34GB/s ± 1% 2.76GB/s ± 1% +17.70% (p=0.008 n=5+5) DivWVW/4-4 2.21GB/s ± 1% 3.61GB/s ± 2% +63.42% (p=0.008 n=5+5) DivWVW/5-4 2.10GB/s ± 2% 3.81GB/s ± 4% +80.89% (p=0.008 n=5+5) DivWVW/10-4 2.01GB/s ± 0% 4.13GB/s ± 4% +105.91% (p=0.008 n=5+5) DivWVW/100-4 1.86GB/s ± 2% 4.95GB/s ± 7% +165.63% (p=0.008 n=5+5) DivWVW/1000-4 1.89GB/s ± 0% 5.86GB/s ± 1% +209.96% (p=0.008 n=5+5) DivWVW/10000-4 1.87GB/s ± 4% 5.76GB/s ± 5% +208.96% (p=0.008 n=5+5) DivWVW/100000-4 1.91GB/s ± 1% 5.14GB/s ± 3% +168.85% (p=0.008 n=5+5) Change-Id: I049f1196562b20800e6ef8a6493fd147f93ad830 Reviewed-on: https://go-review.googlesource.com/c/go/+/250417 Trust: Giovanni Bajo <rasky@develer.com> Trust: Keith Randall <khr@golang.org> Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-09-23 21:55:55 +00:00
surechen	73eb24ccb6	math: Remove redundant local variable Ln2 Use the const variable Ln2 in math/const.go for function acosh. Change-Id: I5381d03dd3142c227ae5773ece9be6c8f377615e Reviewed-on: https://go-review.googlesource.com/c/go/+/232517 Reviewed-by: Robert Griesemer <gri@golang.org> Trust: Robert Griesemer <gri@golang.org> Trust: Giovanni Bajo <rasky@develer.com>	2020-09-19 09:09:52 +00:00
Xiangdong Ji	ae7b6a3b77	math/big: tune addVW/subVW performance on arm64 Add an optimization for addVW and subVW over large-sized vectors, it switches from add/sub with carry to copy the rest of the vector when we are done with carries. Consistent performance improvement are observed on various arm64 machines. Add additional tests and benchmarks to increase the test coverage. TestFunVWExt: Testing with various types of input vector, using the result from go-version addVW/subVW as golden reference. BenchmarkAddVWext and BenchmarkSubVWext: Benchmarking using input vector having all 1s or all 0s, for evaluating the overhead of worst case. 1. Perf. comparison over randomly generated input vectors: Server 1: name old time/op new time/op delta AddVW/1 12.3ns ± 3% 12.0ns ± 0% -2.60% (p=0.001 n=10+8) AddVW/2 12.5ns ± 2% 12.3ns ± 0% -1.84% (p=0.001 n=10+8) AddVW/3 12.6ns ± 2% 12.3ns ± 0% -1.91% (p=0.009 n=10+10) AddVW/4 13.1ns ± 3% 12.7ns ± 0% -2.98% (p=0.006 n=10+8) AddVW/5 14.4ns ± 1% 13.9ns ± 0% -3.81% (p=0.000 n=10+10) AddVW/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) AddVW/100 47.8ns ± 0% 29.9ns ± 2% -37.38% (p=0.000 n=10+9) AddVW/1000 446ns ± 0% 207ns ± 0% -53.59% (p=0.000 n=10+10) AddVW/10000 4.35µs ± 1% 2.92µs ± 0% -32.85% (p=0.000 n=10+10) AddVW/100000 43.6µs ± 0% 29.7µs ± 0% -31.92% (p=0.000 n=8+10) SubVW/1 12.6ns ± 0% 12.3ns ± 2% -2.22% (p=0.000 n=7+10) SubVW/2 12.7ns ± 0% 12.6ns ± 1% -0.39% (p=0.046 n=8+10) SubVW/3 12.7ns ± 1% 12.6ns ± 1% ~ (p=0.410 n=10+10) SubVW/4 13.3ns ± 3% 13.1ns ± 3% ~ (p=0.072 n=10+10) SubVW/5 14.2ns ± 0% 14.1ns ± 1% -0.63% (p=0.046 n=8+10) SubVW/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) SubVW/100 47.8ns ± 0% 33.1ns ±19% -30.71% (p=0.000 n=10+10) SubVW/1000 446ns ± 0% 207ns ± 0% -53.59% (p=0.000 n=10+10) SubVW/10000 4.33µs ± 1% 2.92µs ± 0% -32.66% (p=0.000 n=10+6) SubVW/100000 43.4µs ± 0% 29.6µs ± 0% -31.90% (p=0.000 n=10+9) Server 2: name old time/op new time/op delta AddVW/1 5.49ns ± 0% 5.53ns ± 2% ~ (p=1.000 n=9+10) AddVW/2 5.96ns ± 2% 5.92ns ± 1% -0.69% (p=0.039 n=10+10) AddVW/3 6.72ns ± 0% 6.73ns ± 0% ~ (p=0.078 n=10+10) AddVW/4 7.07ns ± 0% 6.75ns ± 2% -4.55% (p=0.000 n=10+10) AddVW/5 8.14ns ± 0% 8.17ns ± 0% +0.46% (p=0.003 n=8+8) AddVW/10 10.0ns ± 0% 10.1ns ± 1% +0.70% (p=0.003 n=10+10) AddVW/100 43.0ns ± 0% 33.5ns ± 0% -22.09% (p=0.000 n=9+9) AddVW/1000 394ns ± 0% 278ns ± 0% -29.44% (p=0.000 n=10+10) AddVW/10000 4.18µs ± 0% 3.14µs ± 0% -24.81% (p=0.000 n=8+8) AddVW/100000 68.3µs ± 3% 62.1µs ± 5% -9.13% (p=0.000 n=10+10) SubVW/1 5.37ns ± 2% 5.42ns ± 1% ~ (p=0.990 n=10+10) SubVW/2 5.89ns ± 0% 5.92ns ± 1% +0.58% (p=0.000 n=8+10) SubVW/3 6.64ns ± 1% 6.82ns ± 3% +2.63% (p=0.000 n=9+10) SubVW/4 7.17ns ± 0% 6.69ns ± 2% -6.74% (p=0.000 n=10+9) SubVW/5 8.22ns ± 0% 8.18ns ± 0% -0.46% (p=0.001 n=8+9) SubVW/10 10.0ns ± 1% 10.1ns ± 1% ~ (p=0.341 n=10+10) SubVW/100 43.0ns ± 0% 33.5ns ± 0% -22.09% (p=0.000 n=7+10) SubVW/1000 394ns ± 0% 278ns ± 0% -29.44% (p=0.000 n=10+10) SubVW/10000 4.18µs ± 0% 3.15µs ± 0% -24.62% (p=0.000 n=9+9) SubVW/100000 67.7µs ± 4% 62.4µs ± 2% -7.92% (p=0.000 n=10+10) 2. Perf. comparison over input vectors of all 1s or all 0s Server 1: name old time/op new time/op delta AddVWext/1 12.6ns ± 0% 12.0ns ± 0% -4.76% (p=0.000 n=6+10) AddVWext/2 12.7ns ± 0% 12.4ns ± 1% -2.52% (p=0.000 n=10+10) AddVWext/3 12.7ns ± 0% 12.4ns ± 0% -2.36% (p=0.000 n=9+7) AddVWext/4 13.2ns ± 4% 12.7ns ± 0% -3.71% (p=0.001 n=10+9) AddVWext/5 14.6ns ± 0% 13.9ns ± 0% -4.79% (p=0.000 n=10+8) AddVWext/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) AddVWext/100 47.8ns ± 0% 47.4ns ± 0% -0.84% (p=0.000 n=10+10) AddVWext/1000 446ns ± 0% 399ns ± 0% -10.54% (p=0.000 n=10+10) AddVWext/10000 4.34µs ± 1% 3.90µs ± 0% -10.12% (p=0.000 n=10+10) AddVWext/100000 43.9µs ± 1% 39.4µs ± 0% -10.18% (p=0.000 n=10+10) SubVWext/1 12.6ns ± 0% 12.3ns ± 2% -2.70% (p=0.000 n=7+10) SubVWext/2 12.6ns ± 1% 12.6ns ± 2% ~ (p=0.234 n=10+10) SubVWext/3 12.7ns ± 0% 12.6ns ± 2% -0.71% (p=0.033 n=10+10) SubVWext/4 13.4ns ± 0% 13.1ns ± 3% -2.01% (p=0.006 n=8+10) SubVWext/5 14.2ns ± 0% 14.1ns ± 1% -0.85% (p=0.003 n=10+10) SubVWext/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal) SubVWext/100 47.8ns ± 0% 47.4ns ± 0% -0.84% (p=0.000 n=10+10) SubVWext/1000 446ns ± 0% 399ns ± 0% -10.54% (p=0.000 n=10+10) SubVWext/10000 4.33µs ± 1% 3.90µs ± 0% -10.02% (p=0.000 n=10+10) SubVWext/100000 43.5µs ± 0% 39.5µs ± 1% -9.16% (p=0.000 n=7+10) Server 2: name old time/op new time/op delta AddVWext/1 5.48ns ± 0% 5.43ns ± 1% -0.97% (p=0.000 n=9+9) AddVWext/2 5.99ns ± 2% 5.93ns ± 1% ~ (p=0.054 n=10+10) AddVWext/3 6.74ns ± 0% 6.79ns ± 1% +0.80% (p=0.000 n=9+10) AddVWext/4 7.18ns ± 0% 7.21ns ± 1% +0.36% (p=0.034 n=9+10) AddVWext/5 7.93ns ± 3% 8.18ns ± 0% +3.18% (p=0.000 n=10+8) AddVWext/10 10.0ns ± 0% 10.1ns ± 1% +0.60% (p=0.011 n=10+10) AddVWext/100 43.0ns ± 0% 47.7ns ± 0% +10.93% (p=0.000 n=9+10) AddVWext/1000 394ns ± 0% 399ns ± 0% +1.27% (p=0.000 n=10+10) AddVWext/10000 4.18µs ± 0% 4.50µs ± 0% +7.73% (p=0.000 n=9+10) AddVWext/100000 67.6µs ± 2% 68.4µs ± 3% ~ (p=0.139 n=9+8) SubVWext/1 5.46ns ± 1% 5.43ns ± 0% -0.55% (p=0.002 n=9+9) SubVWext/2 5.89ns ± 0% 5.93ns ± 1% +0.68% (p=0.000 n=8+10) SubVWext/3 6.72ns ± 1% 6.79ns ± 1% +1.07% (p=0.000 n=10+10) SubVWext/4 6.98ns ± 1% 7.21ns ± 0% +3.25% (p=0.000 n=10+10) SubVWext/5 8.22ns ± 0% 7.99ns ± 3% -2.83% (p=0.000 n=8+10) SubVWext/10 10.0ns ± 1% 10.1ns ± 1% ~ (p=0.239 n=10+10) SubVWext/100 43.0ns ± 0% 47.7ns ± 0% +10.93% (p=0.000 n=8+10) SubVWext/1000 394ns ± 0% 399ns ± 0% +1.27% (p=0.000 n=10+10) SubVWext/10000 4.18µs ± 0% 4.51µs ± 0% +7.86% (p=0.000 n=8+8) SubVWext/100000 68.3µs ± 2% 68.0µs ± 3% ~ (p=0.515 n=10+8) Change-Id: I134a5194b8a2deaaebbaa2b771baf72846971d58 Reviewed-on: https://go-review.googlesource.com/c/go/+/229739 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-08-28 16:40:41 +00:00
surechen	bd6dfe9a3e	math/big: add a comment for SetMantExp Change-Id: I9ff5d1767cf70648c2251268e5e815944a7cb371 Reviewed-on: https://go-review.googlesource.com/c/go/+/233737 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-28 16:25:32 +00:00
zhouzhongyuan	63828096f6	math/big: add function example While reading the source code of the math/big package, I found the SetString function example of float type missing. Change-Id: Id8c16a58e2e24f9463e8ff38adbc98f8c418ab26 Reviewed-on: https://go-review.googlesource.com/c/go/+/232804 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-26 16:15:32 +00:00
SparrowLii	41bc0a1713	math/big: fix TestShiftOverlap for test -count arguments > 1 Don't overwrite incoming test data. The change uses copy instead of assigning statement to avoid this. Change-Id: Ib907101822d811de5c45145cb9d7961907e212c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/250137 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-25 17:04:40 +00:00
Lynn Boger	216714e44f	math/big: improve performance of mulAddVWW on ppc64x This changes the assembly implementation on ppc64x to improve performance by reordering some instructions. It also eliminates an unnecessary move by changing an ADDZE to use the correct target register. Improvement on power9: MulAddVWW/1 6.89ns ± 0% 7.30ns ± 0% +5.95% (p=1.000 n=1+1) MulAddVWW/2 8.04ns ± 0% 8.06ns ± 0% +0.25% (p=1.000 n=1+1) MulAddVWW/3 9.39ns ± 0% 9.39ns ± 0% ~ (all equal) MulAddVWW/4 9.76ns ± 0% 9.48ns ± 0% -2.87% (p=1.000 n=1+1) MulAddVWW/5 10.5ns ± 0% 10.3ns ± 0% -1.90% (p=1.000 n=1+1) MulAddVWW/10 15.4ns ± 0% 14.9ns ± 0% -3.25% (p=1.000 n=1+1) MulAddVWW/100 149ns ± 0% 125ns ± 0% -16.11% (p=1.000 n=1+1) MulAddVWW/1000 1.42µs ± 0% 1.28µs ± 0% -9.74% (p=1.000 n=1+1) MulAddVWW/10000 14.2µs ± 0% 12.8µs ± 0% -9.73% (p=1.000 n=1+1) MulAddVWW/100000 144µs ± 0% 129µs ± 0% -10.10% (p=1.000 n=1+1) Change-Id: I0ae7002a69783ca19d7a4e3e42042ae75dc60069 Reviewed-on: https://go-review.googlesource.com/c/go/+/248721 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Paul Murphy <murp@ibm.com>	2020-08-18 20:25:26 +00:00
kakulisen	441b52f566	math: simplify the code Simplifying some code without compromising performance. My CPU is Intel Xeon Gold 6161, 2.20GHz, 64-bit operating system. The memory is 8GB. This is my test environment, I hope to help you judge. Benchmark: name old time/op new time/op delta Log1p-4 21.8ns ± 5% 21.8ns ± 4% ~ (p=0.973 n=20+20) Change-Id: Icd8f96f1325b00007602d114300b92d4c57de409 Reviewed-on: https://go-review.googlesource.com/c/go/+/233940 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-08-15 02:20:42 +00:00
Alberto Donizetti	65f514edfb	math: fix dead link to springerlink (now link.springer) Change-Id: Ie5fd026af45d2e7bc371a38d15dbb52a1b4958cd Reviewed-on: https://go-review.googlesource.com/c/go/+/235717 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-05-29 14:33:50 +00:00
Bryan C. Mills	13617380ca	testing: clean up remaining TempDir issues from CL 231958 Updates #38850 Change-Id: I33f48762f5520eb0c0a841d8ca1ccdd65ecc20c8 Reviewed-on: https://go-review.googlesource.com/c/go/+/234583 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-05-19 19:55:14 +00:00
Filippo Valsorda	c9d5f60eaa	math/big: add (Int).FillBytes Replaced almost every use of Bytes with FillBytes. Note that the approved proposal was for func (Int) FillBytes(buf []byte) while this implements func (*Int) FillBytes(buf []byte) []byte because the latter was far nicer to use in all callsites. Fixes #35833 Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a Reviewed-on: https://go-review.googlesource.com/c/go/+/230397 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-05 00:36:44 +00:00
Joel Sing	03e6073b13	math: implement Min/Max in riscv64 assembly Change-Id: If34422859d47bc8f44974a00c6b7908e7655ff41 Reviewed-on: https://go-review.googlesource.com/c/go/+/223561 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-05-04 17:29:13 +00:00
kakulisen	e90b0ce68b	math: add function examples. The function Modf lacks corresponding examples. Change-Id: Id93423500e87d35b0b6870882be1698b304797ae Reviewed-on: https://go-review.googlesource.com/c/go/+/231097 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-02 20:22:19 +00:00
Brian Kessler	4209a9f65a	math/cmplx: handle special cases Implement special case handling and testing to ensure conformance with the C99 standard annex G.6 Complex arithmetic. Fixes #29320 Change-Id: Id72eb4c5a35d5a54b4b8690d2f7176ab11028f1b Reviewed-on: https://go-review.googlesource.com/c/go/+/220689 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-01 03:16:37 +00:00
kakulisen	df2862cf54	math: Add a function example When I browsed the source code, I saw that there is no corresponding example of this function. I am not sure if there is a need for an increase, this is my first time to submit CL. Change-Id: Idbf4e1e1ed2995176a76959d561e152263a2fd26 Reviewed-on: https://go-review.googlesource.com/c/go/+/230741 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-30 00:33:38 +00:00
Ruixin(Peter) Bao	a7e9e84716	math/big: simplify hasVX checking on s390x Originally, we use an assembly function that returns a boolean result to tell whether the machine has vector facility or not. It is now no longer needed when we can directly use cpu.S390X.HasVX variable. Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87 Reviewed-on: https://go-review.googlesource.com/c/go/+/230318 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-27 20:20:53 +00:00
Ruixin Bao	d2f5e4e38c	math: simplify hasVX checking on s390x Originally, we use an assembly function that returns a boolean result to tell whether the machine has vector facility or not. It is now no longer needed when we can directly use cpu.S390X.HasVX variable. Change-Id: Ic3ffeb9e63238ef41406d97cdc42502145ddb454 Reviewed-on: https://go-review.googlesource.com/c/go/+/230319 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-27 20:06:57 +00:00
Tyson Andre	19648622d2	math/cmplx: fix typo in code comment Everywhere else is using "cancellation" as of 2019 The reasoning is mentioned in 170060. > Though there is variation in the spelling of canceled, > cancellation is always spelled with a double l. > > Reference: https://www.grammarly.com/blog/canceled-vs-cancelled/ Change-Id: I933ea68d7251986ce582b92c33b7cb13cee1d207 GitHub-Last-Rev: `fc3d5ada2b` GitHub-Pull-Request: golang/go#38661 Reviewed-on: https://go-review.googlesource.com/c/go/+/230199 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-04-25 21:06:35 +00:00
Ruixin(Peter) Bao	3a37fd4010	math/big: rewrite subVW to use fast path on s390x This CL replaces the original subVW implementation with a implementation that uses a similar idea as CL 164968. When we know the borrow bit is zero, we can copy the rest of words as they will not be updated. Also, since we are copying vector of a words, a faster implementation of copy is written in this CL to copy a word or multiple words at a time. Benchmarks: name old time/op new time/op delta SubVW/1-18 4.43ns ± 0% 3.82ns ± 0% -13.85% (p=0.000 n=20+20) SubVW/2-18 5.39ns ± 0% 4.25ns ± 0% -21.23% (p=0.000 n=20+20) SubVW/3-18 6.29ns ± 0% 4.65ns ± 0% -26.07% (p=0.000 n=16+19) SubVW/4-18 6.08ns ± 2% 4.84ns ± 0% -20.43% (p=0.000 n=20+20) SubVW/5-18 7.06ns ± 1% 4.93ns ± 0% -30.18% (p=0.000 n=20+20) SubVW/10-18 10.3ns ± 2% 7.2ns ± 0% -30.35% (p=0.000 n=20+19) SubVW/100-18 48.0ns ± 4% 17.6ns ± 0% -63.32% (p=0.000 n=18+19) SubVW/1000-18 448ns ±10% 236ns ± 1% -47.24% (p=0.000 n=20+20) SubVW/10000-18 4.83µs ± 5% 2.96µs ± 0% -38.73% (p=0.000 n=20+19) SubVW/100000-18 46.6µs ± 3% 30.6µs ± 1% -34.30% (p=0.000 n=20+20) [Geo mean] 56.3ns 37.0ns -34.24% name old speed new speed delta SubVW/1-18 1.80GB/s ± 0% 2.10GB/s ± 0% +16.16% (p=0.000 n=20+20) SubVW/2-18 2.97GB/s ± 0% 3.77GB/s ± 0% +26.95% (p=0.000 n=20+20) SubVW/3-18 3.82GB/s ± 0% 5.16GB/s ± 0% +35.26% (p=0.000 n=20+19) SubVW/4-18 5.26GB/s ± 1% 6.61GB/s ± 0% +25.59% (p=0.000 n=20+20) SubVW/5-18 5.67GB/s ± 1% 8.11GB/s ± 0% +43.12% (p=0.000 n=20+20) SubVW/10-18 7.79GB/s ± 2% 11.17GB/s ± 0% +43.52% (p=0.000 n=20+19) SubVW/100-18 16.7GB/s ± 4% 45.5GB/s ± 0% +172.61% (p=0.000 n=18+20) SubVW/1000-18 17.9GB/s ± 9% 33.9GB/s ± 1% +89.25% (p=0.000 n=20+20) SubVW/10000-18 16.6GB/s ± 5% 27.0GB/s ± 0% +63.08% (p=0.000 n=20+19) SubVW/100000-18 17.2GB/s ± 2% 26.1GB/s ± 1% +52.18% (p=0.000 n=20+20) [Geo mean] 7.25GB/s 11.03GB/s +52.01% Change-Id: I32e99cbab3260054a96231d02b87049c833ab77e Reviewed-on: https://go-review.googlesource.com/c/go/+/227297 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-24 14:50:59 +00:00
Ruixin(Peter) Bao	ee8972cd12	math/big: rewrite addVW to use fast path on s390x Rewrite addVW to use a fast path and remove the original vector and non vector implementation of addVW in assembly. This CL uses a similar idea as CL 164968, where we copy the rest of words when we know carry bit is zero. In addition, since we are copying vector of words, a faster implementation of copy is written in this CL to copy a word or multiple words at a time. Benchmarks: name old time/op new time/op delta AddVW/1-18 4.56ns ± 0% 4.01ns ± 6% -12.14% (p=0.000 n=18+20) AddVW/2-18 5.54ns ± 0% 4.42ns ± 5% -20.20% (p=0.000 n=18+20) AddVW/3-18 6.55ns ± 0% 4.61ns ± 0% -29.62% (p=0.000 n=16+18) AddVW/4-18 6.11ns ± 2% 5.12ns ± 6% -16.19% (p=0.000 n=20+20) AddVW/5-18 7.32ns ± 4% 5.14ns ± 0% -29.77% (p=0.000 n=20+19) AddVW/10-18 10.6ns ± 2% 7.2ns ± 1% -31.47% (p=0.000 n=20+20) AddVW/100-18 49.6ns ± 2% 18.0ns ± 0% -63.63% (p=0.000 n=20+20) AddVW/1000-18 465ns ± 3% 244ns ± 0% -47.54% (p=0.000 n=20+20) AddVW/10000-18 4.99µs ± 4% 2.97µs ± 0% -40.54% (p=0.000 n=20+20) AddVW/100000-18 48.3µs ± 3% 30.8µs ± 1% -36.29% (p=0.000 n=20+20) [Geo mean] 58.1ns 38.0ns -34.57% name old speed new speed delta AddVW/1-18 1.76GB/s ± 0% 2.00GB/s ± 6% +14.04% (p=0.000 n=20+20) AddVW/2-18 2.89GB/s ± 0% 3.63GB/s ± 5% +25.55% (p=0.000 n=18+20) AddVW/3-18 3.66GB/s ± 0% 5.21GB/s ± 0% +42.25% (p=0.000 n=18+19) AddVW/4-18 5.24GB/s ± 2% 6.27GB/s ± 6% +19.61% (p=0.000 n=20+20) AddVW/5-18 5.47GB/s ± 4% 7.78GB/s ± 0% +42.28% (p=0.000 n=20+18) AddVW/10-18 7.55GB/s ± 2% 11.04GB/s ± 1% +46.09% (p=0.000 n=20+20) AddVW/100-18 16.1GB/s ± 2% 44.3GB/s ± 0% +174.77% (p=0.000 n=20+20) AddVW/1000-18 17.2GB/s ± 3% 32.8GB/s ± 1% +90.58% (p=0.000 n=20+20) AddVW/10000-18 16.0GB/s ± 4% 26.9GB/s ± 0% +68.11% (p=0.000 n=20+20) AddVW/100000-18 16.6GB/s ± 3% 26.0GB/s ± 1% +56.94% (p=0.000 n=20+20) [Geo mean] 7.03GB/s 10.75GB/s +52.93% Change-Id: Idbb73f3178311bd2b18a93bdc1e48f26869d2f6a Reviewed-on: https://go-review.googlesource.com/c/go/+/209679 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-24 13:26:34 +00:00
Michael Munday	ab7a65f283	cmd/compile: clean up codegen for branch-on-carry on s390x This CL optimizes code that uses a carry from a function such as bits.Add64 as the condition in an if statement. For example: x, c := bits.Add64(a, b, 0) if c != 0 { panic("overflow") } Rather than converting the carry into a 0 or a 1 value and using that as an input to a comparison instruction the carry flag is now used as the input to a conditional branch directly. This typically removes an ADD LOGICAL WITH CARRY instruction when user code is doing overflow detection and is closer to the code that a user would expect to generate. Change-Id: I950431270955ab72f1b5c6db873b6abe769be0da Reviewed-on: https://go-review.googlesource.com/c/go/+/219757 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-04-22 20:11:06 +00:00
Ruixin(Peter) Bao	0329c915a0	math/big: clean up whitespace in arith_s390x.s file This CL looks big but it only does formatting changes to arith_s390x.s. The file was formatted using asmfmt(https://github.com/klauspost/asmfmt) , so there should not be any functional impact. I verified that the generated assembly of big.test file is identical. Change-Id: I8b4035ef082a4d0357881869327e25253f2d8be1 Reviewed-on: https://go-review.googlesource.com/c/go/+/229302 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-22 15:40:55 +00:00
Alberto Donizetti	813f8eae27	math/big: remove Direct Sqrt computation The Float.Sqrt method switches (for performance reasons) between direct (uses Quo) and inverse (doesn't) computation, depending on the precision, with threshold 128. Unfortunately the implementation of recursive division in CL 172018 made Quo slightly slower exactly in the range around and below the threshold Sqrt is using, so this strategy is no longer profitable. The new division algorithm allocates more, and this has increased the amount of allocations performed by Sqrt when using the direct method; on low precisions the computation is fast, so additional allocations have an negative impact on performance. Interestingly, only using the inverse method doesn't just reverse the effects of the Quo algorithm change, but it seems to make performances better overall for small precisions: name old time/op new time/op delta FloatSqrt/64-4 643ns ± 1% 635ns ± 1% -1.24% (p=0.000 n=10+10) FloatSqrt/128-4 1.44µs ± 1% 1.02µs ± 1% -29.25% (p=0.000 n=10+10) FloatSqrt/256-4 1.49µs ± 1% 1.49µs ± 1% ~ (p=0.752 n=10+10) FloatSqrt/1000-4 3.71µs ± 1% 3.74µs ± 1% +0.87% (p=0.001 n=10+10) FloatSqrt/10000-4 35.3µs ± 1% 35.6µs ± 1% +0.82% (p=0.002 n=10+9) FloatSqrt/100000-4 844µs ± 1% 844µs ± 0% ~ (p=0.549 n=10+9) FloatSqrt/1000000-4 69.5ms ± 0% 69.6ms ± 0% ~ (p=0.222 n=9+9) name old alloc/op new alloc/op delta FloatSqrt/64-4 280B ± 0% 200B ± 0% -28.57% (p=0.000 n=10+10) FloatSqrt/128-4 504B ± 0% 248B ± 0% -50.79% (p=0.000 n=10+10) FloatSqrt/256-4 344B ± 0% 344B ± 0% ~ (all equal) FloatSqrt/1000-4 1.30kB ± 0% 1.30kB ± 0% ~ (all equal) FloatSqrt/10000-4 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.237 n=10+10) FloatSqrt/100000-4 123kB ± 0% 123kB ± 0% ~ (p=0.247 n=10+10) FloatSqrt/1000000-4 1.83MB ± 1% 1.83MB ± 3% ~ (p=0.779 n=8+10) name old allocs/op new allocs/op delta FloatSqrt/64-4 8.00 ± 0% 5.00 ± 0% -37.50% (p=0.000 n=10+10) FloatSqrt/128-4 11.0 ± 0% 5.0 ± 0% -54.55% (p=0.000 n=10+10) FloatSqrt/256-4 5.00 ± 0% 5.00 ± 0% ~ (all equal) FloatSqrt/1000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/10000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/100000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/1000000-4 10.3 ±13% 10.3 ±13% ~ (p=1.000 n=10+10) For example, 1.02µs for FloatSqrt/128 is actually better than what I was getting on the same machine before the Quo changes. The .8% slowdown on /1000 and /10000 appears to be real and it is quite baffling (that codepath was not touched at all); it may be caused by code alignment changes. Change-Id: Ib03761cdc1055674bc7526d4f3a23d7a25094029 Reviewed-on: https://go-review.googlesource.com/c/go/+/228062 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 16:37:53 +00:00
Brad Fitzpatrick	8f53fad035	math/big: add test that linker is able to remove unused code (Follow-up to CL 228108.) Change-Id: Ia6d119ee19c7aa923cdeead06d3cee87a1751105 Reviewed-on: https://go-review.googlesource.com/c/go/+/228109 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 03:25:21 +00:00
Hanjun Kim	5a447c0ae9	math/big: fix typo in documentation for Int.Exp Fixes #38304 Also change `If m > 0, y < 0, ...` to `If m != 0, y < 0, ...` since `Exp` will return `nil` whatever `m`'s sign is. Change-Id: I17d7337ccd1404318cea5d42a8de904ad185fd00 GitHub-Last-Rev: `2399510300` GitHub-Pull-Request: golang/go#38390 Reviewed-on: https://go-review.googlesource.com/c/go/+/228000 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 00:32:18 +00:00
Brad Fitzpatrick	a55645fa34	math/big: don't use Float in init to help linker discard 162 KiB Removes 162 KiB from binaries that don't use math/big.Float: -rwxr-xr-x 1 bradfitz bradfitz 1916590 Apr 14 12:21 x.after -rwxr-xr-x 1 bradfitz bradfitz 2082575 Apr 14 12:21 x.before No change in deps (this package already used sync). No change in benchmarks: name old time/op new time/op delta FloatSqrt/64-8 1.06µs ±10% 1.03µs ± 6% ~ (p=0.133 n=10+9) FloatSqrt/128-8 2.26µs ± 9% 2.28µs ± 9% ~ (p=0.460 n=10+8) FloatSqrt/256-8 2.29µs ± 5% 2.31µs ± 3% ~ (p=0.214 n=9+9) FloatSqrt/1000-8 5.82µs ± 3% 5.87µs ± 7% ~ (p=0.666 n=9+9) FloatSqrt/10000-8 56.4µs ± 5% 57.0µs ± 6% ~ (p=0.436 n=10+10) FloatSqrt/100000-8 1.34ms ± 8% 1.31ms ± 3% ~ (p=0.447 n=10+9) FloatSqrt/1000000-8 106ms ± 5% 107ms ± 7% ~ (p=0.315 n=10+10) name old alloc/op new alloc/op delta FloatSqrt/64-8 280B ± 0% 280B ± 0% ~ (all equal) FloatSqrt/128-8 504B ± 0% 504B ± 0% ~ (all equal) FloatSqrt/256-8 344B ± 0% 344B ± 0% ~ (all equal) FloatSqrt/1000-8 1.30kB ± 0% 1.30kB ± 0% ~ (all equal) FloatSqrt/10000-8 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.403 n=10+10) FloatSqrt/100000-8 123kB ± 0% 123kB ± 0% ~ (p=0.393 n=10+10) FloatSqrt/1000000-8 1.84MB ± 7% 1.84MB ± 5% ~ (p=0.739 n=10+10) name old allocs/op new allocs/op delta FloatSqrt/64-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) FloatSqrt/128-8 11.0 ± 0% 11.0 ± 0% ~ (all equal) FloatSqrt/256-8 5.00 ± 0% 5.00 ± 0% ~ (all equal) FloatSqrt/1000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/10000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/100000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/1000000-8 10.9 ±10% 10.8 ±17% ~ (p=0.974 n=10+10) Change-Id: I3337f1f531bf7b4fae192b9d90cd24ff2be14fea Reviewed-on: https://go-review.googlesource.com/c/go/+/228108 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-14 20:50:19 +00:00
Rémy Oudompheng	ac1fd419b6	math/big: correct off-by-one access in divBasic The divBasic function computes the quotient of big nats u/v word by word. It estimates each word qhat by performing a long division (top 2 words of u divided by top word of v), looks at the next word to correct the estimate, then perform a full multiplication (qhatv) to catch any inaccuracy in the estimate. In the latter case, "negative" values appear temporarily and carries must be carefully managed, and the recursive division refactoring introduced a case where qhatv has the same length as v, triggering an out-of-bounds write in the case it happens when computing the top word of the quotient. Fixes #37499 Change-Id: I15089da4a4027beda43af497bf6de261eb792f94 Reviewed-on: https://go-review.googlesource.com/c/go/+/221980 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-08 20:53:58 +00:00
Brian Kessler	6b6414cab4	math: correct Atan2(±y,+∞) = ±0 on s390x The s390x assembly implementation was previously only handling this case correctly for x = -Pi. Update the special case handling for any y. Fixes #35446 Change-Id: I355575e9ec8c7ce8bd9db10d74f42a22f39a2f38 Reviewed-on: https://go-review.googlesource.com/c/go/+/223420 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <mike.munday@ibm.com> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-03-25 04:06:34 +00:00
Alberto Donizetti	c5058652fd	math/big: document that Sqrt doesn't set Accuracy Document that the Float.Sqrt method does not set the receiver's Accuracy field. Updates #37915 Change-Id: Ief1dcac07eacc0ef02f86bfac9044501477bca1c Reviewed-on: https://go-review.googlesource.com/c/go/+/224497 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-03-20 19:24:37 +00:00
Brian Kessler	d774d979dd	math/cmplx: disable TanHuge test on s390x s390x has inaccurate range reduction for the assembly routines in math so these tests are diabled until these are corrected. Updates #37854 Change-Id: I1e26acd6d09ae3e592a3dd90aec73a6844f5c6fe Reviewed-on: https://go-review.googlesource.com/c/go/+/223457 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org>	2020-03-14 07:03:15 +00:00
Brian Kessler	70dc28f766	math/cmplx: implement Payne-Hanek range reduction Tan has poles along the real axis. In order to accurately calculate the value near these poles, a range reduction by Pi is performed and the result calculated via a Taylor series. The prior implementation of range reduction used Cody-Waite range reduction in three parts. This fails when x is too large to accurately calculate the partial products in the summation accurately. Above this threshold, Payne-Hanek range reduction using a multiple precision value of 1/Pi is required. Additionally, the threshold used in math/trig_reduce.go for Payne-Hanek range reduction was not set conservatively enough. The prior threshold ensured that catastrophic failure did not occur where the argument x would not actually be reduced below Pi/4. However, errors in reduction begin to occur at values much lower when z = ((x - yPI4A) - yPI4B) - yPI4C is not exact because yPI4A cannot be exactly represented as a float64. reduceThreshold is lowered to the proper value. Fixes #31566 Change-Id: I0f39a4171a5be44f64305f18dc57f6c29f19dba7 Reviewed-on: https://go-review.googlesource.com/c/go/+/172838 Reviewed-by: Rob Pike <r@golang.org>	2020-03-14 04:12:41 +00:00
Joel Sing	fe70838598	math/big: initial vector arithmetic in riscv64 assembly Provide an assembly implementation of mulWW - for now all others run the Go code. Change-Id: Icb594c31048255f131bdea8d64f56784fc9db4d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/220919 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-25 16:47:02 +00:00
Joel Sing	89f249a40d	math: implement Sqrt in assembly for riscv64 Change-Id: I9a5dc33271434e58335f5562a30cc131c6a8332c Reviewed-on: https://go-review.googlesource.com/c/go/+/220918 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-25 16:43:26 +00:00
Filippo Valsorda	88ae4ccefb	math/big: reintroduce pre-Go 1.14 mention in GCD docs It was removed in CL 217302 but was intentionally added in CL 217104. Change-Id: I1a478d80ad1ec4f0a0184bfebf8f1a5e352cfe8c Reviewed-on: https://go-review.googlesource.com/c/go/+/217941 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-05 20:54:27 +00:00
Filippo Valsorda	cdb7fd6b06	math/big: simplify GCD docs We don't usually document past behavior (like "As of Go 1.14 ...") and in isolation the current docs made it sound like a and b could only be negative or zero. Change-Id: I0d3c2b8579a9c01159ce528a3128b1478e99042a Reviewed-on: https://go-review.googlesource.com/c/go/+/217302 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-01-31 23:32:37 +00:00
Robert Griesemer	9bb40ed8ec	math/big: update comment on Int.GCD Per the suggestion https://golang.org/cl/216200/2/doc/go1.14.html#423. Updates #28878. Change-Id: I654d2d114409624219a0041916f0a4030efc7573 Reviewed-on: https://go-review.googlesource.com/c/go/+/217104 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-01-30 20:37:01 +00:00
Joel Sing	5a3a5d3525	math, math/big: add support for riscv64 Based on riscv-go port. Updates #27532 Change-Id: Id8ae7d851c393ec3702e4176c363accb0a42587f Reviewed-on: https://go-review.googlesource.com/c/go/+/204633 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-01-15 18:49:52 +00:00

1 2 3 4 5 ...

570 Commits