mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Katie Hockman	84150d0af1	[release-branch.go1.15-security] math/big: fix shift for recursive division The previous s value could cause a crash for certain inputs. Will check in tests and documentation improvements later. Thanks to the Go Ethereum team and the OSS-Fuzz project for reporting this. Thanks to Rémy Oudompheng and Robert Griesemer for their help developing and validating the fix. Fixes CVE-2020-28362 Change-Id: Ibbf455c4436bcdb07c84a34fa6551fb3422356d3 Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/899974 Reviewed-by: Roland Shoemaker <bracewell@google.com> Reviewed-by: Filippo Valsorda <valsorda@google.com> (cherry picked from commit 28015462c2a83239543dc2bef651e9a5f234b633) Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/901065	2020-11-11 23:35:42 +00:00
Bryan C. Mills	13617380ca	testing: clean up remaining TempDir issues from CL 231958 Updates #38850 Change-Id: I33f48762f5520eb0c0a841d8ca1ccdd65ecc20c8 Reviewed-on: https://go-review.googlesource.com/c/go/+/234583 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-05-19 19:55:14 +00:00
Filippo Valsorda	c9d5f60eaa	math/big: add (Int).FillBytes Replaced almost every use of Bytes with FillBytes. Note that the approved proposal was for func (Int) FillBytes(buf []byte) while this implements func (*Int) FillBytes(buf []byte) []byte because the latter was far nicer to use in all callsites. Fixes #35833 Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a Reviewed-on: https://go-review.googlesource.com/c/go/+/230397 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-05-05 00:36:44 +00:00
Ruixin(Peter) Bao	a7e9e84716	math/big: simplify hasVX checking on s390x Originally, we use an assembly function that returns a boolean result to tell whether the machine has vector facility or not. It is now no longer needed when we can directly use cpu.S390X.HasVX variable. Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87 Reviewed-on: https://go-review.googlesource.com/c/go/+/230318 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-27 20:20:53 +00:00
Ruixin(Peter) Bao	3a37fd4010	math/big: rewrite subVW to use fast path on s390x This CL replaces the original subVW implementation with a implementation that uses a similar idea as CL 164968. When we know the borrow bit is zero, we can copy the rest of words as they will not be updated. Also, since we are copying vector of a words, a faster implementation of copy is written in this CL to copy a word or multiple words at a time. Benchmarks: name old time/op new time/op delta SubVW/1-18 4.43ns ± 0% 3.82ns ± 0% -13.85% (p=0.000 n=20+20) SubVW/2-18 5.39ns ± 0% 4.25ns ± 0% -21.23% (p=0.000 n=20+20) SubVW/3-18 6.29ns ± 0% 4.65ns ± 0% -26.07% (p=0.000 n=16+19) SubVW/4-18 6.08ns ± 2% 4.84ns ± 0% -20.43% (p=0.000 n=20+20) SubVW/5-18 7.06ns ± 1% 4.93ns ± 0% -30.18% (p=0.000 n=20+20) SubVW/10-18 10.3ns ± 2% 7.2ns ± 0% -30.35% (p=0.000 n=20+19) SubVW/100-18 48.0ns ± 4% 17.6ns ± 0% -63.32% (p=0.000 n=18+19) SubVW/1000-18 448ns ±10% 236ns ± 1% -47.24% (p=0.000 n=20+20) SubVW/10000-18 4.83µs ± 5% 2.96µs ± 0% -38.73% (p=0.000 n=20+19) SubVW/100000-18 46.6µs ± 3% 30.6µs ± 1% -34.30% (p=0.000 n=20+20) [Geo mean] 56.3ns 37.0ns -34.24% name old speed new speed delta SubVW/1-18 1.80GB/s ± 0% 2.10GB/s ± 0% +16.16% (p=0.000 n=20+20) SubVW/2-18 2.97GB/s ± 0% 3.77GB/s ± 0% +26.95% (p=0.000 n=20+20) SubVW/3-18 3.82GB/s ± 0% 5.16GB/s ± 0% +35.26% (p=0.000 n=20+19) SubVW/4-18 5.26GB/s ± 1% 6.61GB/s ± 0% +25.59% (p=0.000 n=20+20) SubVW/5-18 5.67GB/s ± 1% 8.11GB/s ± 0% +43.12% (p=0.000 n=20+20) SubVW/10-18 7.79GB/s ± 2% 11.17GB/s ± 0% +43.52% (p=0.000 n=20+19) SubVW/100-18 16.7GB/s ± 4% 45.5GB/s ± 0% +172.61% (p=0.000 n=18+20) SubVW/1000-18 17.9GB/s ± 9% 33.9GB/s ± 1% +89.25% (p=0.000 n=20+20) SubVW/10000-18 16.6GB/s ± 5% 27.0GB/s ± 0% +63.08% (p=0.000 n=20+19) SubVW/100000-18 17.2GB/s ± 2% 26.1GB/s ± 1% +52.18% (p=0.000 n=20+20) [Geo mean] 7.25GB/s 11.03GB/s +52.01% Change-Id: I32e99cbab3260054a96231d02b87049c833ab77e Reviewed-on: https://go-review.googlesource.com/c/go/+/227297 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-24 14:50:59 +00:00
Ruixin(Peter) Bao	ee8972cd12	math/big: rewrite addVW to use fast path on s390x Rewrite addVW to use a fast path and remove the original vector and non vector implementation of addVW in assembly. This CL uses a similar idea as CL 164968, where we copy the rest of words when we know carry bit is zero. In addition, since we are copying vector of words, a faster implementation of copy is written in this CL to copy a word or multiple words at a time. Benchmarks: name old time/op new time/op delta AddVW/1-18 4.56ns ± 0% 4.01ns ± 6% -12.14% (p=0.000 n=18+20) AddVW/2-18 5.54ns ± 0% 4.42ns ± 5% -20.20% (p=0.000 n=18+20) AddVW/3-18 6.55ns ± 0% 4.61ns ± 0% -29.62% (p=0.000 n=16+18) AddVW/4-18 6.11ns ± 2% 5.12ns ± 6% -16.19% (p=0.000 n=20+20) AddVW/5-18 7.32ns ± 4% 5.14ns ± 0% -29.77% (p=0.000 n=20+19) AddVW/10-18 10.6ns ± 2% 7.2ns ± 1% -31.47% (p=0.000 n=20+20) AddVW/100-18 49.6ns ± 2% 18.0ns ± 0% -63.63% (p=0.000 n=20+20) AddVW/1000-18 465ns ± 3% 244ns ± 0% -47.54% (p=0.000 n=20+20) AddVW/10000-18 4.99µs ± 4% 2.97µs ± 0% -40.54% (p=0.000 n=20+20) AddVW/100000-18 48.3µs ± 3% 30.8µs ± 1% -36.29% (p=0.000 n=20+20) [Geo mean] 58.1ns 38.0ns -34.57% name old speed new speed delta AddVW/1-18 1.76GB/s ± 0% 2.00GB/s ± 6% +14.04% (p=0.000 n=20+20) AddVW/2-18 2.89GB/s ± 0% 3.63GB/s ± 5% +25.55% (p=0.000 n=18+20) AddVW/3-18 3.66GB/s ± 0% 5.21GB/s ± 0% +42.25% (p=0.000 n=18+19) AddVW/4-18 5.24GB/s ± 2% 6.27GB/s ± 6% +19.61% (p=0.000 n=20+20) AddVW/5-18 5.47GB/s ± 4% 7.78GB/s ± 0% +42.28% (p=0.000 n=20+18) AddVW/10-18 7.55GB/s ± 2% 11.04GB/s ± 1% +46.09% (p=0.000 n=20+20) AddVW/100-18 16.1GB/s ± 2% 44.3GB/s ± 0% +174.77% (p=0.000 n=20+20) AddVW/1000-18 17.2GB/s ± 3% 32.8GB/s ± 1% +90.58% (p=0.000 n=20+20) AddVW/10000-18 16.0GB/s ± 4% 26.9GB/s ± 0% +68.11% (p=0.000 n=20+20) AddVW/100000-18 16.6GB/s ± 3% 26.0GB/s ± 1% +56.94% (p=0.000 n=20+20) [Geo mean] 7.03GB/s 10.75GB/s +52.93% Change-Id: Idbb73f3178311bd2b18a93bdc1e48f26869d2f6a Reviewed-on: https://go-review.googlesource.com/c/go/+/209679 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-24 13:26:34 +00:00
Ruixin(Peter) Bao	0329c915a0	math/big: clean up whitespace in arith_s390x.s file This CL looks big but it only does formatting changes to arith_s390x.s. The file was formatted using asmfmt(https://github.com/klauspost/asmfmt) , so there should not be any functional impact. I verified that the generated assembly of big.test file is identical. Change-Id: I8b4035ef082a4d0357881869327e25253f2d8be1 Reviewed-on: https://go-review.googlesource.com/c/go/+/229302 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-22 15:40:55 +00:00
Alberto Donizetti	813f8eae27	math/big: remove Direct Sqrt computation The Float.Sqrt method switches (for performance reasons) between direct (uses Quo) and inverse (doesn't) computation, depending on the precision, with threshold 128. Unfortunately the implementation of recursive division in CL 172018 made Quo slightly slower exactly in the range around and below the threshold Sqrt is using, so this strategy is no longer profitable. The new division algorithm allocates more, and this has increased the amount of allocations performed by Sqrt when using the direct method; on low precisions the computation is fast, so additional allocations have an negative impact on performance. Interestingly, only using the inverse method doesn't just reverse the effects of the Quo algorithm change, but it seems to make performances better overall for small precisions: name old time/op new time/op delta FloatSqrt/64-4 643ns ± 1% 635ns ± 1% -1.24% (p=0.000 n=10+10) FloatSqrt/128-4 1.44µs ± 1% 1.02µs ± 1% -29.25% (p=0.000 n=10+10) FloatSqrt/256-4 1.49µs ± 1% 1.49µs ± 1% ~ (p=0.752 n=10+10) FloatSqrt/1000-4 3.71µs ± 1% 3.74µs ± 1% +0.87% (p=0.001 n=10+10) FloatSqrt/10000-4 35.3µs ± 1% 35.6µs ± 1% +0.82% (p=0.002 n=10+9) FloatSqrt/100000-4 844µs ± 1% 844µs ± 0% ~ (p=0.549 n=10+9) FloatSqrt/1000000-4 69.5ms ± 0% 69.6ms ± 0% ~ (p=0.222 n=9+9) name old alloc/op new alloc/op delta FloatSqrt/64-4 280B ± 0% 200B ± 0% -28.57% (p=0.000 n=10+10) FloatSqrt/128-4 504B ± 0% 248B ± 0% -50.79% (p=0.000 n=10+10) FloatSqrt/256-4 344B ± 0% 344B ± 0% ~ (all equal) FloatSqrt/1000-4 1.30kB ± 0% 1.30kB ± 0% ~ (all equal) FloatSqrt/10000-4 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.237 n=10+10) FloatSqrt/100000-4 123kB ± 0% 123kB ± 0% ~ (p=0.247 n=10+10) FloatSqrt/1000000-4 1.83MB ± 1% 1.83MB ± 3% ~ (p=0.779 n=8+10) name old allocs/op new allocs/op delta FloatSqrt/64-4 8.00 ± 0% 5.00 ± 0% -37.50% (p=0.000 n=10+10) FloatSqrt/128-4 11.0 ± 0% 5.0 ± 0% -54.55% (p=0.000 n=10+10) FloatSqrt/256-4 5.00 ± 0% 5.00 ± 0% ~ (all equal) FloatSqrt/1000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/10000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/100000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/1000000-4 10.3 ±13% 10.3 ±13% ~ (p=1.000 n=10+10) For example, 1.02µs for FloatSqrt/128 is actually better than what I was getting on the same machine before the Quo changes. The .8% slowdown on /1000 and /10000 appears to be real and it is quite baffling (that codepath was not touched at all); it may be caused by code alignment changes. Change-Id: Ib03761cdc1055674bc7526d4f3a23d7a25094029 Reviewed-on: https://go-review.googlesource.com/c/go/+/228062 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 16:37:53 +00:00
Brad Fitzpatrick	8f53fad035	math/big: add test that linker is able to remove unused code (Follow-up to CL 228108.) Change-Id: Ia6d119ee19c7aa923cdeead06d3cee87a1751105 Reviewed-on: https://go-review.googlesource.com/c/go/+/228109 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 03:25:21 +00:00
Hanjun Kim	5a447c0ae9	math/big: fix typo in documentation for Int.Exp Fixes #38304 Also change `If m > 0, y < 0, ...` to `If m != 0, y < 0, ...` since `Exp` will return `nil` whatever `m`'s sign is. Change-Id: I17d7337ccd1404318cea5d42a8de904ad185fd00 GitHub-Last-Rev: `2399510300` GitHub-Pull-Request: golang/go#38390 Reviewed-on: https://go-review.googlesource.com/c/go/+/228000 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-15 00:32:18 +00:00
Brad Fitzpatrick	a55645fa34	math/big: don't use Float in init to help linker discard 162 KiB Removes 162 KiB from binaries that don't use math/big.Float: -rwxr-xr-x 1 bradfitz bradfitz 1916590 Apr 14 12:21 x.after -rwxr-xr-x 1 bradfitz bradfitz 2082575 Apr 14 12:21 x.before No change in deps (this package already used sync). No change in benchmarks: name old time/op new time/op delta FloatSqrt/64-8 1.06µs ±10% 1.03µs ± 6% ~ (p=0.133 n=10+9) FloatSqrt/128-8 2.26µs ± 9% 2.28µs ± 9% ~ (p=0.460 n=10+8) FloatSqrt/256-8 2.29µs ± 5% 2.31µs ± 3% ~ (p=0.214 n=9+9) FloatSqrt/1000-8 5.82µs ± 3% 5.87µs ± 7% ~ (p=0.666 n=9+9) FloatSqrt/10000-8 56.4µs ± 5% 57.0µs ± 6% ~ (p=0.436 n=10+10) FloatSqrt/100000-8 1.34ms ± 8% 1.31ms ± 3% ~ (p=0.447 n=10+9) FloatSqrt/1000000-8 106ms ± 5% 107ms ± 7% ~ (p=0.315 n=10+10) name old alloc/op new alloc/op delta FloatSqrt/64-8 280B ± 0% 280B ± 0% ~ (all equal) FloatSqrt/128-8 504B ± 0% 504B ± 0% ~ (all equal) FloatSqrt/256-8 344B ± 0% 344B ± 0% ~ (all equal) FloatSqrt/1000-8 1.30kB ± 0% 1.30kB ± 0% ~ (all equal) FloatSqrt/10000-8 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.403 n=10+10) FloatSqrt/100000-8 123kB ± 0% 123kB ± 0% ~ (p=0.393 n=10+10) FloatSqrt/1000000-8 1.84MB ± 7% 1.84MB ± 5% ~ (p=0.739 n=10+10) name old allocs/op new allocs/op delta FloatSqrt/64-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) FloatSqrt/128-8 11.0 ± 0% 11.0 ± 0% ~ (all equal) FloatSqrt/256-8 5.00 ± 0% 5.00 ± 0% ~ (all equal) FloatSqrt/1000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/10000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/100000-8 6.00 ± 0% 6.00 ± 0% ~ (all equal) FloatSqrt/1000000-8 10.9 ±10% 10.8 ±17% ~ (p=0.974 n=10+10) Change-Id: I3337f1f531bf7b4fae192b9d90cd24ff2be14fea Reviewed-on: https://go-review.googlesource.com/c/go/+/228108 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-14 20:50:19 +00:00
Rémy Oudompheng	ac1fd419b6	math/big: correct off-by-one access in divBasic The divBasic function computes the quotient of big nats u/v word by word. It estimates each word qhat by performing a long division (top 2 words of u divided by top word of v), looks at the next word to correct the estimate, then perform a full multiplication (qhatv) to catch any inaccuracy in the estimate. In the latter case, "negative" values appear temporarily and carries must be carefully managed, and the recursive division refactoring introduced a case where qhatv has the same length as v, triggering an out-of-bounds write in the case it happens when computing the top word of the quotient. Fixes #37499 Change-Id: I15089da4a4027beda43af497bf6de261eb792f94 Reviewed-on: https://go-review.googlesource.com/c/go/+/221980 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-04-08 20:53:58 +00:00
Alberto Donizetti	c5058652fd	math/big: document that Sqrt doesn't set Accuracy Document that the Float.Sqrt method does not set the receiver's Accuracy field. Updates #37915 Change-Id: Ief1dcac07eacc0ef02f86bfac9044501477bca1c Reviewed-on: https://go-review.googlesource.com/c/go/+/224497 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-03-20 19:24:37 +00:00
Joel Sing	fe70838598	math/big: initial vector arithmetic in riscv64 assembly Provide an assembly implementation of mulWW - for now all others run the Go code. Change-Id: Icb594c31048255f131bdea8d64f56784fc9db4d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/220919 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-25 16:47:02 +00:00
Filippo Valsorda	88ae4ccefb	math/big: reintroduce pre-Go 1.14 mention in GCD docs It was removed in CL 217302 but was intentionally added in CL 217104. Change-Id: I1a478d80ad1ec4f0a0184bfebf8f1a5e352cfe8c Reviewed-on: https://go-review.googlesource.com/c/go/+/217941 Reviewed-by: Robert Griesemer <gri@golang.org>	2020-02-05 20:54:27 +00:00
Filippo Valsorda	cdb7fd6b06	math/big: simplify GCD docs We don't usually document past behavior (like "As of Go 1.14 ...") and in isolation the current docs made it sound like a and b could only be negative or zero. Change-Id: I0d3c2b8579a9c01159ce528a3128b1478e99042a Reviewed-on: https://go-review.googlesource.com/c/go/+/217302 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-01-31 23:32:37 +00:00
Robert Griesemer	9bb40ed8ec	math/big: update comment on Int.GCD Per the suggestion https://golang.org/cl/216200/2/doc/go1.14.html#423. Updates #28878. Change-Id: I654d2d114409624219a0041916f0a4030efc7573 Reviewed-on: https://go-review.googlesource.com/c/go/+/217104 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-01-30 20:37:01 +00:00
Joel Sing	5a3a5d3525	math, math/big: add support for riscv64 Based on riscv-go port. Updates #27532 Change-Id: Id8ae7d851c393ec3702e4176c363accb0a42587f Reviewed-on: https://go-review.googlesource.com/c/go/+/204633 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2020-01-15 18:49:52 +00:00
Ville Skyttä	440f7d6404	all: fix a bunch of misspellings Change-Id: I5b909df0fd048cd66c5a27fca1b06466d3bcaac7 GitHub-Last-Rev: `778c5d2131` GitHub-Pull-Request: golang/go#35624 Reviewed-on: https://go-review.googlesource.com/c/go/+/207421 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-11-15 21:04:43 +00:00
Rémy Oudompheng	7ad27481f8	math/big: fix out-of-bounds panic in divRecursive The bounds in the last carry branch were wrong as there is no reason for len(u) >= n+n/2 to always hold true. We also adjust test to avoid using a remainder of 1 (in which case, the last step of the algorithm computes (qhatv+1) - qhatv which rarely produces a carry). Change-Id: I69fbab9c5e19d0db1c087fbfcd5b89352c2d26fb Reviewed-on: https://go-review.googlesource.com/c/go/+/206839 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-11-13 19:15:27 +00:00
Robert Griesemer	1fe33e3cb2	math/big: ensure correct test input There is a (theoretical, but possible) chance that the random number values a, b used for TestDiv are 0 or 1, in which case the test would fail. This CL makes sure that a >= 1 and b >= 2 at all times. Fixes #35523. Change-Id: I6451feb94241249516a821cd0066e95a0c65b0ed Reviewed-on: https://go-review.googlesource.com/c/go/+/206818 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-12 18:52:52 +00:00
Rémy Oudompheng	194ae3236d	math/big: implement recursive algorithm for division The current division algorithm produces one word of result at a time, using 2-word division to compute the top word and mulAddVWW to compute the remainder. The top word may need to be adjusted by 1 or 2 units. The recursive version, based on Burnikel, Ziegler, "Fast Recursive Division", uses the same principles, but in a multi-word setting, so that multiplication benefits from the Karatsuba algorithm (and possibly later improvements). benchmark old ns/op new ns/op delta BenchmarkDiv/20/10-4 38.2 38.3 +0.26% BenchmarkDiv/40/20-4 38.7 38.5 -0.52% BenchmarkDiv/100/50-4 62.5 62.6 +0.16% BenchmarkDiv/200/100-4 238 259 +8.82% BenchmarkDiv/400/200-4 311 338 +8.68% BenchmarkDiv/1000/500-4 604 649 +7.45% BenchmarkDiv/2000/1000-4 1214 1278 +5.27% BenchmarkDiv/20000/10000-4 38279 36510 -4.62% BenchmarkDiv/200000/100000-4 3022057 1359615 -55.01% BenchmarkDiv/2000000/1000000-4 310827664 54012939 -82.62% BenchmarkDiv/20000000/10000000-4 33272829421 1965401359 -94.09% BenchmarkString/10/Base10-4 158 156 -1.27% BenchmarkString/100/Base10-4 797 792 -0.63% BenchmarkString/1000/Base10-4 3677 3814 +3.73% BenchmarkString/10000/Base10-4 16633 17116 +2.90% BenchmarkString/100000/Base10-4 5779029 1793808 -68.96% BenchmarkString/1000000/Base10-4 889840820 85524031 -90.39% BenchmarkString/10000000/Base10-4 134338236860 4935657026 -96.33% Fixes #21960 Updates #30943 Change-Id: I134c6f81a47870c688ca95b6081eb9211def15a2 Reviewed-on: https://go-review.googlesource.com/c/go/+/172018 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-12 05:18:25 +00:00
Brian Kessler	f5949b6067	math/big: allow all values for GCD Allow the inputs a and b to be zero or negative to GCD with the following definitions. If x or y are not nil, GCD sets their value such that z = ax + by. Regardless of the signs of a and b, z is always >= 0. If a == b == 0, GCD sets z = x = y = 0. If a == 0 and b != 0, GCD sets z = \|b\|, x = 0, y = sign(b) * 1. If a != 0 and b == 0, GCD sets z = \|a\|, x = sign(a) * 1, y = 0. Fixes #28878 Change-Id: Ia83fce66912a96545c95cd8df0549bfd852652f3 Reviewed-on: https://go-review.googlesource.com/c/go/+/164972 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-11-07 06:58:44 +00:00
Rémy Oudompheng	8f30d25168	math/big: use nat pool to reduce allocations in mul and sqr This notably allows to reuse temporaries across the karatsubaSqr recursion. benchmark old ns/op new ns/op delta BenchmarkNatMul/10-4 227 228 +0.44% BenchmarkNatMul/100-4 8339 8589 +3.00% BenchmarkNatMul/1000-4 313796 312272 -0.49% BenchmarkNatMul/10000-4 11924720 11873589 -0.43% BenchmarkNatMul/100000-4 503813354 503839058 +0.01% BenchmarkNatSqr/20-4 549 513 -6.56% BenchmarkNatSqr/30-4 945 874 -7.51% BenchmarkNatSqr/50-4 1993 1832 -8.08% BenchmarkNatSqr/80-4 4096 3874 -5.42% BenchmarkNatSqr/100-4 6192 5712 -7.75% BenchmarkNatSqr/200-4 20388 19543 -4.14% BenchmarkNatSqr/300-4 38735 36715 -5.21% BenchmarkNatSqr/500-4 99562 93542 -6.05% BenchmarkNatSqr/800-4 195554 184907 -5.44% BenchmarkNatSqr/1000-4 286302 275053 -3.93% BenchmarkNatSqr/10000-4 9817057 9441641 -3.82% BenchmarkNatSqr/100000-4 390713416 379696789 -2.82% benchmark old allocs new allocs delta BenchmarkNatMul/10-4 1 1 +0.00% BenchmarkNatMul/100-4 1 1 +0.00% BenchmarkNatMul/1000-4 2 1 -50.00% BenchmarkNatMul/10000-4 2 1 -50.00% BenchmarkNatMul/100000-4 9 11 +22.22% BenchmarkNatSqr/20-4 2 1 -50.00% BenchmarkNatSqr/30-4 2 1 -50.00% BenchmarkNatSqr/50-4 2 1 -50.00% BenchmarkNatSqr/80-4 2 1 -50.00% BenchmarkNatSqr/100-4 2 1 -50.00% BenchmarkNatSqr/200-4 2 1 -50.00% BenchmarkNatSqr/300-4 4 1 -75.00% BenchmarkNatSqr/500-4 4 1 -75.00% BenchmarkNatSqr/800-4 10 1 -90.00% BenchmarkNatSqr/1000-4 10 1 -90.00% BenchmarkNatSqr/10000-4 731 1 -99.86% BenchmarkNatSqr/100000-4 19687 6 -99.97% benchmark old bytes new bytes delta BenchmarkNatMul/10-4 192 192 +0.00% BenchmarkNatMul/100-4 4864 4864 +0.00% BenchmarkNatMul/1000-4 57344 49224 -14.16% BenchmarkNatMul/10000-4 565248 498772 -11.76% BenchmarkNatMul/100000-4 5749504 7263720 +26.34% BenchmarkNatSqr/20-4 672 352 -47.62% BenchmarkNatSqr/30-4 992 512 -48.39% BenchmarkNatSqr/50-4 1792 896 -50.00% BenchmarkNatSqr/80-4 2688 1408 -47.62% BenchmarkNatSqr/100-4 3584 1792 -50.00% BenchmarkNatSqr/200-4 6656 3456 -48.08% BenchmarkNatSqr/300-4 24448 16387 -32.97% BenchmarkNatSqr/500-4 36864 24591 -33.29% BenchmarkNatSqr/800-4 69760 40981 -41.25% BenchmarkNatSqr/1000-4 86016 49180 -42.82% BenchmarkNatSqr/10000-4 2524800 487368 -80.70% BenchmarkNatSqr/100000-4 68599808 5876581 -91.43% Change-Id: I8e6e409ae1cb48be9d5aa9b5f428d6cbe487673a Reviewed-on: https://go-review.googlesource.com/c/go/+/172017 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-10-25 03:14:39 +00:00
Robert Griesemer	a52c0a1992	math/big: make Rat.Denom side-effect free A Rat is represented via a quotient a/b where a and b are Int values. To make it possible to use an uninitialized Rat value (with a and b uninitialized and thus == 0), the implementation treats a 0 denominator as 1. Rat.Num and Rat.Denom return pointers to these values a and b. Because b may be 0, Rat.Denom used to first initialize it to 1 and thus produce an undesirable side-effect (by changing the Rat's denominator). This CL changes Denom to return a new (not shared) *Int with value 1 in the rare case where the Rat was not initialized. This eliminates the side effect and returns the correct denominator value. While this is changing behavior of the API, the impact should now be minor because together with (prior) CL https://golang.org/cl/202997, which initializes Rats ASAP, Denom is unlikely used to access the denominator of an uninitialized (and thus 0) Rat. Any operation that will somehow set a Rat value will ensure that the denominator is not 0. Fixes #33792. Updates #3521. Change-Id: I0bf15ac60513cf52162bfb62440817ba36f0c3fc Reviewed-on: https://go-review.googlesource.com/c/go/+/203059 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-24 03:34:24 +00:00
Robert Griesemer	4412181e7c	math/big: normalize unitialized denominators ASAP A Rat is represented via a quotient a/b where a and b are Int values. To make it possible to use an uninitialized Rat value (with a and b uninitialized and thus == 0), the implementation treats a 0 denominator as 1. For each operation we check if the denominator is 0, and then treat it as 1 (if necessary). Operations that create a new Rat result, normalize that value such that a result denominator 1 is represened as 0 again. This CL changes this behavior slightly: 0 denominators are still interpreted as 1, but whenever we (safely) can, we set an uninitialized 0 denominator to 1. This simplifies the code overall. Also: Improved some doc strings. Preparation for addressing issue #33792. Updates #33792. Change-Id: I3040587c8d0dad2e840022f96ca027d8470878a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/202997 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-24 03:34:11 +00:00
Robert Griesemer	898f9db81f	math/big: make Rat accessors safe for concurrent use Do not modify the underlying Rat denominator when calling one of the accessors Float32, Float64; verify that we don't modify the Rat denominator when calling Inv, Sign, IsInt, Num. Fixes #34919. Reopens #33792. Change-Id: Ife6d1252373f493a597398ee51e7b5695b708df5 Reviewed-on: https://go-review.googlesource.com/c/go/+/201205 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-10-15 23:31:58 +00:00
Brad Fitzpatrick	07b4abd62e	all: remove the nacl port (part 2, amd64p32 + toolchain) This is part two if the nacl removal. Part 1 was CL 199499. This CL removes amd64p32 support, which might be useful in the future if we implement the x32 ABI. It also removes the nacl bits in the toolchain, and some remaining nacl bits. Updates #30439 Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/200077 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-10-09 22:34:34 +00:00
Robert Griesemer	770fac4586	math/big: avoid MinExp exponent wrap-around in 'x' Text format Fixes #34343. Change-Id: I74240c8f431f6596338633a86a7a5ee1fce70a65 Reviewed-on: https://go-review.googlesource.com/c/go/+/196057 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-09-18 04:48:34 +00:00
Ainar Garipov	0efbd10157	all: fix typos Use the following (suboptimal) script to obtain a list of possible typos: #!/usr/bin/env sh set -x git ls-files \|\ grep -e '\.$c\\|cc\\|go$$' \|\ xargs -n 1\ awk\ '/\/\// { gsub(/.\/\//, ""); print; } /\/\/, /\\// { gsub(/.\/\/, ""); gsub(/\\/.*/, ""); }' \|\ hunspell -d en_US -l \|\ grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' \|\ grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' \|\ sort -f \|\ uniq -c \|\ awk '$1 == 1 { print $2; }' Then, go through the results manually and fix the most obvious typos in the non-vendored code. Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae Reviewed-on: https://go-review.googlesource.com/c/go/+/193848 Reviewed-by: Robert Griesemer <gri@golang.org>	2019-09-08 17:28:20 +00:00
peter zhang	d5fe73393c	math/big: fix a duplicate "the" in a comment Change-Id: Ib637381ab8a12aeb798576b781e1b3c458ba812d GitHub-Last-Rev: `12994496b6` GitHub-Pull-Request: golang/go#34017 Reviewed-on: https://go-review.googlesource.com/c/go/+/192877 Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2019-09-02 11:42:47 +00:00
Eric Lagergren	9dfa4cb026	math/big: document that Rat.Denom might modify the receiver Fixes #33792 Change-Id: I306a95883c3db2d674d3294a6feb50adc50ee5d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/192017 Reviewed-by: Robert Griesemer <gri@golang.org>	2019-08-28 16:20:00 +00:00
Illya Yalovyy	be452cea42	math/big: fast path for Cmp if same math/big.Int Cmp method does not have a fast path for the case if x and y are the same. Fixes #30856 Change-Id: Ia9a5b5f72db9d73af1b13ed6ac39ecff87d10393 Reviewed-on: https://go-review.googlesource.com/c/go/+/178957 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-08-28 15:08:00 +00:00
Russ Cox	06b0babf31	all: shorten some tests Shorten some of the longest tests that run during all.bash. Removes 7r 50u 21s from all.bash. After this change, all.bash is under 5 minutes again on my laptop. For #26473. Change-Id: Ie0460aa935808d65460408feaed210fbaa1d5d79 Reviewed-on: https://go-review.googlesource.com/c/go/+/177559 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-05-22 12:54:00 +00:00
JT Olio	5a2da5624a	math/big: stack allocate scaleDenom return value benchmark old ns/op new ns/op delta BenchmarkRatCmp-4 154 77.9 -49.42% Change-Id: I932710ad8b6905879e232168b1777927f86ba22a Reviewed-on: https://go-review.googlesource.com/c/go/+/175460 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-05-08 17:11:57 +00:00
erifan01	503e6ccd74	math/big: fix the bug in assembly implementation of shlVU on arm64 For the case where the addresses of parameter z and x of the function shlVU overlap and the address of z is greater than x, x (input value) can be polluted during the calculation when the high words of x are overlapped with the low words of z (output value). Fixes #31084 Change-Id: I9bb0266a1d7856b8faa9a9b1975d6f57dece0479 Reviewed-on: https://go-review.googlesource.com/c/go/+/169780 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-05-08 01:29:00 +00:00
Brian Kessler	689ee112df	math/big: document Int.String Int.String had no documentation and the documentation for Int.Text did not mention the handling of the nil pointer case. Change-Id: I9f21921e431c948545b7cabc7829e4b4e574bbe9 Reviewed-on: https://go-review.googlesource.com/c/go/+/175118 Reviewed-by: Robert Griesemer <gri@golang.org>	2019-05-03 03:25:26 +00:00
erifan01	d17d41e58d	math/big: optimize mulAddVWW on arm64 for better performance Unroll the cycle 4 times to reduce load overhead. Benchmarks: name old time/op new time/op delta MulAddVWW/1-8 15.9ns ± 0% 11.9ns ± 0% -24.92% (p=0.000 n=8+8) MulAddVWW/2-8 16.1ns ± 0% 13.9ns ± 1% -13.82% (p=0.000 n=8+8) MulAddVWW/3-8 18.9ns ± 0% 17.3ns ± 0% -8.47% (p=0.000 n=8+8) MulAddVWW/4-8 21.7ns ± 0% 19.5ns ± 0% -10.14% (p=0.000 n=8+8) MulAddVWW/5-8 25.1ns ± 0% 22.5ns ± 0% -10.27% (p=0.000 n=8+8) MulAddVWW/10-8 41.6ns ± 0% 40.0ns ± 0% -3.79% (p=0.000 n=8+8) MulAddVWW/100-8 368ns ± 0% 363ns ± 0% -1.36% (p=0.000 n=8+8) MulAddVWW/1000-8 3.52µs ± 0% 3.52µs ± 0% -0.14% (p=0.000 n=8+8) MulAddVWW/10000-8 35.1µs ± 0% 35.1µs ± 0% -0.01% (p=0.000 n=7+6) MulAddVWW/100000-8 351µs ± 0% 351µs ± 0% +0.15% (p=0.038 n=8+8) Change-Id: I052a4db286ac6e4f3293289c7e9a82027da0405e Reviewed-on: https://go-review.googlesource.com/c/go/+/155780 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-04-22 14:45:16 +00:00
Robert Griesemer	f8c6f986fd	math/big: don't clobber shared underlying array in pow5 computation Rearranged code slightly to make lifetime of underlying array of pow5 more explicit in code. Fixes #31184. Change-Id: I063081f0e54097c499988d268a23813746592654 Reviewed-on: https://go-review.googlesource.com/c/go/+/170641 Reviewed-by: Filippo Valsorda <filippo@golang.org>	2019-04-15 17:48:21 +00:00
Neven Sajko	7756a72b35	all: change the old assembly style AX:CX to CX, AX Assembly files with "/vendor/" or "testdata" in their paths were ignored. Change-Id: I3882ff07eb4426abb9f8ee96f82dff73c81cd61f GitHub-Last-Rev: `51ae8c324d` GitHub-Pull-Request: golang/go#31166 Reviewed-on: https://go-review.googlesource.com/c/go/+/170197 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-04-09 00:22:03 +00:00
Filippo Valsorda	ead895688d	math/big: do not panic in Exp when y < 0 and x doesn't have an inverse If x does not have an inverse modulo m, and a negative exponent is used, return nil just like ModInverse does now. Change-Id: I8fa72f7a851e8cf77c5fab529ede88408740626f Reviewed-on: https://go-review.googlesource.com/c/go/+/170757 Run-TryBot: Filippo Valsorda <filippo@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-04-04 23:02:09 +00:00
Neven Sajko	964fe4b80f	math/big: simplify shlVU_g and shrVU_g Rewrote a few lines to be more idiomatic/less assembly-ish. Benchmarked with `go test -bench Float -tags math_big_pure_go`: name old time/op new time/op delta FloatString/100-8 751ns ± 0% 746ns ± 1% -0.71% (p=0.000 n=10+10) FloatString/1000-8 22.9µs ± 0% 22.9µs ± 0% ~ (p=0.271 n=10+10) FloatString/10000-8 1.89ms ± 0% 1.89ms ± 0% ~ (p=0.481 n=10+10) FloatString/100000-8 184ms ± 0% 184ms ± 0% ~ (p=0.094 n=9+9) FloatAdd/10-8 56.4ns ± 1% 56.5ns ± 0% ~ (p=0.170 n=9+9) FloatAdd/100-8 59.7ns ± 0% 59.3ns ± 0% -0.70% (p=0.000 n=8+9) FloatAdd/1000-8 101ns ± 0% 99ns ± 0% -1.89% (p=0.000 n=8+8) FloatAdd/10000-8 553ns ± 0% 536ns ± 0% -3.00% (p=0.000 n=9+10) FloatAdd/100000-8 4.94µs ± 0% 4.74µs ± 0% -3.94% (p=0.000 n=9+10) FloatSub/10-8 50.3ns ± 0% 50.5ns ± 0% +0.52% (p=0.000 n=8+8) FloatSub/100-8 52.0ns ± 0% 52.2ns ± 1% +0.46% (p=0.012 n=8+10) FloatSub/1000-8 77.9ns ± 0% 77.3ns ± 0% -0.80% (p=0.000 n=7+8) FloatSub/10000-8 371ns ± 0% 362ns ± 0% -2.67% (p=0.000 n=10+10) FloatSub/100000-8 3.20µs ± 0% 3.10µs ± 0% -3.16% (p=0.000 n=10+10) ParseFloatSmallExp-8 7.84µs ± 0% 7.82µs ± 0% -0.17% (p=0.037 n=9+9) ParseFloatLargeExp-8 29.3µs ± 1% 29.5µs ± 0% ~ (p=0.059 n=9+8) FloatSqrt/64-8 516ns ± 0% 519ns ± 0% +0.54% (p=0.000 n=9+9) FloatSqrt/128-8 1.07µs ± 0% 1.07µs ± 0% ~ (p=0.109 n=8+9) FloatSqrt/256-8 1.23µs ± 0% 1.23µs ± 0% +0.50% (p=0.000 n=9+9) FloatSqrt/1000-8 3.43µs ± 0% 3.44µs ± 0% +0.53% (p=0.000 n=9+8) FloatSqrt/10000-8 40.9µs ± 0% 40.7µs ± 0% -0.39% (p=0.000 n=9+8) FloatSqrt/100000-8 1.07ms ± 0% 1.07ms ± 0% -0.10% (p=0.017 n=10+9) FloatSqrt/1000000-8 89.3ms ± 0% 89.2ms ± 0% -0.07% (p=0.015 n=9+8) Change-Id: Ibf07c6142719d11bc7f329246957d87a9f3ba3d2 GitHub-Last-Rev: `870a041ab7` GitHub-Pull-Request: golang/go#31220 Reviewed-on: https://go-review.googlesource.com/c/go/+/170449 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-04-04 00:26:24 +00:00
Robert Griesemer	4dce6dbb1e	math/big: temporarily disable buggy shlVU assembly for arm64 This addresses the failures we have seen in #31084. The correct fix is to find the actual bug in the assembly code. Updates #31084. Change-Id: I437780c53d0c4423d742e2e3b650b899ce845372 Reviewed-on: https://go-review.googlesource.com/c/go/+/169721 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-03-27 23:40:12 +00:00
Brian Kessler	5ee2290420	math/big: implement Rat.SetUint64 Implemented via the underlying Int.SetUint64. Added tests for Rat.SetInt64 and Rat.SetUint64. Fixes #29579 Change-Id: I03faaffc93e36873b202b58ae72b139dea5c40f9 Reviewed-on: https://go-review.googlesource.com/c/go/+/160682 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-03-27 15:20:28 +00:00
Robert Griesemer	e4ba40030f	math/big: accept non-decimal floats with Rat.SetString This fixes an old oversight. Rat.SetString already permitted fractions a/b where both a and b could independently specify a base prefix. With this CL, it now also accepts non-decimal floating-point numbers. Fixes #29799. Change-Id: I9cc65666a5cebb00f0202da2e4fc5654a02e3234 Reviewed-on: https://go-review.googlesource.com/c/go/+/168237 Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>	2019-03-25 22:29:26 +00:00
Robert Griesemer	cfa93ba51f	math/big: add support for underscores '_' in numbers The primary change is in nat.scan which now accepts underscores for base 0. While at it, streamlined error handling in that function as well. Also, improved the corresponding test significantly by checking the expected result values also in case of scan errors. The second major change is in scanExponent which now accepts underscores when the new sepOk argument is set. While at it, essentially rewrote that function to match error and underscore handling of nat.scan more closely. Added a new test for scanExponent which until now was only tested indirectly. Finally, updated the documentation for several functions and added many new test cases to clients of nat.scan. A major portion of this CL is due to much better test coverage. Updates #28493. Change-Id: I7f17b361b633fbe6c798619d891bd5e0a045b5c5 Reviewed-on: https://go-review.googlesource.com/c/go/+/166157 Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>	2019-03-12 22:58:58 +00:00
Brian Kessler	ef891e1c83	math/big: implement Int.TrailingZeroBits Implemented via the underlying nat.trailingZeroBits. Fixes #29578 Change-Id: If9876c5a74b107cbabceb7547bef4e44501f6745 Reviewed-on: https://go-review.googlesource.com/c/go/+/160681 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-03-12 13:18:27 +00:00
Josh Bleecher Snyder	4d10aba35e	math/big: add fast path for amd64 addVW for large z This matches the pure Go fast path added in the previous commit. I will leave other architectures to those with ready access to hardware. name old time/op new time/op delta AddVW/1-8 3.60ns ± 3% 3.59ns ± 1% ~ (p=0.147 n=91+86) AddVW/2-8 3.92ns ± 1% 3.91ns ± 2% -0.36% (p=0.000 n=86+92) AddVW/3-8 4.33ns ± 5% 4.46ns ± 5% +2.94% (p=0.000 n=96+97) AddVW/4-8 4.76ns ± 5% 4.82ns ± 5% +1.28% (p=0.000 n=95+92) AddVW/5-8 5.40ns ± 1% 5.42ns ± 0% +0.47% (p=0.000 n=76+71) AddVW/10-8 8.03ns ± 1% 7.80ns ± 5% -2.90% (p=0.000 n=73+96) AddVW/100-8 43.8ns ± 5% 17.9ns ± 1% -59.12% (p=0.000 n=94+81) AddVW/1000-8 428ns ± 4% 85ns ± 6% -80.20% (p=0.000 n=96+99) AddVW/10000-8 4.22µs ± 2% 1.80µs ± 3% -57.32% (p=0.000 n=69+92) AddVW/100000-8 44.8µs ± 8% 31.5µs ± 3% -29.76% (p=0.000 n=99+90) name old time/op new time/op delta SubVW/1-8 3.53ns ± 2% 3.63ns ± 5% +2.97% (p=0.000 n=94+93) SubVW/2-8 4.33ns ± 5% 4.01ns ± 2% -7.36% (p=0.000 n=90+85) SubVW/3-8 4.32ns ± 2% 4.32ns ± 5% ~ (p=0.084 n=87+97) SubVW/4-8 4.70ns ± 2% 4.83ns ± 6% +2.77% (p=0.000 n=85+96) SubVW/5-8 5.84ns ± 1% 5.35ns ± 1% -8.35% (p=0.000 n=87+87) SubVW/10-8 8.01ns ± 4% 7.54ns ± 4% -5.84% (p=0.000 n=98+97) SubVW/100-8 43.9ns ± 5% 17.9ns ± 1% -59.20% (p=0.000 n=98+76) SubVW/1000-8 426ns ± 2% 85ns ± 3% -80.13% (p=0.000 n=90+98) SubVW/10000-8 4.24µs ± 2% 1.81µs ± 3% -57.28% (p=0.000 n=74+91) SubVW/100000-8 44.5µs ± 4% 31.5µs ± 2% -29.33% (p=0.000 n=84+91) Change-Id: I10dd361cbaca22197c27e7734c0f50065292afbb Reviewed-on: https://go-review.googlesource.com/c/go/+/164969 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-03-09 20:34:40 +00:00
Josh Bleecher Snyder	fe24837c4d	math/big: add fast path for pure Go addVW for large z In the normal case, only a few words have to be updated when adding a word to a vector. When that happens, we can simply copy the rest of the words, which is much faster. However, the overhead of that makes it prohibitive for small vectors, so we check the size at the beginning. The implementation is a bit weird to allow addVW to continued to be inlined; see #30548. The AddVW benchmarks are surprising, but fully repeatable. The SubVW benchmarks are more or less as expected. I expect that removing the indirect function call will help both and make them a bit more normal. name old time/op new time/op delta AddVW/1-8 4.27ns ± 2% 3.81ns ± 3% -10.83% (p=0.000 n=89+90) AddVW/2-8 4.91ns ± 2% 4.34ns ± 1% -11.60% (p=0.000 n=83+90) AddVW/3-8 5.77ns ± 4% 5.76ns ± 2% ~ (p=0.365 n=91+87) AddVW/4-8 6.03ns ± 1% 6.03ns ± 1% ~ (p=0.392 n=80+76) AddVW/5-8 6.48ns ± 2% 6.63ns ± 1% +2.27% (p=0.000 n=76+74) AddVW/10-8 9.56ns ± 2% 9.56ns ± 1% -0.02% (p=0.002 n=69+76) AddVW/100-8 90.6ns ± 0% 18.1ns ± 4% -79.99% (p=0.000 n=72+94) AddVW/1000-8 865ns ± 0% 85ns ± 6% -90.14% (p=0.000 n=66+96) AddVW/10000-8 8.57µs ± 2% 1.82µs ± 3% -78.73% (p=0.000 n=99+94) AddVW/100000-8 84.4µs ± 2% 31.8µs ± 4% -62.29% (p=0.000 n=93+98) name old time/op new time/op delta SubVW/1-8 3.90ns ± 2% 4.13ns ± 4% +6.02% (p=0.000 n=92+95) SubVW/2-8 4.15ns ± 1% 5.20ns ± 1% +25.22% (p=0.000 n=83+85) SubVW/3-8 5.50ns ± 2% 6.22ns ± 6% +13.21% (p=0.000 n=91+97) SubVW/4-8 5.99ns ± 1% 6.63ns ± 1% +10.63% (p=0.000 n=79+61) SubVW/5-8 6.75ns ± 4% 6.88ns ± 2% +1.82% (p=0.000 n=98+73) SubVW/10-8 9.57ns ± 1% 9.56ns ± 1% -0.13% (p=0.000 n=77+64) SubVW/100-8 90.3ns ± 1% 18.1ns ± 2% -80.00% (p=0.000 n=75+94) SubVW/1000-8 860ns ± 4% 85ns ± 7% -90.14% (p=0.000 n=97+99) SubVW/10000-8 8.51µs ± 3% 1.77µs ± 6% -79.21% (p=0.000 n=100+97) SubVW/100000-8 84.4µs ± 3% 31.5µs ± 3% -62.66% (p=0.000 n=92+92) Change-Id: I721d7031d40f245b4a284f5bdd93e7bb85e7e937 Reviewed-on: https://go-review.googlesource.com/c/go/+/164968 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-03-09 20:33:46 +00:00
Josh Bleecher Snyder	4c227a091e	math/big: remove bounds checks in pure Go implementations These routines are quite sensitive to BCE. This change eliminates bounds checks from loops. It does so at the cost of a bit of safety: malformed input will now return incorrect answers instead of panicking. This isn't as bad as it sounds: math/big has very good test coverage, and the alternative implementations are in assembly, which could do much worse things with malformed input. If the compiler's BCE improves, so could these routines. Notable BCE improvements for these routines would be: * Allowing and propagating more cross-slice length hints. Then hints like _ = y[:len(z)] would eliminate bounds checks for y[i]. * Propagating enough information so that we could do n := len(x) if len(z) < n { n = len(z) } and then have i < n eliminate the same bounds checks as i < len(x) && i < len(z) currently does. * Providing some way to do BCE for unrolled loops. Now that we have math/bits implementations, it is possible to write things like ADC chains in pure Go, if you can reasonably unroll loops. Benchmarks below are for amd64, using -tags=math_big_pure_go. name old time/op new time/op delta AddVV/1-8 5.15ns ± 3% 4.65ns ± 4% -9.81% (p=0.000 n=93+86) AddVV/2-8 6.40ns ± 2% 5.58ns ± 4% -12.78% (p=0.000 n=90+95) AddVV/3-8 7.07ns ± 2% 6.66ns ± 2% -5.88% (p=0.000 n=87+83) AddVV/4-8 7.94ns ± 5% 7.41ns ± 4% -6.65% (p=0.000 n=94+98) AddVV/5-8 8.55ns ± 1% 8.80ns ± 0% +2.92% (p=0.000 n=87+92) AddVV/10-8 12.7ns ± 1% 12.3ns ± 1% -3.12% (p=0.000 n=83+71) AddVV/100-8 119ns ± 5% 117ns ± 4% -1.60% (p=0.000 n=93+90) AddVV/1000-8 1.14µs ± 4% 1.14µs ± 5% ~ (p=0.812 n=95+91) AddVV/10000-8 11.4µs ± 5% 11.3µs ± 5% ~ (p=0.503 n=97+96) AddVV/100000-8 114µs ± 4% 113µs ± 5% -0.98% (p=0.002 n=97+90) name old time/op new time/op delta SubVV/1-8 5.23ns ± 5% 4.65ns ± 3% -11.18% (p=0.000 n=89+91) SubVV/2-8 6.49ns ± 5% 5.58ns ± 3% -14.04% (p=0.000 n=92+94) SubVV/3-8 7.10ns ± 3% 6.65ns ± 2% -6.28% (p=0.000 n=87+80) SubVV/4-8 8.04ns ± 1% 7.44ns ± 5% -7.49% (p=0.000 n=83+98) SubVV/5-8 8.55ns ± 2% 8.32ns ± 1% -2.75% (p=0.000 n=84+92) SubVV/10-8 12.7ns ± 1% 12.3ns ± 1% -3.09% (p=0.000 n=80+75) SubVV/100-8 119ns ± 0% 116ns ± 3% -1.83% (p=0.000 n=87+98) SubVV/1000-8 1.13µs ± 5% 1.13µs ± 3% ~ (p=0.082 n=96+98) SubVV/10000-8 11.2µs ± 1% 11.3µs ± 3% +0.76% (p=0.000 n=87+97) SubVV/100000-8 112µs ± 2% 113µs ± 3% +0.55% (p=0.000 n=76+88) name old time/op new time/op delta AddVW/1-8 4.30ns ± 4% 3.96ns ± 6% -8.02% (p=0.000 n=89+97) AddVW/2-8 5.15ns ± 2% 4.91ns ± 1% -4.56% (p=0.000 n=87+80) AddVW/3-8 5.59ns ± 3% 5.75ns ± 2% +2.91% (p=0.000 n=91+88) AddVW/4-8 6.20ns ± 1% 6.03ns ± 1% -2.71% (p=0.000 n=75+90) AddVW/5-8 6.93ns ± 3% 6.49ns ± 2% -6.35% (p=0.000 n=100+82) AddVW/10-8 10.0ns ± 7% 9.6ns ± 0% -4.02% (p=0.000 n=98+74) AddVW/100-8 91.1ns ± 1% 90.6ns ± 1% -0.55% (p=0.000 n=84+80) AddVW/1000-8 866ns ± 1% 856ns ± 4% -1.06% (p=0.000 n=69+96) AddVW/10000-8 8.64µs ± 1% 8.53µs ± 4% -1.25% (p=0.000 n=67+99) AddVW/100000-8 84.3µs ± 2% 85.4µs ± 4% +1.22% (p=0.000 n=89+99) name old time/op new time/op delta SubVW/1-8 4.28ns ± 2% 3.82ns ± 3% -10.63% (p=0.000 n=91+89) SubVW/2-8 4.61ns ± 1% 4.48ns ± 3% -2.67% (p=0.000 n=94+96) SubVW/3-8 5.54ns ± 1% 5.81ns ± 4% +4.87% (p=0.000 n=92+97) SubVW/4-8 6.20ns ± 1% 6.08ns ± 2% -1.99% (p=0.000 n=71+88) SubVW/5-8 6.91ns ± 3% 6.64ns ± 1% -3.90% (p=0.000 n=97+70) SubVW/10-8 9.85ns ± 2% 9.62ns ± 0% -2.31% (p=0.000 n=82+62) SubVW/100-8 91.1ns ± 1% 90.9ns ± 3% -0.14% (p=0.010 n=71+93) SubVW/1000-8 859ns ± 3% 867ns ± 1% +0.98% (p=0.000 n=99+78) SubVW/10000-8 8.54µs ± 5% 8.57µs ± 2% +0.38% (p=0.007 n=98+92) SubVW/100000-8 84.5µs ± 3% 84.6µs ± 3% ~ (p=0.334 n=95+94) name old time/op new time/op delta AddMulVVW/1-8 5.43ns ± 3% 4.36ns ± 2% -19.67% (p=0.000 n=95+94) AddMulVVW/2-8 6.56ns ± 4% 6.11ns ± 1% -6.90% (p=0.000 n=91+91) AddMulVVW/3-8 8.00ns ± 1% 7.80ns ± 4% -2.52% (p=0.000 n=83+95) AddMulVVW/4-8 9.81ns ± 2% 9.53ns ± 1% -2.86% (p=0.000 n=77+64) AddMulVVW/5-8 11.4ns ± 3% 11.3ns ± 5% -0.89% (p=0.000 n=95+97) AddMulVVW/10-8 18.9ns ± 5% 19.1ns ± 5% +0.89% (p=0.000 n=91+94) AddMulVVW/100-8 165ns ± 5% 165ns ± 4% ~ (p=0.427 n=97+98) AddMulVVW/1000-8 1.56µs ± 3% 1.56µs ± 4% ~ (p=0.167 n=98+96) AddMulVVW/10000-8 15.7µs ± 5% 15.6µs ± 5% -0.31% (p=0.044 n=95+97) AddMulVVW/100000-8 156µs ± 3% 157µs ± 8% ~ (p=0.373 n=72+99) Change-Id: Ibc720785d5b95f6a797103b1363843205f4d56bf Reviewed-on: https://go-review.googlesource.com/c/go/+/164966 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-03-09 20:33:13 +00:00

1 2 3 4 5 ...

346 Commits