Commit Graph

346 Commits

Author SHA1 Message Date
Katie Hockman 84150d0af1 [release-branch.go1.15-security] math/big: fix shift for recursive division
The previous s value could cause a crash
for certain inputs.

Will check in tests and documentation improvements later.

Thanks to the Go Ethereum team and the OSS-Fuzz project for reporting this.
Thanks to Rémy Oudompheng and Robert Griesemer for their help
developing and validating the fix.

Fixes CVE-2020-28362

Change-Id: Ibbf455c4436bcdb07c84a34fa6551fb3422356d3
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/899974
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Filippo Valsorda <valsorda@google.com>
(cherry picked from commit 28015462c2a83239543dc2bef651e9a5f234b633)
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/901065
2020-11-11 23:35:42 +00:00
Bryan C. Mills 13617380ca testing: clean up remaining TempDir issues from CL 231958
Updates #38850

Change-Id: I33f48762f5520eb0c0a841d8ca1ccdd65ecc20c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/234583
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-05-19 19:55:14 +00:00
Filippo Valsorda c9d5f60eaa math/big: add (*Int).FillBytes
Replaced almost every use of Bytes with FillBytes.

Note that the approved proposal was for

    func (*Int) FillBytes(buf []byte)

while this implements

    func (*Int) FillBytes(buf []byte) []byte

because the latter was far nicer to use in all callsites.

Fixes #35833

Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/230397
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-05-05 00:36:44 +00:00
Ruixin(Peter) Bao a7e9e84716 math/big: simplify hasVX checking on s390x
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.

Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87
Reviewed-on: https://go-review.googlesource.com/c/go/+/230318
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-27 20:20:53 +00:00
Ruixin(Peter) Bao 3a37fd4010 math/big: rewrite subVW to use fast path on s390x
This CL replaces the original subVW implementation with a implementation
that uses a similar idea as CL 164968.

When we know the borrow bit is zero, we can copy the rest of words as
they will not be updated. Also, since we are copying vector of a words,
a faster implementation of copy is written in this CL to copy a word or
multiple words at a time.

Benchmarks:
name             old time/op    new time/op     delta
SubVW/1-18         4.43ns ± 0%     3.82ns ± 0%   -13.85%  (p=0.000 n=20+20)
SubVW/2-18         5.39ns ± 0%     4.25ns ± 0%   -21.23%  (p=0.000 n=20+20)
SubVW/3-18         6.29ns ± 0%     4.65ns ± 0%   -26.07%  (p=0.000 n=16+19)
SubVW/4-18         6.08ns ± 2%     4.84ns ± 0%   -20.43%  (p=0.000 n=20+20)
SubVW/5-18         7.06ns ± 1%     4.93ns ± 0%   -30.18%  (p=0.000 n=20+20)
SubVW/10-18        10.3ns ± 2%      7.2ns ± 0%   -30.35%  (p=0.000 n=20+19)
SubVW/100-18       48.0ns ± 4%     17.6ns ± 0%   -63.32%  (p=0.000 n=18+19)
SubVW/1000-18       448ns ±10%      236ns ± 1%   -47.24%  (p=0.000 n=20+20)
SubVW/10000-18     4.83µs ± 5%     2.96µs ± 0%   -38.73%  (p=0.000 n=20+19)
SubVW/100000-18    46.6µs ± 3%     30.6µs ± 1%   -34.30%  (p=0.000 n=20+20)
[Geo mean]         56.3ns          37.0ns        -34.24%

name             old speed      new speed       delta
SubVW/1-18       1.80GB/s ± 0%   2.10GB/s ± 0%   +16.16%  (p=0.000 n=20+20)
SubVW/2-18       2.97GB/s ± 0%   3.77GB/s ± 0%   +26.95%  (p=0.000 n=20+20)
SubVW/3-18       3.82GB/s ± 0%   5.16GB/s ± 0%   +35.26%  (p=0.000 n=20+19)
SubVW/4-18       5.26GB/s ± 1%   6.61GB/s ± 0%   +25.59%  (p=0.000 n=20+20)
SubVW/5-18       5.67GB/s ± 1%   8.11GB/s ± 0%   +43.12%  (p=0.000 n=20+20)
SubVW/10-18      7.79GB/s ± 2%  11.17GB/s ± 0%   +43.52%  (p=0.000 n=20+19)
SubVW/100-18     16.7GB/s ± 4%   45.5GB/s ± 0%  +172.61%  (p=0.000 n=18+20)
SubVW/1000-18    17.9GB/s ± 9%   33.9GB/s ± 1%   +89.25%  (p=0.000 n=20+20)
SubVW/10000-18   16.6GB/s ± 5%   27.0GB/s ± 0%   +63.08%  (p=0.000 n=20+19)
SubVW/100000-18  17.2GB/s ± 2%   26.1GB/s ± 1%   +52.18%  (p=0.000 n=20+20)
[Geo mean]       7.25GB/s       11.03GB/s        +52.01%

Change-Id: I32e99cbab3260054a96231d02b87049c833ab77e
Reviewed-on: https://go-review.googlesource.com/c/go/+/227297
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-24 14:50:59 +00:00
Ruixin(Peter) Bao ee8972cd12 math/big: rewrite addVW to use fast path on s390x
Rewrite addVW to use a fast path and remove the original
vector and non vector implementation of addVW in assembly. This CL uses
a similar idea as CL 164968, where we copy the rest of words when we
know carry bit is zero.

In addition, since we are copying vector of words, a faster
implementation of copy is written in this CL to copy a word or multiple
words at a time.

Benchmarks:
name             old time/op    new time/op     delta
AddVW/1-18         4.56ns ± 0%     4.01ns ± 6%   -12.14%  (p=0.000 n=18+20)
AddVW/2-18         5.54ns ± 0%     4.42ns ± 5%   -20.20%  (p=0.000 n=18+20)
AddVW/3-18         6.55ns ± 0%     4.61ns ± 0%   -29.62%  (p=0.000 n=16+18)
AddVW/4-18         6.11ns ± 2%     5.12ns ± 6%   -16.19%  (p=0.000 n=20+20)
AddVW/5-18         7.32ns ± 4%     5.14ns ± 0%   -29.77%  (p=0.000 n=20+19)
AddVW/10-18        10.6ns ± 2%      7.2ns ± 1%   -31.47%  (p=0.000 n=20+20)
AddVW/100-18       49.6ns ± 2%     18.0ns ± 0%   -63.63%  (p=0.000 n=20+20)
AddVW/1000-18       465ns ± 3%      244ns ± 0%   -47.54%  (p=0.000 n=20+20)
AddVW/10000-18     4.99µs ± 4%     2.97µs ± 0%   -40.54%  (p=0.000 n=20+20)
AddVW/100000-18    48.3µs ± 3%     30.8µs ± 1%   -36.29%  (p=0.000 n=20+20)
[Geo mean]         58.1ns          38.0ns        -34.57%

name             old speed      new speed       delta
AddVW/1-18       1.76GB/s ± 0%   2.00GB/s ± 6%   +14.04%  (p=0.000 n=20+20)
AddVW/2-18       2.89GB/s ± 0%   3.63GB/s ± 5%   +25.55%  (p=0.000 n=18+20)
AddVW/3-18       3.66GB/s ± 0%   5.21GB/s ± 0%   +42.25%  (p=0.000 n=18+19)
AddVW/4-18       5.24GB/s ± 2%   6.27GB/s ± 6%   +19.61%  (p=0.000 n=20+20)
AddVW/5-18       5.47GB/s ± 4%   7.78GB/s ± 0%   +42.28%  (p=0.000 n=20+18)
AddVW/10-18      7.55GB/s ± 2%  11.04GB/s ± 1%   +46.09%  (p=0.000 n=20+20)
AddVW/100-18     16.1GB/s ± 2%   44.3GB/s ± 0%  +174.77%  (p=0.000 n=20+20)
AddVW/1000-18    17.2GB/s ± 3%   32.8GB/s ± 1%   +90.58%  (p=0.000 n=20+20)
AddVW/10000-18   16.0GB/s ± 4%   26.9GB/s ± 0%   +68.11%  (p=0.000 n=20+20)
AddVW/100000-18  16.6GB/s ± 3%   26.0GB/s ± 1%   +56.94%  (p=0.000 n=20+20)
[Geo mean]       7.03GB/s       10.75GB/s        +52.93%

Change-Id: Idbb73f3178311bd2b18a93bdc1e48f26869d2f6a
Reviewed-on: https://go-review.googlesource.com/c/go/+/209679
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-24 13:26:34 +00:00
Ruixin(Peter) Bao 0329c915a0 math/big: clean up whitespace in arith_s390x.s file
This CL looks big but it only does formatting changes to arith_s390x.s.
The file was formatted using asmfmt(https://github.com/klauspost/asmfmt)
, so there should not be any functional impact. I verified that the
generated assembly of big.test file is identical.

Change-Id: I8b4035ef082a4d0357881869327e25253f2d8be1
Reviewed-on: https://go-review.googlesource.com/c/go/+/229302
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-22 15:40:55 +00:00
Alberto Donizetti 813f8eae27 math/big: remove Direct Sqrt computation
The Float.Sqrt method switches (for performance reasons) between
direct (uses Quo) and inverse (doesn't) computation, depending on the
precision, with threshold 128.

Unfortunately the implementation of recursive division in CL 172018
made Quo slightly slower exactly in the range around and below the
threshold Sqrt is using, so this strategy is no longer profitable.

The new division algorithm allocates more, and this has increased the
amount of allocations performed by Sqrt when using the direct method;
on low precisions the computation is fast, so additional allocations
have an negative impact on performance.

Interestingly, only using the inverse method doesn't just reverse the
effects of the Quo algorithm change, but it seems to make performances
better overall for small precisions:

name                 old time/op    new time/op    delta
FloatSqrt/64-4          643ns ± 1%     635ns ± 1%   -1.24%  (p=0.000 n=10+10)
FloatSqrt/128-4        1.44µs ± 1%    1.02µs ± 1%  -29.25%  (p=0.000 n=10+10)
FloatSqrt/256-4        1.49µs ± 1%    1.49µs ± 1%     ~     (p=0.752 n=10+10)
FloatSqrt/1000-4       3.71µs ± 1%    3.74µs ± 1%   +0.87%  (p=0.001 n=10+10)
FloatSqrt/10000-4      35.3µs ± 1%    35.6µs ± 1%   +0.82%  (p=0.002 n=10+9)
FloatSqrt/100000-4      844µs ± 1%     844µs ± 0%     ~     (p=0.549 n=10+9)
FloatSqrt/1000000-4    69.5ms ± 0%    69.6ms ± 0%     ~     (p=0.222 n=9+9)

name                 old alloc/op   new alloc/op   delta
FloatSqrt/64-4           280B ± 0%      200B ± 0%  -28.57%  (p=0.000 n=10+10)
FloatSqrt/128-4          504B ± 0%      248B ± 0%  -50.79%  (p=0.000 n=10+10)
FloatSqrt/256-4          344B ± 0%      344B ± 0%     ~     (all equal)
FloatSqrt/1000-4       1.30kB ± 0%    1.30kB ± 0%     ~     (all equal)
FloatSqrt/10000-4      13.5kB ± 0%    13.5kB ± 0%     ~     (p=0.237 n=10+10)
FloatSqrt/100000-4      123kB ± 0%     123kB ± 0%     ~     (p=0.247 n=10+10)
FloatSqrt/1000000-4    1.83MB ± 1%    1.83MB ± 3%     ~     (p=0.779 n=8+10)

name                 old allocs/op  new allocs/op  delta
FloatSqrt/64-4           8.00 ± 0%      5.00 ± 0%  -37.50%  (p=0.000 n=10+10)
FloatSqrt/128-4          11.0 ± 0%       5.0 ± 0%  -54.55%  (p=0.000 n=10+10)
FloatSqrt/256-4          5.00 ± 0%      5.00 ± 0%     ~     (all equal)
FloatSqrt/1000-4         6.00 ± 0%      6.00 ± 0%     ~     (all equal)
FloatSqrt/10000-4        6.00 ± 0%      6.00 ± 0%     ~     (all equal)
FloatSqrt/100000-4       6.00 ± 0%      6.00 ± 0%     ~     (all equal)
FloatSqrt/1000000-4      10.3 ±13%      10.3 ±13%     ~     (p=1.000 n=10+10)

For example, 1.02µs for FloatSqrt/128 is actually better than what I
was getting on the same machine before the Quo changes.

The .8% slowdown on /1000 and /10000 appears to be real and it is
quite baffling (that codepath was not touched at all); it may be
caused by code alignment changes.

Change-Id: Ib03761cdc1055674bc7526d4f3a23d7a25094029
Reviewed-on: https://go-review.googlesource.com/c/go/+/228062
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-04-15 16:37:53 +00:00
Brad Fitzpatrick 8f53fad035 math/big: add test that linker is able to remove unused code
(Follow-up to CL 228108.)

Change-Id: Ia6d119ee19c7aa923cdeead06d3cee87a1751105
Reviewed-on: https://go-review.googlesource.com/c/go/+/228109
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-04-15 03:25:21 +00:00
Hanjun Kim 5a447c0ae9 math/big: fix typo in documentation for Int.Exp
Fixes #38304

Also change `If m > 0, y < 0, ...` to `If m != 0, y < 0, ...` since `Exp` will return `nil`
whatever `m`'s sign is.

Change-Id: I17d7337ccd1404318cea5d42a8de904ad185fd00
GitHub-Last-Rev: 2399510300
GitHub-Pull-Request: golang/go#38390
Reviewed-on: https://go-review.googlesource.com/c/go/+/228000
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-04-15 00:32:18 +00:00
Brad Fitzpatrick a55645fa34 math/big: don't use Float in init to help linker discard 162 KiB
Removes 162 KiB from binaries that don't use math/big.Float:

-rwxr-xr-x 1 bradfitz bradfitz 1916590 Apr 14 12:21 x.after
-rwxr-xr-x 1 bradfitz bradfitz 2082575 Apr 14 12:21 x.before

No change in deps (this package already used sync).

No change in benchmarks:

name                 old time/op    new time/op    delta
FloatSqrt/64-8         1.06µs ±10%    1.03µs ± 6%   ~     (p=0.133 n=10+9)
FloatSqrt/128-8        2.26µs ± 9%    2.28µs ± 9%   ~     (p=0.460 n=10+8)
FloatSqrt/256-8        2.29µs ± 5%    2.31µs ± 3%   ~     (p=0.214 n=9+9)
FloatSqrt/1000-8       5.82µs ± 3%    5.87µs ± 7%   ~     (p=0.666 n=9+9)
FloatSqrt/10000-8      56.4µs ± 5%    57.0µs ± 6%   ~     (p=0.436 n=10+10)
FloatSqrt/100000-8     1.34ms ± 8%    1.31ms ± 3%   ~     (p=0.447 n=10+9)
FloatSqrt/1000000-8     106ms ± 5%     107ms ± 7%   ~     (p=0.315 n=10+10)

name                 old alloc/op   new alloc/op   delta
FloatSqrt/64-8           280B ± 0%      280B ± 0%   ~     (all equal)
FloatSqrt/128-8          504B ± 0%      504B ± 0%   ~     (all equal)
FloatSqrt/256-8          344B ± 0%      344B ± 0%   ~     (all equal)
FloatSqrt/1000-8       1.30kB ± 0%    1.30kB ± 0%   ~     (all equal)
FloatSqrt/10000-8      13.5kB ± 0%    13.5kB ± 0%   ~     (p=0.403 n=10+10)
FloatSqrt/100000-8      123kB ± 0%     123kB ± 0%   ~     (p=0.393 n=10+10)
FloatSqrt/1000000-8    1.84MB ± 7%    1.84MB ± 5%   ~     (p=0.739 n=10+10)

name                 old allocs/op  new allocs/op  delta
FloatSqrt/64-8           8.00 ± 0%      8.00 ± 0%   ~     (all equal)
FloatSqrt/128-8          11.0 ± 0%      11.0 ± 0%   ~     (all equal)
FloatSqrt/256-8          5.00 ± 0%      5.00 ± 0%   ~     (all equal)
FloatSqrt/1000-8         6.00 ± 0%      6.00 ± 0%   ~     (all equal)
FloatSqrt/10000-8        6.00 ± 0%      6.00 ± 0%   ~     (all equal)
FloatSqrt/100000-8       6.00 ± 0%      6.00 ± 0%   ~     (all equal)
FloatSqrt/1000000-8      10.9 ±10%      10.8 ±17%   ~     (p=0.974 n=10+10)

Change-Id: I3337f1f531bf7b4fae192b9d90cd24ff2be14fea
Reviewed-on: https://go-review.googlesource.com/c/go/+/228108
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-14 20:50:19 +00:00
Rémy Oudompheng ac1fd419b6 math/big: correct off-by-one access in divBasic
The divBasic function computes the quotient of big nats u/v word by word.
It estimates each word qhat by performing a long division (top 2 words of u
divided by top word of v), looks at the next word to correct the estimate,
then perform a full multiplication (qhat*v) to catch any inaccuracy in the
estimate.

In the latter case, "negative" values appear temporarily and carries
must be carefully managed, and the recursive division refactoring
introduced a case where qhat*v has the same length as v, triggering an
out-of-bounds write in the case it happens when computing the top word
of the quotient.

Fixes #37499

Change-Id: I15089da4a4027beda43af497bf6de261eb792f94
Reviewed-on: https://go-review.googlesource.com/c/go/+/221980
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-04-08 20:53:58 +00:00
Alberto Donizetti c5058652fd math/big: document that Sqrt doesn't set Accuracy
Document that the Float.Sqrt method does not set the receiver's
Accuracy field.

Updates #37915

Change-Id: Ief1dcac07eacc0ef02f86bfac9044501477bca1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/224497
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-03-20 19:24:37 +00:00
Joel Sing fe70838598 math/big: initial vector arithmetic in riscv64 assembly
Provide an assembly implementation of mulWW - for now all others run the
Go code.

Change-Id: Icb594c31048255f131bdea8d64f56784fc9db4d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/220919
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-02-25 16:47:02 +00:00
Filippo Valsorda 88ae4ccefb math/big: reintroduce pre-Go 1.14 mention in GCD docs
It was removed in CL 217302 but was intentionally added in CL 217104.

Change-Id: I1a478d80ad1ec4f0a0184bfebf8f1a5e352cfe8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/217941
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-02-05 20:54:27 +00:00
Filippo Valsorda cdb7fd6b06 math/big: simplify GCD docs
We don't usually document past behavior (like "As of Go 1.14 ...") and
in isolation the current docs made it sound like a and b could only be
negative or zero.

Change-Id: I0d3c2b8579a9c01159ce528a3128b1478e99042a
Reviewed-on: https://go-review.googlesource.com/c/go/+/217302
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-01-31 23:32:37 +00:00
Robert Griesemer 9bb40ed8ec math/big: update comment on Int.GCD
Per the suggestion https://golang.org/cl/216200/2/doc/go1.14.html#423.

Updates #28878.

Change-Id: I654d2d114409624219a0041916f0a4030efc7573
Reviewed-on: https://go-review.googlesource.com/c/go/+/217104
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-01-30 20:37:01 +00:00
Joel Sing 5a3a5d3525 math, math/big: add support for riscv64
Based on riscv-go port.

Updates #27532

Change-Id: Id8ae7d851c393ec3702e4176c363accb0a42587f
Reviewed-on: https://go-review.googlesource.com/c/go/+/204633
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-01-15 18:49:52 +00:00
Ville Skyttä 440f7d6404 all: fix a bunch of misspellings
Change-Id: I5b909df0fd048cd66c5a27fca1b06466d3bcaac7
GitHub-Last-Rev: 778c5d2131
GitHub-Pull-Request: golang/go#35624
Reviewed-on: https://go-review.googlesource.com/c/go/+/207421
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-11-15 21:04:43 +00:00
Rémy Oudompheng 7ad27481f8 math/big: fix out-of-bounds panic in divRecursive
The bounds in the last carry branch were wrong as there
is no reason for len(u) >= n+n/2 to always hold true.

We also adjust test to avoid using a remainder of 1
(in which case, the last step of the algorithm computes
(qhatv+1) - qhatv which rarely produces a carry).

Change-Id: I69fbab9c5e19d0db1c087fbfcd5b89352c2d26fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/206839
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-11-13 19:15:27 +00:00
Robert Griesemer 1fe33e3cb2 math/big: ensure correct test input
There is a (theoretical, but possible) chance that the
random number values a, b used for TestDiv are 0 or 1,
in which case the test would fail.

This CL makes sure that a >= 1 and b >= 2 at all times.

Fixes #35523.

Change-Id: I6451feb94241249516a821cd0066e95a0c65b0ed
Reviewed-on: https://go-review.googlesource.com/c/go/+/206818
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-11-12 18:52:52 +00:00
Rémy Oudompheng 194ae3236d math/big: implement recursive algorithm for division
The current division algorithm produces one word of result at a time,
using 2-word division to compute the top word and mulAddVWW to compute
the remainder. The top word may need to be adjusted by 1 or 2 units.

The recursive version, based on Burnikel, Ziegler, "Fast Recursive Division",
uses the same principles, but in a multi-word setting, so that
multiplication benefits from the Karatsuba algorithm (and possibly later
improvements).

benchmark                             old ns/op        new ns/op      delta
BenchmarkDiv/20/10-4                  38.2             38.3           +0.26%
BenchmarkDiv/40/20-4                  38.7             38.5           -0.52%
BenchmarkDiv/100/50-4                 62.5             62.6           +0.16%
BenchmarkDiv/200/100-4                238              259            +8.82%
BenchmarkDiv/400/200-4                311              338            +8.68%
BenchmarkDiv/1000/500-4               604              649            +7.45%
BenchmarkDiv/2000/1000-4              1214             1278           +5.27%
BenchmarkDiv/20000/10000-4            38279            36510          -4.62%
BenchmarkDiv/200000/100000-4          3022057          1359615        -55.01%
BenchmarkDiv/2000000/1000000-4        310827664        54012939       -82.62%
BenchmarkDiv/20000000/10000000-4      33272829421      1965401359     -94.09%
BenchmarkString/10/Base10-4           158              156            -1.27%
BenchmarkString/100/Base10-4          797              792            -0.63%
BenchmarkString/1000/Base10-4         3677             3814           +3.73%
BenchmarkString/10000/Base10-4        16633            17116          +2.90%
BenchmarkString/100000/Base10-4       5779029          1793808        -68.96%
BenchmarkString/1000000/Base10-4      889840820        85524031       -90.39%
BenchmarkString/10000000/Base10-4     134338236860     4935657026     -96.33%

Fixes #21960
Updates #30943

Change-Id: I134c6f81a47870c688ca95b6081eb9211def15a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/172018
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-11-12 05:18:25 +00:00
Brian Kessler f5949b6067 math/big: allow all values for GCD
Allow the inputs a and b to be zero or negative to GCD
with the following definitions.

If x or y are not nil, GCD sets their value such that z = a*x + b*y.
Regardless of the signs of a and b, z is always >= 0.
If a == b == 0, GCD sets z = x = y = 0.
If a == 0 and b != 0, GCD sets z = |b|, x = 0, y = sign(b) * 1.
If a != 0 and b == 0, GCD sets z = |a|, x = sign(a) * 1, y = 0.

Fixes #28878

Change-Id: Ia83fce66912a96545c95cd8df0549bfd852652f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/164972
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-11-07 06:58:44 +00:00
Rémy Oudompheng 8f30d25168 math/big: use nat pool to reduce allocations in mul and sqr
This notably allows to reuse temporaries across
the karatsubaSqr recursion.

benchmark                    old ns/op     new ns/op     delta
BenchmarkNatMul/10-4         227           228           +0.44%
BenchmarkNatMul/100-4        8339          8589          +3.00%
BenchmarkNatMul/1000-4       313796        312272        -0.49%
BenchmarkNatMul/10000-4      11924720      11873589      -0.43%
BenchmarkNatMul/100000-4     503813354     503839058     +0.01%
BenchmarkNatSqr/20-4         549           513           -6.56%
BenchmarkNatSqr/30-4         945           874           -7.51%
BenchmarkNatSqr/50-4         1993          1832          -8.08%
BenchmarkNatSqr/80-4         4096          3874          -5.42%
BenchmarkNatSqr/100-4        6192          5712          -7.75%
BenchmarkNatSqr/200-4        20388         19543         -4.14%
BenchmarkNatSqr/300-4        38735         36715         -5.21%
BenchmarkNatSqr/500-4        99562         93542         -6.05%
BenchmarkNatSqr/800-4        195554        184907        -5.44%
BenchmarkNatSqr/1000-4       286302        275053        -3.93%
BenchmarkNatSqr/10000-4      9817057       9441641       -3.82%
BenchmarkNatSqr/100000-4     390713416     379696789     -2.82%

benchmark                    old allocs     new allocs     delta
BenchmarkNatMul/10-4         1              1              +0.00%
BenchmarkNatMul/100-4        1              1              +0.00%
BenchmarkNatMul/1000-4       2              1              -50.00%
BenchmarkNatMul/10000-4      2              1              -50.00%
BenchmarkNatMul/100000-4     9              11             +22.22%
BenchmarkNatSqr/20-4         2              1              -50.00%
BenchmarkNatSqr/30-4         2              1              -50.00%
BenchmarkNatSqr/50-4         2              1              -50.00%
BenchmarkNatSqr/80-4         2              1              -50.00%
BenchmarkNatSqr/100-4        2              1              -50.00%
BenchmarkNatSqr/200-4        2              1              -50.00%
BenchmarkNatSqr/300-4        4              1              -75.00%
BenchmarkNatSqr/500-4        4              1              -75.00%
BenchmarkNatSqr/800-4        10             1              -90.00%
BenchmarkNatSqr/1000-4       10             1              -90.00%
BenchmarkNatSqr/10000-4      731            1              -99.86%
BenchmarkNatSqr/100000-4     19687          6              -99.97%

benchmark                    old bytes     new bytes     delta
BenchmarkNatMul/10-4         192           192           +0.00%
BenchmarkNatMul/100-4        4864          4864          +0.00%
BenchmarkNatMul/1000-4       57344         49224         -14.16%
BenchmarkNatMul/10000-4      565248        498772        -11.76%
BenchmarkNatMul/100000-4     5749504       7263720       +26.34%
BenchmarkNatSqr/20-4         672           352           -47.62%
BenchmarkNatSqr/30-4         992           512           -48.39%
BenchmarkNatSqr/50-4         1792          896           -50.00%
BenchmarkNatSqr/80-4         2688          1408          -47.62%
BenchmarkNatSqr/100-4        3584          1792          -50.00%
BenchmarkNatSqr/200-4        6656          3456          -48.08%
BenchmarkNatSqr/300-4        24448         16387         -32.97%
BenchmarkNatSqr/500-4        36864         24591         -33.29%
BenchmarkNatSqr/800-4        69760         40981         -41.25%
BenchmarkNatSqr/1000-4       86016         49180         -42.82%
BenchmarkNatSqr/10000-4      2524800       487368        -80.70%
BenchmarkNatSqr/100000-4     68599808      5876581       -91.43%

Change-Id: I8e6e409ae1cb48be9d5aa9b5f428d6cbe487673a
Reviewed-on: https://go-review.googlesource.com/c/go/+/172017
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-10-25 03:14:39 +00:00
Robert Griesemer a52c0a1992 math/big: make Rat.Denom side-effect free
A Rat is represented via a quotient a/b where a and b are Int values.
To make it possible to use an uninitialized Rat value (with a and b
uninitialized and thus == 0), the implementation treats a 0 denominator
as 1.

Rat.Num and Rat.Denom return pointers to these values a and b. Because
b may be 0, Rat.Denom used to first initialize it to 1 and thus produce
an undesirable side-effect (by changing the Rat's denominator).

This CL changes Denom to return a new (not shared) *Int with value 1
in the rare case where the Rat was not initialized. This eliminates
the side effect and returns the correct denominator value.

While this is changing behavior of the API, the impact should now be
minor because together with (prior) CL https://golang.org/cl/202997,
which initializes Rats ASAP, Denom is unlikely used to access the
denominator of an uninitialized (and thus 0) Rat. Any operation that
will somehow set a Rat value will ensure that the denominator is not 0.

Fixes #33792.
Updates #3521.

Change-Id: I0bf15ac60513cf52162bfb62440817ba36f0c3fc
Reviewed-on: https://go-review.googlesource.com/c/go/+/203059
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-10-24 03:34:24 +00:00
Robert Griesemer 4412181e7c math/big: normalize unitialized denominators ASAP
A Rat is represented via a quotient a/b where a and b are Int values.
To make it possible to use an uninitialized Rat value (with a and b
uninitialized and thus == 0), the implementation treats a 0 denominator
as 1.

For each operation we check if the denominator is 0, and then treat
it as 1 (if necessary). Operations that create a new Rat result,
normalize that value such that a result denominator 1 is represened
as 0 again.

This CL changes this behavior slightly: 0 denominators are still
interpreted as 1, but whenever we (safely) can, we set an uninitialized
0 denominator to 1. This simplifies the code overall.

Also: Improved some doc strings.

Preparation for addressing issue #33792.

Updates #33792.

Change-Id: I3040587c8d0dad2e840022f96ca027d8470878a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/202997
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-10-24 03:34:11 +00:00
Robert Griesemer 898f9db81f math/big: make Rat accessors safe for concurrent use
Do not modify the underlying Rat denominator when calling
one of the accessors Float32, Float64; verify that we don't
modify the Rat denominator when calling Inv, Sign, IsInt, Num.

Fixes #34919.
Reopens #33792.

Change-Id: Ife6d1252373f493a597398ee51e7b5695b708df5
Reviewed-on: https://go-review.googlesource.com/c/go/+/201205
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-15 23:31:58 +00:00
Brad Fitzpatrick 07b4abd62e all: remove the nacl port (part 2, amd64p32 + toolchain)
This is part two if the nacl removal. Part 1 was CL 199499.

This CL removes amd64p32 support, which might be useful in the future
if we implement the x32 ABI. It also removes the nacl bits in the
toolchain, and some remaining nacl bits.

Updates #30439

Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/200077
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-09 22:34:34 +00:00
Robert Griesemer 770fac4586 math/big: avoid MinExp exponent wrap-around in 'x' Text format
Fixes #34343.

Change-Id: I74240c8f431f6596338633a86a7a5ee1fce70a65
Reviewed-on: https://go-review.googlesource.com/c/go/+/196057
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-09-18 04:48:34 +00:00
Ainar Garipov 0efbd10157 all: fix typos
Use the following (suboptimal) script to obtain a list of possible
typos:

  #!/usr/bin/env sh

  set -x

  git ls-files |\
    grep -e '\.\(c\|cc\|go\)$' |\
    xargs -n 1\
    awk\
    '/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\
    hunspell -d en_US -l |\
    grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\
    grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\
    sort -f |\
    uniq -c |\
    awk '$1 == 1 { print $2; }'

Then, go through the results manually and fix the most obvious typos in
the non-vendored code.

Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae
Reviewed-on: https://go-review.googlesource.com/c/go/+/193848
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-09-08 17:28:20 +00:00
peter zhang d5fe73393c math/big: fix a duplicate "the" in a comment
Change-Id: Ib637381ab8a12aeb798576b781e1b3c458ba812d
GitHub-Last-Rev: 12994496b6
GitHub-Pull-Request: golang/go#34017
Reviewed-on: https://go-review.googlesource.com/c/go/+/192877
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2019-09-02 11:42:47 +00:00
Eric Lagergren 9dfa4cb026 math/big: document that Rat.Denom might modify the receiver
Fixes #33792

Change-Id: I306a95883c3db2d674d3294a6feb50adc50ee5d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/192017
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-08-28 16:20:00 +00:00
Illya Yalovyy be452cea42 math/big: fast path for Cmp if same
math/big.Int Cmp method does not have a fast path for the case if x and y are the same.

Fixes #30856

Change-Id: Ia9a5b5f72db9d73af1b13ed6ac39ecff87d10393
Reviewed-on: https://go-review.googlesource.com/c/go/+/178957
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-08-28 15:08:00 +00:00
Russ Cox 06b0babf31 all: shorten some tests
Shorten some of the longest tests that run during all.bash.
Removes 7r 50u 21s from all.bash.

After this change, all.bash is under 5 minutes again on my laptop.

For #26473.

Change-Id: Ie0460aa935808d65460408feaed210fbaa1d5d79
Reviewed-on: https://go-review.googlesource.com/c/go/+/177559
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-05-22 12:54:00 +00:00
JT Olio 5a2da5624a math/big: stack allocate scaleDenom return value
benchmark             old ns/op     new ns/op     delta
BenchmarkRatCmp-4     154           77.9          -49.42%

Change-Id: I932710ad8b6905879e232168b1777927f86ba22a
Reviewed-on: https://go-review.googlesource.com/c/go/+/175460
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-05-08 17:11:57 +00:00
erifan01 503e6ccd74 math/big: fix the bug in assembly implementation of shlVU on arm64
For the case where the addresses of parameter z and x of the function
shlVU overlap and the address of z is greater than x, x (input value)
can be polluted during the calculation when the high words of x are
overlapped with the low words of z (output value).

Fixes #31084

Change-Id: I9bb0266a1d7856b8faa9a9b1975d6f57dece0479
Reviewed-on: https://go-review.googlesource.com/c/go/+/169780
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-05-08 01:29:00 +00:00
Brian Kessler 689ee112df math/big: document Int.String
Int.String had no documentation and the documentation for Int.Text
did not mention the handling of the nil pointer case.

Change-Id: I9f21921e431c948545b7cabc7829e4b4e574bbe9
Reviewed-on: https://go-review.googlesource.com/c/go/+/175118
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-05-03 03:25:26 +00:00
erifan01 d17d41e58d math/big: optimize mulAddVWW on arm64 for better performance
Unroll the cycle 4 times to reduce load overhead.

Benchmarks:
name                old time/op    new time/op    delta
MulAddVWW/1-8         15.9ns ± 0%    11.9ns ± 0%  -24.92%  (p=0.000 n=8+8)
MulAddVWW/2-8         16.1ns ± 0%    13.9ns ± 1%  -13.82%  (p=0.000 n=8+8)
MulAddVWW/3-8         18.9ns ± 0%    17.3ns ± 0%   -8.47%  (p=0.000 n=8+8)
MulAddVWW/4-8         21.7ns ± 0%    19.5ns ± 0%  -10.14%  (p=0.000 n=8+8)
MulAddVWW/5-8         25.1ns ± 0%    22.5ns ± 0%  -10.27%  (p=0.000 n=8+8)
MulAddVWW/10-8        41.6ns ± 0%    40.0ns ± 0%   -3.79%  (p=0.000 n=8+8)
MulAddVWW/100-8        368ns ± 0%     363ns ± 0%   -1.36%  (p=0.000 n=8+8)
MulAddVWW/1000-8      3.52µs ± 0%    3.52µs ± 0%   -0.14%  (p=0.000 n=8+8)
MulAddVWW/10000-8     35.1µs ± 0%    35.1µs ± 0%   -0.01%  (p=0.000 n=7+6)
MulAddVWW/100000-8     351µs ± 0%     351µs ± 0%   +0.15%  (p=0.038 n=8+8)

Change-Id: I052a4db286ac6e4f3293289c7e9a82027da0405e
Reviewed-on: https://go-review.googlesource.com/c/go/+/155780
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-04-22 14:45:16 +00:00
Robert Griesemer f8c6f986fd math/big: don't clobber shared underlying array in pow5 computation
Rearranged code slightly to make lifetime of underlying array of
pow5 more explicit in code.

Fixes #31184.

Change-Id: I063081f0e54097c499988d268a23813746592654
Reviewed-on: https://go-review.googlesource.com/c/go/+/170641
Reviewed-by: Filippo Valsorda <filippo@golang.org>
2019-04-15 17:48:21 +00:00
Neven Sajko 7756a72b35 all: change the old assembly style AX:CX to CX, AX
Assembly files with "/vendor/" or "testdata" in their paths were ignored.

Change-Id: I3882ff07eb4426abb9f8ee96f82dff73c81cd61f
GitHub-Last-Rev: 51ae8c324d
GitHub-Pull-Request: golang/go#31166
Reviewed-on: https://go-review.googlesource.com/c/go/+/170197
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-04-09 00:22:03 +00:00
Filippo Valsorda ead895688d math/big: do not panic in Exp when y < 0 and x doesn't have an inverse
If x does not have an inverse modulo m, and a negative exponent is used,
return nil just like ModInverse does now.

Change-Id: I8fa72f7a851e8cf77c5fab529ede88408740626f
Reviewed-on: https://go-review.googlesource.com/c/go/+/170757
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-04-04 23:02:09 +00:00
Neven Sajko 964fe4b80f math/big: simplify shlVU_g and shrVU_g
Rewrote a few lines to be more idiomatic/less assembly-ish.

Benchmarked with `go test -bench Float -tags math_big_pure_go`:

name                  old time/op    new time/op    delta
FloatString/100-8        751ns ± 0%     746ns ± 1%  -0.71%  (p=0.000 n=10+10)
FloatString/1000-8      22.9µs ± 0%    22.9µs ± 0%    ~     (p=0.271 n=10+10)
FloatString/10000-8     1.89ms ± 0%    1.89ms ± 0%    ~     (p=0.481 n=10+10)
FloatString/100000-8     184ms ± 0%     184ms ± 0%    ~     (p=0.094 n=9+9)
FloatAdd/10-8           56.4ns ± 1%    56.5ns ± 0%    ~     (p=0.170 n=9+9)
FloatAdd/100-8          59.7ns ± 0%    59.3ns ± 0%  -0.70%  (p=0.000 n=8+9)
FloatAdd/1000-8          101ns ± 0%      99ns ± 0%  -1.89%  (p=0.000 n=8+8)
FloatAdd/10000-8         553ns ± 0%     536ns ± 0%  -3.00%  (p=0.000 n=9+10)
FloatAdd/100000-8       4.94µs ± 0%    4.74µs ± 0%  -3.94%  (p=0.000 n=9+10)
FloatSub/10-8           50.3ns ± 0%    50.5ns ± 0%  +0.52%  (p=0.000 n=8+8)
FloatSub/100-8          52.0ns ± 0%    52.2ns ± 1%  +0.46%  (p=0.012 n=8+10)
FloatSub/1000-8         77.9ns ± 0%    77.3ns ± 0%  -0.80%  (p=0.000 n=7+8)
FloatSub/10000-8         371ns ± 0%     362ns ± 0%  -2.67%  (p=0.000 n=10+10)
FloatSub/100000-8       3.20µs ± 0%    3.10µs ± 0%  -3.16%  (p=0.000 n=10+10)
ParseFloatSmallExp-8    7.84µs ± 0%    7.82µs ± 0%  -0.17%  (p=0.037 n=9+9)
ParseFloatLargeExp-8    29.3µs ± 1%    29.5µs ± 0%    ~     (p=0.059 n=9+8)
FloatSqrt/64-8           516ns ± 0%     519ns ± 0%  +0.54%  (p=0.000 n=9+9)
FloatSqrt/128-8         1.07µs ± 0%    1.07µs ± 0%    ~     (p=0.109 n=8+9)
FloatSqrt/256-8         1.23µs ± 0%    1.23µs ± 0%  +0.50%  (p=0.000 n=9+9)
FloatSqrt/1000-8        3.43µs ± 0%    3.44µs ± 0%  +0.53%  (p=0.000 n=9+8)
FloatSqrt/10000-8       40.9µs ± 0%    40.7µs ± 0%  -0.39%  (p=0.000 n=9+8)
FloatSqrt/100000-8      1.07ms ± 0%    1.07ms ± 0%  -0.10%  (p=0.017 n=10+9)
FloatSqrt/1000000-8     89.3ms ± 0%    89.2ms ± 0%  -0.07%  (p=0.015 n=9+8)

Change-Id: Ibf07c6142719d11bc7f329246957d87a9f3ba3d2
GitHub-Last-Rev: 870a041ab7
GitHub-Pull-Request: golang/go#31220
Reviewed-on: https://go-review.googlesource.com/c/go/+/170449
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-04-04 00:26:24 +00:00
Robert Griesemer 4dce6dbb1e math/big: temporarily disable buggy shlVU assembly for arm64
This addresses the failures we have seen in #31084. The correct
fix is to find the actual bug in the assembly code.

Updates #31084.

Change-Id: I437780c53d0c4423d742e2e3b650b899ce845372
Reviewed-on: https://go-review.googlesource.com/c/go/+/169721
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-27 23:40:12 +00:00
Brian Kessler 5ee2290420 math/big: implement Rat.SetUint64
Implemented via the underlying Int.SetUint64.
Added tests for Rat.SetInt64 and Rat.SetUint64.

Fixes #29579

Change-Id: I03faaffc93e36873b202b58ae72b139dea5c40f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/160682
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-27 15:20:28 +00:00
Robert Griesemer e4ba40030f math/big: accept non-decimal floats with Rat.SetString
This fixes an old oversight. Rat.SetString already permitted
fractions a/b where both a and b could independently specify
a base prefix. With this CL, it now also accepts non-decimal
floating-point numbers.

Fixes #29799.

Change-Id: I9cc65666a5cebb00f0202da2e4fc5654a02e3234
Reviewed-on: https://go-review.googlesource.com/c/go/+/168237
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2019-03-25 22:29:26 +00:00
Robert Griesemer cfa93ba51f math/big: add support for underscores '_' in numbers
The primary change is in nat.scan which now accepts underscores for base 0.
While at it, streamlined error handling in that function as well.
Also, improved the corresponding test significantly by checking the
expected result values also in case of scan errors.

The second major change is in scanExponent which now accepts underscores when
the new sepOk argument is set. While at it, essentially rewrote that
function to match error and underscore handling of nat.scan more closely.
Added a new test for scanExponent which until now was only tested
indirectly.

Finally, updated the documentation for several functions and added many
new test cases to clients of nat.scan.

A major portion of this CL is due to much better test coverage.

Updates #28493.

Change-Id: I7f17b361b633fbe6c798619d891bd5e0a045b5c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/166157
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2019-03-12 22:58:58 +00:00
Brian Kessler ef891e1c83 math/big: implement Int.TrailingZeroBits
Implemented via the underlying nat.trailingZeroBits.

Fixes #29578

Change-Id: If9876c5a74b107cbabceb7547bef4e44501f6745
Reviewed-on: https://go-review.googlesource.com/c/go/+/160681
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-12 13:18:27 +00:00
Josh Bleecher Snyder 4d10aba35e math/big: add fast path for amd64 addVW for large z
This matches the pure Go fast path added in the previous commit.

I will leave other architectures to those with ready access to hardware.

name            old time/op    new time/op    delta
AddVW/1-8         3.60ns ± 3%    3.59ns ± 1%      ~     (p=0.147 n=91+86)
AddVW/2-8         3.92ns ± 1%    3.91ns ± 2%    -0.36%  (p=0.000 n=86+92)
AddVW/3-8         4.33ns ± 5%    4.46ns ± 5%    +2.94%  (p=0.000 n=96+97)
AddVW/4-8         4.76ns ± 5%    4.82ns ± 5%    +1.28%  (p=0.000 n=95+92)
AddVW/5-8         5.40ns ± 1%    5.42ns ± 0%    +0.47%  (p=0.000 n=76+71)
AddVW/10-8        8.03ns ± 1%    7.80ns ± 5%    -2.90%  (p=0.000 n=73+96)
AddVW/100-8       43.8ns ± 5%    17.9ns ± 1%   -59.12%  (p=0.000 n=94+81)
AddVW/1000-8       428ns ± 4%      85ns ± 6%   -80.20%  (p=0.000 n=96+99)
AddVW/10000-8     4.22µs ± 2%    1.80µs ± 3%   -57.32%  (p=0.000 n=69+92)
AddVW/100000-8    44.8µs ± 8%    31.5µs ± 3%   -29.76%  (p=0.000 n=99+90)

name            old time/op    new time/op    delta
SubVW/1-8         3.53ns ± 2%    3.63ns ± 5%    +2.97%  (p=0.000 n=94+93)
SubVW/2-8         4.33ns ± 5%    4.01ns ± 2%    -7.36%  (p=0.000 n=90+85)
SubVW/3-8         4.32ns ± 2%    4.32ns ± 5%      ~     (p=0.084 n=87+97)
SubVW/4-8         4.70ns ± 2%    4.83ns ± 6%    +2.77%  (p=0.000 n=85+96)
SubVW/5-8         5.84ns ± 1%    5.35ns ± 1%    -8.35%  (p=0.000 n=87+87)
SubVW/10-8        8.01ns ± 4%    7.54ns ± 4%    -5.84%  (p=0.000 n=98+97)
SubVW/100-8       43.9ns ± 5%    17.9ns ± 1%   -59.20%  (p=0.000 n=98+76)
SubVW/1000-8       426ns ± 2%      85ns ± 3%   -80.13%  (p=0.000 n=90+98)
SubVW/10000-8     4.24µs ± 2%    1.81µs ± 3%   -57.28%  (p=0.000 n=74+91)
SubVW/100000-8    44.5µs ± 4%    31.5µs ± 2%   -29.33%  (p=0.000 n=84+91)

Change-Id: I10dd361cbaca22197c27e7734c0f50065292afbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/164969
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-09 20:34:40 +00:00
Josh Bleecher Snyder fe24837c4d math/big: add fast path for pure Go addVW for large z
In the normal case, only a few words have to be updated when adding a word to a vector.
When that happens, we can simply copy the rest of the words, which is much faster.
However, the overhead of that makes it prohibitive for small vectors,
so we check the size at the beginning.

The implementation is a bit weird to allow addVW to continued to be inlined; see #30548.

The AddVW benchmarks are surprising, but fully repeatable.
The SubVW benchmarks are more or less as expected.
I expect that removing the indirect function call will
help both and make them a bit more normal.

name            old time/op    new time/op     delta
AddVW/1-8         4.27ns ± 2%     3.81ns ± 3%   -10.83%  (p=0.000 n=89+90)
AddVW/2-8         4.91ns ± 2%     4.34ns ± 1%   -11.60%  (p=0.000 n=83+90)
AddVW/3-8         5.77ns ± 4%     5.76ns ± 2%      ~     (p=0.365 n=91+87)
AddVW/4-8         6.03ns ± 1%     6.03ns ± 1%      ~     (p=0.392 n=80+76)
AddVW/5-8         6.48ns ± 2%     6.63ns ± 1%    +2.27%  (p=0.000 n=76+74)
AddVW/10-8        9.56ns ± 2%     9.56ns ± 1%    -0.02%  (p=0.002 n=69+76)
AddVW/100-8       90.6ns ± 0%     18.1ns ± 4%   -79.99%  (p=0.000 n=72+94)
AddVW/1000-8       865ns ± 0%       85ns ± 6%   -90.14%  (p=0.000 n=66+96)
AddVW/10000-8     8.57µs ± 2%     1.82µs ± 3%   -78.73%  (p=0.000 n=99+94)
AddVW/100000-8    84.4µs ± 2%     31.8µs ± 4%   -62.29%  (p=0.000 n=93+98)

name            old time/op    new time/op     delta
SubVW/1-8         3.90ns ± 2%     4.13ns ± 4%    +6.02%  (p=0.000 n=92+95)
SubVW/2-8         4.15ns ± 1%     5.20ns ± 1%   +25.22%  (p=0.000 n=83+85)
SubVW/3-8         5.50ns ± 2%     6.22ns ± 6%   +13.21%  (p=0.000 n=91+97)
SubVW/4-8         5.99ns ± 1%     6.63ns ± 1%   +10.63%  (p=0.000 n=79+61)
SubVW/5-8         6.75ns ± 4%     6.88ns ± 2%    +1.82%  (p=0.000 n=98+73)
SubVW/10-8        9.57ns ± 1%     9.56ns ± 1%    -0.13%  (p=0.000 n=77+64)
SubVW/100-8       90.3ns ± 1%     18.1ns ± 2%   -80.00%  (p=0.000 n=75+94)
SubVW/1000-8       860ns ± 4%       85ns ± 7%   -90.14%  (p=0.000 n=97+99)
SubVW/10000-8     8.51µs ± 3%     1.77µs ± 6%   -79.21%  (p=0.000 n=100+97)
SubVW/100000-8    84.4µs ± 3%     31.5µs ± 3%   -62.66%  (p=0.000 n=92+92)

Change-Id: I721d7031d40f245b4a284f5bdd93e7bb85e7e937
Reviewed-on: https://go-review.googlesource.com/c/go/+/164968
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-09 20:33:46 +00:00
Josh Bleecher Snyder 4c227a091e math/big: remove bounds checks in pure Go implementations
These routines are quite sensitive to BCE.

This change eliminates bounds checks from loops.
It does so at the cost of a bit of safety:
malformed input will now return incorrect answers
instead of panicking.

This isn't as bad as it sounds: math/big has very good
test coverage, and the alternative implementations are in
assembly, which could do much worse things with malformed input.

If the compiler's BCE improves, so could these routines.

Notable BCE improvements for these routines would be:

* Allowing and propagating more cross-slice length hints.
  Then hints like _ = y[:len(z)] would eliminate bounds checks for y[i].

* Propagating enough information so that we could do
  n := len(x)
  if len(z) < n {
    n = len(z)
  }
  and then have i < n eliminate the same bounds checks as
  i < len(x) && i < len(z) currently does.

* Providing some way to do BCE for unrolled loops.
  Now that we have math/bits implementations,
  it is possible to write things like ADC chains in
  pure Go, if you can reasonably unroll loops.

Benchmarks below are for amd64, using -tags=math_big_pure_go.

name            old time/op    new time/op    delta
AddVV/1-8         5.15ns ± 3%    4.65ns ± 4%   -9.81%  (p=0.000 n=93+86)
AddVV/2-8         6.40ns ± 2%    5.58ns ± 4%  -12.78%  (p=0.000 n=90+95)
AddVV/3-8         7.07ns ± 2%    6.66ns ± 2%   -5.88%  (p=0.000 n=87+83)
AddVV/4-8         7.94ns ± 5%    7.41ns ± 4%   -6.65%  (p=0.000 n=94+98)
AddVV/5-8         8.55ns ± 1%    8.80ns ± 0%   +2.92%  (p=0.000 n=87+92)
AddVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.12%  (p=0.000 n=83+71)
AddVV/100-8        119ns ± 5%     117ns ± 4%   -1.60%  (p=0.000 n=93+90)
AddVV/1000-8      1.14µs ± 4%    1.14µs ± 5%     ~     (p=0.812 n=95+91)
AddVV/10000-8     11.4µs ± 5%    11.3µs ± 5%     ~     (p=0.503 n=97+96)
AddVV/100000-8     114µs ± 4%     113µs ± 5%   -0.98%  (p=0.002 n=97+90)

name            old time/op    new time/op    delta
SubVV/1-8         5.23ns ± 5%    4.65ns ± 3%  -11.18%  (p=0.000 n=89+91)
SubVV/2-8         6.49ns ± 5%    5.58ns ± 3%  -14.04%  (p=0.000 n=92+94)
SubVV/3-8         7.10ns ± 3%    6.65ns ± 2%   -6.28%  (p=0.000 n=87+80)
SubVV/4-8         8.04ns ± 1%    7.44ns ± 5%   -7.49%  (p=0.000 n=83+98)
SubVV/5-8         8.55ns ± 2%    8.32ns ± 1%   -2.75%  (p=0.000 n=84+92)
SubVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.09%  (p=0.000 n=80+75)
SubVV/100-8        119ns ± 0%     116ns ± 3%   -1.83%  (p=0.000 n=87+98)
SubVV/1000-8      1.13µs ± 5%    1.13µs ± 3%     ~     (p=0.082 n=96+98)
SubVV/10000-8     11.2µs ± 1%    11.3µs ± 3%   +0.76%  (p=0.000 n=87+97)
SubVV/100000-8     112µs ± 2%     113µs ± 3%   +0.55%  (p=0.000 n=76+88)

name            old time/op    new time/op    delta
AddVW/1-8         4.30ns ± 4%    3.96ns ± 6%  -8.02%  (p=0.000 n=89+97)
AddVW/2-8         5.15ns ± 2%    4.91ns ± 1%  -4.56%  (p=0.000 n=87+80)
AddVW/3-8         5.59ns ± 3%    5.75ns ± 2%  +2.91%  (p=0.000 n=91+88)
AddVW/4-8         6.20ns ± 1%    6.03ns ± 1%  -2.71%  (p=0.000 n=75+90)
AddVW/5-8         6.93ns ± 3%    6.49ns ± 2%  -6.35%  (p=0.000 n=100+82)
AddVW/10-8        10.0ns ± 7%     9.6ns ± 0%  -4.02%  (p=0.000 n=98+74)
AddVW/100-8       91.1ns ± 1%    90.6ns ± 1%  -0.55%  (p=0.000 n=84+80)
AddVW/1000-8       866ns ± 1%     856ns ± 4%  -1.06%  (p=0.000 n=69+96)
AddVW/10000-8     8.64µs ± 1%    8.53µs ± 4%  -1.25%  (p=0.000 n=67+99)
AddVW/100000-8    84.3µs ± 2%    85.4µs ± 4%  +1.22%  (p=0.000 n=89+99)

name            old time/op    new time/op    delta
SubVW/1-8         4.28ns ± 2%    3.82ns ± 3%  -10.63%  (p=0.000 n=91+89)
SubVW/2-8         4.61ns ± 1%    4.48ns ± 3%   -2.67%  (p=0.000 n=94+96)
SubVW/3-8         5.54ns ± 1%    5.81ns ± 4%   +4.87%  (p=0.000 n=92+97)
SubVW/4-8         6.20ns ± 1%    6.08ns ± 2%   -1.99%  (p=0.000 n=71+88)
SubVW/5-8         6.91ns ± 3%    6.64ns ± 1%   -3.90%  (p=0.000 n=97+70)
SubVW/10-8        9.85ns ± 2%    9.62ns ± 0%   -2.31%  (p=0.000 n=82+62)
SubVW/100-8       91.1ns ± 1%    90.9ns ± 3%   -0.14%  (p=0.010 n=71+93)
SubVW/1000-8       859ns ± 3%     867ns ± 1%   +0.98%  (p=0.000 n=99+78)
SubVW/10000-8     8.54µs ± 5%    8.57µs ± 2%   +0.38%  (p=0.007 n=98+92)
SubVW/100000-8    84.5µs ± 3%    84.6µs ± 3%     ~     (p=0.334 n=95+94)

name                old time/op    new time/op    delta
AddMulVVW/1-8         5.43ns ± 3%    4.36ns ± 2%  -19.67%  (p=0.000 n=95+94)
AddMulVVW/2-8         6.56ns ± 4%    6.11ns ± 1%   -6.90%  (p=0.000 n=91+91)
AddMulVVW/3-8         8.00ns ± 1%    7.80ns ± 4%   -2.52%  (p=0.000 n=83+95)
AddMulVVW/4-8         9.81ns ± 2%    9.53ns ± 1%   -2.86%  (p=0.000 n=77+64)
AddMulVVW/5-8         11.4ns ± 3%    11.3ns ± 5%   -0.89%  (p=0.000 n=95+97)
AddMulVVW/10-8        18.9ns ± 5%    19.1ns ± 5%   +0.89%  (p=0.000 n=91+94)
AddMulVVW/100-8        165ns ± 5%     165ns ± 4%     ~     (p=0.427 n=97+98)
AddMulVVW/1000-8      1.56µs ± 3%    1.56µs ± 4%     ~     (p=0.167 n=98+96)
AddMulVVW/10000-8     15.7µs ± 5%    15.6µs ± 5%   -0.31%  (p=0.044 n=95+97)
AddMulVVW/100000-8     156µs ± 3%     157µs ± 8%     ~     (p=0.373 n=72+99)

Change-Id: Ibc720785d5b95f6a797103b1363843205f4d56bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/164966
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-09 20:33:13 +00:00