Commit Graph

470 Commits

Author SHA1 Message Date
erifan01 5714c91b53 cmd/compile: intrinsify math/bits.Add64 for arm64
This CL instrinsifies Add64 with arm64 instruction sequence ADDS, ADCS
and ADC, and optimzes the case of carry chains.The CL also changes the
test code so that the intrinsic implementation can be tested.

Benchmarks:
name               old time/op       new time/op       delta
Add-224            2.500000ns +- 0%  2.090000ns +- 4%  -16.40%  (p=0.000 n=9+10)
Add32-224          2.500000ns +- 0%  2.500000ns +- 0%     ~     (all equal)
Add64-224          2.500000ns +- 0%  1.577778ns +- 2%  -36.89%  (p=0.000 n=10+9)
Add64multiple-224  6.000000ns +- 0%  2.000000ns +- 0%  -66.67%  (p=0.000 n=10+10)

Change-Id: I6ee91c9a85c16cc72ade5fd94868c579f16c7615
Reviewed-on: https://go-review.googlesource.com/c/go/+/159017
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-20 05:39:49 +00:00
David Chase 6535791385 math: fix math.Remainder(-x,x) (for Inf > x > 0)
Modify the |x| == |y| case to return -0 when x < 0.

Fixes #30814.

Change-Id: Ic4cd48001e0e894a12b5b813c6a1ddc3a055610b
Reviewed-on: https://go-review.googlesource.com/c/go/+/167479
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-15 14:52:51 +00:00
Robert Griesemer cfa93ba51f math/big: add support for underscores '_' in numbers
The primary change is in nat.scan which now accepts underscores for base 0.
While at it, streamlined error handling in that function as well.
Also, improved the corresponding test significantly by checking the
expected result values also in case of scan errors.

The second major change is in scanExponent which now accepts underscores when
the new sepOk argument is set. While at it, essentially rewrote that
function to match error and underscore handling of nat.scan more closely.
Added a new test for scanExponent which until now was only tested
indirectly.

Finally, updated the documentation for several functions and added many
new test cases to clients of nat.scan.

A major portion of this CL is due to much better test coverage.

Updates #28493.

Change-Id: I7f17b361b633fbe6c798619d891bd5e0a045b5c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/166157
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2019-03-12 22:58:58 +00:00
Brian Kessler ef891e1c83 math/big: implement Int.TrailingZeroBits
Implemented via the underlying nat.trailingZeroBits.

Fixes #29578

Change-Id: If9876c5a74b107cbabceb7547bef4e44501f6745
Reviewed-on: https://go-review.googlesource.com/c/go/+/160681
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-12 13:18:27 +00:00
Josh Bleecher Snyder 4d10aba35e math/big: add fast path for amd64 addVW for large z
This matches the pure Go fast path added in the previous commit.

I will leave other architectures to those with ready access to hardware.

name            old time/op    new time/op    delta
AddVW/1-8         3.60ns ± 3%    3.59ns ± 1%      ~     (p=0.147 n=91+86)
AddVW/2-8         3.92ns ± 1%    3.91ns ± 2%    -0.36%  (p=0.000 n=86+92)
AddVW/3-8         4.33ns ± 5%    4.46ns ± 5%    +2.94%  (p=0.000 n=96+97)
AddVW/4-8         4.76ns ± 5%    4.82ns ± 5%    +1.28%  (p=0.000 n=95+92)
AddVW/5-8         5.40ns ± 1%    5.42ns ± 0%    +0.47%  (p=0.000 n=76+71)
AddVW/10-8        8.03ns ± 1%    7.80ns ± 5%    -2.90%  (p=0.000 n=73+96)
AddVW/100-8       43.8ns ± 5%    17.9ns ± 1%   -59.12%  (p=0.000 n=94+81)
AddVW/1000-8       428ns ± 4%      85ns ± 6%   -80.20%  (p=0.000 n=96+99)
AddVW/10000-8     4.22µs ± 2%    1.80µs ± 3%   -57.32%  (p=0.000 n=69+92)
AddVW/100000-8    44.8µs ± 8%    31.5µs ± 3%   -29.76%  (p=0.000 n=99+90)

name            old time/op    new time/op    delta
SubVW/1-8         3.53ns ± 2%    3.63ns ± 5%    +2.97%  (p=0.000 n=94+93)
SubVW/2-8         4.33ns ± 5%    4.01ns ± 2%    -7.36%  (p=0.000 n=90+85)
SubVW/3-8         4.32ns ± 2%    4.32ns ± 5%      ~     (p=0.084 n=87+97)
SubVW/4-8         4.70ns ± 2%    4.83ns ± 6%    +2.77%  (p=0.000 n=85+96)
SubVW/5-8         5.84ns ± 1%    5.35ns ± 1%    -8.35%  (p=0.000 n=87+87)
SubVW/10-8        8.01ns ± 4%    7.54ns ± 4%    -5.84%  (p=0.000 n=98+97)
SubVW/100-8       43.9ns ± 5%    17.9ns ± 1%   -59.20%  (p=0.000 n=98+76)
SubVW/1000-8       426ns ± 2%      85ns ± 3%   -80.13%  (p=0.000 n=90+98)
SubVW/10000-8     4.24µs ± 2%    1.81µs ± 3%   -57.28%  (p=0.000 n=74+91)
SubVW/100000-8    44.5µs ± 4%    31.5µs ± 2%   -29.33%  (p=0.000 n=84+91)

Change-Id: I10dd361cbaca22197c27e7734c0f50065292afbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/164969
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-09 20:34:40 +00:00
Josh Bleecher Snyder fe24837c4d math/big: add fast path for pure Go addVW for large z
In the normal case, only a few words have to be updated when adding a word to a vector.
When that happens, we can simply copy the rest of the words, which is much faster.
However, the overhead of that makes it prohibitive for small vectors,
so we check the size at the beginning.

The implementation is a bit weird to allow addVW to continued to be inlined; see #30548.

The AddVW benchmarks are surprising, but fully repeatable.
The SubVW benchmarks are more or less as expected.
I expect that removing the indirect function call will
help both and make them a bit more normal.

name            old time/op    new time/op     delta
AddVW/1-8         4.27ns ± 2%     3.81ns ± 3%   -10.83%  (p=0.000 n=89+90)
AddVW/2-8         4.91ns ± 2%     4.34ns ± 1%   -11.60%  (p=0.000 n=83+90)
AddVW/3-8         5.77ns ± 4%     5.76ns ± 2%      ~     (p=0.365 n=91+87)
AddVW/4-8         6.03ns ± 1%     6.03ns ± 1%      ~     (p=0.392 n=80+76)
AddVW/5-8         6.48ns ± 2%     6.63ns ± 1%    +2.27%  (p=0.000 n=76+74)
AddVW/10-8        9.56ns ± 2%     9.56ns ± 1%    -0.02%  (p=0.002 n=69+76)
AddVW/100-8       90.6ns ± 0%     18.1ns ± 4%   -79.99%  (p=0.000 n=72+94)
AddVW/1000-8       865ns ± 0%       85ns ± 6%   -90.14%  (p=0.000 n=66+96)
AddVW/10000-8     8.57µs ± 2%     1.82µs ± 3%   -78.73%  (p=0.000 n=99+94)
AddVW/100000-8    84.4µs ± 2%     31.8µs ± 4%   -62.29%  (p=0.000 n=93+98)

name            old time/op    new time/op     delta
SubVW/1-8         3.90ns ± 2%     4.13ns ± 4%    +6.02%  (p=0.000 n=92+95)
SubVW/2-8         4.15ns ± 1%     5.20ns ± 1%   +25.22%  (p=0.000 n=83+85)
SubVW/3-8         5.50ns ± 2%     6.22ns ± 6%   +13.21%  (p=0.000 n=91+97)
SubVW/4-8         5.99ns ± 1%     6.63ns ± 1%   +10.63%  (p=0.000 n=79+61)
SubVW/5-8         6.75ns ± 4%     6.88ns ± 2%    +1.82%  (p=0.000 n=98+73)
SubVW/10-8        9.57ns ± 1%     9.56ns ± 1%    -0.13%  (p=0.000 n=77+64)
SubVW/100-8       90.3ns ± 1%     18.1ns ± 2%   -80.00%  (p=0.000 n=75+94)
SubVW/1000-8       860ns ± 4%       85ns ± 7%   -90.14%  (p=0.000 n=97+99)
SubVW/10000-8     8.51µs ± 3%     1.77µs ± 6%   -79.21%  (p=0.000 n=100+97)
SubVW/100000-8    84.4µs ± 3%     31.5µs ± 3%   -62.66%  (p=0.000 n=92+92)

Change-Id: I721d7031d40f245b4a284f5bdd93e7bb85e7e937
Reviewed-on: https://go-review.googlesource.com/c/go/+/164968
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-09 20:33:46 +00:00
Josh Bleecher Snyder 4c227a091e math/big: remove bounds checks in pure Go implementations
These routines are quite sensitive to BCE.

This change eliminates bounds checks from loops.
It does so at the cost of a bit of safety:
malformed input will now return incorrect answers
instead of panicking.

This isn't as bad as it sounds: math/big has very good
test coverage, and the alternative implementations are in
assembly, which could do much worse things with malformed input.

If the compiler's BCE improves, so could these routines.

Notable BCE improvements for these routines would be:

* Allowing and propagating more cross-slice length hints.
  Then hints like _ = y[:len(z)] would eliminate bounds checks for y[i].

* Propagating enough information so that we could do
  n := len(x)
  if len(z) < n {
    n = len(z)
  }
  and then have i < n eliminate the same bounds checks as
  i < len(x) && i < len(z) currently does.

* Providing some way to do BCE for unrolled loops.
  Now that we have math/bits implementations,
  it is possible to write things like ADC chains in
  pure Go, if you can reasonably unroll loops.

Benchmarks below are for amd64, using -tags=math_big_pure_go.

name            old time/op    new time/op    delta
AddVV/1-8         5.15ns ± 3%    4.65ns ± 4%   -9.81%  (p=0.000 n=93+86)
AddVV/2-8         6.40ns ± 2%    5.58ns ± 4%  -12.78%  (p=0.000 n=90+95)
AddVV/3-8         7.07ns ± 2%    6.66ns ± 2%   -5.88%  (p=0.000 n=87+83)
AddVV/4-8         7.94ns ± 5%    7.41ns ± 4%   -6.65%  (p=0.000 n=94+98)
AddVV/5-8         8.55ns ± 1%    8.80ns ± 0%   +2.92%  (p=0.000 n=87+92)
AddVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.12%  (p=0.000 n=83+71)
AddVV/100-8        119ns ± 5%     117ns ± 4%   -1.60%  (p=0.000 n=93+90)
AddVV/1000-8      1.14µs ± 4%    1.14µs ± 5%     ~     (p=0.812 n=95+91)
AddVV/10000-8     11.4µs ± 5%    11.3µs ± 5%     ~     (p=0.503 n=97+96)
AddVV/100000-8     114µs ± 4%     113µs ± 5%   -0.98%  (p=0.002 n=97+90)

name            old time/op    new time/op    delta
SubVV/1-8         5.23ns ± 5%    4.65ns ± 3%  -11.18%  (p=0.000 n=89+91)
SubVV/2-8         6.49ns ± 5%    5.58ns ± 3%  -14.04%  (p=0.000 n=92+94)
SubVV/3-8         7.10ns ± 3%    6.65ns ± 2%   -6.28%  (p=0.000 n=87+80)
SubVV/4-8         8.04ns ± 1%    7.44ns ± 5%   -7.49%  (p=0.000 n=83+98)
SubVV/5-8         8.55ns ± 2%    8.32ns ± 1%   -2.75%  (p=0.000 n=84+92)
SubVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.09%  (p=0.000 n=80+75)
SubVV/100-8        119ns ± 0%     116ns ± 3%   -1.83%  (p=0.000 n=87+98)
SubVV/1000-8      1.13µs ± 5%    1.13µs ± 3%     ~     (p=0.082 n=96+98)
SubVV/10000-8     11.2µs ± 1%    11.3µs ± 3%   +0.76%  (p=0.000 n=87+97)
SubVV/100000-8     112µs ± 2%     113µs ± 3%   +0.55%  (p=0.000 n=76+88)

name            old time/op    new time/op    delta
AddVW/1-8         4.30ns ± 4%    3.96ns ± 6%  -8.02%  (p=0.000 n=89+97)
AddVW/2-8         5.15ns ± 2%    4.91ns ± 1%  -4.56%  (p=0.000 n=87+80)
AddVW/3-8         5.59ns ± 3%    5.75ns ± 2%  +2.91%  (p=0.000 n=91+88)
AddVW/4-8         6.20ns ± 1%    6.03ns ± 1%  -2.71%  (p=0.000 n=75+90)
AddVW/5-8         6.93ns ± 3%    6.49ns ± 2%  -6.35%  (p=0.000 n=100+82)
AddVW/10-8        10.0ns ± 7%     9.6ns ± 0%  -4.02%  (p=0.000 n=98+74)
AddVW/100-8       91.1ns ± 1%    90.6ns ± 1%  -0.55%  (p=0.000 n=84+80)
AddVW/1000-8       866ns ± 1%     856ns ± 4%  -1.06%  (p=0.000 n=69+96)
AddVW/10000-8     8.64µs ± 1%    8.53µs ± 4%  -1.25%  (p=0.000 n=67+99)
AddVW/100000-8    84.3µs ± 2%    85.4µs ± 4%  +1.22%  (p=0.000 n=89+99)

name            old time/op    new time/op    delta
SubVW/1-8         4.28ns ± 2%    3.82ns ± 3%  -10.63%  (p=0.000 n=91+89)
SubVW/2-8         4.61ns ± 1%    4.48ns ± 3%   -2.67%  (p=0.000 n=94+96)
SubVW/3-8         5.54ns ± 1%    5.81ns ± 4%   +4.87%  (p=0.000 n=92+97)
SubVW/4-8         6.20ns ± 1%    6.08ns ± 2%   -1.99%  (p=0.000 n=71+88)
SubVW/5-8         6.91ns ± 3%    6.64ns ± 1%   -3.90%  (p=0.000 n=97+70)
SubVW/10-8        9.85ns ± 2%    9.62ns ± 0%   -2.31%  (p=0.000 n=82+62)
SubVW/100-8       91.1ns ± 1%    90.9ns ± 3%   -0.14%  (p=0.010 n=71+93)
SubVW/1000-8       859ns ± 3%     867ns ± 1%   +0.98%  (p=0.000 n=99+78)
SubVW/10000-8     8.54µs ± 5%    8.57µs ± 2%   +0.38%  (p=0.007 n=98+92)
SubVW/100000-8    84.5µs ± 3%    84.6µs ± 3%     ~     (p=0.334 n=95+94)

name                old time/op    new time/op    delta
AddMulVVW/1-8         5.43ns ± 3%    4.36ns ± 2%  -19.67%  (p=0.000 n=95+94)
AddMulVVW/2-8         6.56ns ± 4%    6.11ns ± 1%   -6.90%  (p=0.000 n=91+91)
AddMulVVW/3-8         8.00ns ± 1%    7.80ns ± 4%   -2.52%  (p=0.000 n=83+95)
AddMulVVW/4-8         9.81ns ± 2%    9.53ns ± 1%   -2.86%  (p=0.000 n=77+64)
AddMulVVW/5-8         11.4ns ± 3%    11.3ns ± 5%   -0.89%  (p=0.000 n=95+97)
AddMulVVW/10-8        18.9ns ± 5%    19.1ns ± 5%   +0.89%  (p=0.000 n=91+94)
AddMulVVW/100-8        165ns ± 5%     165ns ± 4%     ~     (p=0.427 n=97+98)
AddMulVVW/1000-8      1.56µs ± 3%    1.56µs ± 4%     ~     (p=0.167 n=98+96)
AddMulVVW/10000-8     15.7µs ± 5%    15.6µs ± 5%   -0.31%  (p=0.044 n=95+97)
AddMulVVW/100000-8     156µs ± 3%     157µs ± 8%     ~     (p=0.373 n=72+99)

Change-Id: Ibc720785d5b95f6a797103b1363843205f4d56bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/164966
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-09 20:33:13 +00:00
Robert Griesemer 129c6e4496 math/big: support new octal prefixes 0o and 0O
This CL extends the various SetString and Parse methods for
Ints, Rats, and Floats to accept the new octal prefixes.

The main change is in natconv.go, all other changes are
documentation and test updates.

Finally, this CL also fixes TestRatSetString which silently
dropped certain failures.

Updates #12711.

Change-Id: I5ee5879e25013ba1e6eda93ff280915f25ab5d55
Reviewed-on: https://go-review.googlesource.com/c/go/+/165898
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2019-03-07 21:13:57 +00:00
Josh Bleecher Snyder d5edbcac98 math/big: rewrite pure Go implementations to use math/bits
While we're here, delete addWW_g and subWW_g, per the TODO.
They are now obsolete.

Benchmarks on amd64 with -tags=math_big_pure_go.

name                old time/op    new time/op     delta
AddVV/1-8             5.24ns ± 2%     5.12ns ± 1%    -2.11%  (p=0.000 n=82+87)
AddVV/2-8             6.44ns ± 1%     6.33ns ± 2%    -1.82%  (p=0.000 n=77+82)
AddVV/3-8             7.89ns ± 8%     6.97ns ± 4%   -11.71%  (p=0.000 n=100+96)
AddVV/4-8             8.60ns ± 0%     7.72ns ± 4%   -10.24%  (p=0.000 n=90+96)
AddVV/5-8             10.3ns ± 4%      8.5ns ± 1%   -17.02%  (p=0.000 n=96+91)
AddVV/10-8            16.2ns ± 5%     12.8ns ± 1%   -21.11%  (p=0.000 n=97+86)
AddVV/100-8            148ns ± 1%      117ns ± 5%   -21.07%  (p=0.000 n=66+98)
AddVV/1000-8          1.41µs ± 4%     1.13µs ± 3%   -19.90%  (p=0.000 n=97+97)
AddVV/10000-8         14.2µs ± 5%     11.2µs ± 1%   -20.82%  (p=0.000 n=99+84)
AddVV/100000-8         142µs ± 4%      113µs ± 4%   -20.40%  (p=0.000 n=91+92)
SubVV/1-8             5.29ns ± 1%     5.11ns ± 0%    -3.30%  (p=0.000 n=87+88)
SubVV/2-8             6.36ns ± 4%     6.33ns ± 2%    -0.56%  (p=0.002 n=98+73)
SubVV/3-8             7.58ns ± 5%     6.98ns ± 4%    -8.01%  (p=0.000 n=97+91)
SubVV/4-8             8.61ns ± 3%     7.98ns ± 2%    -7.31%  (p=0.000 n=95+83)
SubVV/5-8             10.6ns ± 2%      8.5ns ± 1%   -19.56%  (p=0.000 n=79+89)
SubVV/10-8            16.3ns ± 4%     12.7ns ± 1%   -21.97%  (p=0.000 n=98+82)
SubVV/100-8            124ns ± 1%      118ns ± 1%    -4.83%  (p=0.000 n=85+81)
SubVV/1000-8          1.14µs ± 5%     1.12µs ± 2%    -1.17%  (p=0.000 n=97+81)
SubVV/10000-8         11.6µs ±10%     11.2µs ± 1%    -3.39%  (p=0.000 n=100+84)
SubVV/100000-8         114µs ± 6%      114µs ± 5%      ~     (p=0.396 n=83+94)
AddVW/1-8             4.04ns ± 4%     4.34ns ± 4%    +7.57%  (p=0.000 n=96+98)
AddVW/2-8             4.34ns ± 5%     4.40ns ± 5%    +1.40%  (p=0.000 n=99+98)
AddVW/3-8             5.43ns ± 0%     5.54ns ± 2%    +1.97%  (p=0.000 n=85+94)
AddVW/4-8             6.23ns ± 1%     6.18ns ± 2%    -0.66%  (p=0.000 n=77+78)
AddVW/5-8             6.78ns ± 2%     6.90ns ± 4%    +1.77%  (p=0.000 n=80+99)
AddVW/10-8            10.5ns ± 4%      9.9ns ± 1%    -5.77%  (p=0.000 n=97+69)
AddVW/100-8            114ns ± 3%       91ns ± 0%   -20.38%  (p=0.000 n=98+77)
AddVW/1000-8          1.12µs ± 1%     0.87µs ± 1%   -22.80%  (p=0.000 n=82+68)
AddVW/10000-8         11.2µs ± 2%      8.5µs ± 5%   -23.85%  (p=0.000 n=85+100)
AddVW/100000-8         112µs ± 2%       85µs ± 5%   -24.22%  (p=0.000 n=71+96)
SubVW/1-8             4.09ns ± 2%     4.18ns ± 4%    +2.32%  (p=0.000 n=78+96)
SubVW/2-8             4.59ns ± 5%     4.52ns ± 7%    -1.54%  (p=0.000 n=98+94)
SubVW/3-8             5.41ns ±10%     5.55ns ± 1%    +2.48%  (p=0.000 n=100+89)
SubVW/4-8             6.51ns ± 2%     6.19ns ± 0%    -4.85%  (p=0.000 n=97+81)
SubVW/5-8             7.25ns ± 3%     6.90ns ± 4%    -4.93%  (p=0.000 n=97+96)
SubVW/10-8            10.6ns ± 4%      9.8ns ± 2%    -7.32%  (p=0.000 n=95+96)
SubVW/100-8           90.4ns ± 0%     90.8ns ± 0%    +0.43%  (p=0.000 n=83+78)
SubVW/1000-8           853ns ± 4%      857ns ± 2%    +0.42%  (p=0.000 n=100+98)
SubVW/10000-8         8.52µs ± 4%     8.53µs ± 2%      ~     (p=0.061 n=99+97)
SubVW/100000-8        84.8µs ± 5%     84.2µs ± 2%    -0.78%  (p=0.000 n=99+93)
AddMulVVW/1-8         8.73ns ± 0%     5.33ns ± 3%   -38.91%  (p=0.000 n=91+96)
AddMulVVW/2-8         14.8ns ± 3%      6.5ns ± 2%   -56.33%  (p=0.000 n=100+79)
AddMulVVW/3-8         18.6ns ± 2%      7.8ns ± 5%   -57.84%  (p=0.000 n=89+96)
AddMulVVW/4-8         24.0ns ± 2%      9.8ns ± 0%   -59.09%  (p=0.000 n=95+67)
AddMulVVW/5-8         29.0ns ± 2%     11.5ns ± 5%   -60.44%  (p=0.000 n=90+97)
AddMulVVW/10-8        54.1ns ± 0%     18.8ns ± 1%   -65.37%  (p=0.000 n=82+84)
AddMulVVW/100-8        508ns ± 2%      165ns ± 4%   -67.62%  (p=0.000 n=72+98)
AddMulVVW/1000-8      4.96µs ± 3%     1.55µs ± 1%   -68.86%  (p=0.000 n=99+91)
AddMulVVW/10000-8     50.0µs ± 4%     15.5µs ± 4%   -68.95%  (p=0.000 n=97+97)
AddMulVVW/100000-8     491µs ± 1%      156µs ± 8%   -68.22%  (p=0.000 n=79+95)

Change-Id: I4c6ae0b4065f371aea8103f6a85d9e9274bf01d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/164965
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-04 20:49:12 +00:00
Josh Bleecher Snyder 87cc56718a math/big: optimize shlVU_g and shrVU_g
Special case shifts by zero.
Provide hints to the compiler that shifts are bounded.

There are no existing benchmarks for shifts,
but the Float implementation uses shifts,
so we can use those.

Benchmarks on amd64 with -tags=math_big_pure_go.

name                  old time/op    new time/op    delta
FloatString/100-8        869ns ± 3%     872ns ± 4%   +0.40%  (p=0.001 n=94+83)
FloatString/1000-8      26.5µs ± 1%    26.4µs ± 1%   -0.46%  (p=0.000 n=87+96)
FloatString/10000-8     2.18ms ± 2%    2.18ms ± 2%     ~     (p=0.687 n=90+89)
FloatString/100000-8     200ms ± 7%     197ms ± 5%   -1.47%  (p=0.000 n=100+90)
FloatAdd/10-8           65.9ns ± 4%    64.0ns ± 4%   -2.94%  (p=0.000 n=92+93)
FloatAdd/100-8          71.3ns ± 4%    67.4ns ± 4%   -5.51%  (p=0.000 n=96+93)
FloatAdd/1000-8          128ns ± 1%     121ns ± 0%   -5.69%  (p=0.000 n=91+80)
FloatAdd/10000-8         718ns ± 4%     626ns ± 4%  -12.83%  (p=0.000 n=99+99)
FloatAdd/100000-8       6.43µs ± 3%    5.50µs ± 1%  -14.50%  (p=0.000 n=98+83)
FloatSub/10-8           57.7ns ± 2%    57.0ns ± 4%   -1.20%  (p=0.000 n=89+96)
FloatSub/100-8          59.9ns ± 3%    58.7ns ± 4%   -2.10%  (p=0.000 n=100+98)
FloatSub/1000-8         94.5ns ± 1%    88.6ns ± 0%   -6.16%  (p=0.000 n=74+70)
FloatSub/10000-8         456ns ± 1%     416ns ± 5%   -8.83%  (p=0.000 n=87+95)
FloatSub/100000-8       4.00µs ± 1%    3.57µs ± 1%  -10.87%  (p=0.000 n=68+85)
FloatSqrt/64-8           585ns ± 1%     579ns ± 1%   -0.99%  (p=0.000 n=92+90)
FloatSqrt/128-8         1.26µs ± 1%    1.23µs ± 2%   -2.42%  (p=0.000 n=91+81)
FloatSqrt/256-8         1.45µs ± 3%    1.40µs ± 1%   -3.61%  (p=0.000 n=96+90)
FloatSqrt/1000-8        4.03µs ± 1%    3.91µs ± 1%   -3.05%  (p=0.000 n=90+93)
FloatSqrt/10000-8       48.0µs ± 0%    47.3µs ± 1%   -1.55%  (p=0.000 n=90+90)
FloatSqrt/100000-8      1.23ms ± 3%    1.22ms ± 4%   -1.00%  (p=0.000 n=99+99)
FloatSqrt/1000000-8     96.7ms ± 4%    98.0ms ±10%     ~     (p=0.322 n=89+99)

Change-Id: I0f941c05b7c324256d7f0674559b6ba906e92ba8
Reviewed-on: https://go-review.googlesource.com/c/go/+/164967
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-04 19:30:57 +00:00
Juraj Sukop 1d992f2e36 math/big: better initial guess for nat.sqrt
The proposed change introduces a better initial guess which is closer to the final value and therefore converges in fewer steps. Consider for example sqrt(8): previously the guess was 8, whereas now it is 4 (and the result is 2). All this change does is it computes the division by two more accurately while it keeps the guess ≥ √x.

Change-Id: I917248d734a7b0488d14a647a063f674e56c4e30
GitHub-Last-Rev: c06d9d4876
GitHub-Pull-Request: golang/go#28981
Reviewed-on: https://go-review.googlesource.com/c/163866
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-27 18:48:56 +00:00
Bryan C. Mills bd98628676 math/cmplx: avoid panic in Pow(x, NaN())
Fixes #30088

Change-Id: I08cec17feddc86bd08532e6b135807e3c8f4c1b2
Reviewed-on: https://go-review.googlesource.com/c/161197
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-27 14:01:03 +00:00
Brian Kessler a73abca37b math/big: handle alias of cofactor inputs in GCD
If the variables passed in to the cofactor arguments of GCD (x, y)
aliased the input arguments (a, b), the previous implementation would
result in incorrect results for y.  This change reorganizes the calculation
so that the only case that need to be handled is when y aliases b, which
can be handled with a simple check.

Tests were added for all of the alias cases for input arguments and and
and irrelevant test case for a previous binary GCD calculation was dropped.

Fixes #30217

Change-Id: Ibe6137f09b3e1ae3c29e3c97aba85b67f33dc169
Reviewed-on: https://go-review.googlesource.com/c/162517
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-27 00:11:17 +00:00
Russ Cox d6311ff1e4 math/big: add %#b and %O integer formats
Matching fmt, %#b now prints an 0b prefix,
and %O prints octal with an 0o prefix.

See golang.org/design/19308-number-literals for background.

For #19308.
For #12711.

Change-Id: I139c5a9a1dfae15415621601edfa13c6a5f19cfc
Reviewed-on: https://go-review.googlesource.com/c/160250
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-26 19:39:19 +00:00
Russ Cox 675503c507 math/big: add %x float format
big.Float already had %p for printing hex format,
but that format normalizes differently from fmt's %x
and ignores precision entirely.

This CL adds %x to big.Float, matching fmt's behavior:
the verb is spelled 'x' not 'p', the mantissa is normalized
to [1, 2), and precision is respected.

See golang.org/design/19308-number-literals for background.

For #29008.

Change-Id: I9c1b9612107094856797e5b0b584c556c1914895
Reviewed-on: https://go-review.googlesource.com/c/160249
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-26 19:39:11 +00:00
Michael Munday 42a82ce1a7 math/bits: optimize Reverse32 and Reverse64
Use ReverseBytes32 and ReverseBytes64 to speed up these functions.
The byte reversal functions are intrinsics on most platforms and
generally compile to a single instruction.

name       old time/op  new time/op  delta
Reverse32  2.41ns ± 1%  1.94ns ± 3%  -19.60%  (p=0.000 n=20+19)
Reverse64  3.85ns ± 1%  2.56ns ± 1%  -33.32%  (p=0.000 n=17+19)

Change-Id: I160bf59a0c7bd5db94114803ec5a59fae448f096
Reviewed-on: https://go-review.googlesource.com/c/159358
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-26 17:52:08 +00:00
Robert Griesemer fae44a2be3 src, misc: apply gofmt
This applies the new gofmt literal normalizations to the library.

Change-Id: I8c1e8ef62eb556fc568872c9f77a31ef236348e7
Reviewed-on: https://go-review.googlesource.com/c/162539
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-02-19 20:38:28 +00:00
Russ Cox e2d87f2ca5 strconv: format hex floats
This CL updates FormatFloat to format
standard hexadecimal floating-point constants,
using the 'x' and 'X' verbs.

See golang.org/design/19308-number-literals for background.

For #29008.

Change-Id: I540b8f71d492cfdb7c58af533d357a564591f28b
Reviewed-on: https://go-review.googlesource.com/c/160242
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-12 14:48:22 +00:00
Robert Griesemer 7bc2aa670f math/big: permit upper-case 'P' binary exponent (not just 'p')
The current implementation accepted binary exponents but restricted
them to 'p'. This change permits both 'p' and 'P'.

R=Go1.13

Updates #29008.

Change-Id: I7a89ccb86af4438f17b0422be7cb630ffcf43272
Reviewed-on: https://go-review.googlesource.com/c/159297
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2019-02-11 23:22:35 +00:00
Robert Griesemer 33caf3be83 math/big: document that Rat.SetString accepts _decimal_ float representations
Updates #29799.

Change-Id: I267c2c3ba3964e96903954affc248d0c52c4916c
Reviewed-on: https://go-review.googlesource.com/c/158397
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-01-17 23:04:06 +00:00
Brian Kessler 649294d0a5 math: fix ternary correction statement in Log1p
The original port of Log1p incorrectly translated a ternary statement
so that a correction was only applied to one of the branches.

Fixes #29488

Change-Id: I035b2fc741f76fe7c0154c63da6e298b575e08a4
Reviewed-on: https://go-review.googlesource.com/c/156120
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-01-07 18:57:45 +00:00
Will Beason bfaf11c158 math/big: fix incorrect comment variable reference
Fix comment as w&1 is the parity of 'x', not of 'n'.

Change-Id: Ia0e448f7e5896412ff9b164459ce15561ab624cc
GitHub-Last-Rev: 54ba08ab10
GitHub-Pull-Request: golang/go#29419
Reviewed-on: https://go-review.googlesource.com/c/155743
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-26 05:21:41 +00:00
Robert Griesemer 9ce38f570f math: don't run huge argument tests on s390x
The s390x implementations for Sin/Cos/SinCos/Tan use assembly
routines which don't reduce arguments accurately enough for
huge inputs.

Fixes #29221.

Change-Id: I340f576899d67bb52a553c3ab22e6464172c936d
Reviewed-on: https://go-review.googlesource.com/c/154119
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-13 22:13:57 +00:00
Brian Kessler 02ad841dd8 math: correct mPi4 comment
The previous comment mis-stated the number of bits in mPi4.
The correct value is 19*64 + 1 == 1217 bits.

Change-Id: Ife971ff6936ce2d5b81ce663ce48044749d592a0
Reviewed-on: https://go-review.googlesource.com/c/154017
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-13 20:35:45 +00:00
Robert Griesemer 944a9c7a4f math: use constant rather than variable for exported test threshold
This is a minor follow-up on https://golang.org/cl/153059.

TBR=iant

Updates #6794.

Change-Id: I03657dafc572959d46a03f86bbeb280825bc969d
Reviewed-on: https://go-review.googlesource.com/c/153845
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-13 06:33:18 +00:00
Brian Kessler 98521a5a8f math: implement trignometric range reduction for huge arguments
This change implements Payne-Hanek range reduction by Pi/4
to properly calculate trigonometric functions of huge arguments.

The implementation is based on:

"ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit"
K. C. Ng et al, March 24, 1992

The major difference with the reference is that the simulated
multi-precision calculation of x*B is implemented using 64-bit
integer arithmetic rather than floating point to ease extraction
of the relevant bits of 4/Pi.

The assembly implementations for 386 were removed since the trigonometric
instructions only use a 66-bit representation of Pi internally for
reduction.  It is not possible to use these instructions and maintain
accuracy without a prior accurate reduction in software as recommended
by Intel.

Fixes #6794

Change-Id: I31bf1369e0578891d738c5473447fe9b10560196
Reviewed-on: https://go-review.googlesource.com/c/153059
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-13 06:01:42 +00:00
Alberto Donizetti 11ce6eabd6 math/bits: remove named return in TrailingZeros16
TrailingZeros16 is the only one of the TrailingZeros functions with a
named return value in the signature. This creates a sligthly
unpleasant effect in the godoc listing:

  func TrailingZeros(x uint) int
  func TrailingZeros16(x uint16) (n int)
  func TrailingZeros32(x uint32) int
  func TrailingZeros64(x uint64) int
  func TrailingZeros8(x uint8) int

Since the named return value is not even used, remove it.

Change-Id: I15c5aedb6157003911b6e0685c357ce56e466c0e
Reviewed-on: https://go-review.googlesource.com/c/153340
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-09 14:27:56 +00:00
Robert Griesemer 276870d6e0 math: document sign bit correspondence for floating-point/bits conversions
Fixes #27736.

Change-Id: Ibda7da7ec6e731626fc43abf3e8c1190117f7885
Reviewed-on: https://go-review.googlesource.com/c/153057
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-12-06 22:27:54 +00:00
Josh Bleecher Snyder bfc54bb6f3 math/big: allocate less for single-Word nats
For many uses of math/big, most numbers are small in practice.
Prior to this change, big.NewInt allocated a minimum of five Words:
one to hold the value, and four as extra capacity.
In most cases, this extra capacity is waste.
Worse, allocating a single Word uses a fast malloc path for tiny allocs;
allocating five Words is more expensive in CPU as well as memory.

This change is a simple fix: Treat a request for one Word at its word.
I experimented with more complicated fixes and did not find anything
that outperformed this easy fix.

On some real world programs, this is a clear win.

The compiler:

name        old alloc/op      new alloc/op      delta
Template         37.1MB ± 0%       37.0MB ± 0%  -0.23%  (p=0.008 n=5+5)
Unicode          29.2MB ± 0%       28.5MB ± 0%  -2.48%  (p=0.008 n=5+5)
GoTypes           133MB ± 0%        133MB ± 0%  -0.05%  (p=0.008 n=5+5)
Compiler          628MB ± 0%        628MB ± 0%  -0.06%  (p=0.008 n=5+5)
SSA              2.04GB ± 0%       2.03GB ± 0%  -0.14%  (p=0.008 n=5+5)
Flate            24.7MB ± 0%       24.6MB ± 0%  -0.23%  (p=0.008 n=5+5)
GoParser         29.6MB ± 0%       29.6MB ± 0%  -0.07%  (p=0.008 n=5+5)
Reflect          82.3MB ± 0%       82.2MB ± 0%  -0.05%  (p=0.008 n=5+5)
Tar              36.2MB ± 0%       36.2MB ± 0%  -0.12%  (p=0.008 n=5+5)
XML              49.5MB ± 0%       49.4MB ± 0%  -0.23%  (p=0.008 n=5+5)
[Geo mean]       85.1MB            84.8MB       -0.37%

name        old allocs/op     new allocs/op     delta
Template           364k ± 0%         364k ± 0%    ~     (p=0.476 n=5+5)
Unicode            341k ± 0%         341k ± 0%    ~     (p=0.690 n=5+5)
GoTypes           1.37M ± 0%        1.37M ± 0%    ~     (p=0.444 n=5+5)
Compiler          5.50M ± 0%        5.50M ± 0%  +0.02%  (p=0.008 n=5+5)
SSA               16.0M ± 0%        16.0M ± 0%  +0.01%  (p=0.008 n=5+5)
Flate              238k ± 0%         238k ± 0%    ~     (p=0.222 n=5+5)
GoParser           305k ± 0%         305k ± 0%    ~     (p=0.841 n=5+5)
Reflect            976k ± 0%         976k ± 0%    ~     (p=0.222 n=5+5)
Tar                354k ± 0%         354k ± 0%    ~     (p=0.103 n=5+5)
XML                450k ± 0%         450k ± 0%    ~     (p=0.151 n=5+5)
[Geo mean]         837k              837k       +0.01%

go.skylark.net (at ea6d2813de75ded8d157b9540bc3d3ad0b688623):

name                     old alloc/op   new alloc/op   delta
Hashtable-8                 456kB ± 0%     299kB ± 0%  -34.33%  (p=0.000 n=9+9)
/bench_builtin_method-8     220kB ± 0%     190kB ± 0%  -13.55%  (p=0.000 n=9+10)

name                     old allocs/op  new allocs/op  delta
Hashtable-8                 7.84k ± 0%     7.84k ± 0%     ~     (all equal)
/bench_builtin_method-8     7.49k ± 0%     7.49k ± 0%     ~     (all equal)

The math/big benchmarks are messy, which is predictable, since they
naturally exercise the bigger-than-one-word code more.
Also worth noting is that many of the benchmarks have very high variance.
I've omitted the opVV and opVW benchmarks, as they are unrelated.

name                              old time/op    new time/op    delta
DecimalConversion-8                 92.5µs ± 1%    90.6µs ± 0%   -2.12%  (p=0.000 n=17+19)
FloatString/100-8                    867ns ± 0%     871ns ± 0%   +0.50%  (p=0.000 n=18+18)
FloatString/1000-8                  26.4µs ± 1%    26.5µs ± 1%     ~     (p=0.396 n=20+19)
FloatString/10000-8                 2.15ms ± 2%    2.16ms ± 2%     ~     (p=0.089 n=19+20)
FloatString/100000-8                 209ms ± 1%     209ms ± 1%     ~     (p=0.583 n=19+19)
FloatAdd/10-8                       63.5ns ± 2%    64.1ns ± 6%     ~     (p=0.389 n=19+19)
FloatAdd/100-8                      66.0ns ± 2%    65.8ns ± 2%     ~     (p=0.825 n=20+20)
FloatAdd/1000-8                     93.9ns ± 1%    94.3ns ± 1%     ~     (p=0.273 n=19+20)
FloatAdd/10000-8                     347ns ± 2%     342ns ± 1%   -1.50%  (p=0.000 n=18+18)
FloatAdd/100000-8                   2.78µs ± 1%    2.78µs ± 2%     ~     (p=0.961 n=20+19)
FloatSub/10-8                       56.9ns ± 2%    57.8ns ± 3%   +1.59%  (p=0.001 n=19+19)
FloatSub/100-8                      58.2ns ± 2%    58.9ns ± 2%   +1.25%  (p=0.004 n=20+20)
FloatSub/1000-8                     74.9ns ± 1%    74.4ns ± 1%   -0.76%  (p=0.000 n=19+20)
FloatSub/10000-8                     223ns ± 1%     220ns ± 2%   -1.29%  (p=0.000 n=16+20)
FloatSub/100000-8                   1.66µs ± 1%    1.66µs ± 2%     ~     (p=0.147 n=20+20)
ParseFloatSmallExp-8                8.38µs ± 0%    8.59µs ± 0%   +2.48%  (p=0.000 n=19+19)
ParseFloatLargeExp-8                31.1µs ± 0%    32.0µs ± 0%   +3.04%  (p=0.000 n=16+17)
GCD10x10/WithoutXY-8                 115ns ± 1%      99ns ± 3%  -14.07%  (p=0.000 n=20+20)
GCD10x10/WithXY-8                    322ns ± 0%     312ns ± 0%   -3.11%  (p=0.000 n=18+13)
GCD10x100/WithoutXY-8                233ns ± 1%     219ns ± 1%   -5.73%  (p=0.000 n=19+17)
GCD10x100/WithXY-8                   709ns ± 0%     759ns ± 0%   +7.04%  (p=0.000 n=19+19)
GCD10x1000/WithoutXY-8               653ns ± 1%     642ns ± 1%   -1.69%  (p=0.000 n=17+20)
GCD10x1000/WithXY-8                 1.35µs ± 0%    1.35µs ± 1%     ~     (p=0.255 n=20+16)
GCD10x10000/WithoutXY-8             4.57µs ± 1%    4.61µs ± 1%   +0.95%  (p=0.000 n=18+17)
GCD10x10000/WithXY-8                6.82µs ± 0%    6.84µs ± 0%   +0.27%  (p=0.000 n=16+17)
GCD10x100000/WithoutXY-8            43.9µs ± 1%    44.0µs ± 1%   +0.28%  (p=0.000 n=18+17)
GCD10x100000/WithXY-8               60.6µs ± 0%    60.6µs ± 0%     ~     (p=0.907 n=18+18)
GCD100x100/WithoutXY-8              1.13µs ± 0%    1.21µs ± 0%   +6.39%  (p=0.000 n=19+19)
GCD100x100/WithXY-8                 1.82µs ± 0%    1.92µs ± 0%   +5.24%  (p=0.000 n=19+17)
GCD100x1000/WithoutXY-8             2.00µs ± 0%    2.03µs ± 1%   +1.61%  (p=0.000 n=18+16)
GCD100x1000/WithXY-8                3.22µs ± 0%    3.20µs ± 1%   -0.83%  (p=0.000 n=19+19)
GCD100x10000/WithoutXY-8            9.28µs ± 1%    9.17µs ± 1%   -1.25%  (p=0.000 n=18+19)
GCD100x10000/WithXY-8               13.5µs ± 0%    13.3µs ± 0%   -1.12%  (p=0.000 n=18+19)
GCD100x100000/WithoutXY-8           80.4µs ± 0%    78.6µs ± 0%   -2.25%  (p=0.000 n=19+19)
GCD100x100000/WithXY-8               114µs ± 0%     112µs ± 0%   -1.46%  (p=0.000 n=19+17)
GCD1000x1000/WithoutXY-8            12.9µs ± 1%    12.9µs ± 2%   -0.50%  (p=0.014 n=20+19)
GCD1000x1000/WithXY-8               19.6µs ± 1%    19.6µs ± 2%   -0.28%  (p=0.040 n=17+18)
GCD1000x10000/WithoutXY-8           22.4µs ± 0%    22.4µs ± 2%     ~     (p=0.220 n=19+19)
GCD1000x10000/WithXY-8              57.0µs ± 0%    56.5µs ± 0%   -0.87%  (p=0.000 n=20+20)
GCD1000x100000/WithoutXY-8           116µs ± 0%     115µs ± 0%   -0.49%  (p=0.000 n=18+19)
GCD1000x100000/WithXY-8              410µs ± 0%     411µs ± 0%     ~     (p=0.052 n=19+19)
GCD10000x10000/WithoutXY-8           247µs ± 1%     244µs ± 1%   -0.92%  (p=0.000 n=19+19)
GCD10000x10000/WithXY-8              476µs ± 1%     473µs ± 1%   -0.48%  (p=0.009 n=19+19)
GCD10000x100000/WithoutXY-8          573µs ± 1%     571µs ± 1%   -0.45%  (p=0.012 n=20+20)
GCD10000x100000/WithXY-8            3.35ms ± 1%    3.35ms ± 1%     ~     (p=0.444 n=20+19)
GCD100000x100000/WithoutXY-8        12.0ms ± 2%    11.9ms ± 2%     ~     (p=0.276 n=18+20)
GCD100000x100000/WithXY-8           27.3ms ± 1%    27.3ms ± 1%     ~     (p=0.792 n=20+19)
Hilbert-8                            672µs ± 0%     611µs ± 0%   -9.02%  (p=0.000 n=19+19)
Binomial-8                          1.40µs ± 0%    1.18µs ± 0%  -15.69%  (p=0.000 n=16+14)
QuoRem-8                            2.20µs ± 1%    2.17µs ± 1%   -1.13%  (p=0.000 n=19+19)
Exp-8                               4.10ms ± 1%    4.11ms ± 1%     ~     (p=0.296 n=20+19)
Exp2-8                              4.11ms ± 1%    4.12ms ± 1%     ~     (p=0.429 n=20+20)
Bitset-8                            8.67ns ± 6%    8.74ns ± 4%     ~     (p=0.139 n=19+17)
BitsetNeg-8                         43.6ns ± 1%    43.8ns ± 2%   +0.61%  (p=0.036 n=20+20)
BitsetOrig-8                        77.5ns ± 1%    68.4ns ± 1%  -11.77%  (p=0.000 n=19+20)
BitsetNegOrig-8                      145ns ± 1%     141ns ± 1%   -2.87%  (p=0.000 n=19+20)
ModSqrt225_Tonelli-8                 324µs ± 1%     324µs ± 1%     ~     (p=0.409 n=18+20)
ModSqrt225_3Mod4-8                  98.9µs ± 1%    99.1µs ± 1%     ~     (p=0.298 n=19+18)
ModSqrt231_Tonelli-8                 337µs ± 1%     337µs ± 1%     ~     (p=0.718 n=20+18)
ModSqrt231_5Mod8-8                   115µs ± 1%     114µs ± 1%   -0.22%  (p=0.050 n=20+20)
ModInverse-8                         895ns ± 0%     869ns ± 1%   -2.83%  (p=0.000 n=17+17)
Sqrt-8                              28.1µs ± 1%    28.1µs ± 0%   -0.28%  (p=0.000 n=16+20)
IntSqr/1-8                          10.8ns ± 3%    10.5ns ± 3%   -2.51%  (p=0.000 n=19+17)
IntSqr/2-8                          30.5ns ± 2%    30.3ns ± 4%   -0.71%  (p=0.035 n=18+18)
IntSqr/3-8                          40.1ns ± 1%    40.1ns ± 1%     ~     (p=0.710 n=20+17)
IntSqr/5-8                          65.3ns ± 1%    65.4ns ± 2%     ~     (p=0.744 n=19+19)
IntSqr/8-8                           101ns ± 1%     102ns ± 0%     ~     (p=0.234 n=19+20)
IntSqr/10-8                          138ns ± 0%     138ns ± 2%     ~     (p=0.827 n=18+18)
IntSqr/20-8                          378ns ± 1%     378ns ± 1%     ~     (p=0.479 n=18+18)
IntSqr/30-8                          637ns ± 0%     638ns ± 1%     ~     (p=0.051 n=18+20)
IntSqr/50-8                         1.34µs ± 2%    1.34µs ± 1%     ~     (p=0.970 n=18+19)
IntSqr/80-8                         2.78µs ± 0%    2.78µs ± 1%   -0.18%  (p=0.006 n=19+17)
IntSqr/100-8                        3.98µs ± 0%    3.98µs ± 0%     ~     (p=0.057 n=17+19)
IntSqr/200-8                        13.5µs ± 0%    13.5µs ± 1%   -0.33%  (p=0.000 n=19+17)
IntSqr/300-8                        25.3µs ± 1%    25.3µs ± 1%     ~     (p=0.361 n=19+20)
IntSqr/500-8                        62.9µs ± 0%    62.9µs ± 1%     ~     (p=0.899 n=17+17)
IntSqr/800-8                         128µs ± 1%     127µs ± 1%   -0.32%  (p=0.016 n=18+20)
IntSqr/1000-8                        192µs ± 0%     192µs ± 1%     ~     (p=0.916 n=17+18)
Div/20/10-8                         34.9ns ± 2%    35.6ns ± 1%   +2.01%  (p=0.000 n=20+20)
Div/200/100-8                        218ns ± 1%     215ns ± 2%   -1.43%  (p=0.000 n=18+18)
Div/2000/1000-8                     1.16µs ± 1%    1.15µs ± 1%   -1.04%  (p=0.000 n=19+20)
Div/20000/10000-8                   35.7µs ± 1%    35.4µs ± 1%   -0.69%  (p=0.000 n=19+18)
Div/200000/100000-8                 2.89ms ± 1%    2.88ms ± 1%   -0.62%  (p=0.007 n=19+20)
Mul-8                               9.28ms ± 1%    9.27ms ± 1%     ~     (p=0.563 n=18+18)
ZeroShifts/Shl-8                     712ns ± 6%     716ns ± 7%     ~     (p=0.597 n=20+20)
ZeroShifts/ShlSame-8                4.00ns ± 1%    4.06ns ± 5%     ~     (p=0.162 n=18+20)
ZeroShifts/Shr-8                     714ns ±10%   1285ns ±156%     ~     (p=0.250 n=20+20)
ZeroShifts/ShrSame-8                4.00ns ± 1%    4.09ns ±10%   +2.34%  (p=0.048 n=16+19)
Exp3Power/0x10-8                     154ns ± 0%     159ns ±13%     ~     (p=0.197 n=14+20)
Exp3Power/0x40-8                     171ns ± 1%     175ns ± 8%     ~     (p=0.058 n=16+19)
Exp3Power/0x100-8                    287ns ± 0%     316ns ± 4%  +10.03%  (p=0.000 n=17+19)
Exp3Power/0x400-8                    698ns ± 1%     801ns ± 6%  +14.75%  (p=0.000 n=19+20)
Exp3Power/0x1000-8                  2.87µs ± 0%    3.65µs ± 6%  +27.24%  (p=0.000 n=18+18)
Exp3Power/0x4000-8                  21.9µs ± 1%    28.7µs ± 8%  +31.09%  (p=0.000 n=18+20)
Exp3Power/0x10000-8                  204µs ± 0%     267µs ± 9%  +30.81%  (p=0.000 n=20+20)
Exp3Power/0x40000-8                 1.86ms ± 0%    2.26ms ± 5%  +21.68%  (p=0.000 n=18+19)
Exp3Power/0x100000-8                17.5ms ± 1%    20.7ms ± 7%  +18.39%  (p=0.000 n=19+20)
Exp3Power/0x400000-8                 156ms ± 0%     172ms ± 6%  +10.54%  (p=0.000 n=19+20)
Fibo-8                              26.9ms ± 1%    27.5ms ± 3%   +2.32%  (p=0.000 n=19+19)
NatSqr/1-8                          31.0ns ± 4%    39.5ns ±29%  +27.25%  (p=0.000 n=20+19)
NatSqr/2-8                          54.1ns ± 1%    69.0ns ±28%  +27.52%  (p=0.000 n=20+20)
NatSqr/3-8                          66.6ns ± 1%    83.0ns ±25%  +24.59%  (p=0.000 n=20+20)
NatSqr/5-8                          97.1ns ± 1%   119.9ns ±12%  +23.50%  (p=0.000 n=16+20)
NatSqr/8-8                           138ns ± 1%     171ns ± 9%  +24.20%  (p=0.000 n=19+20)
NatSqr/10-8                          182ns ± 0%     225ns ± 9%  +23.50%  (p=0.000 n=16+20)
NatSqr/20-8                          447ns ± 1%     624ns ± 6%  +39.64%  (p=0.000 n=19+19)
NatSqr/30-8                          736ns ± 2%     986ns ± 9%  +33.94%  (p=0.000 n=19+20)
NatSqr/50-8                         1.51µs ± 2%    1.97µs ± 9%  +30.42%  (p=0.000 n=20+20)
NatSqr/80-8                         3.03µs ± 1%    3.67µs ± 7%  +21.08%  (p=0.000 n=20+20)
NatSqr/100-8                        4.31µs ± 1%    5.20µs ± 7%  +20.52%  (p=0.000 n=19+20)
NatSqr/200-8                        14.2µs ± 0%    16.3µs ± 4%  +14.92%  (p=0.000 n=19+20)
NatSqr/300-8                        27.8µs ± 1%    33.2µs ± 7%  +19.28%  (p=0.000 n=20+18)
NatSqr/500-8                        66.6µs ± 1%    74.5µs ± 3%  +11.87%  (p=0.000 n=18+18)
NatSqr/800-8                         135µs ± 1%     165µs ± 7%  +22.33%  (p=0.000 n=20+20)
NatSqr/1000-8                        200µs ± 0%     228µs ± 3%  +14.39%  (p=0.000 n=19+20)
NatSetBytes/8-8                     8.87ns ± 4%    8.77ns ± 2%   -1.17%  (p=0.020 n=20+16)
NatSetBytes/24-8                    38.6ns ± 3%    49.5ns ±29%  +28.32%  (p=0.000 n=18+19)
NatSetBytes/128-8                   75.2ns ± 1%   120.7ns ±29%  +60.60%  (p=0.000 n=17+20)
NatSetBytes/7-8                     16.2ns ± 2%    16.5ns ± 2%   +1.76%  (p=0.000 n=20+20)
NatSetBytes/23-8                    46.5ns ± 1%    60.2ns ±24%  +29.59%  (p=0.000 n=20+20)
NatSetBytes/127-8                   83.1ns ± 1%   118.2ns ±20%  +42.33%  (p=0.000 n=18+20)
ScanPi-8                            89.1µs ± 1%   117.4µs ±12%  +31.75%  (p=0.000 n=18+20)
StringPiParallel-8                  35.1µs ± 9%    40.2µs ±12%  +14.53%  (p=0.000 n=20+20)
Scan/10/Base2-8                      410ns ±14%     429ns ±10%   +4.47%  (p=0.018 n=19+20)
Scan/100/Base2-8                    3.05µs ±20%    2.97µs ±14%     ~     (p=0.449 n=20+20)
Scan/1000/Base2-8                   29.3µs ± 8%    30.1µs ±23%     ~     (p=0.355 n=20+20)
Scan/10000/Base2-8                   402µs ±13%     395µs ±14%     ~     (p=0.355 n=20+20)
Scan/100000/Base2-8                 11.8ms ±10%    11.6ms ± 1%     ~     (p=0.245 n=17+18)
Scan/10/Base8-8                      194ns ± 6%     196ns ±12%     ~     (p=0.829 n=20+19)
Scan/100/Base8-8                    1.11µs ±15%    1.11µs ±12%     ~     (p=0.743 n=20+20)
Scan/1000/Base8-8                   11.7µs ±10%    11.7µs ±12%     ~     (p=0.904 n=20+20)
Scan/10000/Base8-8                   209µs ± 7%     210µs ± 8%     ~     (p=0.478 n=20+20)
Scan/100000/Base8-8                 10.6ms ± 7%    10.4ms ± 6%     ~     (p=0.112 n=20+18)
Scan/10/Base10-8                     182ns ±12%     188ns ±11%   +3.52%  (p=0.044 n=20+20)
Scan/100/Base10-8                   1.01µs ± 8%    1.00µs ±13%     ~     (p=0.588 n=20+20)
Scan/1000/Base10-8                  10.7µs ±20%    10.6µs ±14%     ~     (p=0.560 n=20+20)
Scan/10000/Base10-8                  195µs ±10%     194µs ± 9%     ~     (p=0.883 n=20+20)
Scan/100000/Base10-8                10.6ms ± 2%    10.6ms ± 2%     ~     (p=0.495 n=20+20)
Scan/10/Base16-8                     166ns ±10%     174ns ±17%     ~     (p=0.072 n=20+20)
Scan/100/Base16-8                    836ns ±10%     826ns ±12%     ~     (p=0.562 n=20+17)
Scan/1000/Base16-8                  8.96µs ±13%    8.65µs ± 9%     ~     (p=0.203 n=20+18)
Scan/10000/Base16-8                  198µs ± 3%     198µs ± 5%     ~     (p=0.718 n=20+20)
Scan/100000/Base16-8                11.1ms ± 3%    11.0ms ± 4%     ~     (p=0.512 n=20+20)
String/10/Base2-8                   88.1ns ± 7%    94.1ns ±11%   +6.80%  (p=0.000 n=19+20)
String/100/Base2-8                   577ns ± 4%     598ns ± 5%   +3.72%  (p=0.000 n=20+20)
String/1000/Base2-8                 5.25µs ± 2%    5.62µs ± 5%   +7.04%  (p=0.000 n=19+20)
String/10000/Base2-8                55.6µs ± 1%    60.1µs ± 2%   +8.12%  (p=0.000 n=19+19)
String/100000/Base2-8                519µs ± 2%     560µs ± 2%   +7.91%  (p=0.000 n=18+17)
String/10/Base8-8                   52.2ns ± 8%    53.3ns ±12%     ~     (p=0.188 n=20+18)
String/100/Base8-8                   218ns ± 3%     232ns ±10%   +6.66%  (p=0.000 n=20+20)
String/1000/Base8-8                 1.84µs ± 3%    1.94µs ± 4%   +5.07%  (p=0.000 n=20+18)
String/10000/Base8-8                18.1µs ± 2%    19.1µs ± 3%   +5.84%  (p=0.000 n=20+19)
String/100000/Base8-8                184µs ± 2%     197µs ± 1%   +7.15%  (p=0.000 n=19+19)
String/10/Base10-8                   158ns ± 7%     146ns ± 6%   -7.65%  (p=0.000 n=20+19)
String/100/Base10-8                  807ns ± 2%     845ns ± 4%   +4.79%  (p=0.000 n=20+19)
String/1000/Base10-8                3.99µs ± 3%    3.99µs ± 7%     ~     (p=0.920 n=20+20)
String/10000/Base10-8               20.8µs ± 6%    22.1µs ±10%   +6.11%  (p=0.000 n=19+20)
String/100000/Base10-8              5.60ms ± 2%    5.59ms ± 2%     ~     (p=0.749 n=20+19)
String/10/Base16-8                  49.0ns ±13%    49.3ns ±16%     ~     (p=0.581 n=19+20)
String/100/Base16-8                  173ns ± 5%     185ns ± 6%   +6.63%  (p=0.000 n=20+18)
String/1000/Base16-8                1.38µs ± 3%    1.49µs ±10%   +8.27%  (p=0.000 n=19+20)
String/10000/Base16-8               13.5µs ± 2%    14.5µs ± 3%   +7.08%  (p=0.000 n=20+20)
String/100000/Base16-8               138µs ± 4%     148µs ± 4%   +7.57%  (p=0.000 n=19+20)
LeafSize/0-8                        2.74ms ± 1%    2.79ms ± 2%   +2.00%  (p=0.000 n=19+19)
LeafSize/1-8                        24.8µs ± 4%    26.1µs ± 8%   +5.33%  (p=0.000 n=18+19)
LeafSize/2-8                        24.9µs ± 7%    25.0µs ± 8%     ~     (p=0.989 n=20+19)
LeafSize/3-8                        97.6µs ± 3%   100.2µs ± 5%   +2.66%  (p=0.001 n=20+19)
LeafSize/4-8                        25.2µs ± 5%    25.4µs ± 5%     ~     (p=0.173 n=19+20)
LeafSize/5-8                         118µs ± 2%     119µs ± 5%     ~     (p=0.478 n=20+20)
LeafSize/6-8                        97.6µs ± 3%   100.1µs ± 8%   +2.65%  (p=0.021 n=20+19)
LeafSize/7-8                        65.6µs ± 5%    67.5µs ± 6%   +2.92%  (p=0.003 n=20+19)
LeafSize/8-8                        25.5µs ± 5%    25.6µs ± 6%     ~     (p=0.461 n=19+20)
LeafSize/9-8                         134µs ± 4%     136µs ± 5%     ~     (p=0.194 n=19+20)
LeafSize/10-8                        119µs ± 3%     122µs ± 3%   +2.52%  (p=0.000 n=20+19)
LeafSize/11-8                        115µs ± 5%     116µs ± 5%     ~     (p=0.158 n=20+19)
LeafSize/12-8                       97.4µs ± 4%   100.3µs ± 5%   +2.91%  (p=0.003 n=19+20)
LeafSize/13-8                       93.1µs ± 4%    93.0µs ± 6%     ~     (p=0.698 n=20+20)
LeafSize/14-8                       67.0µs ± 3%    69.7µs ± 6%   +4.10%  (p=0.000 n=20+20)
LeafSize/15-8                       48.3µs ± 2%    49.3µs ± 6%   +1.91%  (p=0.014 n=19+20)
LeafSize/16-8                       25.6µs ± 5%    25.6µs ± 6%     ~     (p=0.947 n=20+20)
LeafSize/32-8                       30.1µs ± 4%    30.3µs ± 5%     ~     (p=0.685 n=18+19)
LeafSize/64-8                       53.4µs ± 2%    54.0µs ± 3%     ~     (p=0.053 n=19+19)
ProbablyPrime/n=0-8                 3.59ms ± 1%    3.55ms ± 1%   -1.12%  (p=0.000 n=20+18)
ProbablyPrime/n=1-8                 4.21ms ± 2%    4.17ms ± 2%   -0.73%  (p=0.018 n=20+19)
ProbablyPrime/n=5-8                 6.74ms ± 1%    6.72ms ± 1%     ~     (p=0.102 n=20+20)
ProbablyPrime/n=10-8                9.91ms ± 1%    9.89ms ± 2%     ~     (p=0.322 n=19+20)
ProbablyPrime/n=20-8                16.2ms ± 1%    16.1ms ± 2%   -0.52%  (p=0.006 n=19+19)
ProbablyPrime/Lucas-8               2.94ms ± 1%    2.95ms ± 1%   +0.52%  (p=0.002 n=18+19)
ProbablyPrime/MillerRabinBase2-8     641µs ± 2%     640µs ± 2%     ~     (p=0.607 n=19+20)
FloatSqrt/64-8                       653ns ± 5%     704ns ± 5%   +7.82%  (p=0.000 n=19+20)
FloatSqrt/128-8                     1.32µs ± 3%    1.42µs ± 5%   +7.29%  (p=0.000 n=18+20)
FloatSqrt/256-8                     1.44µs ± 2%    1.45µs ± 4%     ~     (p=0.089 n=19+19)
FloatSqrt/1000-8                    3.36µs ± 3%    3.42µs ± 5%   +1.82%  (p=0.012 n=20+20)
FloatSqrt/10000-8                   25.5µs ± 2%    27.5µs ± 7%   +7.91%  (p=0.000 n=18+19)
FloatSqrt/100000-8                   629µs ± 6%     663µs ± 9%   +5.32%  (p=0.000 n=18+20)
FloatSqrt/1000000-8                 46.4ms ± 2%    46.6ms ± 5%     ~     (p=0.351 n=20+19)
[Geo mean]                          9.60µs        10.01µs        +4.28%

name                              old alloc/op   new alloc/op   delta
DecimalConversion-8                 54.0kB ± 0%    43.6kB ± 0%  -19.40%  (p=0.000 n=20+20)
FloatString/100-8                     400B ± 0%      400B ± 0%     ~     (all equal)
FloatString/1000-8                  3.10kB ± 0%    3.10kB ± 0%     ~     (all equal)
FloatString/10000-8                 52.1kB ± 0%    52.1kB ± 0%     ~     (p=0.153 n=20+20)
FloatString/100000-8                 582kB ± 0%     582kB ± 0%     ~     (all equal)
FloatAdd/10-8                        0.00B          0.00B          ~     (all equal)
FloatAdd/100-8                       0.00B          0.00B          ~     (all equal)
FloatAdd/1000-8                      0.00B          0.00B          ~     (all equal)
FloatAdd/10000-8                     0.00B          0.00B          ~     (all equal)
FloatAdd/100000-8                    0.00B          0.00B          ~     (all equal)
FloatSub/10-8                        0.00B          0.00B          ~     (all equal)
FloatSub/100-8                       0.00B          0.00B          ~     (all equal)
FloatSub/1000-8                      0.00B          0.00B          ~     (all equal)
FloatSub/10000-8                     0.00B          0.00B          ~     (all equal)
FloatSub/100000-8                    0.00B          0.00B          ~     (all equal)
ParseFloatSmallExp-8                4.18kB ± 0%    3.60kB ± 0%  -13.79%  (p=0.000 n=20+20)
ParseFloatLargeExp-8                18.9kB ± 0%    19.3kB ± 0%   +2.25%  (p=0.000 n=20+20)
GCD10x10/WithoutXY-8                 96.0B ± 0%     16.0B ± 0%  -83.33%  (p=0.000 n=20+20)
GCD10x10/WithXY-8                     240B ± 0%       88B ± 0%  -63.33%  (p=0.000 n=20+20)
GCD10x100/WithoutXY-8                 192B ± 0%      112B ± 0%  -41.67%  (p=0.000 n=20+20)
GCD10x100/WithXY-8                    464B ± 0%      424B ± 0%   -8.62%  (p=0.000 n=20+20)
GCD10x1000/WithoutXY-8                416B ± 0%      336B ± 0%  -19.23%  (p=0.000 n=20+20)
GCD10x1000/WithXY-8                 1.25kB ± 0%    1.10kB ± 0%  -12.18%  (p=0.000 n=20+20)
GCD10x10000/WithoutXY-8             2.91kB ± 0%    2.83kB ± 0%   -2.75%  (p=0.000 n=20+20)
GCD10x10000/WithXY-8                8.70kB ± 0%    8.55kB ± 0%   -1.76%  (p=0.000 n=16+16)
GCD10x100000/WithoutXY-8            27.2kB ± 0%    27.2kB ± 0%   -0.29%  (p=0.000 n=20+20)
GCD10x100000/WithXY-8               82.4kB ± 0%    82.3kB ± 0%   -0.17%  (p=0.000 n=20+19)
GCD100x100/WithoutXY-8                288B ± 0%      384B ± 0%  +33.33%  (p=0.000 n=20+20)
GCD100x100/WithXY-8                   464B ± 0%      576B ± 0%  +24.14%  (p=0.000 n=20+20)
GCD100x1000/WithoutXY-8               640B ± 0%      688B ± 0%   +7.50%  (p=0.000 n=20+20)
GCD100x1000/WithXY-8                1.52kB ± 0%    1.46kB ± 0%   -3.68%  (p=0.000 n=20+20)
GCD100x10000/WithoutXY-8            4.24kB ± 0%    4.29kB ± 0%   +1.13%  (p=0.000 n=20+20)
GCD100x10000/WithXY-8               11.1kB ± 0%    11.0kB ± 0%   -0.51%  (p=0.000 n=15+20)
GCD100x100000/WithoutXY-8           40.9kB ± 0%    40.9kB ± 0%   +0.12%  (p=0.000 n=20+19)
GCD100x100000/WithXY-8               110kB ± 0%     109kB ± 0%   -0.08%  (p=0.000 n=20+20)
GCD1000x1000/WithoutXY-8            1.22kB ± 0%    1.06kB ± 0%  -13.16%  (p=0.000 n=20+20)
GCD1000x1000/WithXY-8               2.37kB ± 0%    2.11kB ± 0%  -10.83%  (p=0.000 n=20+20)
GCD1000x10000/WithoutXY-8           4.71kB ± 0%    4.63kB ± 0%   -1.70%  (p=0.000 n=20+19)
GCD1000x10000/WithXY-8              28.2kB ± 0%    28.0kB ± 0%   -0.43%  (p=0.000 n=20+15)
GCD1000x100000/WithoutXY-8          41.3kB ± 0%    41.2kB ± 0%   -0.20%  (p=0.000 n=20+16)
GCD1000x100000/WithXY-8              301kB ± 0%     301kB ± 0%   -0.13%  (p=0.000 n=20+20)
GCD10000x10000/WithoutXY-8          8.64kB ± 0%    8.48kB ± 0%   -1.85%  (p=0.000 n=20+20)
GCD10000x10000/WithXY-8             57.2kB ± 0%    57.7kB ± 0%   +0.80%  (p=0.000 n=20+20)
GCD10000x100000/WithoutXY-8         43.8kB ± 0%    43.7kB ± 0%   -0.19%  (p=0.000 n=20+18)
GCD10000x100000/WithXY-8            2.08MB ± 0%    2.08MB ± 0%   -0.02%  (p=0.000 n=15+19)
GCD100000x100000/WithoutXY-8        81.6kB ± 0%    81.4kB ± 0%   -0.20%  (p=0.000 n=20+20)
GCD100000x100000/WithXY-8           4.32MB ± 0%    4.33MB ± 0%   +0.12%  (p=0.000 n=20+20)
Hilbert-8                            653kB ± 0%     313kB ± 0%  -52.13%  (p=0.000 n=19+20)
Binomial-8                          1.82kB ± 0%    1.02kB ± 0%  -43.86%  (p=0.000 n=20+20)
QuoRem-8                             0.00B          0.00B          ~     (all equal)
Exp-8                               11.1kB ± 0%    11.0kB ± 0%   -0.34%  (p=0.000 n=19+20)
Exp2-8                              11.3kB ± 0%    11.3kB ± 0%   -0.35%  (p=0.000 n=19+20)
Bitset-8                             0.00B          0.00B          ~     (all equal)
BitsetNeg-8                          0.00B          0.00B          ~     (all equal)
BitsetOrig-8                          103B ± 0%       63B ± 0%  -38.83%  (p=0.000 n=20+20)
BitsetNegOrig-8                       215B ± 0%      175B ± 0%  -18.60%  (p=0.000 n=20+20)
ModSqrt225_Tonelli-8                11.3kB ± 0%    11.0kB ± 0%   -2.76%  (p=0.000 n=20+17)
ModSqrt225_3Mod4-8                  3.57kB ± 0%    3.53kB ± 0%   -1.12%  (p=0.000 n=20+20)
ModSqrt231_Tonelli-8                11.0kB ± 0%    10.7kB ± 0%   -2.55%  (p=0.000 n=20+20)
ModSqrt231_5Mod8-8                  4.21kB ± 0%    4.09kB ± 0%   -2.85%  (p=0.000 n=16+20)
ModInverse-8                        1.44kB ± 0%    1.28kB ± 0%  -11.11%  (p=0.000 n=20+20)
Sqrt-8                              6.00kB ± 0%    6.00kB ± 0%     ~     (all equal)
IntSqr/1-8                           0.00B          0.00B          ~     (all equal)
IntSqr/2-8                           0.00B          0.00B          ~     (all equal)
IntSqr/3-8                           0.00B          0.00B          ~     (all equal)
IntSqr/5-8                           0.00B          0.00B          ~     (all equal)
IntSqr/8-8                           0.00B          0.00B          ~     (all equal)
IntSqr/10-8                          0.00B          0.00B          ~     (all equal)
IntSqr/20-8                           320B ± 0%      320B ± 0%     ~     (all equal)
IntSqr/30-8                           480B ± 0%      480B ± 0%     ~     (all equal)
IntSqr/50-8                           896B ± 0%      896B ± 0%     ~     (all equal)
IntSqr/80-8                         1.28kB ± 0%    1.28kB ± 0%     ~     (all equal)
IntSqr/100-8                        1.79kB ± 0%    1.79kB ± 0%     ~     (all equal)
IntSqr/200-8                        3.20kB ± 0%    3.20kB ± 0%     ~     (all equal)
IntSqr/300-8                        8.06kB ± 0%    8.06kB ± 0%     ~     (all equal)
IntSqr/500-8                        12.3kB ± 0%    12.3kB ± 0%     ~     (all equal)
IntSqr/800-8                        28.8kB ± 0%    28.8kB ± 0%     ~     (all equal)
IntSqr/1000-8                       36.9kB ± 0%    36.9kB ± 0%     ~     (all equal)
Div/20/10-8                          0.00B          0.00B          ~     (all equal)
Div/200/100-8                        0.00B          0.00B          ~     (all equal)
Div/2000/1000-8                      0.00B          0.00B          ~     (all equal)
Div/20000/10000-8                    0.00B          0.00B          ~     (all equal)
Div/200000/100000-8                   690B ± 0%      690B ± 0%     ~     (all equal)
Mul-8                                565kB ± 0%     565kB ± 0%     ~     (all equal)
ZeroShifts/Shl-8                    6.53kB ± 0%    6.53kB ± 0%     ~     (all equal)
ZeroShifts/ShlSame-8                 0.00B          0.00B          ~     (all equal)
ZeroShifts/Shr-8                    6.53kB ± 0%    6.53kB ± 0%     ~     (all equal)
ZeroShifts/ShrSame-8                 0.00B          0.00B          ~     (all equal)
Exp3Power/0x10-8                      192B ± 0%      112B ± 0%  -41.67%  (p=0.000 n=20+20)
Exp3Power/0x40-8                      192B ± 0%      112B ± 0%  -41.67%  (p=0.000 n=20+20)
Exp3Power/0x100-8                     288B ± 0%      208B ± 0%  -27.78%  (p=0.000 n=20+20)
Exp3Power/0x400-8                     672B ± 0%      592B ± 0%  -11.90%  (p=0.000 n=20+20)
Exp3Power/0x1000-8                  3.33kB ± 0%    3.25kB ± 0%   -2.40%  (p=0.000 n=20+20)
Exp3Power/0x4000-8                  13.8kB ± 0%    13.7kB ± 0%   -0.58%  (p=0.000 n=20+20)
Exp3Power/0x10000-8                  117kB ± 0%     117kB ± 0%   -0.07%  (p=0.000 n=20+20)
Exp3Power/0x40000-8                  755kB ± 0%     755kB ± 0%   -0.01%  (p=0.000 n=19+20)
Exp3Power/0x100000-8                5.22MB ± 0%    5.22MB ± 0%   -0.00%  (p=0.000 n=20+20)
Exp3Power/0x400000-8                39.8MB ± 0%    39.8MB ± 0%   -0.00%  (p=0.000 n=20+19)
Fibo-8                              3.09MB ± 0%    3.08MB ± 0%   -0.28%  (p=0.000 n=20+16)
NatSqr/1-8                           48.0B ± 0%     48.0B ± 0%     ~     (all equal)
NatSqr/2-8                           64.0B ± 0%     64.0B ± 0%     ~     (all equal)
NatSqr/3-8                           80.0B ± 0%     80.0B ± 0%     ~     (all equal)
NatSqr/5-8                            112B ± 0%      112B ± 0%     ~     (all equal)
NatSqr/8-8                            160B ± 0%      160B ± 0%     ~     (all equal)
NatSqr/10-8                           192B ± 0%      192B ± 0%     ~     (all equal)
NatSqr/20-8                           672B ± 0%      672B ± 0%     ~     (all equal)
NatSqr/30-8                           992B ± 0%      992B ± 0%     ~     (all equal)
NatSqr/50-8                         1.79kB ± 0%    1.79kB ± 0%     ~     (all equal)
NatSqr/80-8                         2.69kB ± 0%    2.69kB ± 0%     ~     (all equal)
NatSqr/100-8                        3.58kB ± 0%    3.58kB ± 0%     ~     (all equal)
NatSqr/200-8                        6.66kB ± 0%    6.66kB ± 0%     ~     (all equal)
NatSqr/300-8                        24.4kB ± 0%    24.4kB ± 0%     ~     (all equal)
NatSqr/500-8                        36.9kB ± 0%    36.9kB ± 0%     ~     (all equal)
NatSqr/800-8                        69.8kB ± 0%    69.8kB ± 0%     ~     (all equal)
NatSqr/1000-8                       86.0kB ± 0%    86.0kB ± 0%     ~     (all equal)
NatSetBytes/8-8                      0.00B          0.00B          ~     (all equal)
NatSetBytes/24-8                     64.0B ± 0%     64.0B ± 0%     ~     (all equal)
NatSetBytes/128-8                     160B ± 0%      160B ± 0%     ~     (all equal)
NatSetBytes/7-8                      0.00B          0.00B          ~     (all equal)
NatSetBytes/23-8                     64.0B ± 0%     64.0B ± 0%     ~     (all equal)
NatSetBytes/127-8                     160B ± 0%      160B ± 0%     ~     (all equal)
ScanPi-8                            75.4kB ± 0%    75.7kB ± 0%   +0.41%  (p=0.000 n=20+20)
StringPiParallel-8                  20.4kB ± 0%    20.4kB ± 0%     ~     (p=0.223 n=20+20)
Scan/10/Base2-8                      48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100/Base2-8                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/1000/Base2-8                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10000/Base2-8                   48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100000/Base2-8                  48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10/Base8-8                      48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100/Base8-8                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/1000/Base8-8                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10000/Base8-8                   48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100000/Base8-8                  48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10/Base10-8                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100/Base10-8                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/1000/Base10-8                   48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10000/Base10-8                  48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100000/Base10-8                 48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10/Base16-8                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100/Base16-8                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/1000/Base16-8                   48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/10000/Base16-8                  48.0B ± 0%     48.0B ± 0%     ~     (all equal)
Scan/100000/Base16-8                 48.0B ± 0%     48.0B ± 0%     ~     (all equal)
String/10/Base2-8                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
String/100/Base2-8                    352B ± 0%      352B ± 0%     ~     (all equal)
String/1000/Base2-8                 3.46kB ± 0%    3.46kB ± 0%     ~     (all equal)
String/10000/Base2-8                41.0kB ± 0%    41.0kB ± 0%     ~     (all equal)
String/100000/Base2-8                336kB ± 0%     336kB ± 0%     ~     (all equal)
String/10/Base8-8                    16.0B ± 0%     16.0B ± 0%     ~     (all equal)
String/100/Base8-8                    112B ± 0%      112B ± 0%     ~     (all equal)
String/1000/Base8-8                 1.15kB ± 0%    1.15kB ± 0%     ~     (all equal)
String/10000/Base8-8                12.3kB ± 0%    12.3kB ± 0%     ~     (all equal)
String/100000/Base8-8                115kB ± 0%     115kB ± 0%     ~     (all equal)
String/10/Base10-8                   64.0B ± 0%     24.0B ± 0%  -62.50%  (p=0.000 n=20+20)
String/100/Base10-8                   192B ± 0%      192B ± 0%     ~     (all equal)
String/1000/Base10-8                1.95kB ± 0%    1.95kB ± 0%     ~     (all equal)
String/10000/Base10-8               20.0kB ± 0%    20.0kB ± 0%     ~     (p=0.983 n=19+20)
String/100000/Base10-8               210kB ± 1%     211kB ± 1%   +0.82%  (p=0.000 n=19+20)
String/10/Base16-8                   16.0B ± 0%     16.0B ± 0%     ~     (all equal)
String/100/Base16-8                  96.0B ± 0%     96.0B ± 0%     ~     (all equal)
String/1000/Base16-8                  896B ± 0%      896B ± 0%     ~     (all equal)
String/10000/Base16-8               9.47kB ± 0%    9.47kB ± 0%     ~     (all equal)
String/100000/Base16-8              90.1kB ± 0%    90.1kB ± 0%     ~     (all equal)
LeafSize/0-8                        16.9kB ± 0%    16.8kB ± 0%   -0.44%  (p=0.000 n=20+20)
LeafSize/1-8                        22.4kB ± 0%    22.3kB ± 0%   -0.34%  (p=0.000 n=20+19)
LeafSize/2-8                        22.4kB ± 0%    22.3kB ± 0%   -0.34%  (p=0.000 n=20+19)
LeafSize/3-8                        22.4kB ± 0%    22.3kB ± 0%   -0.34%  (p=0.000 n=20+17)
LeafSize/4-8                        22.4kB ± 0%    22.3kB ± 0%   -0.34%  (p=0.000 n=20+19)
LeafSize/5-8                        22.4kB ± 0%    22.3kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/6-8                        22.3kB ± 0%    22.2kB ± 0%   -0.34%  (p=0.000 n=20+20)
LeafSize/7-8                        22.3kB ± 0%    22.2kB ± 0%   -0.35%  (p=0.000 n=20+20)
LeafSize/8-8                        22.3kB ± 0%    22.2kB ± 0%   -0.34%  (p=0.000 n=16+20)
LeafSize/9-8                        22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/10-8                       22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/11-8                       22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/12-8                       22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/13-8                       22.3kB ± 0%    22.2kB ± 0%   -0.34%  (p=0.000 n=20+15)
LeafSize/14-8                       22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/15-8                       22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=20+20)
LeafSize/16-8                       22.3kB ± 0%    22.2kB ± 0%   -0.33%  (p=0.000 n=19+20)
LeafSize/32-8                       22.3kB ± 0%    22.2kB ± 0%   -0.32%  (p=0.000 n=20+20)
LeafSize/64-8                       21.8kB ± 0%    21.7kB ± 0%   -0.33%  (p=0.000 n=18+19)
ProbablyPrime/n=0-8                 15.3kB ± 0%    14.9kB ± 0%   -2.35%  (p=0.000 n=20+20)
ProbablyPrime/n=1-8                 21.0kB ± 0%    20.7kB ± 0%   -1.71%  (p=0.000 n=20+20)
ProbablyPrime/n=5-8                 43.4kB ± 0%    42.9kB ± 0%   -1.20%  (p=0.000 n=20+20)
ProbablyPrime/n=10-8                71.5kB ± 0%    70.7kB ± 0%   -1.01%  (p=0.000 n=19+20)
ProbablyPrime/n=20-8                 127kB ± 0%     126kB ± 0%   -0.88%  (p=0.000 n=20+20)
ProbablyPrime/Lucas-8               3.07kB ± 0%    2.79kB ± 0%   -9.12%  (p=0.000 n=20+20)
ProbablyPrime/MillerRabinBase2-8    12.1kB ± 0%    12.0kB ± 0%   -0.66%  (p=0.000 n=20+20)
FloatSqrt/64-8                        416B ± 0%      360B ± 0%  -13.46%  (p=0.000 n=20+20)
FloatSqrt/128-8                       640B ± 0%      584B ± 0%   -8.75%  (p=0.000 n=20+20)
FloatSqrt/256-8                       512B ± 0%      472B ± 0%   -7.81%  (p=0.000 n=20+20)
FloatSqrt/1000-8                    1.47kB ± 0%    1.43kB ± 0%   -2.72%  (p=0.000 n=20+20)
FloatSqrt/10000-8                   18.2kB ± 0%    18.1kB ± 0%   -0.22%  (p=0.000 n=20+20)
FloatSqrt/100000-8                   204kB ± 0%     204kB ± 0%   -0.02%  (p=0.000 n=20+20)
FloatSqrt/1000000-8                 6.37MB ± 0%    6.37MB ± 0%   -0.00%  (p=0.000 n=19+20)
[Geo mean]                          3.42kB         3.24kB        -5.33%

name                              old allocs/op  new allocs/op  delta
DecimalConversion-8                  1.65k ± 0%     1.65k ± 0%     ~     (all equal)
FloatString/100-8                     8.00 ± 0%      8.00 ± 0%     ~     (all equal)
FloatString/1000-8                    9.00 ± 0%      9.00 ± 0%     ~     (all equal)
FloatString/10000-8                   22.0 ± 0%      22.0 ± 0%     ~     (all equal)
FloatString/100000-8                   136 ± 0%       136 ± 0%     ~     (all equal)
FloatAdd/10-8                         0.00           0.00          ~     (all equal)
FloatAdd/100-8                        0.00           0.00          ~     (all equal)
FloatAdd/1000-8                       0.00           0.00          ~     (all equal)
FloatAdd/10000-8                      0.00           0.00          ~     (all equal)
FloatAdd/100000-8                     0.00           0.00          ~     (all equal)
FloatSub/10-8                         0.00           0.00          ~     (all equal)
FloatSub/100-8                        0.00           0.00          ~     (all equal)
FloatSub/1000-8                       0.00           0.00          ~     (all equal)
FloatSub/10000-8                      0.00           0.00          ~     (all equal)
FloatSub/100000-8                     0.00           0.00          ~     (all equal)
ParseFloatSmallExp-8                   110 ± 0%       130 ± 0%  +18.18%  (p=0.000 n=20+20)
ParseFloatLargeExp-8                   319 ± 0%       371 ± 0%  +16.30%  (p=0.000 n=20+20)
GCD10x10/WithoutXY-8                  2.00 ± 0%      2.00 ± 0%     ~     (all equal)
GCD10x10/WithXY-8                     5.00 ± 0%      6.00 ± 0%  +20.00%  (p=0.000 n=20+20)
GCD10x100/WithoutXY-8                 4.00 ± 0%      4.00 ± 0%     ~     (all equal)
GCD10x100/WithXY-8                    9.00 ± 0%     12.00 ± 0%  +33.33%  (p=0.000 n=20+20)
GCD10x1000/WithoutXY-8                4.00 ± 0%      4.00 ± 0%     ~     (all equal)
GCD10x1000/WithXY-8                   11.0 ± 0%      12.0 ± 0%   +9.09%  (p=0.000 n=20+20)
GCD10x10000/WithoutXY-8               4.00 ± 0%      4.00 ± 0%     ~     (all equal)
GCD10x10000/WithXY-8                  11.0 ± 0%      12.0 ± 0%   +9.09%  (p=0.000 n=20+20)
GCD10x100000/WithoutXY-8              4.00 ± 0%      4.00 ± 0%     ~     (all equal)
GCD10x100000/WithXY-8                 11.0 ± 0%      12.0 ± 0%   +9.09%  (p=0.000 n=20+20)
GCD100x100/WithoutXY-8                6.00 ± 0%     10.00 ± 0%  +66.67%  (p=0.000 n=20+20)
GCD100x100/WithXY-8                   9.00 ± 0%     15.00 ± 0%  +66.67%  (p=0.000 n=20+20)
GCD100x1000/WithoutXY-8               6.00 ± 0%      8.00 ± 0%  +33.33%  (p=0.000 n=20+20)
GCD100x1000/WithXY-8                  12.0 ± 0%      13.0 ± 0%   +8.33%  (p=0.000 n=20+20)
GCD100x10000/WithoutXY-8              6.00 ± 0%      8.00 ± 0%  +33.33%  (p=0.000 n=20+20)
GCD100x10000/WithXY-8                 12.0 ± 0%      13.0 ± 0%   +8.33%  (p=0.000 n=20+20)
GCD100x100000/WithoutXY-8             6.00 ± 0%      8.00 ± 0%  +33.33%  (p=0.000 n=20+20)
GCD100x100000/WithXY-8                12.0 ± 0%      13.0 ± 0%   +8.33%  (p=0.000 n=20+20)
GCD1000x1000/WithoutXY-8              10.0 ± 0%      10.0 ± 0%     ~     (all equal)
GCD1000x1000/WithXY-8                 19.0 ± 0%      20.0 ± 0%   +5.26%  (p=0.000 n=20+20)
GCD1000x10000/WithoutXY-8             8.00 ± 0%      8.00 ± 0%     ~     (all equal)
GCD1000x10000/WithXY-8                26.0 ± 0%      26.0 ± 0%     ~     (all equal)
GCD1000x100000/WithoutXY-8            8.00 ± 0%      8.00 ± 0%     ~     (all equal)
GCD1000x100000/WithXY-8               27.0 ± 0%      27.0 ± 0%     ~     (all equal)
GCD10000x10000/WithoutXY-8            10.0 ± 0%      10.0 ± 0%     ~     (all equal)
GCD10000x10000/WithXY-8               76.0 ± 0%      78.0 ± 0%   +2.63%  (p=0.000 n=20+20)
GCD10000x100000/WithoutXY-8           8.00 ± 0%      8.00 ± 0%     ~     (all equal)
GCD10000x100000/WithXY-8               174 ± 0%       174 ± 0%     ~     (all equal)
GCD100000x100000/WithoutXY-8          10.0 ± 0%      10.0 ± 0%     ~     (all equal)
GCD100000x100000/WithXY-8              645 ± 0%       647 ± 0%   +0.31%  (p=0.000 n=20+20)
Hilbert-8                            14.1k ± 0%     14.3k ± 0%   +0.92%  (p=0.000 n=20+20)
Binomial-8                            38.0 ± 0%      38.0 ± 0%     ~     (all equal)
QuoRem-8                              0.00           0.00          ~     (all equal)
Exp-8                                 21.0 ± 0%      21.0 ± 0%     ~     (all equal)
Exp2-8                                22.0 ± 0%      22.0 ± 0%     ~     (all equal)
Bitset-8                              0.00           0.00          ~     (all equal)
BitsetNeg-8                           0.00           0.00          ~     (all equal)
BitsetOrig-8                          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
BitsetNegOrig-8                       2.00 ± 0%      2.00 ± 0%     ~     (all equal)
ModSqrt225_Tonelli-8                  85.0 ± 0%      86.0 ± 0%   +1.18%  (p=0.000 n=20+20)
ModSqrt225_3Mod4-8                    25.0 ± 0%      25.0 ± 0%     ~     (all equal)
ModSqrt231_Tonelli-8                  80.0 ± 0%      80.0 ± 0%     ~     (all equal)
ModSqrt231_5Mod8-8                    32.0 ± 0%      32.0 ± 0%     ~     (all equal)
ModInverse-8                          11.0 ± 0%      11.0 ± 0%     ~     (all equal)
Sqrt-8                                13.0 ± 0%      13.0 ± 0%     ~     (all equal)
IntSqr/1-8                            0.00           0.00          ~     (all equal)
IntSqr/2-8                            0.00           0.00          ~     (all equal)
IntSqr/3-8                            0.00           0.00          ~     (all equal)
IntSqr/5-8                            0.00           0.00          ~     (all equal)
IntSqr/8-8                            0.00           0.00          ~     (all equal)
IntSqr/10-8                           0.00           0.00          ~     (all equal)
IntSqr/20-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
IntSqr/30-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
IntSqr/50-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
IntSqr/80-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
IntSqr/100-8                          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
IntSqr/200-8                          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
IntSqr/300-8                          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
IntSqr/500-8                          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
IntSqr/800-8                          9.00 ± 0%      9.00 ± 0%     ~     (all equal)
IntSqr/1000-8                         9.00 ± 0%      9.00 ± 0%     ~     (all equal)
Div/20/10-8                           0.00           0.00          ~     (all equal)
Div/200/100-8                         0.00           0.00          ~     (all equal)
Div/2000/1000-8                       0.00           0.00          ~     (all equal)
Div/20000/10000-8                     0.00           0.00          ~     (all equal)
Div/200000/100000-8                   0.00           0.00          ~     (all equal)
Mul-8                                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
ZeroShifts/Shl-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
ZeroShifts/ShlSame-8                  0.00           0.00          ~     (all equal)
ZeroShifts/Shr-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
ZeroShifts/ShrSame-8                  0.00           0.00          ~     (all equal)
Exp3Power/0x10-8                      4.00 ± 0%      4.00 ± 0%     ~     (all equal)
Exp3Power/0x40-8                      4.00 ± 0%      4.00 ± 0%     ~     (all equal)
Exp3Power/0x100-8                     5.00 ± 0%      5.00 ± 0%     ~     (all equal)
Exp3Power/0x400-8                     7.00 ± 0%      7.00 ± 0%     ~     (all equal)
Exp3Power/0x1000-8                    11.0 ± 0%      11.0 ± 0%     ~     (all equal)
Exp3Power/0x4000-8                    15.0 ± 0%      15.0 ± 0%     ~     (all equal)
Exp3Power/0x10000-8                   29.0 ± 0%      29.0 ± 0%     ~     (all equal)
Exp3Power/0x40000-8                    140 ± 0%       140 ± 0%     ~     (all equal)
Exp3Power/0x100000-8                 1.12k ± 0%     1.12k ± 0%     ~     (all equal)
Exp3Power/0x400000-8                 9.88k ± 0%     9.88k ± 0%     ~     (p=0.747 n=17+19)
Fibo-8                                 739 ± 0%       743 ± 0%   +0.54%  (p=0.000 n=20+20)
NatSqr/1-8                            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSqr/2-8                            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSqr/3-8                            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSqr/5-8                            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSqr/8-8                            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSqr/10-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSqr/20-8                           2.00 ± 0%      2.00 ± 0%     ~     (all equal)
NatSqr/30-8                           2.00 ± 0%      2.00 ± 0%     ~     (all equal)
NatSqr/50-8                           2.00 ± 0%      2.00 ± 0%     ~     (all equal)
NatSqr/80-8                           2.00 ± 0%      2.00 ± 0%     ~     (all equal)
NatSqr/100-8                          2.00 ± 0%      2.00 ± 0%     ~     (all equal)
NatSqr/200-8                          2.00 ± 0%      2.00 ± 0%     ~     (all equal)
NatSqr/300-8                          4.00 ± 0%      4.00 ± 0%     ~     (all equal)
NatSqr/500-8                          4.00 ± 0%      4.00 ± 0%     ~     (all equal)
NatSqr/800-8                          10.0 ± 0%      10.0 ± 0%     ~     (all equal)
NatSqr/1000-8                         10.0 ± 0%      10.0 ± 0%     ~     (all equal)
NatSetBytes/8-8                       0.00           0.00          ~     (all equal)
NatSetBytes/24-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSetBytes/128-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSetBytes/7-8                       0.00           0.00          ~     (all equal)
NatSetBytes/23-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
NatSetBytes/127-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
ScanPi-8                              60.0 ± 0%      61.0 ± 0%   +1.67%  (p=0.000 n=20+20)
StringPiParallel-8                    24.0 ± 0%      24.0 ± 0%     ~     (all equal)
Scan/10/Base2-8                       1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100/Base2-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/1000/Base2-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10000/Base2-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100000/Base2-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10/Base8-8                       1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100/Base8-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/1000/Base8-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10000/Base8-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100000/Base8-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10/Base10-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100/Base10-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/1000/Base10-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10000/Base10-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100000/Base10-8                  1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10/Base16-8                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100/Base16-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/1000/Base16-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/10000/Base16-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Scan/100000/Base16-8                  1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/10/Base2-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/100/Base2-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/1000/Base2-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/10000/Base2-8                  1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/100000/Base2-8                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/10/Base8-8                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/100/Base8-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/1000/Base8-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/10000/Base8-8                  1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/100000/Base8-8                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/10/Base10-8                    2.00 ± 0%      2.00 ± 0%     ~     (all equal)
String/100/Base10-8                   2.00 ± 0%      2.00 ± 0%     ~     (all equal)
String/1000/Base10-8                  3.00 ± 0%      3.00 ± 0%     ~     (all equal)
String/10000/Base10-8                 3.00 ± 0%      3.00 ± 0%     ~     (all equal)
String/100000/Base10-8                3.00 ± 0%      3.00 ± 0%     ~     (all equal)
String/10/Base16-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/100/Base16-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/1000/Base16-8                  1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/10000/Base16-8                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
String/100000/Base16-8                1.00 ± 0%      1.00 ± 0%     ~     (all equal)
LeafSize/0-8                          10.0 ± 0%      10.0 ± 0%     ~     (all equal)
LeafSize/1-8                          13.0 ± 0%      13.0 ± 0%     ~     (all equal)
LeafSize/2-8                          13.0 ± 0%      13.0 ± 0%     ~     (all equal)
LeafSize/3-8                          13.0 ± 0%      13.0 ± 0%     ~     (all equal)
LeafSize/4-8                          13.0 ± 0%      13.0 ± 0%     ~     (all equal)
LeafSize/5-8                          13.0 ± 0%      13.0 ± 0%     ~     (all equal)
LeafSize/6-8                          12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/7-8                          12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/8-8                          12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/9-8                          12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/10-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/11-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/12-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/13-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/14-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/15-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/16-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/32-8                         12.0 ± 0%      12.0 ± 0%     ~     (all equal)
LeafSize/64-8                         11.0 ± 0%      11.0 ± 0%     ~     (all equal)
ProbablyPrime/n=0-8                   52.0 ± 0%      52.0 ± 0%     ~     (all equal)
ProbablyPrime/n=1-8                   73.0 ± 0%      73.0 ± 0%     ~     (all equal)
ProbablyPrime/n=5-8                    157 ± 0%       157 ± 0%     ~     (all equal)
ProbablyPrime/n=10-8                   262 ± 0%       262 ± 0%     ~     (all equal)
ProbablyPrime/n=20-8                   472 ± 0%       472 ± 0%     ~     (all equal)
ProbablyPrime/Lucas-8                 22.0 ± 0%      22.0 ± 0%     ~     (all equal)
ProbablyPrime/MillerRabinBase2-8      29.0 ± 0%      29.0 ± 0%     ~     (all equal)
FloatSqrt/64-8                        9.00 ± 0%     10.00 ± 0%  +11.11%  (p=0.000 n=20+20)
FloatSqrt/128-8                       12.0 ± 0%      13.0 ± 0%   +8.33%  (p=0.000 n=20+20)
FloatSqrt/256-8                       8.00 ± 0%      8.00 ± 0%     ~     (all equal)
FloatSqrt/1000-8                      9.00 ± 0%      9.00 ± 0%     ~     (all equal)
FloatSqrt/10000-8                     14.0 ± 0%      14.0 ± 0%     ~     (all equal)
FloatSqrt/100000-8                    33.0 ± 0%      33.0 ± 0%     ~     (all equal)
FloatSqrt/1000000-8                  1.16k ± 0%     1.16k ± 0%     ~     (all equal)
[Geo mean]                            6.62           6.76        +2.09%

Change-Id: Id9df4157cac1e07721e35cff7fcdefe60703873a
Reviewed-on: https://go-review.googlesource.com/c/150999
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-11-28 17:38:46 +00:00
Brian Kessler 979d9027ae math/bits: define Div to panic when y<=hi
Div panics when y<=hi because either the quotient overflows
the size of the output or division by zero occurs when y==0.
This provides a uniform behavior for all implementations.

Fixes #28316

Change-Id: If23aeb10e0709ee1a60b7d614afc9103d674a980
Reviewed-on: https://go-review.googlesource.com/c/149517
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-11-27 05:04:33 +00:00
Brian Kessler ead5d1e316 math/bits: panic when y<=hi in Div
Explicitly check for divide-by-zero/overflow and panic with the appropriate
runtime error.  The additional checks have basically no effect on performance
since the branch is easily predicted.

name     old time/op  new time/op  delta
Div-4    53.9ns ± 1%  53.0ns ± 1%  -1.59%  (p=0.016 n=4+5)
Div32-4  17.9ns ± 0%  18.4ns ± 0%  +2.56%  (p=0.008 n=5+5)
Div64-4  53.5ns ± 0%  53.3ns ± 0%    ~     (p=0.095 n=5+5)

Updates #28316

Change-Id: I36297ee9946cbbc57fefb44d1730283b049ecf57
Reviewed-on: https://go-review.googlesource.com/c/144377
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-27 05:04:14 +00:00
Brad Fitzpatrick 3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
Robert Griesemer c86d464734 math/big: shallow copies of Int/Rat/Float are not supported (documentation)
Fixes #28423.

Change-Id: Ie57ade565d0407a4bffaa86fb4475ff083168e79
Reviewed-on: https://go-review.googlesource.com/c/145537
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-29 18:23:31 +00:00
hearot f28191340e math/big: fix a formula used as documentation
The function documentation was wrong, it was using a wrong parameter. This change
replaces it with the right parameter.

The wrong formula was: q = (u1<<_W + u0 - r)/y
The function has got a parameter "v" (of type Word), not a parameter "y".
So, the right formula is: q = (u1<<_W + u0 - r)/v

Fixes #28444

Change-Id: I82e57ba014735a9fdb6262874ddf498754d30d33
Reviewed-on: https://go-review.googlesource.com/c/145280
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-28 16:58:20 +00:00
Keith Randall 899f3a2892 cmd/compile: intrinsify math/bits.Add on amd64
name             old time/op  new time/op  delta
Add-8            1.11ns ± 0%  1.18ns ± 0%   +6.31%  (p=0.029 n=4+4)
Add32-8          1.02ns ± 0%  1.02ns ± 1%     ~     (p=0.333 n=4+5)
Add64-8          1.11ns ± 1%  1.17ns ± 0%   +5.79%  (p=0.008 n=5+5)
Add64multiple-8  4.35ns ± 1%  0.86ns ± 0%  -80.22%  (p=0.000 n=5+4)

The individual ops are a bit slower (but still very fast).
Using the ops in carry chains is very fast.

Update #28273

Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5
Reviewed-on: https://go-review.googlesource.com/c/144257
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:00 +00:00
Brian Kessler 127c51e48c math/bits: correct BenchmarkSub64
Previously, the benchmark was measuring Add64 instead of Sub64.

Change-Id: I0cf30935c8a4728bead9868834377aae0b34f008
Reviewed-on: https://go-review.googlesource.com/c/144380
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-24 14:53:19 +00:00
Igor Zhilianin f90e89e675 all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev: d4cfc41a55
GitHub-Pull-Request: golang/go#28049
Reviewed-on: https://go-review.googlesource.com/c/140299
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-06 15:40:03 +00:00
Plekhanov Maxim 47e71f3b69 math: use Abs in Pow rather than if x < 0 { x = -x }
name     old time/op  new time/op  delta
PowInt   55.7ns ± 1%  53.4ns ± 2%  -4.15%  (p=0.000 n=9+9)
PowFrac   133ns ± 1%   133ns ± 2%    ~     (p=0.587 n=8+9)

Change-Id: Ica0f4c2cbd554f2195c6d1762ed26742ff8e3924
Reviewed-on: https://go-review.googlesource.com/c/85375
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-04 17:33:04 +00:00
Plekhanov Maxim 497d24178f math: use Abs in Mod rather than if x < 0 { x = -x}
goos: linux
goarch: amd64
pkg: math

name  old time/op  new time/op  delta
Mod   64.7ns ± 2%  63.7ns ± 2%  -1.52%  (p=0.003 n=8+10)

Change-Id: I851bec0fd6c223dab73e4a680b7393d49e81a0e8
Reviewed-on: https://go-review.googlesource.com/c/85095
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-04 17:32:44 +00:00
Zhou Peng b8ac64a581 all: this big patch remove whitespace from assembly files
Don't worry, this patch just remove trailing whitespace from
assembly files, and does not touch any logical changes.

Change-Id: Ia724ac0b1abf8bc1e41454bdc79289ef317c165d
Reviewed-on: https://go-review.googlesource.com/c/113595
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-03 15:28:51 +00:00
fanzha02 a19a83c8ef cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on arm64
Use float <-> int register moves without conversion instead of stores
and loads to move float <-> int values.

Math package benchmark results.
name                 old time/op  new time/op  delta
Acosh                 153ns ± 0%   147ns ± 0%   -3.92%  (p=0.000 n=10+10)
Asinh                 183ns ± 0%   177ns ± 0%   -3.28%  (p=0.000 n=10+10)
Atanh                 157ns ± 0%   155ns ± 0%   -1.27%  (p=0.000 n=10+10)
Atan2                 118ns ± 0%   117ns ± 1%   -0.59%  (p=0.003 n=10+10)
Cbrt                  119ns ± 0%   114ns ± 0%   -4.20%  (p=0.000 n=10+10)
Copysign             7.51ns ± 0%  6.51ns ± 0%  -13.32%  (p=0.000 n=9+10)
Cos                  73.1ns ± 0%  70.6ns ± 0%   -3.42%  (p=0.000 n=10+10)
Cosh                  119ns ± 0%   121ns ± 0%   +1.68%  (p=0.000 n=10+9)
ExpGo                 154ns ± 0%   149ns ± 0%   -3.05%  (p=0.000 n=9+10)
Expm1                 101ns ± 0%    99ns ± 0%   -1.88%  (p=0.000 n=10+10)
Exp2Go                150ns ± 0%   146ns ± 0%   -2.67%  (p=0.000 n=10+10)
Abs                  7.01ns ± 0%  6.01ns ± 0%  -14.27%  (p=0.000 n=10+9)
Mod                   234ns ± 0%   212ns ± 0%   -9.40%  (p=0.000 n=9+10)
Frexp                34.5ns ± 0%  30.0ns ± 0%  -13.04%  (p=0.000 n=10+10)
Gamma                 112ns ± 0%   111ns ± 0%   -0.89%  (p=0.000 n=10+10)
Hypot                73.6ns ± 0%  68.6ns ± 0%   -6.79%  (p=0.000 n=10+10)
HypotGo              77.1ns ± 0%  72.1ns ± 0%   -6.49%  (p=0.000 n=10+10)
Ilogb                31.0ns ± 0%  28.0ns ± 0%   -9.68%  (p=0.000 n=10+10)
J0                    437ns ± 0%   434ns ± 0%   -0.62%  (p=0.000 n=10+10)
J1                    433ns ± 0%   431ns ± 0%   -0.46%  (p=0.000 n=10+10)
Jn                    927ns ± 0%   922ns ± 0%   -0.54%  (p=0.000 n=10+10)
Ldexp                41.5ns ± 0%  37.0ns ± 0%  -10.84%  (p=0.000 n=9+10)
Log                   124ns ± 0%   118ns ± 0%   -4.84%  (p=0.000 n=10+9)
Logb                 34.0ns ± 0%  32.0ns ± 0%   -5.88%  (p=0.000 n=10+10)
Log1p                 110ns ± 0%   108ns ± 0%   -1.82%  (p=0.000 n=10+10)
Log10                 136ns ± 0%   132ns ± 0%   -2.94%  (p=0.000 n=10+10)
Log2                 51.6ns ± 0%  47.1ns ± 0%   -8.72%  (p=0.000 n=10+10)
Nextafter32          33.0ns ± 0%  30.5ns ± 0%   -7.58%  (p=0.000 n=10+10)
Nextafter64          29.0ns ± 0%  26.5ns ± 0%   -8.62%  (p=0.000 n=10+10)
PowInt                169ns ± 0%   160ns ± 0%   -5.33%  (p=0.000 n=10+10)
PowFrac               375ns ± 0%   361ns ± 0%   -3.73%  (p=0.000 n=10+10)
RoundToEven          14.0ns ± 0%  12.5ns ± 0%  -10.71%  (p=0.000 n=10+10)
Remainder             206ns ± 0%   192ns ± 0%   -6.80%  (p=0.000 n=10+9)
Signbit              6.01ns ± 0%  5.51ns ± 0%   -8.32%  (p=0.000 n=10+9)
Sin                  70.1ns ± 0%  69.6ns ± 0%   -0.71%  (p=0.000 n=10+10)
Sincos               99.1ns ± 0%  99.6ns ± 0%   +0.50%  (p=0.000 n=9+10)
SqrtGoLatency         178ns ± 0%   146ns ± 0%  -17.70%  (p=0.000 n=8+10)
SqrtPrime            9.19µs ± 0%  9.20µs ± 0%   +0.01%  (p=0.000 n=9+9)
Tanh                  125ns ± 1%   127ns ± 0%   +1.36%  (p=0.000 n=10+10)
Y0                    428ns ± 0%   426ns ± 0%   -0.47%  (p=0.000 n=10+10)
Y1                    431ns ± 0%   429ns ± 0%   -0.46%  (p=0.000 n=10+9)
Yn                    906ns ± 0%   901ns ± 0%   -0.55%  (p=0.000 n=10+10)
Float64bits          4.50ns ± 0%  3.50ns ± 0%  -22.22%  (p=0.000 n=10+10)
Float64frombits      4.00ns ± 0%  3.50ns ± 0%  -12.50%  (p=0.000 n=10+9)
Float32bits          4.50ns ± 0%  3.50ns ± 0%  -22.22%  (p=0.002 n=8+10)
Float32frombits      4.00ns ± 0%  3.50ns ± 0%  -12.50%  (p=0.000 n=10+10)

Change-Id: Iba829e15d5624962fe0c699139ea783efeefabc2
Reviewed-on: https://go-review.googlesource.com/129715
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-17 20:49:04 +00:00
Brian Kessler 13de5e7f7f math/bits: add extended precision Add, Sub, Mul, Div
Port math/big pure go versions of add-with-carry, subtract-with-borrow,
full-width multiply, and full-width divide.

Updates #24813

Change-Id: Ifae5d2f6ee4237137c9dcba931f69c91b80a4b1c
Reviewed-on: https://go-review.googlesource.com/123157
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-11 17:54:20 +00:00
Eric Ponce ded9411580 math: add Round and RoundToEven examples
Change-Id: Ibef5f96ea588d17eac1c96ee3992e01943ba0fef
Reviewed-on: https://go-review.googlesource.com/131496
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-08-28 05:22:41 +00:00
Brian Kessler a3381faf81 math/big: streamline divLarge initialization
The divLarge code contained "todo"s about avoiding alias
and clear calls in the initialization of variables.  By
rearranging the order of initialization and always using
an auxiliary variable for the shifted divisor, all of these
calls can be safely avoided.  On average, normalizing
the divisor (shift>0) is required 31/32 or 63/64 of the
time.  If one always performs the shift into an auxiliary
variable first, this avoids the need to check for aliasing of
vIn in the output variables u and z.  The remainder u is
initialized via a left shift of uIn and thus needs no
alias check against uIn.  Since uIn and vIn were both used,
z needs no alias checks except against u which is used for
storage of the remainder. This change has a minimal impact
on performance (see below), but cleans up the initialization
code and eliminates the "todo"s.

name                 old time/op  new time/op  delta
Div/20/10-4          86.7ns ± 6%  85.7ns ± 5%    ~     (p=0.841 n=5+5)
Div/200/100-4         523ns ± 5%   502ns ± 3%  -4.13%  (p=0.024 n=5+5)
Div/2000/1000-4      2.55µs ± 3%  2.59µs ± 5%    ~     (p=0.548 n=5+5)
Div/20000/10000-4    80.4µs ± 4%  80.0µs ± 2%    ~     (p=1.000 n=5+5)
Div/200000/100000-4  6.43ms ± 6%  6.35ms ± 4%    ~     (p=0.548 n=5+5)

Fixes #22928

Change-Id: I30d8498ef1cf8b69b0f827165c517bc25a5c32d7
Reviewed-on: https://go-review.googlesource.com/130775
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-08-22 22:54:01 +00:00
Brian Kessler 3fd62ce910 math/big: optimize multiplication by 2 and 1/2 in float Sqrt
The Sqrt code previously used explicit constants for 2 and 1/2.  This change
replaces multiplication by these constants with increment and decrement of
the floating point exponent directly.  This improves performance by ~7-10%
for small inputs and minimal improvement for large inputs.

name                 old time/op    new time/op    delta
FloatSqrt/64-4         1.39µs ± 0%    1.29µs ± 3%   -7.01%  (p=0.016 n=4+5)
FloatSqrt/128-4        2.84µs ± 0%    2.60µs ± 1%   -8.33%  (p=0.008 n=5+5)
FloatSqrt/256-4        3.24µs ± 1%    2.91µs ± 2%  -10.00%  (p=0.008 n=5+5)
FloatSqrt/1000-4       7.42µs ± 1%    6.74µs ± 0%   -9.16%  (p=0.008 n=5+5)
FloatSqrt/10000-4      65.9µs ± 1%    65.3µs ± 4%     ~     (p=0.310 n=5+5)
FloatSqrt/100000-4     1.57ms ± 8%    1.52ms ± 1%     ~     (p=0.111 n=5+4)
FloatSqrt/1000000-4     127ms ± 1%     126ms ± 1%     ~     (p=0.690 n=5+5)

Change-Id: Id81ac842a9d64981e001c4ca3ff129eebd227593
Reviewed-on: https://go-review.googlesource.com/130835
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-08-22 21:02:21 +00:00
Ian Lance Taylor 1ae2eed0b2 math: test for pos/neg zero return of Ceil/Floor/Trunc
Ceil and Trunc of -0.2 return -0, not +0, but we didn't test that.

Updates #23647

Change-Id: Idbd4699376abfb4ca93f16c73c114d610d86a9f2
Reviewed-on: https://go-review.googlesource.com/91335
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-21 22:31:37 +00:00
Iskander Sharipov 0fbaf6ca8b math,net: omit explicit true tag expr in switch
Performed `switch true {}` => `switch {}` replacement.

Found using https://go-critic.github.io/overview.html#switchTrue-ref

Change-Id: Ib39ea98531651966a5a56b7bd729b46e4eeb7f7c
Reviewed-on: https://go-review.googlesource.com/123378
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-20 22:15:59 +00:00
Michael Munday edae0ff8c1 math: use s390x mnemonics rather than binary encodings
TMLL, LGDR and LDGR have all been added to the Go assembler
previously, so we don't need to encode them using WORD and BYTE
directives anymore. This is purely a cosmetic change, it does not
change the contents of any object files.

Change-Id: I93f815b91be310858297d8a0dc9e6d8e3f09dd65
Reviewed-on: https://go-review.googlesource.com/129895
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-20 17:42:08 +00:00
Benjamin Cable 669ac1228a math/rand: improve package documentation
Notify readers that interval notation is used.
Fixes: #26765

Change-Id: Id02a7fcffbf41699e85631badeee083f5d4b2201
Reviewed-on: https://go-review.googlesource.com/127549
Reviewed-by: Rob Pike <r@golang.org>
2018-08-03 23:08:42 +00:00
Keith Randall 51ddeb9965 math: add tests for erf and erfc
Test large but not infinite arguments.

This CL adds a test which breaks s390x.  Don't submit until
a fix for that is figured out.

Update #26477

Change-Id: Ic86739fe3554e87d7f8e15482875c198fcf1d59c
Reviewed-on: https://go-review.googlesource.com/125641
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-03 03:38:52 +00:00