Go documentation style for boolean funcs is to say:
// Foo reports whether ...
func Foo() bool
(rather than "returns true if")
This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.
(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)
Created with:
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)
Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The function documentation was wrong, it was using a wrong parameter. This change
replaces it with the right parameter.
The wrong formula was: q = (u1<<_W + u0 - r)/y
The function has got a parameter "v" (of type Word), not a parameter "y".
So, the right formula is: q = (u1<<_W + u0 - r)/v
Fixes#28444
Change-Id: I82e57ba014735a9fdb6262874ddf498754d30d33
Reviewed-on: https://go-review.googlesource.com/c/145280
Reviewed-by: Robert Griesemer <gri@golang.org>
goos: linux
goarch: amd64
pkg: math
name old time/op new time/op delta
Mod 64.7ns ± 2% 63.7ns ± 2% -1.52% (p=0.003 n=8+10)
Change-Id: I851bec0fd6c223dab73e4a680b7393d49e81a0e8
Reviewed-on: https://go-review.googlesource.com/c/85095
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Don't worry, this patch just remove trailing whitespace from
assembly files, and does not touch any logical changes.
Change-Id: Ia724ac0b1abf8bc1e41454bdc79289ef317c165d
Reviewed-on: https://go-review.googlesource.com/c/113595
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Port math/big pure go versions of add-with-carry, subtract-with-borrow,
full-width multiply, and full-width divide.
Updates #24813
Change-Id: Ifae5d2f6ee4237137c9dcba931f69c91b80a4b1c
Reviewed-on: https://go-review.googlesource.com/123157
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: Ibef5f96ea588d17eac1c96ee3992e01943ba0fef
Reviewed-on: https://go-review.googlesource.com/131496
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The divLarge code contained "todo"s about avoiding alias
and clear calls in the initialization of variables. By
rearranging the order of initialization and always using
an auxiliary variable for the shifted divisor, all of these
calls can be safely avoided. On average, normalizing
the divisor (shift>0) is required 31/32 or 63/64 of the
time. If one always performs the shift into an auxiliary
variable first, this avoids the need to check for aliasing of
vIn in the output variables u and z. The remainder u is
initialized via a left shift of uIn and thus needs no
alias check against uIn. Since uIn and vIn were both used,
z needs no alias checks except against u which is used for
storage of the remainder. This change has a minimal impact
on performance (see below), but cleans up the initialization
code and eliminates the "todo"s.
name old time/op new time/op delta
Div/20/10-4 86.7ns ± 6% 85.7ns ± 5% ~ (p=0.841 n=5+5)
Div/200/100-4 523ns ± 5% 502ns ± 3% -4.13% (p=0.024 n=5+5)
Div/2000/1000-4 2.55µs ± 3% 2.59µs ± 5% ~ (p=0.548 n=5+5)
Div/20000/10000-4 80.4µs ± 4% 80.0µs ± 2% ~ (p=1.000 n=5+5)
Div/200000/100000-4 6.43ms ± 6% 6.35ms ± 4% ~ (p=0.548 n=5+5)
Fixes#22928
Change-Id: I30d8498ef1cf8b69b0f827165c517bc25a5c32d7
Reviewed-on: https://go-review.googlesource.com/130775
Reviewed-by: Robert Griesemer <gri@golang.org>
Ceil and Trunc of -0.2 return -0, not +0, but we didn't test that.
Updates #23647
Change-Id: Idbd4699376abfb4ca93f16c73c114d610d86a9f2
Reviewed-on: https://go-review.googlesource.com/91335
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
TMLL, LGDR and LDGR have all been added to the Go assembler
previously, so we don't need to encode them using WORD and BYTE
directives anymore. This is purely a cosmetic change, it does not
change the contents of any object files.
Change-Id: I93f815b91be310858297d8a0dc9e6d8e3f09dd65
Reviewed-on: https://go-review.googlesource.com/129895
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Test large but not infinite arguments.
This CL adds a test which breaks s390x. Don't submit until
a fix for that is figured out.
Update #26477
Change-Id: Ic86739fe3554e87d7f8e15482875c198fcf1d59c
Reviewed-on: https://go-review.googlesource.com/125641
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The existing implementation produces correct results with a wide range of inputs,
but invalid results asymptotically. With this change we ensure correct asymptotic results
on s390x
Fixes#26477
Change-Id: I760c1f8177f7cab2d7622ab9a926dfb1f8113b49
Reviewed-on: https://go-review.googlesource.com/127119
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For modular exponentiation, negative exponents can be handled using
the following relation.
for y < 0: x**y mod m == (x**(-1))**|y| mod m
First compute ModInverse(x, m) and then compute the exponentiation
with the absolute value of the exponent. Non-modular exponentiation
with a negative exponent still returns 1.
Fixes#25865
Change-Id: I2a35986a24794b48e549c8de935ac662d217d8a0
Reviewed-on: https://go-review.googlesource.com/118562
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Handling of sign bit as defined by IEEE 754-2008, section 6.3:
When the sum of two operands with opposite signs (or the difference of
two operands with like signs) is exactly zero, the sign of that sum (or
difference) shall be +0 in all rounding-direction attributes except
roundTowardNegative; under that attribute, the sign of an exact zero
sum (or difference) shall be −0. However, x+x = x−(−x) retains the same
sign as x even when x is zero.
This change handles the special case of Add/Sub resulting in exactly zero
when the rounding mode is ToNegativeInf setting the sign bit accordingly.
Fixes#25798
Change-Id: I4d0715fa3c3e4a3d8a4d7861dc1d6423c8b1c68c
Reviewed-on: https://go-review.googlesource.com/117495
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Each URL was manually verified to ensure it did not serve up incorrect
content.
Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
For primes congruent to 5 mod 8 there is a simple deterministic
method for calculating the modular square root due to Atkin,
using one exponentiation and 4 multiplications.
A. Atkin. Probabilistic primality testing, summary by F. Morain.
Research Report 1779, INRIA, pages 159–163, 1992.
This increases the speed of modular square roots for these primes
considerably.
name old time/op new time/op delta
ModSqrt231_5Mod8-4 1.03ms ± 2% 0.36ms ± 5% -65.06% (p=0.008 n=5+5)
Change-Id: I024f6e514bbca8d634218983117db2afffe615fe
Reviewed-on: https://go-review.googlesource.com/99615
Reviewed-by: Robert Griesemer <gri@golang.org>
This commit adds the wasm architecture to the math package.
Updates #18892
Change-Id: I5cc38552a31b193d35fb81ae87600a76b8b9e9b5
Reviewed-on: https://go-review.googlesource.com/106996
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This makes math/bits not have any explicit imports even
when compiling tests and thereby avoids import cycles when
dependencies of testing want to import math/bits.
Change-Id: I95eccae2f5c4310e9b18124abfa85212dfbd9daa
Reviewed-on: https://go-review.googlesource.com/110479
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently, there is no check for a negative modulus in ModInverse.
Negative moduli are passed internally to GCD, which returns 0 for
negative arguments. Mod is symmetric with respect to negative moduli,
so the calculation can be done by just negating the modulus before
passing the arguments to GCD.
Fixes#24949
Change-Id: Ifd1e64c9b2343f0489c04ab65504e73a623378c7
Reviewed-on: https://go-review.googlesource.com/108115
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
Currently, the behavior of z.ModInverse(g, n) is undefined
when g and n are not relatively prime. In that case, no
ModInverse exists which can be easily checked during the
computation of the ModInverse. Because the ModInverse does
not indicate whether the inverse exists, there are reimplementations
of a "checked" ModInverse in crypto/rsa. This change removes the
undefined behavior. If the ModInverse does not exist, the receiver z
is unchanged and the return value is nil. This matches the behavior of
ModSqrt for the case where the square root does not exist.
name old time/op new time/op delta
ModInverse-4 2.40µs ± 4% 2.22µs ± 0% -7.74% (p=0.016 n=5+4)
name old alloc/op new alloc/op delta
ModInverse-4 1.36kB ± 0% 1.17kB ± 0% -14.12% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
ModInverse-4 10.0 ± 0% 9.0 ± 0% -10.00% (p=0.008 n=5+5)
Fixes#24922
Change-Id: If7f9d491858450bdb00f1e317152f02493c9c8a8
Reviewed-on: https://go-review.googlesource.com/108996
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
One might try to implement the Mod or Remainder function with the expression
x - TRUNC(x/y + 0.5)*y, but in fact this method is wrong, because the rounding
of (x/y + 0.5) to initialize the argument of TRUNC may lose too much precision.
However, the current test cases can not detect this error. This CL adds two
test cases to prevent people from continuing to do such attempts.
Change-Id: I6690f5cffb21bf8ae06a314b7a45cafff8bcee13
Reviewed-on: https://go-review.googlesource.com/84275
Reviewed-by: Robert Griesemer <gri@golang.org>
Made constant names more idiomatic,
moved some constants to function seedrand,
and found better name for _M.
Change-Id: I192172f398378bef486a5bbceb6ba86af48ebcc9
Reviewed-on: https://go-review.googlesource.com/107135
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Updates #22830
Due to not checking if the output slices alias in divLarge,
calls of the form z.div(z, x, y) caused the slice z
to attempt to be used to store both the quotient and the
remainder of the division. CL 78995 applies an alias
check to correct that error. This CL cleans up the
additional div calls that attempt to supply the same slice
to hold both the quotient and remainder.
Note that the call in expNN was responsible for the reported
error in r.Exp(x, 1, m) when r was initialized to a non-zero value.
The second instance in expNNMontgomery did not result in an error
due to the size of the arguments.
// RR = 2**(2*_W*len(m)) mod m
RR := nat(nil).setWord(1)
zz := nat(nil).shl(RR, uint(2*numWords*_W))
_, RR = RR.div(RR, zz, m)
Specifically,
cap(RR) == 5 after setWord(1) due to const e = 4 in z.make(1)
len(zz) == 2*len(m) + 1 after shifting left, numWords = len(m)
Reusing the backing array for z and z2 in div was only triggered if
cap(RR) >= len(zz) + 1 and len(m) > 1 so that divLarge was called.
But, 5 < 2*len(m) + 2 if len(m) > 1, so new arrays were allocated
and the error was never triggered in this case.
Change-Id: Iedac80dbbde13216c94659e84d28f6f4be3aaf24
Reviewed-on: https://go-review.googlesource.com/81055
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The go/printer (and thus gofmt) uses a heuristic to determine
whether to break alignment between elements of an expression
list which is spread across multiple lines. The heuristic only
kicked in if the entry sizes (character length) was above a
certain threshold (20) and the ratio between the previous and
current entry size was above a certain value (4).
This heuristic worked reasonably most of the time, but also
led to unfortunate breaks in many cases where a single entry
was suddenly much smaller (or larger) then the previous one.
The behavior of gofmt was sufficiently mysterious in some of
these situations that many issues were filed against it.
The simplest solution to address this problem is to remove
the heuristic altogether and have a programmer introduce
empty lines to force different alignments if it improves
readability. The problem with that approach is that the
places where it really matters, very long tables with many
(hundreds, or more) entries, may be machine-generated and
not "post-processed" by a human (e.g., unicode/utf8/tables.go).
If a single one of those entries is overlong, the result
would be that the alignment would force all comments or
values in key:value pairs to be adjusted to that overlong
value, making the table hard to read (e.g., that entry may
not even be visible on screen and all other entries seem
spaced out too wide).
Instead, we opted for a slightly improved heuristic that
behaves much better for "normal", human-written code.
1) The threshold is increased from 20 to 40. This disables
the heuristic for many common cases yet even if the alignment
is not "ideal", 40 is not that many characters per line with
todays screens, making it very likely that the entire line
remains "visible" in an editor.
2) Changed the heuristic to not simply look at the size ratio
between current and previous line, but instead considering the
geometric mean of the sizes of the previous (aligned) lines.
This emphasizes the "overall picture" of the previous lines,
rather than a single one (which might be an outlier).
3) Changed the ratio from 4 to 2.5. Now that we ignore sizes
below 40, a ratio of 4 would mean that a new entry would have
to be 4 times bigger (160) or smaller (10) before alignment
would be broken. A ratio of 2.5 seems more sensible.
Applied updated gofmt to all of src and misc. Also tested
against several former issues that complained about this
and verified that the output for the given examples is
satisfactory (added respective test cases).
Some of the files changed because they were not gofmt-ed
in the first place.
For #644.
For #7335.
For #10392.
(and probably more related issues)
Fixes#22852.
Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e
That "else" was needed due to gc DCE limitations.
Now it's not the case and we can avoid go lint complaints.
(See #23521 and https://golang.org/cl/91056.)
There is inlining test for bigEndianWord, so if test
is passing, no performance regression should occur.
Change-Id: Id84d63f361e5e51a52293904ff042966c83c16e9
Reviewed-on: https://go-review.googlesource.com/104555
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Floating point test instructions allow special cases (NaN, ±∞ and
a few other useful properties) to be checked directly.
This CL adds the following instructions to the assembler:
* LTEBR - load and test (float32)
* LTDBR - load and test (float64)
* TCEB - test data class (float32)
* TCDB - test data class (float64)
Note that I have only added immediate versions of the 'test data
class' instructions for now as that's the only case I think the
compiler will use.
Change-Id: I3398aab2b3a758bf909bd158042234030c8af582
Reviewed-on: https://go-review.googlesource.com/104457
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Before this change, the smallest result Ldexp can handle was
ldexp(2, -1075), which is SmallestNonzeroFloat64.
There are some numbers below it should also be rounded to
SmallestNonzeroFloat64. The change fixes this.
Fixes#23407
Change-Id: I76f4cb005a6e9ccdd95b5e5c734079fd5d29e4aa
Reviewed-on: https://go-review.googlesource.com/87338
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
One could expect calls like
z.mant.shl(z.mant, shiftAmount)
(or higher-level-functions calls that use lhs/rhs) to be almost free
when shiftAmount = 0; and expect calls like
z.mant.shl(x.mant, 0)
to have the same cost of a x.mant -> z.mant copy. Neither of this
things are currently true.
For an 800 words nat, the first kind of calls cost ~800ns for rigth
shifts and ~3.5µs for left shift; while the second kind of calls are
doing more work than necessary by calling shlVU/shrVU.
This change makes the first kind of calls ({Shl,Shr}Same) almost free,
and the second kind of calls ({Shl,Shr}) about 30% faster.
name old time/op new time/op delta
ZeroShifts/Shl-4 3.64µs ± 3% 2.49µs ± 1% -31.55% (p=0.000 n=10+10)
ZeroShifts/ShlSame-4 3.65µs ± 1% 0.01µs ± 1% -99.85% (p=0.000 n=9+9)
ZeroShifts/Shr-4 3.65µs ± 1% 2.49µs ± 1% -31.91% (p=0.000 n=10+10)
ZeroShifts/ShrSame-4 825ns ± 0% 6ns ± 1% -99.33% (p=0.000 n=9+10)
During go test math/big, the shl zeroshift fastpath is triggered 1380
times; while the shr fastpath is triggered 153334 times(!).
Change-Id: I5f92b304a40638bd8453a86c87c58e54b337bcdf
Reviewed-on: https://go-review.googlesource.com/87660
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The Newton sqrtInverse procedure we use to compute Float.Sqrt should
not allocate a number of times proportional to the number of Newton
iterations we need to reach the desired precision.
At the beginning the function the target precision is known, so even
if we do want to perform the early steps at low precisions (to save
time), it's still possible to pre-allocate larger backing arrays, both
for the temp variables in the loop and the variable that'll hold the
final result.
There's one complication. At the following line:
u.Sub(three, u)
the Sub method will allocate, because the receiver aliases one of the
arguments, and the large backing array we initially allocated for u
will be replaced by a smaller one allocated by Sub. We can work around
this by introducing a second temp variable u2 that we use to hold the
Sub call result.
Overall, the sqrtInverse procedure still allocates a number of times
proportional to the number of Newton steps, because unfortunately a
few of the Mul calls in the Newton function allocate; but at least we
allocate less in the function itself.
FloatSqrt/256-4 1.97µs ± 1% 1.84µs ± 1% -6.61% (p=0.000 n=8+8)
FloatSqrt/1000-4 4.80µs ± 3% 4.28µs ± 1% -10.78% (p=0.000 n=8+8)
FloatSqrt/10000-4 40.0µs ± 1% 38.3µs ± 1% -4.15% (p=0.000 n=8+8)
FloatSqrt/100000-4 955µs ± 1% 932µs ± 0% -2.49% (p=0.000 n=8+7)
FloatSqrt/1000000-4 79.8ms ± 1% 79.4ms ± 1% ~ (p=0.105 n=8+8)
name old alloc/op new alloc/op delta
FloatSqrt/256-4 816B ± 0% 512B ± 0% -37.25% (p=0.000 n=8+8)
FloatSqrt/1000-4 2.50kB ± 0% 1.47kB ± 0% -41.03% (p=0.000 n=8+8)
FloatSqrt/10000-4 23.5kB ± 0% 18.2kB ± 0% -22.62% (p=0.000 n=8+8)
FloatSqrt/100000-4 251kB ± 0% 173kB ± 0% -31.26% (p=0.000 n=8+8)
FloatSqrt/1000000-4 4.61MB ± 0% 2.86MB ± 0% -37.90% (p=0.000 n=8+8)
name old allocs/op new allocs/op delta
FloatSqrt/256-4 12.0 ± 0% 8.0 ± 0% -33.33% (p=0.000 n=8+8)
FloatSqrt/1000-4 19.0 ± 0% 9.0 ± 0% -52.63% (p=0.000 n=8+8)
FloatSqrt/10000-4 35.0 ± 0% 14.0 ± 0% -60.00% (p=0.000 n=8+8)
FloatSqrt/100000-4 55.0 ± 0% 23.0 ± 0% -58.18% (p=0.000 n=8+8)
FloatSqrt/1000000-4 122 ± 0% 75 ± 0% -38.52% (p=0.000 n=8+8)
Change-Id: I950dbf61a40267a6cca82ae72524c3024bcb149c
Reviewed-on: https://go-review.googlesource.com/87659
Reviewed-by: Robert Griesemer <gri@golang.org>
useSSE41 was used inside asm implementation of floor to select between base and ss4 code path.
We intrinsified floor and left asm functions as a backup for non-sse4 systems.
This made variable unused, so remove it.
Change-Id: Ia2633de7c7cb1ef1d5b15a2366b523e481b722d9
Reviewed-on: https://go-review.googlesource.com/97935
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fatalf calls in two Float tests use the %s verb with Floats values,
which is not allowed and results in failure messages that look like
this:
float_test.go:1385: i = 0, prec = 1, ToZero:
%!s(*big.Float=1) [0]
/ %!s(*big.Float=1) [0]
= %!s(*big.Float=0.0625)
want %!s(*big.Float=1)
Switch to %v.
Change-Id: Ifdc80bf19c91ca1b190f6551a6d0a51b42ed5919
Reviewed-on: https://go-review.googlesource.com/87199
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In the comment of seedPost, the word: condiiton was changed to: condition
Change-Id: I8967cc0e9f5d37776bada96cc1443c8bf46e1117
Reviewed-on: https://go-review.googlesource.com/86156
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Fixes#23224
The previous Pow code had an optimization for
powers equal to ±0.5 that used Sqrt for
increased accuracy/speed. This caused special
cases involving powers of ±0.5 to disagree with
the Pow spec. This change places the Sqrt optimization
after all of the special case handling.
Change-Id: I6bf757f6248256b29cc21725a84e27705d855369
Reviewed-on: https://go-review.googlesource.com/85660
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Dim performance has regressed by 14% vs 1.9 on amd64.
Current pure go version of Dim is faster and,
what is even more important for performance, is inlinable, so
instead of tweaking asm implementation, just remove it.
I had to update BenchmarkDim, because it was simply reloading
constant(answer) in a loop.
Perf data below:
name old time/op new time/op delta
Dim-6 6.79ns ± 0% 1.60ns ± 1% -76.39% (p=0.000 n=7+10)
If I modify benchmark to be the same as in this CL results are even better:
name old time/op new time/op delta
Dim-6 10.2ns ± 0% 1.6ns ± 1% -84.27% (p=0.000 n=8+10)
Updates #21913
Change-Id: I00e23c8affc293531e1d9f0e0e49f3a525634f53
Reviewed-on: https://go-review.googlesource.com/80695
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
In nat.divLarge (having signature (z nat).divLarge(u, uIn, v nat)),
we check whether z aliases uIn or v, but aliasing is currently not
checked for the u parameter.
Unfortunately, z and u aliasing each other can in some cases cause
errors in the computation.
The q return parameter (which will hold the result's quotient), is
unconditionally initialized as
q = z.make(m + 1)
When cap(z) ≥ m+1, z.make() will reuse z's backing array, causing q
and z to share the same backing array. If then z aliases u, setting q
during the quotient computation will then corrupt u, which at that
point already holds computation state.
To fix this, we add an alias(z, u) check at the beginning of the
function, taking care of aliasing the same way we already do for uIn
and v.
Fixes#22830
Change-Id: I3ab81120d5af6db7772a062bb1dfc011de91f7ad
Reviewed-on: https://go-review.googlesource.com/78995
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Branch cuts for the elementary complex functions along real or imaginary axes
should be resolved in floating point calculations by one-sided continuity with
signed zero as described in:
"Branch Cuts for Complex Elementary Functions or Much Ado About Nothing's Sign Bit"
W. Kahan
Available at: https://people.freebsd.org/~das/kahan86branch.pdf
And as described in the C99 standard which is claimed as the original cephes source.
Sqrt did not return the correct branch when imag(x) == 0. The branch is now
determined by sign(imag(x)). This incorrect branch choice was affecting the behavior
of the Trigonometric/Hyperbolic functions that use Sqrt in intermediate calculations.
Asin, Asinh and Atan had spurious domain checks, whereas the functions should be valid
over the whole complex plane with appropriate branch cuts.
Fixes#6888
Change-Id: I9b1278af54f54bfb4208276ae345bbd3ddf3ec83
Reviewed-on: https://go-review.googlesource.com/46492
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This reverts CL 55972.
Reason for revert: this changes Perm's behavior unnecessarily.
I asked for this change originally but I now regret it.
Reverting so that I don't have to justify it in Go 1.10 release notes.
Edited to keep the change to rand_test.go, which seems to have
been mostly unrelated.
Fixes#22744.
Change-Id: If8bb1bcde3ced0db2fdcd0aa65ab128613686c66
Reviewed-on: https://go-review.googlesource.com/78195
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
If a composite literal contains any comments on their own lines without
any elements, the printer would unindent the comments.
The comments in this edge case are written when the closing '}' is
written. Indent and outdent first so that the indentation is
interspersed before the comment is written.
Also note that the go/printer golden tests don't show the exact same
behaviour that gofmt does. Added a TODO to figure this out in a separate
CL.
While at it, ensure that the tree conforms to gofmt. The changes are
unrelated to this indentation fix, however.
Fixes#22355.
Change-Id: I5ac25ac6de95a236f1e123479127cc4dd71e93fe
Reviewed-on: https://go-review.googlesource.com/74232
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
A clarifying comment was added to indicate that overflow of a
single Word is not possible in the single digit calculation.
Lehmer's paper includes a proof of the bounds on the size of the
cosequences (u0, u1, u2, v0, v1, v2).
Change-Id: I98127a07aa8f8fe44814b74b2bc6ff720805194b
Reviewed-on: https://go-review.googlesource.com/77451
Reviewed-by: Robert Griesemer <gri@golang.org>
Right rotation is achieved using negative k in RotateLeft*(x, k). Add
examples demonstrating that functionality.
Change-Id: I15dab159accd2937cb18d3fa8ca32da8501567d3
Reviewed-on: https://go-review.googlesource.com/75371
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
An initial draft of the Newton code for Float.Sqrt was structured like
this:
for condition
// do Newton iteration..
prec *= 2
since prec, at the end of the loop, was double the precision used in
the last Newton iteration, the termination condition was set to
2*limit. The code was later rewritten in the form
for condition
prec *= 2
// do Newton iteration..
but condition was not updated, and it's still 2*limit, which is about
double what we actually need, and is triggering the execution of an
additional, and unnecessary, Newton iteration.
This change adjusts the Newton termination condition to the (correct)
value of z.prec, plus 32 guard bits as a safety margin.
name old time/op new time/op delta
FloatSqrt/64-4 798ns ± 3% 802ns ± 3% ~ (p=0.458 n=8+8)
FloatSqrt/128-4 1.65µs ± 1% 1.65µs ± 1% ~ (p=0.290 n=8+8)
FloatSqrt/256-4 3.10µs ± 1% 2.10µs ± 0% -32.32% (p=0.000 n=8+7)
FloatSqrt/1000-4 8.83µs ± 1% 4.91µs ± 2% -44.39% (p=0.000 n=8+8)
FloatSqrt/10000-4 107µs ± 1% 40µs ± 1% -62.68% (p=0.000 n=8+8)
FloatSqrt/100000-4 2.91ms ± 1% 0.96ms ± 1% -67.13% (p=0.000 n=8+8)
FloatSqrt/1000000-4 240ms ± 1% 80ms ± 1% -66.66% (p=0.000 n=8+8)
name old alloc/op new alloc/op delta
FloatSqrt/64-4 416B ± 0% 416B ± 0% ~ (all equal)
FloatSqrt/128-4 720B ± 0% 720B ± 0% ~ (all equal)
FloatSqrt/256-4 1.34kB ± 0% 0.82kB ± 0% -39.29% (p=0.000 n=8+8)
FloatSqrt/1000-4 5.09kB ± 0% 2.50kB ± 0% -50.94% (p=0.000 n=8+8)
FloatSqrt/10000-4 45.9kB ± 0% 23.5kB ± 0% -48.81% (p=0.000 n=8+8)
FloatSqrt/100000-4 533kB ± 0% 251kB ± 0% -52.90% (p=0.000 n=8+8)
FloatSqrt/1000000-4 9.21MB ± 0% 4.61MB ± 0% -49.98% (p=0.000 n=8+8)
name old allocs/op new allocs/op delta
FloatSqrt/64-4 9.00 ± 0% 9.00 ± 0% ~ (all equal)
FloatSqrt/128-4 13.0 ± 0% 13.0 ± 0% ~ (all equal)
FloatSqrt/256-4 15.0 ± 0% 12.0 ± 0% -20.00% (p=0.000 n=8+8)
FloatSqrt/1000-4 24.0 ± 0% 19.0 ± 0% -20.83% (p=0.000 n=8+8)
FloatSqrt/10000-4 40.0 ± 0% 35.0 ± 0% -12.50% (p=0.000 n=8+8)
FloatSqrt/100000-4 66.0 ± 0% 55.0 ± 0% -16.67% (p=0.000 n=8+8)
FloatSqrt/1000000-4 143 ± 0% 122 ± 0% -14.69% (p=0.000 n=8+8)
Change-Id: I4868adb7f8960f2ca20e7792734c2e6211669fc0
Reviewed-on: https://go-review.googlesource.com/75010
Reviewed-by: Robert Griesemer <gri@golang.org>
Adds the following s390x test under mask (immediate) instructions:
TMHH
TMHL
TMLH
TMLL
These are useful for testing bits and are already used in the math package.
Change-Id: Idffb3f83b238dba76ac1e42ac6b0bf7f1d11bea2
Reviewed-on: https://go-review.googlesource.com/41092
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
By calculating dim directly, rather than calling max, we can simplify
the generated code significantly. The compiler now reports that dim
is easily inlineable, but it can't be inlined because there is still
an assembly stub for Dim.
Since dim is now very simple I no longer think it is worth having
assembly implementations of it. I have therefore removed the s390x
assembly. Removing the other assembly for Dim is #21913.
name old time/op new time/op delta
Dim 4.29ns ± 0% 3.53ns ± 0% -17.62% (p=0.000 n=9+8)
Change-Id: Ic38a6b51603cbc661dcdb868ecf2b1947e9f399e
Reviewed-on: https://go-review.googlesource.com/64194
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This change adds a Square root method to the big.Float type, with
signature
(z *Float) Sqrt(x *Float) *Float
Fixes#20460
Change-Id: I050aaed0615fe0894e11c800744600648343c223
Reviewed-on: https://go-review.googlesource.com/67830
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Rounding ties to even is statistically useful for some applications.
This implementation completes IEEE float64 rounding mode support (in
addition to Round, Ceil, Floor, Trunc).
This function avoids subtle faults found in ad-hoc implementations, and
is simple enough to be inlined by the compiler.
Fixes#21748
Change-Id: I09415df2e42435f9e7dabe3bdc0148e9b9ebd609
Reviewed-on: https://go-review.googlesource.com/61211
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Increase MaxBase from 36 to 62 and extend the conversion
alphabet with the upper-case letters 'A' to 'Z'. For int
conversions with bases <= 36, the letters 'A' to 'Z' have
the same values (10 to 35) as the corresponding lower-case
letters. For conversion bases > 36 up to 62, the upper-case
letters have the values 36 to 61.
Added MaxBase to api/except.txt: Clients should not make
assumptions about the value of MaxBase being constant.
The core of the change is in natconv.go. The remaining
changes are adjusted tests and documentation.
Fixes#21558.
Change-Id: I5f74da633caafca03993e13f32ac9546c572cc84
Reviewed-on: https://go-review.googlesource.com/65970
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
This removes some of the []byte/string conversions currently
existing in the (un)marshaling methods of Int and Rat.
For Int we introduce a new function (*Int).setFromScanner() essentially
implementing the SetString method being given an io.ByteScanner instead
of a string. So we can handle the string case in (*Int).SetString with
a *strings.Reader and the []byte case in (*Int).UnmarshalText() with a
*bytes.Reader now avoiding the []byte/string conversion here.
For Rat we introduce a new function (*Rat).marshal() essentially
implementing the String method outputting []byte instead of string.
Using this new function and the same formatting rules as in
(*Rat).RatString we can implement (*Rat).MarshalText() without
the []byte/string conversion it used to have.
Change-Id: Ic5ef246c1582c428a40f214b95a16671ef0a06d9
Reviewed-on: https://go-review.googlesource.com/65950
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
strings.IndexByte was introduced in go1.2 and it can be used
effectively wherever the second argument to strings.Index is
exactly one byte long.
This avoids generating unnecessary string symbols and saves
a few calls to strings.Index.
Change-Id: I1ab5edb7c4ee9058084cfa57cbcc267c2597e793
Reviewed-on: https://go-review.googlesource.com/65930
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
- using FMA and AVX instructions if available to speed-up
Exp calculation on amd64
- using a data table instead of #define'ed constants because
these instructions do not support loading floating point immediates.
One has to use a memory operand / register.
- Benchmark results on Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz:
Original vs New (non-FMA path)
name old time/op new time/op delta
Exp 16.0ns ± 1% 16.1ns ± 3% ~ (p=0.308 n=9+10)
Original vs New (FMA path)
name old time/op new time/op delta
Exp 16.0ns ± 1% 13.7ns ± 2% -14.80% (p=0.000 n=9+10)
Change-Id: I3d8986925d82b39b95ee979ae06f59d7e591d02e
Reviewed-on: https://go-review.googlesource.com/62590
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Shuffle panics if n < 0, not n <= 0. The comment for the (*Rand).Shuffle
function is already accurate.
Change-Id: I073049310bca9632e50e9ca3ff79eec402122793
Reviewed-on: https://go-review.googlesource.com/63750
Reviewed-by: Ian Lance Taylor <iant@golang.org>
CL 62250 makes constant folding a bit more aggressive and these
benchmarks were optimized away. This CL adds some indirection to
the function arguments to stop them being folded.
The Copysign benchmark is a bit faster because I've left one
argument as a constant and it can be partially folded.
old CL 62250 this CL
Copysign 1.24ns ± 0% 0.34ns ± 2% 1.02ns ± 2%
Abs 0.67ns ± 0% 0.35ns ± 3% 0.67ns ± 0%
Signbit 0.87ns ± 0% 0.35ns ± 2% 0.87ns ± 1%
Change-Id: I9604465a87d7aa29f4bd6009839c8ee354be3cd7
Reviewed-on: https://go-review.googlesource.com/62450
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Perm and Shuffle are fundamentally doing the same work.
This change makes Perm's algorithm match Shuffle's.
In addition to allowing developers to switch more
easily between the two methods, it affords a nice speed-up:
name old time/op new time/op delta
Perm3-8 75.7ns ± 1% 51.8ns ± 1% -31.59% (p=0.000 n=9+8)
Perm30-8 610ns ± 1% 405ns ± 1% -33.67% (p=0.000 n=9+9)
This change alters the output from Perm,
given the same Source and seed.
This is a change from Go 1.0 behavior.
This necessitates updating the regression test.
This also changes the number of calls made to the Source
during Perm, which changes the output of the math/rand examples.
This also slightly perturbs the output of Perm,
nudging it out of the range currently accepted by TestUniformFactorial.
However, it is complete unclear that the helpers relied on
by TestUniformFactorial are correct. That is #21211.
This change updates checkSimilarDistribution to respect
closeEnough for standard deviations, which makes the test pass.
The whole situation is muddy; see #21211 for details.
There is an alternative implementation of Perm
that avoids initializing m, which is more similar
to the existing implementation, plus some optimizations:
func (r *Rand) Perm(n int) []int {
m := make([]int, n)
max31 := n
if n > 1<<31-1-1 {
max31 = 1<<31 - 1 - 1
}
i := 1
for ; i < max31; i++ {
j := r.int31n(int32(i + 1))
m[i] = m[j]
m[j] = i
}
for ; i < n; i++ {
j := r.Int63n(int64(i + 1))
m[i] = m[j]
m[j] = i
}
return m
}
This is a tiny bit faster than the implementation
actually used in this change:
name old time/op new time/op delta
Perm3-8 51.8ns ± 1% 50.3ns ± 1% -2.83% (p=0.000 n=8+9)
Perm30-8 405ns ± 1% 394ns ± 1% -2.66% (p=0.000 n=9+8)
However, 3% in performance doesn't seem worth
having the two algorithms diverge,
nor the reduced readability of this alternative.
Updates #16213.
Change-Id: I11a7441ff8837ee9c241b4c88f7aa905348be781
Reviewed-on: https://go-review.googlesource.com/55972
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Shuffle uses the Fisher-Yates algorithm.
Since this is new API, it affords us the opportunity
to use a much faster Int31n implementation that mostly avoids division.
As a result, BenchmarkPerm30ViaShuffle is
about 30% faster than BenchmarkPerm30,
despite requiring a separate initialization loop
and using function calls to swap elements.
Fixes#20480
Updates #16213
Updates #21211
Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5
Reviewed-on: https://go-review.googlesource.com/51891
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This function avoids subtle faults found in many ad-hoc implementations,
and is simple enough to be inlined by the compiler.
Fixes#20100
Change-Id: Ib320254e9b1f1f798c6ef906b116f63bc29e8d08
Reviewed-on: https://go-review.googlesource.com/43652
Reviewed-by: Robert Griesemer <gri@golang.org>
Change-Id: Ic3ce2f3c055f2636ec8fc9cec8592e596b18dc05
Reviewed-on: https://go-review.googlesource.com/54771
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Implement int reg <-> fp reg moves on amd64.
If we see a load to int reg followed by an int->fp move, then we can just
load to the fp reg instead. Same for stores.
math.Abs is now:
MOVQ "".x+8(SP), AX
SHLQ $1, AX
SHRQ $1, AX
MOVQ AX, "".~r1+16(SP)
math.Copysign is now:
MOVQ "".x+8(SP), AX
SHLQ $1, AX
SHRQ $1, AX
MOVQ "".y+16(SP), CX
SHRQ $63, CX
SHLQ $63, CX
ORQ CX, AX
MOVQ AX, "".~r2+24(SP)
math.Float64bits is now:
MOVSD "".x+8(SP), X0
MOVSD X0, "".~r1+16(SP)
(it would be nicer to use a non-SSE reg for this, nothing is perfect)
And due to the fix for #21440, the inlined version of these improve as well.
name old time/op new time/op delta
Abs 1.38ns ± 5% 0.89ns ±10% -35.54% (p=0.000 n=10+10)
Copysign 1.56ns ± 7% 1.35ns ± 6% -13.77% (p=0.000 n=9+10)
Fixes#13095
Change-Id: Ibd7f2792412a6668608780b0688a77062e1f1499
Reviewed-on: https://go-review.googlesource.com/58732
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
This commit defines the inverse of error function (erfinv) in the
math package. The function is based on the rational approximation
of percentage points of normal distribution available at
https://www.jstor.org/stable/pdf/2347330.pdf.
Fixes#6359
Change-Id: Icfe4508f623e0574c7fffdbf7aa929540fd4c944
Reviewed-on: https://go-review.googlesource.com/46990
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Updates #13745
Recognize z.Mul(x, x) as squaring for Floats and use
the internal z.sqr(x) method for nat on the mantissa.
Change-Id: I0f792157bad93a13cae1aecc4c10bd20c6397693
Reviewed-on: https://go-review.googlesource.com/56774
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
updates #13745
A squared rational is always positive and can not
be reduced since the numerator and denominator had
no previous common factors. The nat multiplication
can be performed using the internal sqr method.
Change-Id: I558f5b38e379bfd26ff163c9489006d7e5a9cfaa
Reviewed-on: https://go-review.googlesource.com/56776
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Found with mvdan.cc/unindent. Prioritized the ones with the biggest wins
for now.
Change-Id: I2b032e45cdd559fc9ed5b1ee4c4de42c4c92e07b
Reviewed-on: https://go-review.googlesource.com/56470
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The existing implementation is translated from C, which uses a
polynomial coefficient very close to 1/6. If the function uses
1/6 as this coeffient, the result of Exp(1) will be more accurate.
And this change doesn't introduce more error to Exp function.
Fixes#20319
Change-Id: I94c236a18cf95570ebb69f7fb99884b0d7cf5f6e
Reviewed-on: https://go-review.googlesource.com/49294
Reviewed-by: Robert Griesemer <gri@golang.org>
updates #13745
Multiprecision squaring can be done in a straightforward manner
with about half the multiplications of a basic multiplication
due to the symmetry of the operands. This change implements
basic squaring for nat types and uses it for Int multiplication
when the same variable is supplied to both arguments of
z.Mul(x, x). This has some overhead to allocate a temporary
variable to hold the cross products, shift them to double and
add them to the diagonal terms. There is a speed benefit in
the intermediate range when the overhead is neglible and the
asymptotic performance of karatsuba multiplication has not been
reached.
basicSqrThreshold = 20
karatsubaSqrThreshold = 400
Were set by running calibrate_test.go to measure timing differences
between the algorithms. Benchmarks for squaring:
name old time/op new time/op delta
IntSqr/1-4 51.5ns ±25% 25.1ns ± 7% -51.38% (p=0.008 n=5+5)
IntSqr/2-4 79.1ns ± 4% 72.4ns ± 2% -8.47% (p=0.008 n=5+5)
IntSqr/3-4 102ns ± 4% 97ns ± 5% ~ (p=0.056 n=5+5)
IntSqr/5-4 161ns ± 4% 163ns ± 7% ~ (p=0.952 n=5+5)
IntSqr/8-4 277ns ± 5% 267ns ± 6% ~ (p=0.087 n=5+5)
IntSqr/10-4 358ns ± 3% 360ns ± 4% ~ (p=0.730 n=5+5)
IntSqr/20-4 1.07µs ± 3% 1.01µs ± 6% ~ (p=0.056 n=5+5)
IntSqr/30-4 2.36µs ± 4% 1.72µs ± 2% -27.03% (p=0.008 n=5+5)
IntSqr/50-4 5.19µs ± 3% 3.88µs ± 4% -25.37% (p=0.008 n=5+5)
IntSqr/80-4 11.3µs ± 4% 8.6µs ± 3% -23.78% (p=0.008 n=5+5)
IntSqr/100-4 16.2µs ± 4% 12.8µs ± 3% -21.49% (p=0.008 n=5+5)
IntSqr/200-4 50.1µs ± 5% 44.7µs ± 3% -10.65% (p=0.008 n=5+5)
IntSqr/300-4 105µs ±11% 95µs ± 3% -9.50% (p=0.008 n=5+5)
IntSqr/500-4 231µs ± 5% 227µs ± 2% ~ (p=0.310 n=5+5)
IntSqr/800-4 496µs ± 9% 459µs ± 3% -7.40% (p=0.016 n=5+5)
IntSqr/1000-4 700µs ± 3% 710µs ± 5% ~ (p=0.841 n=5+5)
Show a speed up of 10-25% in the range where basicSqr is optimal,
improved single word squaring and no significant difference when
the fallback to standard multiplication is used.
Change-Id: Iae2c82ca91cf890823f91e5c83bbe9a2c534b72b
Reviewed-on: https://go-review.googlesource.com/53638
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The current implementation of the extended Euclidean GCD algorithm
calculates both cosequences x and y inside the division loop. This
is unneccessary since the second Bezout coefficient can be obtained
at the end of calculation via a multiplication, subtraction and a
division. In case only one coefficient is needed, e.g. ModInverse
this calculation can be skipped entirely. This is a standard
optimization, see e.g.
"Handbook of Elliptic and Hyperelliptic Curve Cryptography"
Cohen et al pp 191
Available at:
http://cs.ucsb.edu/~koc/ccs130h/2013/EllipticHyperelliptic-CohenFrey.pdf
Updates #15833
Change-Id: I1e0d2e63567cfed97fd955048fe6373d36f22757
Reviewed-on: https://go-review.googlesource.com/50530
Reviewed-by: Robert Griesemer <gri@golang.org>
The current implementation uses a shift and add
loop to compute the product of x's exponent xe and
the integer part of y (yi) for yi up to 1<<63.
Since xe is an 11-bit exponent, this product can be
up to 74-bits and overflow both 32 and 64-bit int.
This change checks whether the accumulated exponent
will fit in the 11-bit float exponent of the output
and breaks out of the loop early if overflow is detected.
The current handling of yi >= 1<<63 uses Exp(y * Log(x))
which incorrectly returns Nan for x<0. In addition,
for y this large, Exp(y * Log(x)) can be enumerated
to only overflow except when x == -1 since the
boundary cases computed exactly:
Pow(NextAfter(1.0, Inf(1)), 1<<63) == 2.72332... * 10^889
Pow(NextAfter(1.0, Inf(-1)), 1<<63) == 1.91624... * 10^-445
exceed the range of float64. So, the call can be
replaced with a simple case statement analgous to
y == Inf that correctly handles x < 0 as well.
Fixes#7394
Change-Id: I6f50dc951f3693697f9669697599860604323102
Reviewed-on: https://go-review.googlesource.com/48290
Reviewed-by: Robert Griesemer <gri@golang.org>
As noted in the TODO comment, the sticky bit is only used
when the rounding bit is zero or the rounding mode is
ToNearestEven. This change makes that check explicit and
will eliminate half the sticky bit calculations on average
when rounding mode is not ToNearestEven.
Change-Id: Ia4709f08f46e682bf97dabe5eb2a10e8e3d7af43
Reviewed-on: https://go-review.googlesource.com/54111
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Add test cases to verify behavior for Ldexp with exponents outside the
range of Minint32/Maxint32, for a gccgo bug.
Test for issue #21323.
Change-Id: Iea67bc6fcfafdfddf515cf7075bdac59360c277a
Reviewed-on: https://go-review.googlesource.com/54230
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The standard deviation of a uniform distribution is size / √12.
The size of the interval [0, 255] is 256, not 255.
While we're here, simplify the expression.
The tests previously passed only because the error margin was large enough.
Sample observed standard deviations while running tests:
73.7893634666819
73.9221651548294
73.8077961697150
73.9084236069471
73.8968446814785
73.8684209136244
73.9774618960282
73.9523483202549
255 / √12 == 73.6121593216772
256 / √12 == 73.9008344562721
Change-Id: I7bc6cdc11e5d098951f2f2133036f62489275979
Reviewed-on: https://go-review.googlesource.com/51310
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Regular HTTP is insecure, oeis.org supports HTTPS and it is actually
used in some other places in the codebase. This changes these final urls
to use HTTPS.
Change-Id: Ia46410a9c7ce67238a10cb6bfffaceca46112f58
Reviewed-on: https://go-review.googlesource.com/52072
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
Manual hyphenation doesn't work well when text gets reflown,
for example by godoc.
There are a few other manual hyphenations in the tree,
but they are in local comments or comments for unexported functions.
Change-Id: I17c9b1fee1def650da48903b3aae2fa1e1119a65
Reviewed-on: https://go-review.googlesource.com/53510
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Erroneously called OnesCount instead of OnesCount64
Change-Id: Ie877e43f213253e45d31f64931c4a15915849586
Reviewed-on: https://go-review.googlesource.com/53410
Reviewed-by: Chris Broadfoot <cbro@golang.org>
Go doesn't guarantee that the result of floating point operations will
be the same on different architectures. It was not stated in the
documentation, that can lead to confusion.
Fixes#18354
Change-Id: Idb1b4c256fb9a7158a74256136eca3b8ce44476f
Reviewed-on: https://go-review.googlesource.com/34938
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Implements detection of x86 cpu features that
are used in the go standard library.
Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.
Updates: #15403
Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on
ppc64x.
benchmark old ns/op new ns/op delta
BenchmarkLeadingZeros-16 4.26 1.71 -59.86%
BenchmarkLeadingZeros16-16 3.04 1.83 -39.80%
BenchmarkLeadingZeros32-16 3.31 1.82 -45.02%
BenchmarkLeadingZeros64-16 3.69 1.71 -53.66%
BenchmarkTrailingZeros-16 2.55 1.62 -36.47%
BenchmarkTrailingZeros32-16 2.55 1.77 -30.59%
BenchmarkTrailingZeros64-16 2.78 1.62 -41.73%
BenchmarkOnesCount-16 3.19 0.93 -70.85%
BenchmarkOnesCount32-16 2.55 1.18 -53.73%
BenchmarkOnesCount64-16 3.22 0.93 -71.12%
Update #18616
I also made a change to bits_test.go because when debugging some failures
the output was not quite providing the right argument information.
Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e
Reviewed-on: https://go-review.googlesource.com/41630
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
As necessary, math functions were structured to use stubs, so that they can
be accelerated with assembly on any platform.
Technique used was minimax polynomial approximation using tables of
polynomial coefficients, with argument range reduction.
Benchmark New Old Speedup
BenchmarkAcos 12.2 47.5 3.89
BenchmarkAcosh 18.5 56.2 3.04
BenchmarkAsin 13.1 40.6 3.10
BenchmarkAsinh 19.4 62.8 3.24
BenchmarkAtan 10.1 23 2.28
BenchmarkAtanh 19.1 53.2 2.79
BenchmarkAtan2 16.5 33.9 2.05
BenchmarkCbrt 14.8 58 3.92
BenchmarkErf 10.8 20.1 1.86
BenchmarkErfc 11.2 23.5 2.10
BenchmarkExp 8.77 53.8 6.13
BenchmarkExpm1 10.1 38.3 3.79
BenchmarkLog 13.1 40.1 3.06
BenchmarkLog1p 12.7 38.3 3.02
BenchmarkPowInt 31.7 40.5 1.28
BenchmarkPowFrac 33.1 141 4.26
BenchmarkTan 11.5 30 2.61
Accuracy was tested against a high precision
reference function to determine maximum error.
Note: ulperr is error in "units in the last place"
max
ulperr
Acos 1.15
Acosh 1.07
Asin 2.22
Asinh 1.72
Atan 1.41
Atanh 3.00
Atan2 1.45
Cbrt 1.18
Erf 1.29
Erfc 4.82
Exp 1.00
Expm1 2.26
Log 0.94
Log1p 2.39
Tan 3.14
Pow will have 99.99% correctly rounded results with reasonable inputs
producing numeric (non Inf or NaN) results
Change-Id: I850e8cf7b70426e8b54ec49d74acd4cddc8c6cb2
Reviewed-on: https://go-review.googlesource.com/38585
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
We have dedicated asm implementation of sincos only on 386 and amd64,
on everything else we are just jumping to generic version.
However amd64 version is actually slower than generic one:
Sincos-6 34.4ns ± 0% 24.8ns ± 0% -27.79% (p=0.000 n=8+10)
So remove all sincos*.s and keep only generic and 386.
Updates #19819
Change-Id: I7eefab35743729578264f52f6d23ee2c227c92a5
Reviewed-on: https://go-review.googlesource.com/41200
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The instructions allow moves between floating point and general
purpose registers without any conversion taking place.
Change-Id: I82c6f3ad9c841a83783b5be80dcf5cd538ff49e6
Reviewed-on: https://go-review.googlesource.com/38777
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Starting in go1.9, the minimum processor requirement for ppc64 is POWER8. So it
may now use the same divWW implementation as ppc64le.
Updates #19074
Change-Id: If1a85f175cda89eee06a1024ccd468da6124c844
Reviewed-on: https://go-review.googlesource.com/39010
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
After https://golang.org/cl/31490 we break false
output dependency for CVTS.. in compiler generated code.
I've looked through asm code, which uses CVTS..
and added XOR to the only case where it affected performance.
Log-6 21.6ns ± 0% 19.9ns ± 0% -7.87% (p=0.000 n=10+10)
Change-Id: I25d9b405e3041a3839b40f9f9a52e708034bb347
Reviewed-on: https://go-review.googlesource.com/38771
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Verified that BenchmarkBitLen time went down from 2.25 ns/op to 0.65 ns/op
an a 2.3 GHz Intel Core i7, before removing that benchmark (now covered by
math/bits benchmarks).
Change-Id: I3890bb7d1889e95b9a94bd68f0bdf06f1885adeb
Reviewed-on: https://go-review.googlesource.com/38464
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
A -0 constant is the same as 0. Use explicit negative zero
for float64 -0.0. Also, fix two test cases that were wrong.
Fixes#19673.
Change-Id: Ic09775f29d9bc2ee7814172e59c4a693441ea730
Reviewed-on: https://go-review.googlesource.com/38463
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
nat.setUint64 is nicely generic.
By assuming 32- or 64-bit words, however,
we can write simpler code,
and eliminate some shifts
in dead code that vet complains about.
Generated code for 64 bit systems is unaltered.
Generated code for 32 bit systems is much better.
For 386, the routine length drops from 325
bytes of code to 271 bytes of code, with fewer loops.
Change-Id: I1bc14c06272dee37a7fcb48d33dd1e621eba945d
Reviewed-on: https://go-review.googlesource.com/38070
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0
Reviewed-on: https://go-review.googlesource.com/37793
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Removes an extra function call for TrailingZeroes and thus may
increase chances for inlining.
Change-Id: Iefd8d4402dc89b64baf4e5c865eb3dadade623af
Reviewed-on: https://go-review.googlesource.com/37613
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This change adds math/bits as a new dependency of math/big.
- use bits.LeadingZeroes instead of local implementation
(they are identical, so there's no performance loss here)
- leave other functionality local (ntz, bitLen) since there's
faster implementations in math/big at the moment
Change-Id: I1218aa8a1df0cc9783583b090a4bb5a8a145c4a2
Reviewed-on: https://go-review.googlesource.com/37141
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Removes init function from the math package.
Allows stripping of arrays with pre-computed values
used for Pow10 from binaries if Pow10 is not used.
cmd/go shrinks by 128 bytes.
Fixed small values like 10**-323 being 0 instead of 1e-323.
Overall precision is increased but still not as good as
predefined constants for some inputs.
Samples:
Pow10(208)
before: 1.0000000000000006662e+208
after: 1.0000000000000000959e+208
Pow10(202)
before 1.0000000000000009895e+202
after 1.0000000000000001193e+202
Pow10(60)
before 1.0000000000000001278e+60
after 0.9999999999999999494e+60
Pow10(-100)
before 0.99999999999999938551e-100
after 0.99999999999999989309e-100
Pow10(-200)
before 0.9999999999999988218e-200
after 1.0000000000000001271e-200
name old time/op new time/op delta
Pow10Pos-4 44.6ns ± 2% 1.2ns ± 1% -97.39% (p=0.000 n=19+17)
Pow10Neg-4 50.8ns ± 1% 4.1ns ± 2% -92.02% (p=0.000 n=17+19)
Change-Id: If094034286b8ac64be3a95fd9e8ffa3d4ad39b31
Reviewed-on: https://go-review.googlesource.com/36331
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Test finite negative x with Y0(-1), Y1(-1), Yn(2,-1), Yn(-3,-1).
Also test the special case Yn(0,0).
Fixes#19130.
Change-Id: I95f05a72e1c455ed8ddf202c56f4266f03f370fd
Reviewed-on: https://go-review.googlesource.com/37310
Reviewed-by: Robert Griesemer <gri@golang.org>
For compatibility with math/bits uint operations.
When math/big was written originally, the Go compiler used 32bit
int/uint values even on a 64bit machine. uintptr was the type that
represented the machine register size. Now, the int/uint types are
sized to the native machine register size, so they are the natural
machine Word type.
On most machines, the size of int/uint correspond to the size of
uintptr. On platforms where uint and uintptr have different sizes,
this change may lead to performance differences (e.g., amd64p32).
Change-Id: Ief249c160b707b6441848f20041e32e9e9d8d8ca
Reviewed-on: https://go-review.googlesource.com/37372
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Add exported global variables and store the results of benchmarked
functions in them. This prevents the current compiler optimizations
from removing the instructions that are needed to compute the return
values of the benchmarked functions.
Change-Id: If8b08424e85f3796bb6dd73e761c653abbabcc5e
Reviewed-on: https://go-review.googlesource.com/37195
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
- moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
This permits use of the same constant m twice (*) which may be
better for machines that can't use large immediate constants
directly with an AND instruction and have to load them explicitly.
*) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)
- simplified returns
This improves the generated code because the compiler recognizes
x>>k | x<<k as ROT when k is the bitsize of x.
The 8-bit versions of these instructions can be significantly faster
still if they are replaced with table lookups, as long as the table
is in cache. If the table is not in cache, table-lookup is probably
slower, hence the choice of an explicit register-only implementation
for now.
BenchmarkReverse-8 8.50 6.86 -19.29%
BenchmarkReverse8-8 2.17 1.74 -19.82%
BenchmarkReverse16-8 2.89 2.34 -19.03%
BenchmarkReverse32-8 3.55 2.95 -16.90%
BenchmarkReverse64-8 6.81 5.57 -18.21%
BenchmarkReverseBytes-8 3.49 2.48 -28.94%
BenchmarkReverseBytes16-8 0.93 0.62 -33.33%
BenchmarkReverseBytes32-8 1.55 1.13 -27.10%
BenchmarkReverseBytes64-8 2.47 2.47 +0.00%
Reverse-8 8.50ns ± 0% 6.86ns ± 0% ~ (p=1.000 n=1+1)
Reverse8-8 2.17ns ± 0% 1.74ns ± 0% ~ (p=1.000 n=1+1)
Reverse16-8 2.89ns ± 0% 2.34ns ± 0% ~ (p=1.000 n=1+1)
Reverse32-8 3.55ns ± 0% 2.95ns ± 0% ~ (p=1.000 n=1+1)
Reverse64-8 6.81ns ± 0% 5.57ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes-8 3.49ns ± 0% 2.48ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes16-8 0.93ns ± 0% 0.62ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes32-8 1.55ns ± 0% 1.13ns ± 0% ~ (p=1.000 n=1+1)
ReverseBytes64-8 2.47ns ± 0% 2.47ns ± 0% ~ (all samples are equal)
Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d
Reviewed-on: https://go-review.googlesource.com/37215
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: I280c53be455f2fe0474ad577c0f7b7908a4eccb2
Reviewed-on: https://go-review.googlesource.com/36993
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The tests failed to compile when using the math_big_pure_go tag on
s390x.
Change-Id: I2a09f53ff6562ab9bc9b886cffc0f6205bbfcfbb
Reviewed-on: https://go-review.googlesource.com/36956
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The s390x port was based on the ppc64 port and, because of the way the
port was done, inherited some instructions from it. ppc64 supports
3-operand (4-operand for FMADD etc.) floating point instructions
but s390x doesn't (the destination register is always an input) and
so these were emulated.
There is a bug in the emulation of FMADD whereby if the destination
register is also a source for the multiplication it will be
clobbered. This doesn't break any assembly code in the std lib but
could affect future work.
To fix this I have gone through the floating point instructions and
removed all unnecessary 3-/4-operand emulation. The compiler doesn't
need it and assembly writers don't need it, it's just a source of
bugs.
I've also deleted the FNMADD family of emulated instructions. They
aren't used anywhere.
Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04
Reviewed-on: https://go-review.googlesource.com/33350
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Unlike the pure go implementation used by every other architecture,
the amd64 asm implementation of Exp does not fail early if the
argument is known to overflow. Make it fail early.
Cost of the check is < 1ns (on an old Sandy Bridge machine):
name old time/op new time/op delta
Exp-4 18.3ns ± 1% 18.7ns ± 1% +2.08% (p=0.000 n=18+20)
Fixes#14932Fixes#18912
Change-Id: I04b3f9b4ee853822cbdc97feade726fbe2907289
Reviewed-on: https://go-review.googlesource.com/36271
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
There is some code value too: types intending to implement
Source64 can write a conversion confirming that.
For #4254 and the Go 1.8 release notes.
Change-Id: I7fc350a84f3a963e4dab317ad228fa340dda5c66
Reviewed-on: https://go-review.googlesource.com/33456
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
After x.ProbablyPrime(n) passes the n Miller-Rabin rounds,
add a Baillie-PSW test before declaring x probably prime.
Although the provable error bounds are unchanged, the empirical
error bounds drop dramatically: there are no known inputs
for which Baillie-PSW gives the wrong answer. For example,
before this CL, big.NewInt(443*1327).ProbablyPrime(1) == true.
Now it is (correctly) false.
The new Baillie-PSW test is two pieces: an added Miller-Rabin
round with base 2, and a so-called extra strong Lucas test.
(See the references listed in prime.go for more details.)
The Lucas test takes about 3.5x as long as the Miller-Rabin round,
which is close to theoretical expectations.
name time/op
ProbablyPrime/Lucas 2.91ms ± 2%
ProbablyPrime/MillerRabinBase2 850µs ± 1%
ProbablyPrime/n=0 3.75ms ± 3%
The speed of prime testing for a prime input does get slower:
name old time/op new time/op delta
ProbablyPrime/n=1 849µs ± 1% 4521µs ± 1% +432.31% (p=0.000 n=10+9)
ProbablyPrime/n=5 4.31ms ± 3% 7.87ms ± 1% +82.70% (p=0.000 n=10+10)
ProbablyPrime/n=10 8.52ms ± 3% 12.28ms ± 1% +44.11% (p=0.000 n=10+10)
ProbablyPrime/n=20 16.9ms ± 2% 21.4ms ± 2% +26.35% (p=0.000 n=9+10)
However, because the Baillie-PSW test is only added when the old
ProbablyPrime(n) would return true, testing composites runs at
the same speed as before, except in the case where the result
would have been incorrect and is now correct.
In particular, the most important use of this code is for
generating random primes in crypto/rand. That use spends
essentially all its time testing composites, so it is not
slowed down by the new Baillie-PSW check:
name old time/op new time/op delta
Prime 104ms ±22% 111ms ±16% ~ (p=0.165 n=10+10)
Thanks to Serhat Şevki Dinçer for CL 20170, which this CL builds on.
Fixes#13229.
Change-Id: Id26dde9b012c7637c85f2e96355d029b6382812a
Reviewed-on: https://go-review.googlesource.com/30770
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
The tree is inconsistent about single l vs double l in those
words in documentation, test messages, and one error value text.
$ git grep -E '[Mm]arshall(|s|er|ers|ed|ing)' | wc -l
42
$ git grep -E '[Mm]arshal(|s|er|ers|ed|ing)' | wc -l
1694
Make it consistently a single l, per earlier decisions. This means
contributors won't be confused by misleading precedence, and it helps
consistency.
Change the spelling in one error value text in newRawAttributes of
crypto/x509 package to be consistent.
This change was generated with:
perl -i -npe 's,([Mm]arshal)l(|s|er|ers|ed|ing),$1$2,' $(git grep -l -E '[Mm]arshall' | grep -v AUTHORS | grep -v CONTRIBUTORS)
Updates #12431.
Follows https://golang.org/cl/14150.
Change-Id: I85d28a2d7692862ccb02d6a09f5d18538b6049a2
Reviewed-on: https://go-review.googlesource.com/33017
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Note, most math functions are structured to use stubs, so that they can
be accelerated with assembly on any platform.
Sinh, cosh, and tanh were not structued with stubs, so this CL does
that. This set of routines was chosen as likely to produce good speedups
with assembly on any platform.
Technique used was minimax polynomial approximation using tables of
polynomial coefficients, with argument range reduction.
A table of scaling factors was also used for cosh and log10.
before after speedup
BenchmarkCos 22.1 ns/op 6.79 ns/op 3.25x
BenchmarkCosh 125 ns/op 11.7 ns/op 10.68x
BenchmarkLog10 48.4 ns/op 12.5 ns/op 3.87x
BenchmarkSin 22.2 ns/op 6.55 ns/op 3.39x
BenchmarkSinh 125 ns/op 14.2 ns/op 8.80x
BenchmarkTanh 65.0 ns/op 15.1 ns/op 4.30x
Accuracy was tested against a high precision
reference function to determine maximum error.
Approximately 4,000,000 points were tested for each function,
producing the following result.
Note: ulperr is error in "units in the last place"
max
ulperr
sin 1.43 (returns NaN beyond +-2^50)
cos 1.79 (returns NaN beyond +-2^50)
cosh 1.05
sinh 3.02
tanh 3.69
log10 1.75
Also includes a set of tests to test non-vector functions even
when SIMD is enabled
Change-Id: Icb45f14d00864ee19ed973d209c3af21e4df4edc
Reviewed-on: https://go-review.googlesource.com/32352
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
The condition to determine if any further iterations are needed is
evaluated to false in case it encounters a NaN. Instead, flip the
condition to keep looping until the factor is greater than the machine
roundoff error.
Updates #17577
Change-Id: I058abe73fcd49d3ae4e2f7b33020437cc8f290c3
Reviewed-on: https://go-review.googlesource.com/31952
Reviewed-by: Robert Griesemer <gri@golang.org>
Implements Float.Scan which satisfies fmt.Scanner interface.
Also enforces docs' interface implementation claims with compile time
type assertions, that is:
+ Float always implements fmt.Formatter and fmt.Scanner
+ Int always implements fmt.Formatter and fmt.Scanner
+ Rat always implements fmt.Formatter
which will ensure that the API claims are strictly matched.
Also note that Float.Scan doesn't handle ±Inf.
Fixes#17391
Change-Id: I3d3dfbe7f602066975c7a7794fe25b4c645440ce
Reviewed-on: https://go-review.googlesource.com/30723
Reviewed-by: Robert Griesemer <gri@golang.org>
Add special case for Gamma(+∞) which speeds it up:
benchmark old ns/op new ns/op delta
BenchmarkGamma-4 14.5 7.44 -48.69%
The documentation for math.Gamma already specifies it as a special
case:
Gamma(+Inf) = +Inf
The original C code that has been used as the reference implementation
(as mentioned in the comments in gamma.go) also treats Gamma(+∞) as a
special case:
if( x == INFINITY )
return(x);
Change-Id: Idac36e19192b440475aec0796faa2d2c7f8abe0b
Reviewed-on: https://go-review.googlesource.com/31370
Reviewed-by: Robert Griesemer <gri@golang.org>
In addition to the DecimalConversion benchmark, that exercises the
String method of the internal decimal type on a range of small shifts,
add a few benchmarks for the big.Float String method. They can be used
to obtain more realistic data on the real-world performance of
big.Float printing.
Change-Id: I7ada324e7603cb1ce7492ccaf3382db0096223ba
Reviewed-on: https://go-review.googlesource.com/31275
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This is needed for some of the more complex primality tests
(to filter out exact squares), and while the code is simple the
boundary conditions are not obvious, so it seems worth having
in the library.
Change-Id: Ica994a6b6c1e412a6f6d9c3cf823f9b653c6bcbd
Reviewed-on: https://go-review.googlesource.com/30706
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Inspired by Alberto Donizetti's observations in
https://go-review.googlesource.com/#/c/30099/.
name old time/op new time/op delta
DecimalConversion-8 138µs ± 1% 136µs ± 2% -1.85% (p=0.000 n=10+10)
10 runs each, measured on a Mac Mini, 2.3 GHz Intel Core i7.
Performance improvements varied between -1.25% to -4.4%; -1.85% is
about in the middle of the observed improvement. The generated code
is slightly shorter in the inner loops of the conversion code.
Change-Id: I10fb3b2843da527691c39ad5e5e5bd37ed63e2fa
Reviewed-on: https://go-review.googlesource.com/31250
Reviewed-by: Alan Donovan <adonovan@google.com>
1. Define behavior for Unmarshal of JSON null into Unmarshaler and
TextUnmarshaler. Specifically, an Unmarshaler will be given the
literal null and can decide what to do (because otherwise
json.RawMessage is impossible to implement), and a TextUnmarshaler
will be skipped over (because there is no text to unmarshal), like
most other inappropriate types. Document this in Unmarshal, with a
reminder in UnmarshalJSON about handling null.
2. Test all this.
3. Fix the TextUnmarshaler case, which was returning an unmarshalling
error, to match the definition.
4. Fix the error that had been used for the TextUnmarshaler, since it
was claiming that there was a JSON string when in fact the problem was
NOT having a string.
5. Adjust time.Time and big.Int's UnmarshalJSON to ignore null, as is
conventional.
Fixes#9037.
Change-Id: If78350414eb8dda712867dc8f4ca35a9db041b0c
Reviewed-on: https://go-review.googlesource.com/30944
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
A later CL will be adding more code here.
It will help to keep it separate from the other code.
Change-Id: I971ba53de819cd10991b51fdec665984939a5f9b
Reviewed-on: https://go-review.googlesource.com/30709
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The Montgomery multiply code is applicable to this case
but was being bypassed. Don't do that.
The old test len(x) > 1 was really just a bad approximation to x > 1.
name old time/op new time/op delta
Exp-8 5.56ms ± 4% 5.73ms ± 3% ~ (p=0.095 n=5+5)
Exp2-8 7.59ms ± 1% 5.66ms ± 1% -25.40% (p=0.008 n=5+5)
This comes up especially when doing Fermat (Miller-Rabin)
primality tests with base 2.
Change-Id: I4cc02978db6dfa93f7f3c8f32718e25eedb4f5ed
Reviewed-on: https://go-review.googlesource.com/30708
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This way you can still run 'go test' or 'go bench -run Foo'
without wondering why it is taking so very long.
Change-Id: Icfa097a6deb1d6682acb7be9f34729215c29eabb
Reviewed-on: https://go-review.googlesource.com/30707
Reviewed-by: Robert Griesemer <gri@golang.org>
- Add new BenchmarkQuoRem.
- Eliminate allocation in divLarge nat pool
- Unroll mulAddVWW body 4x
- Remove some redundant slice loads in divLarge
name old time/op new time/op delta
QuoRem-8 2.18µs ± 1% 1.93µs ± 1% -11.38% (p=0.000 n=19+18)
The starting point in the comparison here is Cherry's
pending CL to turn mulWW and divWW into intrinsics.
The optimizations in divLarge work best because all
the function calls are gone. The effect of this CL is not
as large if you don't assume Cherry's CL.
Change-Id: Ia6138907489c5b9168497912e43705634e163b35
Reviewed-on: https://go-review.googlesource.com/30613
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Using 387 mode was computing it without underflow to zero,
apparently due to an 80-bit intermediate. Avoid underflow even
with 64-bit floats.
This eliminates the TODOs in the test suite.
Fixes linux-386-387 build and fixes#11441.
Change-Id: I8abaa63bfdf040438a95625d1cb61042f0302473
Reviewed-on: https://go-review.googlesource.com/30540
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The 387 implementation is less accurate and slower.
name old time/op new time/op delta
Exp-8 29.7ns ± 2% 24.0ns ± 2% -19.08% (p=0.000 n=10+10)
This makes Gamma more accurate too.
Change-Id: Iad33b9cce0b087ccbce3e08ba7a6d285c4999d02
Reviewed-on: https://go-review.googlesource.com/30230
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Quentin Smith <quentin@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>