Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.
On arm64 flatform.
previous:
FCVTSD F0, F0
FSQRTD F0, F0
FCVTDS F0, F0
optimized:
FSQRTS S0, S0
And this patch adds the test case to check the correctness.
This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)
Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The Surface Pro X's 386 simulator is not completely faithful to a real 387.
The most egregious problem is that it computes Log2(8) as 2.9999999999999996,
but it has some other subtler problems as well. All the problems occur in
routines that we don't even bother with assembly for on amd64.
If the speed of Go code is OK on amd64 it should be OK on 386 too.
Just remove all the 386-only assembly functions.
This leaves Ceil, Floor, Trunc, Hypot, and Sqrt in 386 assembly,
all of which are also in assembly on amd64 and all of which pass
their tests on Surface Pro X.
Compared to amd64, the 386 port omits assembly for Min, Max, and Log.
It never had Min and Max, and this CL deletes Log because Log2 wasn't
even correct. (None of the other architectures have assembly Log either.)
Change-Id: I5eb6c61084467035269d4098a36001447b7a0601
Reviewed-on: https://go-review.googlesource.com/c/go/+/291229
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
There appears to be a typo in the description of
the recursive division algorithm.
Two things seem suspicious with the original comment:
1. It is talking about choosing s, but s doesn't
appear anywhere in the equation.
2. The math in the equation is incorrect.
Where
B = len(v)/2
s = B - 1
Proof that it is incorrect:
len(v) - B >= B + 1
len(v) - len(v)/2 >= len(v)/2 + 1
This doesn't hold if len(v) is even, e.g. 10:
10 - 10/2 >= 10/2 + 1
10 - 5 >= 5 + 1
5 >= 6 // this is false
The new equation will be the following,
which will be mathematically correct:
len(v) - s >= B + 1
len(v) - (len(v)/2 - 1) >= len(v)/2 + 1
len(v) - len(v)/2 + 1 >= len(v)/2 + 1
len(v) - len(v)/2 >= len(v)/2
This holds if len(v) is even or odd.
e.g. 10
10 - 10/2 >= 10/2
10 - 5 >= 5
5 >= 5
e.g. 11
11 - 11/2 >= 11/2
11 - 5 >= 5
6 >= 5
Change-Id: If77ce09286cf7038637b5dfd0fb7d4f828023f56
Reviewed-on: https://go-review.googlesource.com/c/go/+/287372
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Trust: Katie Hockman <katie@golang.org>
"it does not necessary" -> "it is not necessary"
Change-Id: I66f9cf2670d76b3686badb4a537b3ec084447d62
GitHub-Last-Rev: 52a0f9993a
GitHub-Pull-Request: golang/go#43935
Reviewed-on: https://go-review.googlesource.com/c/go/+/287052
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Robert Griesemer <gri@golang.org>
The vulnerability that allowed this panic is
CVE-2020-28362 and has been fixed in a security
release, per #42552.
Change-Id: I774bcda2cc83cdd5a273d21c8d9f4b53fa17c88f
Reviewed-on: https://go-review.googlesource.com/c/go/+/277959
Run-TryBot: Katie Hockman <katie@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)
Update the Go tree to use the preferred names.
As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.
ReadDir changes are in a separate CL, because they are not a
simple search and replace.
For #42026.
Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The s390x assembly for shlVU does a forward copy when the shift amount s
is 0. This causes corruption of the result z when z is aliased to the
input x.
This fix removes the s390x assembly for both shlVU and shrVU so the pure
go implementations will be used.
Test cases have been added to the existing TestShiftOverlap test to
cover shift values of 0, 1 and (_W - 1).
Fixes#42838
Change-Id: I75ca0e98f3acfaa6366a26355dcd9dd82499a48b
Reviewed-on: https://go-review.googlesource.com/c/go/+/274442
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Trust: Robert Griesemer <gri@golang.org>
The previous s value could cause a crash
for certain inputs.
Will check in tests and documentation improvements later.
Thanks to the Go Ethereum team and the OSS-Fuzz project for reporting this.
Thanks to Rémy Oudompheng and Robert Griesemer for their help
developing and validating the fix.
Fixes CVE-2020-28362
Change-Id: Ibbf455c4436bcdb07c84a34fa6551fb3422356d3
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/899974
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Filippo Valsorda <valsorda@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/269657
Trust: Katie Hockman <katie@golang.org>
Trust: Roland Shoemaker <roland@golang.org>
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Append operations in the decimal String function may cause several allocations.
Use make to pre allocate slices in String that have enough capacity to avoid additional allocations in append operations.
name old time/op new time/op delta
DecimalConversion-8 139µs ± 7% 109µs ± 2% -21.06% (p=0.000 n=10+10)
Change-Id: Id0284d204918a179a0421c51c35d86a3408e1bd9
Reviewed-on: https://go-review.googlesource.com/c/go/+/233980
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
Trust: Martin Möhrmann <moehrmann@google.com>
Use the const variable Ln2 in math/const.go for function acosh.
Change-Id: I5381d03dd3142c227ae5773ece9be6c8f377615e
Reviewed-on: https://go-review.googlesource.com/c/go/+/232517
Reviewed-by: Robert Griesemer <gri@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
While reading the source code of the math/big package, I found the SetString function example of float type missing.
Change-Id: Id8c16a58e2e24f9463e8ff38adbc98f8c418ab26
Reviewed-on: https://go-review.googlesource.com/c/go/+/232804
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Simplifying some code without compromising performance.
My CPU is Intel Xeon Gold 6161, 2.20GHz, 64-bit operating system.
The memory is 8GB. This is my test environment, I hope to help you judge.
Benchmark:
name old time/op new time/op delta
Log1p-4 21.8ns ± 5% 21.8ns ± 4% ~ (p=0.973 n=20+20)
Change-Id: Icd8f96f1325b00007602d114300b92d4c57de409
Reviewed-on: https://go-review.googlesource.com/c/go/+/233940
Reviewed-by: Robert Griesemer <gri@golang.org>
Replaced almost every use of Bytes with FillBytes.
Note that the approved proposal was for
func (*Int) FillBytes(buf []byte)
while this implements
func (*Int) FillBytes(buf []byte) []byte
because the latter was far nicer to use in all callsites.
Fixes#35833
Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/230397
Reviewed-by: Robert Griesemer <gri@golang.org>
Implement special case handling and testing to ensure
conformance with the C99 standard annex G.6 Complex arithmetic.
Fixes#29320
Change-Id: Id72eb4c5a35d5a54b4b8690d2f7176ab11028f1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/220689
Reviewed-by: Robert Griesemer <gri@golang.org>
When I browsed the source code, I saw that there is no corresponding example of this function. I am not sure if there is a need for an increase, this is my first time to submit CL.
Change-Id: Idbf4e1e1ed2995176a76959d561e152263a2fd26
Reviewed-on: https://go-review.googlesource.com/c/go/+/230741
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.
Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87
Reviewed-on: https://go-review.googlesource.com/c/go/+/230318
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.
Change-Id: Ic3ffeb9e63238ef41406d97cdc42502145ddb454
Reviewed-on: https://go-review.googlesource.com/c/go/+/230319
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This CL optimizes code that uses a carry from a function such as
bits.Add64 as the condition in an if statement. For example:
x, c := bits.Add64(a, b, 0)
if c != 0 {
panic("overflow")
}
Rather than converting the carry into a 0 or a 1 value and using
that as an input to a comparison instruction the carry flag is now
used as the input to a conditional branch directly. This typically
removes an ADD LOGICAL WITH CARRY instruction when user code is
doing overflow detection and is closer to the code that a user
would expect to generate.
Change-Id: I950431270955ab72f1b5c6db873b6abe769be0da
Reviewed-on: https://go-review.googlesource.com/c/go/+/219757
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This CL looks big but it only does formatting changes to arith_s390x.s.
The file was formatted using asmfmt(https://github.com/klauspost/asmfmt)
, so there should not be any functional impact. I verified that the
generated assembly of big.test file is identical.
Change-Id: I8b4035ef082a4d0357881869327e25253f2d8be1
Reviewed-on: https://go-review.googlesource.com/c/go/+/229302
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The Float.Sqrt method switches (for performance reasons) between
direct (uses Quo) and inverse (doesn't) computation, depending on the
precision, with threshold 128.
Unfortunately the implementation of recursive division in CL 172018
made Quo slightly slower exactly in the range around and below the
threshold Sqrt is using, so this strategy is no longer profitable.
The new division algorithm allocates more, and this has increased the
amount of allocations performed by Sqrt when using the direct method;
on low precisions the computation is fast, so additional allocations
have an negative impact on performance.
Interestingly, only using the inverse method doesn't just reverse the
effects of the Quo algorithm change, but it seems to make performances
better overall for small precisions:
name old time/op new time/op delta
FloatSqrt/64-4 643ns ± 1% 635ns ± 1% -1.24% (p=0.000 n=10+10)
FloatSqrt/128-4 1.44µs ± 1% 1.02µs ± 1% -29.25% (p=0.000 n=10+10)
FloatSqrt/256-4 1.49µs ± 1% 1.49µs ± 1% ~ (p=0.752 n=10+10)
FloatSqrt/1000-4 3.71µs ± 1% 3.74µs ± 1% +0.87% (p=0.001 n=10+10)
FloatSqrt/10000-4 35.3µs ± 1% 35.6µs ± 1% +0.82% (p=0.002 n=10+9)
FloatSqrt/100000-4 844µs ± 1% 844µs ± 0% ~ (p=0.549 n=10+9)
FloatSqrt/1000000-4 69.5ms ± 0% 69.6ms ± 0% ~ (p=0.222 n=9+9)
name old alloc/op new alloc/op delta
FloatSqrt/64-4 280B ± 0% 200B ± 0% -28.57% (p=0.000 n=10+10)
FloatSqrt/128-4 504B ± 0% 248B ± 0% -50.79% (p=0.000 n=10+10)
FloatSqrt/256-4 344B ± 0% 344B ± 0% ~ (all equal)
FloatSqrt/1000-4 1.30kB ± 0% 1.30kB ± 0% ~ (all equal)
FloatSqrt/10000-4 13.5kB ± 0% 13.5kB ± 0% ~ (p=0.237 n=10+10)
FloatSqrt/100000-4 123kB ± 0% 123kB ± 0% ~ (p=0.247 n=10+10)
FloatSqrt/1000000-4 1.83MB ± 1% 1.83MB ± 3% ~ (p=0.779 n=8+10)
name old allocs/op new allocs/op delta
FloatSqrt/64-4 8.00 ± 0% 5.00 ± 0% -37.50% (p=0.000 n=10+10)
FloatSqrt/128-4 11.0 ± 0% 5.0 ± 0% -54.55% (p=0.000 n=10+10)
FloatSqrt/256-4 5.00 ± 0% 5.00 ± 0% ~ (all equal)
FloatSqrt/1000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal)
FloatSqrt/10000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal)
FloatSqrt/100000-4 6.00 ± 0% 6.00 ± 0% ~ (all equal)
FloatSqrt/1000000-4 10.3 ±13% 10.3 ±13% ~ (p=1.000 n=10+10)
For example, 1.02µs for FloatSqrt/128 is actually better than what I
was getting on the same machine before the Quo changes.
The .8% slowdown on /1000 and /10000 appears to be real and it is
quite baffling (that codepath was not touched at all); it may be
caused by code alignment changes.
Change-Id: Ib03761cdc1055674bc7526d4f3a23d7a25094029
Reviewed-on: https://go-review.googlesource.com/c/go/+/228062
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Fixes#38304
Also change `If m > 0, y < 0, ...` to `If m != 0, y < 0, ...` since `Exp` will return `nil`
whatever `m`'s sign is.
Change-Id: I17d7337ccd1404318cea5d42a8de904ad185fd00
GitHub-Last-Rev: 2399510300
GitHub-Pull-Request: golang/go#38390
Reviewed-on: https://go-review.googlesource.com/c/go/+/228000
Reviewed-by: Robert Griesemer <gri@golang.org>
The divBasic function computes the quotient of big nats u/v word by word.
It estimates each word qhat by performing a long division (top 2 words of u
divided by top word of v), looks at the next word to correct the estimate,
then perform a full multiplication (qhat*v) to catch any inaccuracy in the
estimate.
In the latter case, "negative" values appear temporarily and carries
must be carefully managed, and the recursive division refactoring
introduced a case where qhat*v has the same length as v, triggering an
out-of-bounds write in the case it happens when computing the top word
of the quotient.
Fixes#37499
Change-Id: I15089da4a4027beda43af497bf6de261eb792f94
Reviewed-on: https://go-review.googlesource.com/c/go/+/221980
Reviewed-by: Robert Griesemer <gri@golang.org>
The s390x assembly implementation was previously only handling this
case correctly for x = -Pi. Update the special case handling for
any y.
Fixes#35446
Change-Id: I355575e9ec8c7ce8bd9db10d74f42a22f39a2f38
Reviewed-on: https://go-review.googlesource.com/c/go/+/223420
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Document that the Float.Sqrt method does not set the receiver's
Accuracy field.
Updates #37915
Change-Id: Ief1dcac07eacc0ef02f86bfac9044501477bca1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/224497
Reviewed-by: Robert Griesemer <gri@golang.org>
s390x has inaccurate range reduction for the assembly routines
in math so these tests are diabled until these are corrected.
Updates #37854
Change-Id: I1e26acd6d09ae3e592a3dd90aec73a6844f5c6fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/223457
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Tan has poles along the real axis. In order to accurately calculate
the value near these poles, a range reduction by Pi is performed and
the result calculated via a Taylor series. The prior implementation
of range reduction used Cody-Waite range reduction in three parts.
This fails when x is too large to accurately calculate the partial
products in the summation accurately. Above this threshold, Payne-Hanek
range reduction using a multiple precision value of 1/Pi is required.
Additionally, the threshold used in math/trig_reduce.go for Payne-Hanek
range reduction was not set conservatively enough. The prior threshold
ensured that catastrophic failure did not occur where the argument x
would not actually be reduced below Pi/4. However, errors in reduction
begin to occur at values much lower when z = ((x - y*PI4A) - y*PI4B) - y*PI4C
is not exact because y*PI4A cannot be exactly represented as a float64.
reduceThreshold is lowered to the proper value.
Fixes#31566
Change-Id: I0f39a4171a5be44f64305f18dc57f6c29f19dba7
Reviewed-on: https://go-review.googlesource.com/c/go/+/172838
Reviewed-by: Rob Pike <r@golang.org>
Provide an assembly implementation of mulWW - for now all others run the
Go code.
Change-Id: Icb594c31048255f131bdea8d64f56784fc9db4d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/220919
Reviewed-by: Robert Griesemer <gri@golang.org>
It was removed in CL 217302 but was intentionally added in CL 217104.
Change-Id: I1a478d80ad1ec4f0a0184bfebf8f1a5e352cfe8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/217941
Reviewed-by: Robert Griesemer <gri@golang.org>
We don't usually document past behavior (like "As of Go 1.14 ...") and
in isolation the current docs made it sound like a and b could only be
negative or zero.
Change-Id: I0d3c2b8579a9c01159ce528a3128b1478e99042a
Reviewed-on: https://go-review.googlesource.com/c/go/+/217302
Reviewed-by: Ian Lance Taylor <iant@golang.org>