mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Joel Sing	efb0ac4ce6	cmd/compile: provide Add/Cas/Exchange atomic intrinsics on riscv64 Provide Add32, Add64, Cas32, Cas64, Exchange32 and Exchange64 atomic intrinsics on riscv64. Updates #36765 Change-Id: I9a3b7d2ce3d49f699171fd76a0fed891d149a6bb Reviewed-on: https://go-review.googlesource.com/c/go/+/223559 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2020-03-25 00:06:40 +00:00
Joel Sing	ade988623e	cmd/compile: provide Load32/Load64/Store32/Store64 atomic intrinsics on riscv64 Updates #36765 Change-Id: Id5ce5c5f60112e4f4cf9eec1b1ec120994934950 Reviewed-on: https://go-review.googlesource.com/c/go/+/223558 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-03-24 14:21:50 +00:00
Michael Anthony Knyszek	8c30971da6	cmd/compile: panic if trying to alias an intrinsic with no definitions Currently if we try to alias an intrinsic which hasn't been defined for any architecture (such as by accidentally creating the alias before the intrinsic is created with addF), then we'll just silently not apply any intrinsics to those aliases. Catch this particular case by panicking in alias if we try to apply the alias and it did nothing. Change-Id: I98e75fc3f7206b08fc9267cedb8db3e109ec4f5d Reviewed-on: https://go-review.googlesource.com/c/go/+/224637 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-23 16:01:20 +00:00
Michael Anthony Knyszek	830ee4792e	cmd/compile: declare runtime bit func aliases after math/bits intrinsics Currently runtime/internal/sys bit-manipulation functions are aliased to math/bits functions, which are intrinsified. Unfortunately these aliases are declared before the intrinsified versions are generated, resulting in the generic version of the code being copied over. This change moves the aliases for bit operations in runtime/internal/sys after the addF calls to generate those intrinsics in SSA, so that the intrinsified SSA representation of those functions actually get copied over. This should improve the overall performance of the runtime (especially the page allocator) since these bit operations will actually be intrinsified now. Change-Id: I4377da13f9a7bb6aee608e50df0297148bf8f806 Reviewed-on: https://go-review.googlesource.com/c/go/+/224437 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-23 15:52:57 +00:00
Joel Sing	2e918c3aab	cmd/compile: provide Load8/Store8 atomic intrinsics on riscv64 Updates #36765 Change-Id: Ieeb6bbc54e4841a1348ad50e80342ec4bc675e07 Reviewed-on: https://go-review.googlesource.com/c/go/+/223557 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-03-17 06:38:32 +00:00
Russ Cox	877ef86bec	cmd/compile: add spectre mitigation mode enabled by -spectre This commit adds a new cmd/compile flag -spectre, which accepts a comma-separated list of possible Spectre mitigations to apply, or the empty string (none), or "all". The only known mitigation right now is "index", which uses conditional moves to ensure that x86-64 CPUs do not speculate past index bounds checks. Speculating past index bounds checks may be problematic on systems running privileged servers that accept requests from untrusted users who can execute their own programs on the same machine. (And some more constraints that make it even more unlikely in practice.) The cases this protects against are analogous to the ones Microsoft explains in the "Array out of bounds load/store feeding ..." sections here: https://docs.microsoft.com/en-us/cpp/security/developer-guidance-speculative-execution?view=vs-2019#array-out-of-bounds-load-feeding-an-indirect-branch Change-Id: Ib7532d7e12466b17e04c4e2075c2a456dc98f610 Reviewed-on: https://go-review.googlesource.com/c/go/+/222660 Reviewed-by: Keith Randall <khr@golang.org>	2020-03-13 19:05:46 +00:00
Cuong Manh Le	7e1028a9ff	cmd/compile: avoid range over copy of array Passes toostash-check. Slightly reduce compiler binary size: file before after Δ % compile 21087288 21070776 -16512 -0.078% total 131847020 131830508 -16512 -0.013% file before after Δ % cmd/compile/internal/gc.a 9007472 8999640 -7832 -0.087% total 127117794 127109962 -7832 -0.006% Change-Id: I4aadd68d0a7545770598bed9d3a4d05899b67b52 Reviewed-on: https://go-review.googlesource.com/c/go/+/205777 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-03-06 23:24:28 +00:00
Josh Bleecher Snyder	7b0b6c2f7e	cmd/compile: simplify converted SSA form for 'if false' The goal here is to make it easier for a human to examine the SSA when a function contains lots of dead code. No significant compiler metric or generated code differences. Change-Id: I81915fa4639bc8820cc9a5e45e526687d0d1f57a Reviewed-on: https://go-review.googlesource.com/c/go/+/221791 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-03-03 18:42:30 +00:00
Meng Zhuo	ab7ecea0c8	cmd/compile: add intrinsics for runtime/internal/math on MIPS64x name old time/op new time/op delta MulUintptr/small 8.42ns ± 0% 5.93ns ± 0% -29.66% (p=0.000 n=9+10) MulUintptr/large 11.1ns ± 0% 7.4ns ± 0% -33.17% (p=0.000 n=10+9) Change-Id: I6659a886389660461fc2c90bd248243f6e7c29d5 Reviewed-on: https://go-review.googlesource.com/c/go/+/210897 Run-TryBot: Meng Zhuo <mengzhuo1203@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-03-02 16:12:20 +00:00
Ruixin(Peter) Bao	2962c96c9f	cmd/compile: lower float to uint conversions on s390x Add rules for lowering float <-> unsigned int on s390x. During compilation, Cvt64Uto64F rule triggers around 80 times, Cvt64Fto64U rule triggers around 20 times, Cvt64Uto32F rule triggers around 5 times. Change-Id: If4c9d128b9132fce8c0bea9abc09cb43a5df7989 Reviewed-on: https://go-review.googlesource.com/c/go/+/209177 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-02-29 21:37:47 +00:00
Michael Munday	cb74dcc172	cmd/compile: remove Greater* and Geq* generic integer ops The generic Greater and Geq ops can always be replaced with the Less and Leq ops. This CL therefore removes them. This simplifies the compiler since it reduces the number of operations that need handling in both code and in rewrite rules. This will be especially true when adding control flow optimizations such as the integer-in-range optimizations in CL 165998. Change-Id: If0648b2b19998ac1bddccbf251283f3be4ec3040 Reviewed-on: https://go-review.googlesource.com/c/go/+/220417 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-02-26 13:11:53 +00:00
Joel Sing	2e4f490b31	cmd/compile,cmd/link: fix and re-enable open-coded defers on riscv64 The R_CALLRISCV relocation marker is on the JALR instruction, however the actual relocation is currently two instructions previous for the AUIPC+ADDI sequence. Adjust the platform dependent offset accordingly and re-enable open-coded defers. Fixes #36786. Change-Id: I71597c193c447930fbe94ce44b7355e89ae877bb Reviewed-on: https://go-review.googlesource.com/c/go/+/216797 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-01-29 16:34:44 +00:00
Joel Sing	a858d15f11	cmd/compile: disable open-coded defers on riscv64 Open-coded defers are currently broken on riscv64 - disable them for the time being. All of the standard package tests now pass on linux/riscv64. Updates issue #27532 and #36786 Change-Id: I20fc25ce91dfad48be32409ba5c64ca9a6acef1d Reviewed-on: https://go-review.googlesource.com/c/go/+/216517 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Dan Scales <danscales@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-01-28 02:40:44 +00:00
Joel Sing	98d2717499	cmd/compile: implement compiler for riscv64 Based on riscv-go port. Updates #27532 Change-Id: Ia329daa243db63ff334053b8807ea96b97ce3acf Reviewed-on: https://go-review.googlesource.com/c/go/+/204631 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-01-18 14:41:40 +00:00
Cherry Zhang	a037582eff	cmd/compile: mark empty block preemptible Currently, a block's control instruction gets the liveness info of the last Value in the block. However, for an empty block, the control instruction gets the invalid liveness info and therefore not preemptible. One example is empty infinite loop, which has only a control instruction. The control instruction being non- preemptible makes the whole loop non-preemptible. Fix this by using a different, preemptible liveness info for empty block's control. We can choose an arbitrary preemptible liveness info, as at run time we don't really use the liveness map at that instruction. As before, if the last Value in the block is non-preemptible, so is the block control. For example, the conditional branch in the write barrier test block is still non-preemptible. Also, only update liveness info if we are actually emitting instructions. So zero-width Values' liveness info (which are always invalid) won't affect the block control's liveness info. For example, if the last Values in a block is a tuple-generating operation and a Select, the block control instruction is still preemptible. Fixes #35923. Change-Id: Ic5225f3254b07e4955f7905329b544515907642b Reviewed-on: https://go-review.googlesource.com/c/go/+/209659 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>	2019-12-06 01:11:02 +00:00
David Chase	0e02cfb369	cmd/compile: try harder to not use an empty src.XPos for a bogus line The fix for #35652 did not guarantee that it was using a non-empty src position to replace an empty one. The new code checks again and falls back to a more certain position. (The input in question compiles to a single empty infinite loop, and none of the actual instructions had any source position at all. That is a bug, but given the pathology of this input, not one worth dealing with this late in the release cycle, if ever.) Literally: 00000 (5) TEXT "".f(SB), ABIInternal 00001 (5) PCDATA $0, $-2 00002 (5) PCDATA $1, $-2 00003 (5) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 00004 (5) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 00005 (5) FUNCDATA $2, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) b2 00006 (?) XCHGL AX, AX b6 00007 (+1048575) JMP 6 00008 (?) END TODO: Add runtime.InfiniteLoop(), replace infinite loops with a call to that, and use an eco-friendly runtime.gopark instead. (This was Cherry's excellent idea.) Updates #35652 Fixes #35695 Change-Id: I4b9a841142ee4df0f6b10863cfa0721a7e13b437 Reviewed-on: https://go-review.googlesource.com/c/go/+/207964 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-22 03:06:22 +00:00
David Chase	9bba63bbbe	cmd/compile: make a better bogus line for empty infinite loops The old recipe for making an infinite loop not be infinite in the debugger could create an instruction (Prog) with a line number not tied to any file (index == 0). This caused downstream failures in DWARF processing. So don't do that. Also adds a test, also adds a check+panic to ensure that the next time this happens the error is less mystifying. Fixes #35652 Change-Id: I04f30bc94fdc4aef20dd9130561303ff84fd945e Reviewed-on: https://go-review.googlesource.com/c/go/+/207613 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-19 00:38:53 +00:00
Michael Munday	b3885dbc93	cmd/compile, runtime: intrinsify atomic And8 and Or8 on s390x Intrinsify these functions to match other platforms. Update the sequence of instructions used in the assembly implementations to match the intrinsics. Also, add a micro benchmark so we can more easily measure the performance of these two functions: name old time/op new time/op delta And8-8 5.33ns ± 7% 2.55ns ± 8% -52.12% (p=0.000 n=20+20) And8Parallel-8 7.39ns ± 5% 3.74ns ± 4% -49.34% (p=0.000 n=20+20) Or8-8 4.84ns ±15% 2.64ns ±11% -45.50% (p=0.000 n=20+20) Or8Parallel-8 7.27ns ± 3% 3.84ns ± 4% -47.10% (p=0.000 n=19+20) By using a 'rotate then xor selected bits' instruction combined with either a 'load and and' or a 'load and or' instruction we can implement And8 and Or8 with far fewer instructions. Replacing 'compare and swap' with atomic instructions may also improve performance when there is contention. Change-Id: I28bb8032052b73ae8ccdf6e4c612d2877085fa01 Reviewed-on: https://go-review.googlesource.com/c/go/+/204277 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-11-11 15:23:59 +00:00
DQNEO	f07059d949	cmd/compile: rename sizeof_Array and array_* to slice_* Renames variables sizeof_Array and other array_* variables that were actually intended for slices and not arrays. Change-Id: I391b95880cc77cabb8472efe694b7dd19545f31a Reviewed-on: https://go-review.googlesource.com/c/go/+/180919 Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-11 12:40:04 +00:00
David Chase	cd53fddabb	cmd/compile: add framework for logging optimizer (non)actions to LSP This is intended to allow IDEs to note where the optimizer was not able to improve users' code. There may be other applications for this, for example in studying effectiveness of optimizer changes more quickly than running benchmarks, or in verifying that code changes did not accidentally disable optimizations in performance-critical code. Logging of nilcheck (bad) for amd64 is implemented as proof-of-concept. In general, the intent is that optimizations that didn't happen are what will be logged, because that is believed to be what IDE users want. Added flag -json=version,dest Check that version=0. (Future compilers will support a few recent versions, I hope that version is always <=3.) Dest is expected to be one of: /path (or \path in Windows) will create directory /path and fill it w/ json files file://path will create directory path, intended either for I:\dont\know\enough\about\windows\paths trustme_I_know_what_I_am_doing_probably_testing Not passing an absolute path name usually leads to json splattered all over source directories, or failure when those directories are not writeable. If you want a foot-gun, you have to ask for it. The JSON output is directed to subdirectories of dest, where each subdirectory is net/url.PathEscape of the package name, and each for each foo.go in the package, net/url.PathEscape(foo).json is created. The first line of foo.json contains version and context information, and subsequent lines contains LSP-conforming JSON describing the missing optimizations. Change-Id: Ib83176a53a8c177ee9081aefc5ae05604ccad8a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/204338 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-10 17:11:34 +00:00
David Chase	a0262b201f	cmd/compile: intrinsify functions added to runtime/internal/sys This restores intrinsic status to functions copied from math/bits into runtime/internal/sys, as an aid to runtime performance. Updates #35112. Change-Id: I41a7d87cf00f1e64d82aa95c5b1000bc128de820 Reviewed-on: https://go-review.googlesource.com/c/go/+/206200 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-09 05:51:04 +00:00
Russ Cox	543c6d2e0d	math, cmd/compile: rename Fma to FMA This API was added for #25819, where it was discussed as math.FMA. The commit adding it used math.Fma, presumably for consistency with the rest of the unusual names in package math (Sincos, Acosh, Erfcinv, Float32bits, etc). I believe that using an idiomatic Go name is more important here than consistency with these other names, most of which are historical baggage from C's standard library. Early additions like Float32frombits happened before "uppercase for export" (so they were originally like "float32frombits") and they were not properly reconsidered when we uppercased the symbols to export them. That's a mistake we live with. The names of functions we have added since then, and even a few that were legacy, are more properly Go-cased, such as IsNaN, IsInf, and RoundToEven, rather than Isnan, Isinf, and Roundtoeven. And also constants like MaxFloat32. For new API, we should keep using proper Go-cased symbols instead of minimally-upper-cased-C symbols. So math.FMA, not math.Fma. This API has not yet been released, so this change does not break the compatibility promise. This CL also modifies cmd/compile, since the compiler knows the name of the function. I could have stopped at changing the string constants, but it seemed to make more sense to use a consistent casing everywhere. Change-Id: I0f6f3407f41e99bfa8239467345c33945088896e Reviewed-on: https://go-review.googlesource.com/c/go/+/205317 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-11-07 14:51:06 +00:00
Dan Scales	cc47b0d2cd	cmd/compile: handle some missing cases of non-SSAable values for args of open-coded defers In my experimentation, I had found that most non-SSAable expressions were converted to autotmp variables during AST evaluation. However, this was not true generally, as witnessed by issue #35213, which has a non-SSAable field reference of a struct that is not converted to an autotmp. So, I fixed openDeferSave() to handle non-SSAable nodes more generally, and make sure that these non-SSAable expressions are not evaluated more than once (which could incorrectly repeat side effects). Fixes #35213 Change-Id: I8043d5576b455e94163599e930ca0275e550d594 Reviewed-on: https://go-review.googlesource.com/c/go/+/203888 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-29 19:58:24 +00:00
Austin Clements	97592b3c14	cmd/compile: intrinsics for runtime/internal/atomic.Store8 For #10958, #24543, but makes sense on its own. Change-Id: I2a87dab66b82a1863e4b6512b1f8def51463ce2a Reviewed-on: https://go-review.googlesource.com/c/go/+/203284 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-29 03:18:55 +00:00
Dan Scales	be64a19d99	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>	2019-10-24 13:54:11 +00:00
smasher164	03fb1f607b	cmd/compile: don't use FMA on plan9 CL 137156 introduces an intrinsic on AMD64 that executes vfmadd231sd when feature detection is successful. However, because floating-point isn't allowed in note handler, the builder disables SSE instructions, and fails when attempting to execute this instruction. This change disables FMA on plan9 to immediately use the software fallback. Fixes #35063. Change-Id: I87d8f0995bd2f15013d203e618938f5079c9eed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/202617 Reviewed-by: Keith Randall <khr@golang.org>	2019-10-22 19:36:42 +00:00
smasher164	58b031949b	cmd/compile: add fma intrinsic for arm This change introduces an arm intrinsic that generates the FMULAD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.ARM.HasVFPv4. A rewrite rule translates the generic intrinsic to FMULAD. Updates #25819. Change-Id: I8459e5dd1cdbdca35f88a78dbeb7d387f1e20efa Reviewed-on: https://go-review.googlesource.com/c/go/+/142117 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 17:42:47 +00:00
smasher164	7a6da218b1	cmd/compile: add fma intrinsic for amd64 To permit ssa-level optimization, this change introduces an amd64 intrinsic that generates the VFMADD231SD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic ("Fma") to VFMADD231SD. The benchmark compares the software implementation (old) with the intrinsic (new). name old time/op new time/op delta Fma-4 27.2ns ± 1% 1.0ns ± 9% -96.48% (p=0.008 n=5+5) Updates #25819. Change-Id: I966655e5f96817a5d06dff5942418a3915b09584 Reviewed-on: https://go-review.googlesource.com/c/go/+/137156 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 16:42:10 +00:00
smasher164	33425ab8db	cmd/compile: introduce generic ssa intrinsic for fused-multiply-add In order to make math.FMA a compiler intrinsic for ISAs like ARM64, PPC64[le], and S390X, a generic 3-argument opcode "Fma" is provided and rewritten as ARM64: (Fma x y z) -> (FMADDD z x y) PPC64: (Fma x y z) -> (FMADD x y z) S390X: (Fma x y z) -> (FMADD z x y) Updates #25819. Change-Id: Ie5bc628311e6feeb28ddf9adaa6e702c8c291efa Reviewed-on: https://go-review.googlesource.com/c/go/+/131959 Run-TryBot: Akhil Indurti <aindurti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 16:24:15 +00:00
Bryan C. Mills	b76e6f8825	Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>	2019-10-16 20:59:53 +00:00
Dan Scales	dad616375f	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>	2019-10-16 18:27:16 +00:00
Cherry Zhang	c4817f5d4f	cmd/compile: on Wasm and AIX, let deferred nil function panic at invocation The Go spec requires If a deferred function value evaluates to nil, execution panics when the function is invoked, not when the "defer" statement is executed. On Wasm and AIX, currently we actually emit a nil check at the point of defer statement, which will make it panic too early. This CL fixes this. Also, on Wasm, now the nil function will be passed through deferreturn to jmpdefer, which does an explicit nil check and calls sigpanic if it is nil. This sigpanic, being called from assembly, is ABI0. So change the assembler backend to also handle sigpanic in ABI0. Fixes #34926. Updates #8047. Change-Id: I28489a571cee36d2aef041f917b8cfdc31d557d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/201297 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-16 00:05:37 +00:00
Meng Zhuo	50f1157760	cmd/compile: add math/bits.Mul64 intrinsic on mips64x Benchmark: name old time/op new time/op delta Mul 36.0ns ± 1% 2.8ns ± 0% -92.31% (p=0.000 n=10+10) Mul32 4.37ns ± 0% 4.37ns ± 0% ~ (p=0.429 n=6+10) Mul64 36.4ns ± 0% 2.8ns ± 0% -92.37% (p=0.000 n=10+9) Change-Id: Ic4f4e5958adbf24999abcee721d0180b5413fca7 Reviewed-on: https://go-review.googlesource.com/c/go/+/200582 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-10-14 21:23:34 +00:00
Matthew Dempsky	06b12e660c	cmd/compile: move some ONAME-specific flags from Node to Name The IsClosureVar, IsOutputParamHeapAddr, Assigned, Addrtaken, InlFormal, and InlLocal flags are only interesting for ONAME nodes, so it's better to set these flags on Name.flags instead of Node.flags. Two caveats though: 1. Previously, we would set Assigned and Addrtaken on the entire expression tree involved in an assignment or addressing operation. However, the rest of the compiler only actually cares about knowing whether the underlying ONAME (if any) was assigned/addressed. 2. This actually requires bumping Name.flags from bitset8 to bitset16, whereas it doesn't allow shrinking Node.flags any. However, Name has some trailing padding bytes, so expanding Name.flags doesn't cost any memory. Passes toolstash-check. Change-Id: I7775d713566a38d5b9723360b1659b79391744c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/200898 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-14 18:57:11 +00:00
Matthew Dempsky	46be01f4e0	cmd/compile: remove Addable flag This flag is supposed to indicate whether the expression is "addressable"; but in practice, we infer this from other attributes about the expression (e.g., n.Op and n.Class()). Passes toolstash-check. Change-Id: I19352ca07ab5646e232d98e8a7c1c9aec822ddd0 Reviewed-on: https://go-review.googlesource.com/c/go/+/200897 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-13 01:48:30 +00:00
David Chase	c450ace12c	cmd/compile: remove statement marks from secondary calls Calls are code-generated in an alternate path that inherits its positions from values, not from SSAGenState. The default position on SSAGenState was marked as not-a-statement, but this was not applied to the value itself, leading to spurious "is statement" marks in the output (convention: after code generation in the compiler, everything is either definitely a statement or definitely not a statement, nothing is in the undetermined state). This CL causes a 35 statement regression in ssa/stmtlines_test. This is down from the earlier 150 because of all the other CLs preceding this one that deal with the root causes of the missing lines (repeated lines on nested calls hid missing lines). This also removes some line repeats from ssa/debug_test. Change-Id: Ie9a507bd5447e906b35bbd098e3295211df2ae01 Reviewed-on: https://go-review.googlesource.com/c/go/+/188018 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jeremy Faller <jeremy@golang.org>	2019-10-04 20:41:52 +00:00
Ruixin(Peter) Bao	ac2ceba01a	cmd/compile/internal/gc: intrinsify mulWW on s390x SSA rule have already been added previously to intrisinfy Mul/Mul64 on s390x. In this CL, we want to let mulWW use that SSA rule as well. Also removed an extra line for formatting. Benchmarks: QuoRem-18 3.59µs ±15% 2.94µs ± 3% -18.06% (p=0.000 n=8+8) ModSqrt225_Tonelli-18 806µs ± 0% 800µs ± 0% -0.85% (p=0.000 n=7+8) ModSqrt225_3Mod4-18 245µs ± 1% 243µs ± 0% -0.81% (p=0.001 n=8+8) ModSqrt231_Tonelli-18 837µs ± 0% 834µs ± 1% -0.36% (p=0.028 n=8+8) ModSqrt231_5Mod8-18 282µs ± 0% 280µs ± 0% -0.76% (p=0.000 n=8+8) Sqrt-18 45.8µs ± 2% 38.6µs ± 0% -15.63% (p=0.000 n=8+8) IntSqr/1-18 19.1ns ± 0% 13.1ns ± 0% -31.41% (p=0.000 n=8+8) IntSqr/2-18 48.3ns ± 2% 48.2ns ± 0% ~ (p=0.094 n=8+8) IntSqr/3-18 70.5ns ± 1% 70.7ns ± 0% ~ (p=0.428 n=8+8) IntSqr/5-18 119ns ± 1% 118ns ± 0% -1.02% (p=0.000 n=7+8) IntSqr/8-18 215ns ± 1% 215ns ± 0% ~ (p=0.320 n=8+7) IntSqr/10-18 302ns ± 1% 301ns ± 0% ~ (p=0.148 n=8+7) IntSqr/20-18 952ns ± 1% 807ns ± 0% -15.28% (p=0.000 n=8+8) IntSqr/30-18 1.74µs ± 0% 1.53µs ± 0% -11.93% (p=0.000 n=8+8) IntSqr/50-18 3.91µs ± 0% 3.57µs ± 0% -8.64% (p=0.000 n=7+8) IntSqr/80-18 8.66µs ± 1% 8.11µs ± 0% -6.39% (p=0.000 n=8+8) IntSqr/100-18 12.8µs ± 0% 12.2µs ± 0% -5.19% (p=0.000 n=8+8) IntSqr/200-18 46.0µs ± 0% 44.5µs ± 0% -3.06% (p=0.000 n=8+8) IntSqr/300-18 81.4µs ± 0% 78.4µs ± 0% -3.71% (p=0.000 n=7+8) IntSqr/500-18 212µs ± 1% 206µs ± 0% -2.66% (p=0.000 n=8+8) IntSqr/800-18 419µs ± 1% 406µs ± 0% -3.07% (p=0.000 n=8+8) IntSqr/1000-18 635µs ± 0% 621µs ± 0% -2.13% (p=0.000 n=8+8) Change-Id: Ib097857186932b902601ab087cbeff3fc9555c3e Reviewed-on: https://go-review.googlesource.com/c/go/+/197639 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-02 11:08:43 +00:00
Mohit Verma	c729116332	cmd/compile: use Node.Right for OAS2* nodes (cleanup) This CL changes cmd/compile to use Node.Right instead of Node.Rlist for OAS2FUNC/OAS2RECV/OAS2MAPR/OAS2DOTTYPE nodes. Fixes #32293 Change-Id: I4c9d9100be2d98d15e016797f934f64d385f5faa Reviewed-on: https://go-review.googlesource.com/c/go/+/197817 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-09-28 05:04:49 +00:00
Cuong Manh Le	75da700d0a	cmd/compile: consistently use strlit to access constants string values Passes toolstash-check. Change-Id: Ieaef20b7649787727b69469f93ffc942022bc079 Reviewed-on: https://go-review.googlesource.com/c/go/+/195198 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-09-16 11:41:20 +00:00
Ruixin Bao	98aa97806b	cmd/compile: add math/bits.Mul64 intrinsic on s390x This change adds an intrinsic for Mul64 on s390x. To achieve that, a new assembly instruction, MLGR, is introduced in s390x/asmz.go. This assembly instruction directly uses an existing instruction on Z and supports multiplication of two 64 bit unsigned integer and stores the result in two separate registers. In this case, we require the multiplcand to be stored in register R3 and the output result (the high and low 64 bit of the product) to be stored in R2 and R3 respectively. A test case is also added. Benchmark: name old time/op new time/op delta Mul-18 11.1ns ± 0% 1.4ns ± 0% -87.39% (p=0.002 n=8+10) Mul32-18 2.07ns ± 0% 2.07ns ± 0% ~ (all equal) Mul64-18 11.1ns ± 1% 1.4ns ± 0% -87.42% (p=0.000 n=10+10) Change-Id: Ieca6ad1f61fff9a48a31d50bbd3f3c6d9e6675c1 Reviewed-on: https://go-review.googlesource.com/c/go/+/194572 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-09-13 09:04:48 +00:00
Ainar Garipov	0efbd10157	all: fix typos Use the following (suboptimal) script to obtain a list of possible typos: #!/usr/bin/env sh set -x git ls-files \|\ grep -e '\.$c\\|cc\\|go$$' \|\ xargs -n 1\ awk\ '/\/\// { gsub(/.\/\//, ""); print; } /\/\/, /\\// { gsub(/.\/\/, ""); gsub(/\\/.*/, ""); }' \|\ hunspell -d en_US -l \|\ grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' \|\ grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' \|\ sort -f \|\ uniq -c \|\ awk '$1 == 1 { print $2; }' Then, go through the results manually and fix the most obvious typos in the non-vendored code. Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae Reviewed-on: https://go-review.googlesource.com/c/go/+/193848 Reviewed-by: Robert Griesemer <gri@golang.org>	2019-09-08 17:28:20 +00:00
Matthew Dempsky	581526ce96	cmd/compile: rewrite untyped constant conversion logic This CL detangles the hairy mess that was convlit+defaultlit. In particular, it makes the following changes: 1. convlit1 now follows the standard typecheck behavior of setting "n.Type = nil" if there's an error. Notably, this means for a lot of test cases, we now avoid reporting useless follow-on error messages. For example, after reporting that "1 << s + 1.0" has an invalid shift, we no longer also report that it can't be assigned to string. 2. Previously, assignconvfn had some extra logic for trying to suppress errors from convlit/defaultlit so that it could provide its own errors with better context information. Instead, this extra context information is now passed down into convlit1 directly. 3. Relatedly, this CL also removes redundant calls to defaultlit prior to assignconv. As a consequence, when an expression doesn't make sense for a particular assignment (e.g., assigning an untyped string to an integer), the error messages now say "untyped string" instead of just "string". This is more consistent with go/types behavior. 4. defaultlit2 is now smarter about only trying to convert pairs of untyped constants when it's likely to succeed. This allows us to report better error messages for things like 3+"x"; instead of "cannot convert 3 to string" we now report "mismatched types untyped number and untyped string". Passes toolstash-check. Change-Id: I26822a02dc35855bd0ac774907b1cf5737e91882 Reviewed-on: https://go-review.googlesource.com/c/go/+/187657 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-09-06 23:15:48 +00:00
Cuong Manh Le	d2f958d8d1	cmd/compile: extend ssa.go to handle 1-element array and 1-field struct Assinging to 1-element array/1-field struct variable is considered clobbering the whole variable. By emitting OpVarDef in this case, liveness analysis can now know the variable is redefined. Also, the isfat is not necessary anymore, and will be removed in follow up CL. Fixes #33916 Change-Id: Iece0d90b05273f333d59d6ee5b12ee7dc71908c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/192979 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-09-03 19:33:04 +00:00
Brian Kessler	b003afe4fe	cmd/compile: intrinsify RotateLeft32 on wasm wasm has 32-bit versions of all integer operations. This change lowers RotateLeft32 to i32.rotl on wasm and intrinsifies the math/bits call. Benchmarking on amd64 under node.js this is ~25% faster. node v10.15.3/amd64 name old time/op new time/op delta RotateLeft 8.37ns ± 1% 8.28ns ± 0% -1.05% (p=0.029 n=4+4) RotateLeft8 11.9ns ± 1% 11.8ns ± 0% ~ (p=0.167 n=5+5) RotateLeft16 11.8ns ± 0% 11.8ns ± 0% ~ (all equal) RotateLeft32 11.9ns ± 1% 8.7ns ± 0% -26.32% (p=0.008 n=5+5) RotateLeft64 8.31ns ± 1% 8.43ns ± 2% ~ (p=0.063 n=5+5) Updates #31265 Change-Id: I5b8e155978faeea536c4f6427ac9564d2f096a46 Reviewed-on: https://go-review.googlesource.com/c/go/+/182359 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>	2019-08-31 17:03:04 +00:00
Ben Shi	8d5197d818	cmd/compile: optimize 386's math.bits.TrailingZeros16 This CL reverts CL 192097 and fixes the issue in CL 189277. Change-Id: Icd271262e1f5019a8e01c91f91c12c1261eeb02b Reviewed-on: https://go-review.googlesource.com/c/go/+/192519 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-30 17:37:00 +00:00
Ben Shi	3cfd003a8a	cmd/compile: optimize ARM's math.bits.RotateLeft32 This CL optimizes math.bits.RotateLeft32 to inline "MOVW Rx@>Ry, Rd" on ARM. The benchmark results of math/bits show some improvements. name old time/op new time/op delta RotateLeft-4 9.42ns ± 0% 6.91ns ± 0% -26.66% (p=0.000 n=40+33) RotateLeft8-4 8.79ns ± 0% 8.79ns ± 0% -0.04% (p=0.000 n=40+31) RotateLeft16-4 8.79ns ± 0% 8.79ns ± 0% -0.04% (p=0.000 n=40+32) RotateLeft32-4 8.16ns ± 0% 7.54ns ± 0% -7.68% (p=0.000 n=40+40) RotateLeft64-4 15.7ns ± 0% 15.7ns ± 0% ~ (all equal) updates #31265 Change-Id: I77bc1c2c702d5323fc7cad5264a8e2d5666bf712 Reviewed-on: https://go-review.googlesource.com/c/go/+/188697 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-08-28 15:41:58 +00:00
Ben Shi	c683ab8128	cmd/compile: optimize ARM's math.Abs This CL optimizes math.Abs to an inline ABSD instruction on ARM. The benchmark results of src/math/ show big improvements. name old time/op new time/op delta Acos-4 181ns ± 0% 182ns ± 0% +0.30% (p=0.000 n=40+40) Acosh-4 202ns ± 0% 202ns ± 0% ~ (all equal) Asin-4 163ns ± 0% 163ns ± 0% ~ (all equal) Asinh-4 242ns ± 0% 242ns ± 0% ~ (all equal) Atan-4 120ns ± 0% 121ns ± 0% +0.83% (p=0.000 n=40+40) Atanh-4 202ns ± 0% 202ns ± 0% ~ (all equal) Atan2-4 173ns ± 0% 173ns ± 0% ~ (all equal) Cbrt-4 1.06µs ± 0% 1.06µs ± 0% +0.09% (p=0.000 n=39+37) Ceil-4 72.9ns ± 0% 72.8ns ± 0% ~ (p=0.237 n=40+40) Copysign-4 13.2ns ± 0% 13.2ns ± 0% ~ (all equal) Cos-4 193ns ± 0% 183ns ± 0% -5.18% (p=0.000 n=40+40) Cosh-4 254ns ± 0% 239ns ± 0% -5.91% (p=0.000 n=40+40) Erf-4 112ns ± 0% 112ns ± 0% ~ (all equal) Erfc-4 117ns ± 0% 117ns ± 0% ~ (all equal) Erfinv-4 127ns ± 0% 127ns ± 1% ~ (p=0.492 n=40+40) Erfcinv-4 128ns ± 0% 128ns ± 0% ~ (all equal) Exp-4 212ns ± 0% 206ns ± 0% -3.05% (p=0.000 n=40+40) ExpGo-4 216ns ± 0% 209ns ± 0% -3.24% (p=0.000 n=40+40) Expm1-4 142ns ± 0% 142ns ± 0% ~ (all equal) Exp2-4 191ns ± 0% 184ns ± 0% -3.45% (p=0.000 n=40+40) Exp2Go-4 194ns ± 0% 187ns ± 0% -3.61% (p=0.000 n=40+40) Abs-4 14.4ns ± 0% 6.3ns ± 0% -56.39% (p=0.000 n=38+39) Dim-4 12.6ns ± 0% 12.6ns ± 0% ~ (all equal) Floor-4 49.6ns ± 0% 49.6ns ± 0% ~ (all equal) Max-4 27.6ns ± 0% 27.6ns ± 0% ~ (all equal) Min-4 27.0ns ± 0% 27.0ns ± 0% ~ (all equal) Mod-4 349ns ± 0% 305ns ± 1% -12.55% (p=0.000 n=33+40) Frexp-4 54.0ns ± 0% 47.1ns ± 0% -12.78% (p=0.000 n=38+38) Gamma-4 242ns ± 0% 234ns ± 0% -3.16% (p=0.000 n=36+40) Hypot-4 84.8ns ± 0% 67.8ns ± 0% -20.05% (p=0.000 n=31+35) HypotGo-4 88.5ns ± 0% 71.6ns ± 0% -19.12% (p=0.000 n=40+38) Ilogb-4 45.8ns ± 0% 38.9ns ± 0% -15.12% (p=0.000 n=40+32) J0-4 821ns ± 0% 802ns ± 0% -2.33% (p=0.000 n=33+40) J1-4 816ns ± 0% 807ns ± 0% -1.05% (p=0.000 n=40+29) Jn-4 1.67µs ± 0% 1.65µs ± 0% -1.45% (p=0.000 n=40+39) Ldexp-4 61.5ns ± 0% 54.6ns ± 0% -11.27% (p=0.000 n=40+32) Lgamma-4 188ns ± 0% 188ns ± 0% ~ (all equal) Log-4 154ns ± 0% 147ns ± 0% -4.78% (p=0.000 n=40+40) Logb-4 50.9ns ± 0% 42.7ns ± 0% -16.11% (p=0.000 n=34+39) Log1p-4 160ns ± 0% 159ns ± 0% ~ (p=0.828 n=40+40) Log10-4 173ns ± 0% 166ns ± 0% -4.05% (p=0.000 n=40+40) Log2-4 65.3ns ± 0% 58.4ns ± 0% -10.57% (p=0.000 n=37+37) Modf-4 36.4ns ± 0% 36.4ns ± 0% ~ (all equal) Nextafter32-4 36.4ns ± 0% 36.4ns ± 0% ~ (all equal) Nextafter64-4 32.7ns ± 0% 32.6ns ± 0% ~ (p=0.375 n=40+40) PowInt-4 300ns ± 0% 277ns ± 0% -7.78% (p=0.000 n=40+40) PowFrac-4 676ns ± 0% 635ns ± 0% -6.00% (p=0.000 n=40+35) Pow10Pos-4 17.6ns ± 0% 17.6ns ± 0% ~ (all equal) Pow10Neg-4 22.0ns ± 0% 22.0ns ± 0% ~ (all equal) Round-4 30.1ns ± 0% 30.1ns ± 0% ~ (all equal) RoundToEven-4 38.9ns ± 0% 38.9ns ± 0% ~ (all equal) Remainder-4 291ns ± 0% 263ns ± 0% -9.62% (p=0.000 n=40+40) Signbit-4 11.3ns ± 0% 11.3ns ± 0% ~ (all equal) Sin-4 185ns ± 0% 185ns ± 0% ~ (all equal) Sincos-4 230ns ± 0% 230ns ± 0% ~ (all equal) Sinh-4 253ns ± 0% 246ns ± 0% -2.77% (p=0.000 n=39+39) SqrtIndirect-4 41.4ns ± 0% 41.4ns ± 0% ~ (all equal) SqrtLatency-4 13.8ns ± 0% 13.8ns ± 0% ~ (all equal) SqrtIndirectLatency-4 37.0ns ± 0% 37.0ns ± 0% ~ (p=0.632 n=40+40) SqrtGoLatency-4 911ns ± 0% 911ns ± 0% +0.08% (p=0.000 n=40+40) SqrtPrime-4 13.2µs ± 0% 13.2µs ± 0% +0.01% (p=0.038 n=38+40) Tan-4 205ns ± 0% 205ns ± 0% ~ (all equal) Tanh-4 264ns ± 0% 247ns ± 0% -6.44% (p=0.000 n=39+32) Trunc-4 45.2ns ± 0% 45.2ns ± 0% ~ (all equal) Y0-4 796ns ± 0% 792ns ± 0% -0.55% (p=0.000 n=35+40) Y1-4 804ns ± 0% 797ns ± 0% -0.82% (p=0.000 n=24+40) Yn-4 1.64µs ± 0% 1.62µs ± 0% -1.27% (p=0.000 n=40+39) Float64bits-4 8.16ns ± 0% 8.16ns ± 0% +0.04% (p=0.000 n=35+40) Float64frombits-4 10.7ns ± 0% 10.7ns ± 0% ~ (all equal) Float32bits-4 7.53ns ± 0% 7.53ns ± 0% ~ (p=0.760 n=40+40) Float32frombits-4 6.91ns ± 0% 6.91ns ± 0% -0.04% (p=0.002 n=32+38) [Geo mean] 111ns 106ns -3.98% Change-Id: I54f4fd7f5160db020b430b556bde59cc0fdb996d Reviewed-on: https://go-review.googlesource.com/c/go/+/188678 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-08-28 15:41:28 +00:00
Bryan C. Mills	372b0eed17	Revert "cmd/compile: optimize 386's math.bits.TrailingZeros16" This reverts CL 189277. Reason for revert: broke 32-bit builders. Updates #33902 Change-Id: Ie5f180d0371a90e5057ed578c334372e5fc3a286 Reviewed-on: https://go-review.googlesource.com/c/go/+/192097 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2019-08-28 12:57:59 +00:00
Ben Shi	22355d6cd2	cmd/compile: optimize 386's math.bits.TrailingZeros16 This CL optimizes math.bits.TrailingZeros16 on 386 with a pair of BSFL and ORL instrcutions. The case TrailingZeros16-4 of the benchmark test in math/bits shows big improvement. name old time/op new time/op delta TrailingZeros16-4 1.55ns ± 1% 0.87ns ± 1% -43.87% (p=0.000 n=50+49) Change-Id: Ia899975b0e46f45dcd20223b713ed632bc32740b Reviewed-on: https://go-review.googlesource.com/c/go/+/189277 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-28 02:29:54 +00:00
LE Manh Cuong	1a432f27d5	cmd/compile: eliminate usage of global Fatalf in ssa.go state and ssafn both have their own Fatalf, so use them instead of global Fatalf. Updates #19683 Change-Id: Ie02a961d4285ab0a3f3b8d889a5b498d926ed567 Reviewed-on: https://go-review.googlesource.com/c/go/+/188539 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-27 17:05:15 +00:00

1 2 3 4 5 ...

785 Commits