Commit Graph

11738 Commits

Author SHA1 Message Date
isharipo 67cc1af133 cmd/internal/obj/x86: remove redundant code (fix TODO)
Special cases described in TODO comments are
fixed in https://golang.org/cl/6901.

One of those blocks was needed until old 6a assembler was removed.
This happened in https://golang.org/cl/12784.

Fixes #24734 (Also contains more details and reasoning)

Change-Id: If1f2f155d36ab236b16ae6f309a0716e00aa6371
Reviewed-on: https://go-review.googlesource.com/105156
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-09 19:10:00 +00:00
Matthew Dempsky 17df5ed910 cmd/compile: insert instrumentation during SSA building
Insert appropriate race/msan calls before each memory operation during
SSA construction.

This is conceptually simple, but subtle because we need to be careful
that inserted instrumentation calls don't clobber arguments that are
currently being prepared for a user function call.

reorder1 already handles introducing temporary variables for arguments
in some cases. This CL changes it to use them for all arguments when
instrumenting.

Also, we can't SSA struct types with more than one field while
instrumenting. Otherwise, concurrent uses of disjoint fields within an
SSA-able struct can introduce false races.

This is both somewhat better and somewhat worse than the old racewalk
instrumentation pass. We're now able to easily recognize cases like
constructing non-escaping closures on the stack or accessing closure
variables don't need instrumentation calls. On the other hand,
spilling escaping parameters to the heap now results in an
instrumentation call.

Overall, this CL results in a small net reduction in the number of
instrumentation calls, but a small net increase in binary size for
instrumented executables. cmd/go ends up with 5.6% fewer calls, but a
2.4% larger binary.

Fixes #19054.

Change-Id: I70d1dd32ad6340e6fdb691e6d5a01452f58e97f3
Reviewed-on: https://go-review.googlesource.com/102817
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-09 18:40:55 +00:00
fanzha02 31700b83b5 cmd/internal/obj/arm64: add assembler support for load with register offset
The patch adds support for LDR(register offset) instruction.
And add the test cases and negative tests.

Change-Id: I5b32c6a5065afc4571116d4896f7ebec3c0416d3
Reviewed-on: https://go-review.googlesource.com/87955
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-09 18:20:41 +00:00
Daniel Martí 14393c5cd4 cmd: remove a few more unused parameters
ssa's pos parameter on the Const* funcs is unused, so remove it.

ld's alloc parameter on elfnote is always true, so remove the arguments
and simplify the code.

Finally, arm's addpltreloc never has its return parameter used, so
remove it.

Change-Id: I63387ecf6ab7b5f7c20df36be823322bb98427b8
Reviewed-on: https://go-review.googlesource.com/104456
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-09 17:10:25 +00:00
Alberto Donizetti 54c3f56ee0 test/codegen: port various mem-combining tests
And delete them from asm_test.

Change-Id: I0e33d58274951ab5acb67b0117b60ef617ea887a
Reviewed-on: https://go-review.googlesource.com/105735
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-09 12:00:06 +00:00
Josh Bleecher Snyder bbfae469a1 cmd/compile: handle blank struct fields in NumComponents
NumComponents is used by racewalk to decide whether reads and writes
might occur to subobjects of an address. For that purpose,
blank fields matter.

It is also used to decide whether to inline == and != for a type.
For that purpose, blank fields may be ignored.

Add a parameter to NumComponents to support this distinction.
While we're here, document NumComponents, as requested in CL 59334.

Change-Id: I8c2021b172edadd6184848a32a74774dde1805c8
Reviewed-on: https://go-review.googlesource.com/103755
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-07 21:01:07 +00:00
Alekseev Artem d8b417e00b cmd/internal/obj/x86: use raw string literals in regexp
Found with megacheck (S1007).

Change-Id: Icb15fd5bfefa8e0b39a1bfa9ec3e9af3eff6b390
Reviewed-on: https://go-review.googlesource.com/105415
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-07 14:21:43 +00:00
Alberto Donizetti 3e31eb6b84 test/codegen: port arm64 slice zeroing tests
Finish porting arm64 slice zeroing codegen tests; delete them from
asm_test.

Change-Id: Id2532df8ba1c340fa662a6b5238daa3de30548be
Reviewed-on: https://go-review.googlesource.com/105136
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-07 09:55:51 +00:00
Daniel Martí 31db81d329 cmd/vendor/.../pprof: update to 0f7d9ba1
In particular, to pull a few fixes that were causing some tests to be
flaky on our build dashboard.

Fixes #24405.
Fixes #24508.
Fixes #24611.

Change-Id: I713156ad11c924e4a4b603144d10395523d526ed
Reviewed-on: https://go-review.googlesource.com/105275
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 23:03:41 +00:00
Matthew Dempsky 27493187ed cmd/compile: change cmplx{mpy,div} into Mpcplx methods
Passes toolstash-check.

Change-Id: Icae55fe4fa1bb8e4f2f83b7c69e08d30a5559d9e
Reviewed-on: https://go-review.googlesource.com/105047
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-06 19:25:08 +00:00
Ilya Tocar b0d437f866 cmd/compile/internal/gc: use bool in racewalk
Replace int variables with 0/1 as only possible values with bools,
where possible.

Change-Id: I958c082e703bbc1540309da3e17612fc8e247932
Reviewed-on: https://go-review.googlesource.com/105036
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 16:14:42 +00:00
David Chase eda22a06fb cmd/compile: ensure first instruction of function is stmt
In gdb, "b f" gets confused if the first instruction of "f"
is not marked as a statement in the DWARF line table.

To ensure gdb is not confused, move the first statement
marker in "f" to its first instruction.

The screwy-looking conditional for "what's the first
instruction with a statement marker" will become simpler in
the future.

Fixes #24695.

Change-Id: I2eef81676b64d1bd9bff5da03b89b9dc0c18f44f
Reviewed-on: https://go-review.googlesource.com/104955
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-06 16:05:42 +00:00
Matthew Dempsky 33eaf75a03 cmd/compile: cleanup method expression type checking
Passes toolstash-check.

Change-Id: I804e73447b6fdbb75af6235c193c4ee7cbcf8d3a
Reviewed-on: https://go-review.googlesource.com/105045
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 15:40:13 +00:00
Matthew Dempsky 950a56899a cmd/compile: fix method expressions with anonymous receivers
Method expressions with anonymous receiver types like "struct { T }.m"
require wrapper functions, which we weren't always creating. This in
turn resulted in linker errors.

This CL ensures that we generate wrapper functions for any anonymous
receiver types used in a method expression.

Fixes #22444.

Change-Id: Ia8ac27f238c2898965e57b82a91d959792d2ddd4
Reviewed-on: https://go-review.googlesource.com/105044
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 15:39:11 +00:00
Tobias Klauser 58e3f2ac87 cmd/asm/internal/arch: unexport ParseARM64Suffix
ParseARM64Suffix is not used outside cmd/asm/internal/arch.

Change-Id: I8e7782dce11cf8cd2fd08dd17e555ced8d87ba24
Reviewed-on: https://go-review.googlesource.com/105115
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 14:09:04 +00:00
Daniel Martí 284c53498f cmd: some semi-automated cleanups
* Remove some redundant returns
* Replace HasPrefix with TrimPrefix
* Remove some obviously dead code

Passes toolstash -cmp on std cmd.

Change-Id: Ifb0d70a45cbb8a8553758a8c4878598b7fe932bc
Reviewed-on: https://go-review.googlesource.com/105017
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-06 13:59:29 +00:00
Matthew Dempsky fd9d2898bd cmd/compile: eliminate unused Sig.offset field
Change-Id: If498d1fc6e8c0c4e8cf7ed38c4997adf05e003a6
Reviewed-on: https://go-review.googlesource.com/105043
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2018-04-06 06:03:38 +00:00
Josh Bleecher Snyder 3f483e6520 cmd/compile: rewrite a & 1 != 1 into a & 1 == 0 on amd64
These rules trigger 190 times during make.bash.

Change-Id: I20d1688db5d8c904a7237c08635c6c9d8bd58b1c
Reviewed-on: https://go-review.googlesource.com/105037
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-05 23:14:14 +00:00
Matthew Dempsky 638f112d69 cmd/compile: cleanup method symbol creation
There were multiple ad hoc ways to create method symbols, with subtle
and confusing differences between them. This CL unifies them into a
single well-documented encoding and implementation.

This introduces some inconsequential changes to symbol format for the
sake of simplicity and consistency. Two notable changes:

1) Symbol construction is now insensitive to the package currently
being compiled. Previously, non-exported methods on anonymous types
received different method symbols depending on whether the method was
local or imported.

2) Symbols for method values parenthesized non-pointer receiver types
and non-exported method names, and also always package-qualified
non-exported method names. Now they use the same rules as normal
method symbols.

The methodSym function is also now stricter about rejecting
non-sensical method/receiver combinations. Notably, this means that
typecheckfunc needs to call addmethod to validate the method before
calling declare, which also means we no longer emit errors about
redeclaring bogus methods.

Change-Id: I9501c7a53dd70ef60e5c74603974e5ecc06e2003
Reviewed-on: https://go-review.googlesource.com/104876
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-05 22:01:17 +00:00
Richard Musiol 533fdfd00e cmd/compile/internal/gc: factor out beginning of SSAGenState.Call
This commit does not change the semantics of the Call method. Its
purpose is to avoid duplication of code by making PrepareCall available
for separate use by the wasm backend.

Updates #18892

Change-Id: I04a3098f56ebf0d995791c5375dd4c03b6a202a3
Reviewed-on: https://go-review.googlesource.com/103275
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-05 21:17:53 +00:00
Hana Kim 8e351ae304 cmd/trace: include taskless spans in /usertasks.
Change-Id: Id4e3407ba497a018d5ace92813ba8e9653d0ac7d
Reviewed-on: https://go-review.googlesource.com/104976
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-05 20:51:32 +00:00
Ilya Tocar d3026dd30a cmd/compile/internal/ssa: fix GO386=387 build
Don't generate FP ops with 1 operand in memory for 387.

Change-Id: I23b49dfa2a1e60c8778c920230e64785a3ddfbd1
Reviewed-on: https://go-review.googlesource.com/105035
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-05 18:54:31 +00:00
Matthew Dempsky 6703addeee cmd/compile: drop legacy code for generating iface wrappers
Originally, scalar values were directly stored within interface values
as long as they fit into a pointer-sized slot of memory. And since
interface method calls always pass the full pointer-sized value as the
receiver argument, value-narrowing wrappers were necessary to adapt to
the calling convention for methods with smaller receiver types.

However, for precise garbage collection, we now only store actual
pointers within interface values, so these wrappers are no longer
necessary.

Passes toolstash-check.

Change-Id: I5303bfeb8d0f11db619b5a5d06b37ac898588670
Reviewed-on: https://go-review.googlesource.com/104875
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-05 18:01:55 +00:00
Daniel Martí e8aa9a533d cmd/internal/obj: various code cleanups
Mostly replacing C-Style loops with range expressions, but also other
simplifications like the introduction of writeBool and unindenting some
code.

Passes toolstash -cmp on std cmd.

Change-Id: I799bccd4e5d411428dcf122b8588a564a9217e7c
Reviewed-on: https://go-review.googlesource.com/104936
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-04-05 16:22:39 +00:00
isharipo 47427d67d6 cmd/internal/obj/x86: cleanup comments and consts
- Unexport MaxLoopPad and LoopAlign; associated comments updated
- Remove commented-out C code
- Replace C-style /**/ code comments with single-line comments

Change-Id: I51bd92a05b4d3823757b12efd798951c9f252bd4
Reviewed-on: https://go-review.googlesource.com/104795
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-05 16:22:26 +00:00
isharipo cefbf302de cmd/internal/obj/x86: remove unused VEX constants
VEX constants were used when instructions were added by hand.
Now all VEX-encoded instructions are auto-generated by x86avxgen,
so there is no need for those anymore.

Change-Id: Ida63e5e23a8b819b15f61ac98980dec45a21617c
Reviewed-on: https://go-review.googlesource.com/104775
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-05 16:22:03 +00:00
Ben Shi 20cf5c4962 cmd/compile: optimize 386 binary operations with a memory operand
Some integer/float binary operations of 386 can take a direct memory
operand, which is more efficient than loading to a register.

These CL does this optimization by copying the similar solution
of amd64. And the go1 benchmark shows some inprovements, especially
the test case Template. (excluding noise)

name                     old time/op    new time/op    delta
BinaryTree17-4              3.42s ± 2%     3.40s ± 2%    ~     (p=0.069 n=38+39)
Fannkuch11-4                3.48s ± 1%     3.53s ± 1%  +1.59%  (p=0.000 n=40+40)
FmtFprintfEmpty-4          46.7ns ± 4%    46.3ns ± 3%  -1.03%  (p=0.001 n=40+40)
FmtFprintfString-4         80.1ns ± 3%    80.6ns ± 3%  +0.56%  (p=0.029 n=40+40)
FmtFprintfInt-4            92.4ns ± 2%    92.3ns ± 3%    ~     (p=0.847 n=40+40)
FmtFprintfIntInt-4          147ns ± 3%     144ns ± 3%  -1.87%  (p=0.000 n=40+40)
FmtFprintfPrefixedInt-4     182ns ± 2%     184ns ± 3%  +0.99%  (p=0.002 n=40+40)
FmtFprintfFloat-4           387ns ± 3%     384ns ± 3%    ~     (p=0.069 n=40+40)
FmtManyArgs-4               619ns ± 3%     616ns ± 3%    ~     (p=0.320 n=40+40)
GobDecode-4                7.28ms ± 6%    7.27ms ± 5%    ~     (p=0.897 n=40+40)
GobEncode-4                7.33ms ± 6%    7.21ms ± 6%  -1.56%  (p=0.022 n=38+40)
Gzip-4                      357ms ± 4%     357ms ± 4%    ~     (p=0.071 n=40+40)
Gunzip-4                   45.3ms ± 3%    45.4ms ± 3%    ~     (p=0.452 n=40+40)
HTTPClientServer-4         63.0µs ± 2%    62.9µs ± 3%    ~     (p=0.760 n=38+39)
JSONEncode-4               22.0ms ± 4%    21.7ms ± 4%  -1.49%  (p=0.000 n=40+40)
JSONDecode-4               67.7ms ± 4%    68.3ms ± 3%  +0.86%  (p=0.039 n=40+40)
Mandelbrot200-4            5.16ms ± 3%    5.17ms ± 3%    ~     (p=0.418 n=40+40)
GoParse-4                  3.30ms ± 2%    3.32ms ± 3%  +0.55%  (p=0.017 n=40+40)
RegexpMatchEasy0_32-4       104ns ± 3%     104ns ± 4%    ~     (p=0.992 n=40+40)
RegexpMatchEasy0_1K-4       852ns ± 3%     851ns ± 2%    ~     (p=0.344 n=40+40)
RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 5%    ~     (p=0.937 n=40+40)
RegexpMatchEasy1_1K-4      1.03µs ± 5%    1.04µs ± 4%    ~     (p=0.430 n=40+40)
RegexpMatchMedium_32-4      132ns ± 4%     131ns ± 3%  -1.06%  (p=0.027 n=40+40)
RegexpMatchMedium_1K-4     43.0µs ± 3%    43.2µs ± 3%    ~     (p=0.122 n=40+40)
RegexpMatchHard_32-4       2.21µs ± 4%    2.20µs ± 4%    ~     (p=0.146 n=40+40)
RegexpMatchHard_1K-4       67.1µs ± 4%    67.2µs ± 3%    ~     (p=0.859 n=40+40)
Revcomp-4                   1.85s ± 2%     1.85s ± 3%    ~     (p=0.184 n=40+40)
Template-4                 70.1ms ± 4%    67.5ms ± 3%  -3.65%  (p=0.000 n=40+40)
TimeParse-4                 457ns ±16%     439ns ± 4%    ~     (p=0.683 n=40+34)
TimeFormat-4                413ns ± 3%     414ns ± 3%    ~     (p=0.850 n=40+40)
[Geo mean]                 67.5µs         67.3µs       -0.38%

name                     old speed      new speed      delta
GobDecode-4               105MB/s ± 6%   106MB/s ± 5%    ~     (p=0.893 n=40+40)
GobEncode-4               105MB/s ± 6%   107MB/s ± 7%  +1.60%  (p=0.023 n=38+40)
Gzip-4                   54.4MB/s ± 4%  54.5MB/s ± 4%    ~     (p=0.073 n=40+40)
Gunzip-4                  429MB/s ± 3%   428MB/s ± 3%    ~     (p=0.453 n=40+40)
JSONEncode-4             88.3MB/s ± 5%  89.6MB/s ± 4%  +1.51%  (p=0.000 n=40+40)
JSONDecode-4             28.7MB/s ± 4%  28.4MB/s ± 3%  -0.87%  (p=0.039 n=40+40)
GoParse-4                17.6MB/s ± 3%  17.5MB/s ± 3%  -0.55%  (p=0.020 n=40+40)
RegexpMatchEasy0_32-4     308MB/s ± 4%   308MB/s ± 5%    ~     (p=0.988 n=40+40)
RegexpMatchEasy0_1K-4    1.20GB/s ± 3%  1.20GB/s ± 2%    ~     (p=0.329 n=40+40)
RegexpMatchEasy1_32-4     283MB/s ± 4%   283MB/s ± 4%    ~     (p=0.507 n=40+40)
RegexpMatchEasy1_1K-4     991MB/s ± 5%   987MB/s ± 4%    ~     (p=0.446 n=40+40)
RegexpMatchMedium_32-4   7.54MB/s ± 4%  7.63MB/s ± 3%  +1.26%  (p=0.004 n=40+40)
RegexpMatchMedium_1K-4   23.8MB/s ± 3%  23.7MB/s ± 4%    ~     (p=0.121 n=40+40)
RegexpMatchHard_32-4     14.5MB/s ± 4%  14.6MB/s ± 4%    ~     (p=0.145 n=40+40)
RegexpMatchHard_1K-4     15.3MB/s ± 4%  15.2MB/s ± 3%    ~     (p=0.874 n=40+40)
Revcomp-4                 137MB/s ± 2%   137MB/s ± 3%    ~     (p=0.179 n=40+40)
Template-4               27.7MB/s ± 4%  28.7MB/s ± 3%  +3.78%  (p=0.000 n=40+40)
[Geo mean]               78.9MB/s       79.2MB/s       +0.38%

Change-Id: I3ba688c253b665485c1ebdf5a75f4ce82cc3def3
Reviewed-on: https://go-review.googlesource.com/102036
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-05 16:09:32 +00:00
isharipo eef42f928d cmd/internal/obj/x86: make AsmBuf receiver name consistent
Fixes golint receiver name complaints.

We can't go with "a" name as it sometimes is used for obj.Addr args.

Change-Id: I66556f4e3dc42cfaaa4db3ed7772fa6756ea9a9b
Reviewed-on: https://go-review.googlesource.com/104796
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-05 15:31:16 +00:00
Daniel Martí fcfea24742 cmd/compile: early return/continue to unindent some code
While at it, also simplify a couple of switches.

Doesn't pass toolstash -cmp on std cmd, because orderBlock(&n2.Nbody) is
moved further down to the n3 loop.

Change-Id: I20a2a6c21eb9a183a59572e0fca401a5041fc40a
Reviewed-on: https://go-review.googlesource.com/104416
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-05 11:12:39 +00:00
Alberto Donizetti 9357bb9eba cmd/compile: stack-allocate worklist in ReachableBlocks
Stack-allocate a local worklist in the deadcode pass. A size of 64 for
the pre-allocation is enough for >99% of the ReachableBlocks call in
a typical package.

name      old time/op       new time/op       delta
Template        281ms ± 3%        278ms ± 2%  -1.03%  (p=0.049 n=20+20)
Unicode         135ms ± 6%        134ms ± 6%    ~     (p=0.273 n=18+17)
GoTypes         882ms ± 3%        880ms ± 2%    ~     (p=0.925 n=20+20)
Compiler        4.01s ± 1%        4.02s ± 2%    ~     (p=0.640 n=20+20)
SSA             9.61s ± 1%        9.75s ± 1%  +1.39%  (p=0.000 n=20+19)
Flate           186ms ± 5%        185ms ± 7%    ~     (p=0.758 n=20+20)
GoParser        219ms ± 5%        218ms ± 4%    ~     (p=0.149 n=20+20)
Reflect         568ms ± 4%        562ms ± 1%    ~     (p=0.154 n=19+19)
Tar             258ms ± 2%        257ms ± 3%    ~     (p=0.428 n=19+20)
XML             316ms ± 2%        317ms ± 3%    ~     (p=0.901 n=20+19)

name      old user-time/op  new user-time/op  delta
Template        398ms ± 6%        388ms ± 6%  -2.55%  (p=0.007 n=20+20)
Unicode         217ms ± 5%        213ms ± 6%  -1.90%  (p=0.036 n=17+20)
GoTypes         1.21s ± 3%        1.20s ± 3%  -0.89%  (p=0.022 n=19+20)
Compiler        5.56s ± 3%        5.53s ± 5%    ~     (p=0.779 n=20+20)
SSA             13.9s ± 5%        14.0s ± 4%    ~     (p=0.529 n=20+20)
Flate           248ms ±10%        252ms ± 4%    ~     (p=0.409 n=20+18)
GoParser        305ms ± 4%        299ms ± 5%  -1.87%  (p=0.007 n=19+20)
Reflect         754ms ± 2%        747ms ± 3%    ~     (p=0.107 n=20+19)
Tar             360ms ± 5%        362ms ± 3%    ~     (p=0.534 n=20+18)
XML             425ms ± 6%        429ms ± 4%    ~     (p=0.496 n=20+19)

name      old alloc/op      new alloc/op      delta
Template       38.8MB ± 0%       38.7MB ± 0%  -0.15%  (p=0.000 n=20+20)
Unicode        29.1MB ± 0%       29.1MB ± 0%  -0.03%  (p=0.000 n=20+20)
GoTypes         115MB ± 0%        115MB ± 0%  -0.13%  (p=0.000 n=20+20)
Compiler        491MB ± 0%        490MB ± 0%  -0.15%  (p=0.000 n=18+19)
SSA            1.40GB ± 0%       1.40GB ± 0%  -0.16%  (p=0.000 n=20+20)
Flate          24.9MB ± 0%       24.8MB ± 0%  -0.17%  (p=0.000 n=20+20)
GoParser       30.7MB ± 0%       30.6MB ± 0%  -0.16%  (p=0.000 n=20+20)
Reflect        77.1MB ± 0%       77.0MB ± 0%  -0.11%  (p=0.000 n=19+20)
Tar            39.0MB ± 0%       39.0MB ± 0%  -0.14%  (p=0.000 n=20+20)
XML            44.6MB ± 0%       44.5MB ± 0%  -0.13%  (p=0.000 n=17+19)

name      old allocs/op     new allocs/op     delta
Template         379k ± 0%         378k ± 0%  -0.45%  (p=0.000 n=20+17)
Unicode          336k ± 0%         336k ± 0%  -0.08%  (p=0.000 n=20+20)
GoTypes         1.18M ± 0%        1.17M ± 0%  -0.37%  (p=0.000 n=20+20)
Compiler        4.58M ± 0%        4.56M ± 0%  -0.38%  (p=0.000 n=20+20)
SSA             11.4M ± 0%        11.4M ± 0%  -0.39%  (p=0.000 n=20+20)
Flate            233k ± 0%         232k ± 0%  -0.51%  (p=0.000 n=20+20)
GoParser         313k ± 0%         312k ± 0%  -0.48%  (p=0.000 n=19+20)
Reflect          946k ± 0%         943k ± 0%  -0.31%  (p=0.000 n=20+20)
Tar              388k ± 0%         387k ± 0%  -0.40%  (p=0.000 n=20+20)
XML              411k ± 0%         409k ± 0%  -0.35%  (p=0.000 n=17+20)

Change-Id: Iaec0b9471ded61be5eb3c9d1074e804672307644
Reviewed-on: https://go-review.googlesource.com/104675
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-05 07:41:56 +00:00
Matthew Dempsky 562a199961 cmd/compile: extract inline related fields into separate Inline type
Inl, Inldcl, and InlCost are only applicable to functions with bodies
that can be inlined, so pull them out into a separate Inline type to
make understanding them easier.

A side benefit is that we can check if a function can be inlined by
just checking if n.Func.Inl is non-nil, which simplifies handling of
empty function bodies.

While here, remove some unnecessary Curfn twiddling, and make imported
functions use Inl.Dcl instead of Func.Dcl for consistency for local
functions.

Passes toolstash-check.

Change-Id: Ifd4a80349d85d9e8e4484952b38ec4a63182e81f
Reviewed-on: https://go-review.googlesource.com/104756
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-05 05:12:36 +00:00
David Chase b9a365681f cmd/compile: adjust is-statement on Pos's to improve debugging
Stores to auto tmp variables can be hoisted to places
where the line numbers make debugging look "jumpy".
Turning those instructions into ones with is_stmt = 0 in
the DWARF (accomplished by marking ssa nodes with NotStmt)
makes debugging look better while still attributing the
instructions with the correct line number.

The same is true for certain register allocator spills and
reloads.

Change-Id: I97a394eb522d4911cc40b4bf5bf76d3d7221f6c0
Reviewed-on: https://go-review.googlesource.com/98415
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-04 22:14:59 +00:00
David Chase dead03b794 cmd/link: process is_stmt data into dwarf line tables
To improve debugging, instructions should be annotated with
DWARF is_stmt.  The DWARF default before was is_stmt=1, and
to remove "jumpy" stepping the optimizer was tagging
instructions with a no-position position, which interferes
with the accuracy of profiling information.  This allows
that to be corrected, and also allows more "jumpy" positions
to be annotated with is_stmt=0 (these changes were not made
for 1.10 because of worries about further messing up
profiling).

The is_stmt values are placed in a pc-encoded table and
passed through a symbol derived from the name of the
function and processed in the linker alongside its
processing of each function's pc/line tables.

The only change in binary size is in the .debug_line tables
measured with "objdump -h --section=.debug_line go1.test"
For go1.test, these are 2614 bytes larger,
or 0.72% of the size of .debug_line,
or 0.025% of the file size.

This will increase in proportion to how much the is_stmt
flag is used (toggled).

Change-Id: Ic1f1aeccff44591ad0494d29e1a0202a3c506a7a
Reviewed-on: https://go-review.googlesource.com/93664
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-04-04 22:14:29 +00:00
David Chase 619679a397 cmd/compile: add IsStmt breakpoint info to src.lico
Add IsStmt information to src.lico so that suitable lines
for breakpoints (or not) can be noted, eventually for
communication to the debugger via the linker and DWARF.

The expectation is that the front end will apply statement
boundary marks because it has best information about the
input, and the optimizer will attempt to preserve these.
The exact method for placing these marks is still TBD;
ideally stopping "at" line N in unoptimized code will occur
at a point where none of the side effects of N have occurred
and all of the inputs for line N can still be observed.
The optimizer will work with the same markings supplied
for unoptimized code.

It is a goal that non-optimizing compilation should conserve
statement marks.

The optimizer will also use the not-a-statement annotation
to indicate instructions that have a line number (for
profiling purposes) but should not be the target of
debugger step, next, or breakpoints.  Because instructions
marked as statements are sometimes removed, a third value
indicating that a position (instruction) can serve as a
statement if the optimizer removes the current instruction
marked as a statement for the same line.  The optimizer
should attempt to conserve statement marks, but it is not
a bug if some are lost.

Includes changes to html output for GOSSAFUNC to indicate
not-default is-a-statement with bold and not-a-statement
with strikethrough.

Change-Id: Ia22c9a682f276e2ca2a4ef7a85d4b6ebf9c62b7f
Reviewed-on: https://go-review.googlesource.com/93663
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-04 22:11:34 +00:00
Robert Griesemer 542ea5ad91 go/printer, gofmt: tuned table alignment for better results
The go/printer (and thus gofmt) uses a heuristic to determine
whether to break alignment between elements of an expression
list which is spread across multiple lines. The heuristic only
kicked in if the entry sizes (character length) was above a
certain threshold (20) and the ratio between the previous and
current entry size was above a certain value (4).

This heuristic worked reasonably most of the time, but also
led to unfortunate breaks in many cases where a single entry
was suddenly much smaller (or larger) then the previous one.

The behavior of gofmt was sufficiently mysterious in some of
these situations that many issues were filed against it.

The simplest solution to address this problem is to remove
the heuristic altogether and have a programmer introduce
empty lines to force different alignments if it improves
readability. The problem with that approach is that the
places where it really matters, very long tables with many
(hundreds, or more) entries, may be machine-generated and
not "post-processed" by a human (e.g., unicode/utf8/tables.go).

If a single one of those entries is overlong, the result
would be that the alignment would force all comments or
values in key:value pairs to be adjusted to that overlong
value, making the table hard to read (e.g., that entry may
not even be visible on screen and all other entries seem
spaced out too wide).

Instead, we opted for a slightly improved heuristic that
behaves much better for "normal", human-written code.

1) The threshold is increased from 20 to 40. This disables
the heuristic for many common cases yet even if the alignment
is not "ideal", 40 is not that many characters per line with
todays screens, making it very likely that the entire line
remains "visible" in an editor.

2) Changed the heuristic to not simply look at the size ratio
between current and previous line, but instead considering the
geometric mean of the sizes of the previous (aligned) lines.
This emphasizes the "overall picture" of the previous lines,
rather than a single one (which might be an outlier).

3) Changed the ratio from 4 to 2.5. Now that we ignore sizes
below 40, a ratio of 4 would mean that a new entry would have
to be 4 times bigger (160) or smaller (10) before alignment
would be broken. A ratio of 2.5 seems more sensible.

Applied updated gofmt to all of src and misc. Also tested
against several former issues that complained about this
and verified that the output for the given examples is
satisfactory (added respective test cases).

Some of the files changed because they were not gofmt-ed
in the first place.

For #644.
For #7335.
For #10392.
(and probably more related issues)

Fixes #22852.

Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e
2018-04-04 13:39:34 -07:00
Balaram Makam d7c7d88b2c cmd/compile: intrinsify math/big.mulWW on ARM64
Performance numbers on amberwing:

pkg: math/big
name                            old time/op    new time/op    delta
QuoRem                            3.08µs ± 0%    2.93µs ± 1%   -4.89%  (p=0.008 n=5+5)
ModSqrt225_Tonelli                 721µs ± 0%     718µs ± 0%   -0.46%  (p=0.008 n=5+5)
ModSqrt224_3Mod4                   218µs ± 0%     217µs ± 0%   -0.27%  (p=0.008 n=5+5)
ModSqrt5430_Tonelli                2.91s ± 0%     2.91s ± 0%     ~     (p=0.222 n=5+5)
ModSqrt5430_3Mod4                  970ms ± 0%     970ms ± 0%     ~     (p=0.151 n=5+5)
Sqrt                              45.9µs ± 0%    43.8µs ± 0%   -4.63%  (p=0.008 n=5+5)
IntSqr/1                          19.9ns ± 0%    17.3ns ± 0%  -13.07%  (p=0.008 n=5+5)
IntSqr/2                          52.6ns ± 0%    50.8ns ± 0%   -3.35%  (p=0.008 n=5+5)
IntSqr/3                          70.4ns ± 0%    69.4ns ± 0%     ~     (p=0.079 n=4+5)
IntSqr/5                           103ns ± 0%      99ns ± 0%   -3.98%  (p=0.008 n=5+5)
IntSqr/8                           179ns ± 0%     178ns ± 0%   -0.56%  (p=0.008 n=5+5)
IntSqr/10                          272ns ± 0%     272ns ± 0%     ~     (all equal)
IntSqr/20                          763ns ± 0%     787ns ± 0%   +3.15%  (p=0.016 n=5+4)
IntSqr/30                         1.25µs ± 1%    1.29µs ± 1%   +3.27%  (p=0.008 n=5+5)
IntSqr/50                         2.64µs ± 0%    2.71µs ± 0%   +2.61%  (p=0.008 n=5+5)
IntSqr/80                         5.67µs ± 0%    5.72µs ± 0%   +0.88%  (p=0.008 n=5+5)
IntSqr/100                        8.05µs ± 0%    8.09µs ± 0%   +0.45%  (p=0.008 n=5+5)
IntSqr/200                        28.0µs ± 0%    28.1µs ± 0%     ~     (p=0.151 n=5+5)
IntSqr/300                        59.4µs ± 0%    59.6µs ± 0%   +0.36%  (p=0.008 n=5+5)
IntSqr/500                         141µs ± 0%     141µs ± 0%   +0.08%  (p=0.008 n=5+5)
IntSqr/800                         280µs ± 0%     280µs ± 0%   -0.12%  (p=0.008 n=5+5)
IntSqr/1000                        429µs ± 0%     428µs ± 0%   -0.27%  (p=0.008 n=5+5)

pkg: crypto-ecdsa
name      old time/op    new time/op    delta
SignP384    7.85ms ± 1%    7.61ms ± 1%  -3.12%  (p=0.008 n=5+5)

Change-Id: I1ab30856cc0e570f6312f0bd8914779b55adbc16
Reviewed-on: https://go-review.googlesource.com/104135
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-04 18:37:24 +00:00
Hana Kim e6ab614fda cmd/trace: avoid emitting traceview slice with 0 duration
The trace viewer interprets the slice as a non-terminating
time interval which is quite opposit to what trace records indicate
(i.e., almostly immediately terminating time interval).
As observed in the issue #24663 this can result in quite misleading
visualization of the trace.

Work around the trace viewer's issue by setting a small value
(0.0001usec) as the duration if the time interval is not positive.

Change-Id: I1c2aac135c194d0717f5c01a98ca60ffb14ef45c
Reviewed-on: https://go-review.googlesource.com/104716
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-04-04 18:15:20 +00:00
Hana Kim c4874aa2a2 cmd/trace: implement /trace?focustask=<taskid> mode
This mode is similar to the default traceview mode where the execution
trace is presented in P-oriented way. Each row represents a P, and each
slice represents the time interval of a goroutine's execution on the P.

The difference is that, in this mode, only the execution of goroutines
involved in the specified task is highlighted, and other goroutine
execution or events are greyed out. So, users can focus on how a task is
executed while considering other affecting conditions such as other
goroutines, network events, or process scheduling.

Example: https://user-images.githubusercontent.com/4999471/38116793-a6f995f0-337f-11e8-8de9-88eec2f2c497.png

Here, for a while the program remained idle after the first burst of
activity related to the task because all other goroutines were also
being blocked or waiting for events, or no incoming network traffic
(indicated by the lack of any network activity). This is a bit hard to
discover when the usual task-oriented view (/trace?taskid=<taskid>)
mode.

Also, it simplifies the traceview generation mode logic.
  /trace ---> 0
  /trace?goid ---> modeGoroutineOriented
  /trace?taskid ---> modeGoroutineOriented|modeTaskOriented
  /trace?focustask ---> modeTaskOriented

Change-Id: Idcc0ae31b708ddfd19766f4e26ee7efdafecd3a5
Reviewed-on: https://go-review.googlesource.com/103555
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-04-04 18:14:40 +00:00
Cherry Zhang 08304e8867 cmd/link: put runtime.framepointer_enabled in DATA instead of RODATA
On darwin, only writable symbol is exported
(cmd/link/internal/ld/macho.go:/machoShouldExport).
For plugin to work correctly, global variables, including
runtime.framepointer_enabled which is set by the linker, need
to be exported when dynamic linking. Put it in DATA so it is
exported. Also in Go it is defined as a var, which is not
read-only.

While here, do the same for runtime.goarm.

Fixes #24653.

Change-Id: I9d1b7d5a648be17103d20b97be65a901cb69f5a2
Reviewed-on: https://go-review.googlesource.com/104715
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-04 16:37:20 +00:00
Nikhil Benesch 804d03281c cmd/go: rebuild as needed when vetting test packages
If A's external test package imports B, which imports A, and A's
internal test code adds something to A that invalidates anything in A's
export data, then we need to build B against the test-augmented version
of A before using it to build A's external test package.

https://golang.org/cl/92215 taught 'go test' to do this rebuilding
properly, but 'go vet' was not taught the same trick when it learned to
vet test packages in https://golang.org/cl/87636. This commit moves the
necessary logic into the load.TestPackagesFor function so it can be
shared by 'go test' and 'go vet'.

Fixes #23701.

Change-Id: I1086d447eca02933af53de693384eac99a08d9bd
Reviewed-on: https://go-review.googlesource.com/104315
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-04-04 14:39:23 +00:00
Alberto Donizetti f2abca90a2 test/codegen: port arm64 byte slice zeroing tests
And delete them from asm_test.

Change-Id: Id533130470da9176a401cb94972f626f43a62148
Reviewed-on: https://go-review.googlesource.com/103656
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-04 13:18:15 +00:00
Alberto Donizetti 4ed94ef1a8 cmd/compile: stack-allocate 2 worklists in order, dom passes
Allocate two more ssa local worklists on the stack. The initial sizes
are chosen to cover >99% of the calls.

name      old time/op       new time/op       delta
Template        281ms ± 2%        283ms ± 5%    ~     (p=0.443 n=18+19)
Unicode         136ms ± 4%        135ms ± 7%    ~     (p=0.277 n=20+20)
GoTypes         886ms ± 2%        885ms ± 2%    ~     (p=0.862 n=20+20)
Compiler        4.03s ± 2%        4.02s ± 1%    ~     (p=0.270 n=19+20)
SSA             9.66s ± 1%        9.64s ± 2%    ~     (p=0.253 n=20+20)
Flate           186ms ± 5%        183ms ± 6%    ~     (p=0.174 n=20+20)
GoParser        222ms ± 4%        219ms ± 4%    ~     (p=0.081 n=20+20)
Reflect         569ms ± 2%        568ms ± 2%    ~     (p=0.686 n=19+19)
Tar             258ms ± 4%        256ms ± 3%    ~     (p=0.211 n=20+20)
XML             319ms ± 2%        317ms ± 3%    ~     (p=0.158 n=18+20)

name      old user-time/op  new user-time/op  delta
Template        396ms ± 6%        392ms ± 6%    ~     (p=0.211 n=20+20)
Unicode         212ms ±10%        211ms ± 9%    ~     (p=0.904 n=20+20)
GoTypes         1.21s ± 3%        1.21s ± 2%    ~     (p=0.183 n=20+20)
Compiler        5.60s ± 2%        5.62s ± 2%    ~     (p=0.355 n=18+18)
SSA             14.0s ± 6%        13.9s ± 5%    ~     (p=0.678 n=20+20)
Flate           250ms ± 8%        245ms ± 6%    ~     (p=0.166 n=19+20)
GoParser        305ms ± 6%        304ms ± 5%    ~     (p=0.659 n=20+20)
Reflect         760ms ± 3%        758ms ± 4%    ~     (p=0.758 n=20+20)
Tar             362ms ± 6%        357ms ± 5%    ~     (p=0.108 n=20+20)
XML             429ms ± 4%        429ms ± 4%    ~     (p=0.799 n=20+20)

name      old alloc/op      new alloc/op      delta
Template       39.0MB ± 0%       38.8MB ± 0%  -0.55%  (p=0.000 n=20+20)
Unicode        29.1MB ± 0%       29.1MB ± 0%  -0.06%  (p=0.000 n=20+20)
GoTypes         116MB ± 0%        115MB ± 0%  -0.50%  (p=0.000 n=20+20)
Compiler        493MB ± 0%        491MB ± 0%  -0.46%  (p=0.000 n=19+20)
SSA            1.40GB ± 0%       1.40GB ± 0%  -0.31%  (p=0.000 n=19+20)
Flate          25.0MB ± 0%       24.9MB ± 0%  -0.60%  (p=0.000 n=19+19)
GoParser       30.9MB ± 0%       30.7MB ± 0%  -0.66%  (p=0.000 n=20+20)
Reflect        77.5MB ± 0%       77.1MB ± 0%  -0.52%  (p=0.000 n=20+20)
Tar            39.2MB ± 0%       39.0MB ± 0%  -0.47%  (p=0.000 n=20+20)
XML            44.8MB ± 0%       44.6MB ± 0%  -0.45%  (p=0.000 n=20+19)

name      old allocs/op     new allocs/op     delta
Template         382k ± 0%         379k ± 0%  -0.69%  (p=0.000 n=20+19)
Unicode          337k ± 0%         336k ± 0%  -0.09%  (p=0.000 n=20+20)
GoTypes         1.19M ± 0%        1.18M ± 0%  -0.64%  (p=0.000 n=20+20)
Compiler        4.60M ± 0%        4.58M ± 0%  -0.57%  (p=0.000 n=20+20)
SSA             11.5M ± 0%        11.4M ± 0%  -0.42%  (p=0.000 n=19+20)
Flate            235k ± 0%         233k ± 0%  -0.74%  (p=0.000 n=20+19)
GoParser         316k ± 0%         313k ± 0%  -0.69%  (p=0.000 n=20+20)
Reflect          953k ± 0%         946k ± 0%  -0.81%  (p=0.000 n=20+20)
Tar              391k ± 0%         388k ± 0%  -0.61%  (p=0.000 n=20+19)
XML              413k ± 0%         411k ± 0%  -0.56%  (p=0.000 n=20+20)

Change-Id: I7378174e3550b47df4368b24cf24c8ce1b85c906
Reviewed-on: https://go-review.googlesource.com/104656
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-04 11:12:32 +00:00
Robert Griesemer 4637699e92 cmd/compile/internal/syntax: better error message for incorrect if/switch header
Fixes #23664.

Change-Id: Ic0637e9f896b2fc6502dfbab2d1c4de3c62c0bd2
Reviewed-on: https://go-review.googlesource.com/104616
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
2018-04-03 21:57:37 +00:00
Robert Griesemer a818ddd972 cmd/compile/internal/syntax: update a couple of comments
Change-Id: Ie84d0e61697922c1e808d815fb7d9aec694ee8e9
Reviewed-on: https://go-review.googlesource.com/104615
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-03 21:57:14 +00:00
Giovanni Bajo ac43de3ae5 cmd/compile: in prove, complete support for OpIsInBounds/OpIsSliceInBounds
The logic in addBranchRestrictions didn't allow to correctly
model OpIs(Slice)Bound for signed domain, and it was also partly
implemented within addRestrictions.

Thanks to the previous changes, it is now possible to handle
the negative conditions correctly, so that we can learn
both signed/LT + unsigned/LT on the positive side, and
signed/GE + unsigned/GE on the negative side (but only if
the index can be proved to be non-negative).

This is able to prove ~50 more slice accesses in std+cmd.

Change-Id: I9858080dc03b16f85993a55983dbc4b00f8491b0
Reviewed-on: https://go-review.googlesource.com/104037
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 20:25:34 +00:00
Giovanni Bajo b846edfd59 cmd/compile: in prove, make addRestrictions more generic
addRestrictions was taking a branch parameter, binding its logic
to that of addBranchRestrictions. Since we will need to use it
for updating the facts table for induction variables, refactor it
to remove the branch parameter.

Passes toolstash -cmp.

Change-Id: Iaaec350a8becd1919d03d8574ffd1bbbd906d068
Reviewed-on: https://go-review.googlesource.com/104036
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 20:25:17 +00:00
Alberto Donizetti 00c8e149b6 cmd/compile: stack-allocate values worklist in schedule
Compiler instrumentation shows that the cap of the stores slice in the
storeOrder function is almost always 64 or less. Since the slice does
not escape, pre-allocating on the stack a 64-elements one greatly
reduces the number of allocations performed by the function.

name      old time/op       new time/op       delta
Template        289ms ± 5%        283ms ± 3%  -1.99%  (p=0.000 n=19+20)
Unicode         140ms ± 6%        136ms ± 6%  -2.61%  (p=0.021 n=19+20)
GoTypes         915ms ± 2%        895ms ± 2%  -2.24%  (p=0.000 n=19+20)
Compiler        4.15s ± 1%        4.04s ± 2%  -2.73%  (p=0.000 n=20+20)
SSA             10.0s ± 1%         9.8s ± 2%  -2.13%  (p=0.000 n=20+20)
Flate           189ms ± 6%        186ms ± 4%  -1.75%  (p=0.028 n=19+20)
GoParser        229ms ± 5%        224ms ± 4%  -2.25%  (p=0.001 n=20+19)
Reflect         584ms ± 2%        573ms ± 3%  -1.83%  (p=0.000 n=18+20)
Tar             265ms ± 3%        261ms ± 3%  -1.33%  (p=0.021 n=20+20)
XML             328ms ± 2%        321ms ± 2%  -2.11%  (p=0.000 n=20+20)

name      old user-time/op  new user-time/op  delta
Template        408ms ± 4%        400ms ± 4%  -1.98%  (p=0.006 n=19+20)
Unicode         216ms ± 9%        216ms ± 7%    ~     (p=0.883 n=20+20)
GoTypes         1.25s ± 1%        1.23s ± 3%  -1.32%  (p=0.002 n=19+20)
Compiler        5.77s ± 1%        5.69s ± 2%  -1.47%  (p=0.000 n=18+19)
SSA             14.6s ± 5%        14.1s ± 4%  -3.45%  (p=0.000 n=20+20)
Flate           252ms ± 7%        251ms ± 7%    ~     (p=0.659 n=20+20)
GoParser        314ms ± 5%        310ms ± 5%    ~     (p=0.165 n=20+20)
Reflect         780ms ± 2%        769ms ± 3%  -1.34%  (p=0.004 n=19+18)
Tar             365ms ± 7%        367ms ± 5%    ~     (p=0.841 n=20+20)
XML             439ms ± 4%        432ms ± 4%  -1.45%  (p=0.043 n=20+20)

name      old alloc/op      new alloc/op      delta
Template       38.9MB ± 0%       38.8MB ± 0%  -0.26%  (p=0.000 n=19+20)
Unicode        29.0MB ± 0%       29.0MB ± 0%  -0.02%  (p=0.001 n=20+19)
GoTypes         115MB ± 0%        115MB ± 0%  -0.31%  (p=0.000 n=20+20)
Compiler        492MB ± 0%        490MB ± 0%  -0.41%  (p=0.000 n=20+19)
SSA            1.40GB ± 0%       1.39GB ± 0%  -0.48%  (p=0.000 n=20+20)
Flate          24.9MB ± 0%       24.9MB ± 0%  -0.24%  (p=0.000 n=20+20)
GoParser       30.9MB ± 0%       30.8MB ± 0%  -0.39%  (p=0.000 n=20+20)
Reflect        77.1MB ± 0%       76.8MB ± 0%  -0.32%  (p=0.000 n=17+20)
Tar            39.1MB ± 0%       39.0MB ± 0%  -0.23%  (p=0.000 n=20+20)
XML            44.7MB ± 0%       44.6MB ± 0%  -0.30%  (p=0.000 n=20+18)

name      old allocs/op     new allocs/op     delta
Template         385k ± 0%         382k ± 0%  -0.99%  (p=0.000 n=20+19)
Unicode          336k ± 0%         336k ± 0%  -0.08%  (p=0.000 n=19+17)
GoTypes         1.20M ± 0%        1.18M ± 0%  -1.11%  (p=0.000 n=20+18)
Compiler        4.66M ± 0%        4.59M ± 0%  -1.42%  (p=0.000 n=19+20)
SSA             11.6M ± 0%        11.5M ± 0%  -1.49%  (p=0.000 n=20+20)
Flate            237k ± 0%         235k ± 0%  -1.00%  (p=0.000 n=20+19)
GoParser         319k ± 0%         315k ± 0%  -1.12%  (p=0.000 n=20+20)
Reflect          960k ± 0%         952k ± 0%  -0.92%  (p=0.000 n=18+20)
Tar              394k ± 0%         390k ± 0%  -0.87%  (p=0.000 n=20+20)
XML              418k ± 0%         413k ± 0%  -1.18%  (p=0.000 n=20+20)

Change-Id: I01b9f45b161379967d7a52e23f39ac30dd90edb0
Reviewed-on: https://go-review.googlesource.com/104415
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-03 17:45:28 +00:00
Michael Munday 32e6461dc6 cmd/asm, math: add s390x floating point test instructions
Floating point test instructions allow special cases (NaN, ±∞ and
a few other useful properties) to be checked directly.

This CL adds the following instructions to the assembler:
 * LTEBR - load and test (float32)
 * LTDBR - load and test (float64)
 * TCEB  - test data class (float32)
 * TCDB  - test data class (float64)

Note that I have only added immediate versions of the 'test data
class' instructions for now as that's the only case I think the
compiler will use.

Change-Id: I3398aab2b3a758bf909bd158042234030c8af582
Reviewed-on: https://go-review.googlesource.com/104457
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-03 16:08:04 +00:00
Fangming.Fang ef9bdd11e8 cmd/asm: add essential instructions for AES-GCM on ARM64
This change adds VLD1, VST1, VPMULL{2}, VEXT, VRBIT, VUSHR and VSHL instructions
for supporting AES-GCM implementation later.

Fixes #24400

Change-Id: I556feb88067f195cbe25629ec2b7a817acc58709
Reviewed-on: https://go-review.googlesource.com/101095
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-04-03 15:36:31 +00:00
isharipo dcaf3fb134 cmd/compile: make DCE remove nodes after terminating if
This change makes compiler frontend dead code elimination of const expr if
statements introduced in https://golang.org/cl/38773 treat both
	if constCondTrue { ...; returnStmt } toBeRemoved...
	if constCondFalse { ...; } else { returnStmt } toBeRemoved...
identically to:
	if constCondTrue { ...; returnStmt } else { toBeRemoved... }

Where "constCondTrue" is a an expression that can be evaluated
to "true" during compile time.

The additional checks are only triggered for const expr
if conditions that evaluate to true.

name       old time/op       new time/op       delta
Template         431ms ± 2%        429ms ± 1%    ~     (p=0.491 n=8+6)
Unicode          198ms ± 4%        201ms ± 2%    ~     (p=0.234 n=7+6)
GoTypes          1.40s ± 1%        1.41s ± 2%    ~     (p=0.053 n=7+7)
Compiler         6.72s ± 2%        6.81s ± 1%  +1.35%  (p=0.011 n=7+7)
SSA              17.3s ± 1%        17.3s ± 2%    ~     (p=0.731 n=6+7)
Flate            275ms ± 2%        275ms ± 2%    ~     (p=0.902 n=7+7)
GoParser         340ms ± 2%        339ms ± 2%    ~     (p=0.902 n=7+7)
Reflect          910ms ± 2%        905ms ± 1%    ~     (p=0.310 n=6+6)
Tar              403ms ± 1%        403ms ± 2%    ~     (p=0.366 n=7+6)
XML              486ms ± 1%        490ms ± 1%    ~     (p=0.065 n=6+6)
StdCmd           56.2s ± 1%        56.6s ± 2%    ~     (p=0.620 n=7+7)

name       old user-time/op  new user-time/op  delta
Template         559ms ± 8%        557ms ± 7%    ~     (p=0.713 n=8+7)
Unicode          266ms ±13%        277ms ± 9%    ~     (p=0.157 n=8+7)
GoTypes          1.83s ± 2%        1.84s ± 1%    ~     (p=0.522 n=8+7)
Compiler         8.67s ± 4%        8.89s ± 4%    ~     (p=0.077 n=7+7)
SSA              23.9s ± 1%        24.2s ± 1%  +1.31%  (p=0.005 n=7+7)
Flate            351ms ± 4%        342ms ± 5%    ~     (p=0.105 n=7+7)
GoParser         437ms ± 2%        423ms ± 5%  -3.14%  (p=0.016 n=7+7)
Reflect          1.16s ± 3%        1.15s ± 2%    ~     (p=0.362 n=7+7)
Tar              517ms ± 4%        511ms ± 3%    ~     (p=0.538 n=7+7)
XML              619ms ± 3%        617ms ± 4%    ~     (p=0.483 n=7+7)

Fixes #23521

Change-Id: I165a7827d869aeb93ce6047d026ff873d039a4f3
Reviewed-on: https://go-review.googlesource.com/91056
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-03 15:19:41 +00:00