Commit Graph

606 Commits

Author SHA1 Message Date
Lynn Boger 6910e1085b cmd/internal/obj/ppc64: use MOVDU to update stack reg for leaf functions where possible
When the stack register is decremented to acquire stack space at
the beginning of a function, a MOVDU should be used so it is done
atomically, unless the size of the stack frame is too large for
that instruction.  The code to determine whether to use MOVDU
or MOVD was checking if the function was a leaf and always generating MOVD
when it was.  The choice of MOVD vs. MOVDU should only depend on the stack
frame size.  This fixes that problem.

Change-Id: I0e49c79036f1e8f7584179e1442b938fc6da085f
Reviewed-on: https://go-review.googlesource.com/41813
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2017-04-26 17:39:33 +00:00
Fangming.Fang aecf73fc31 cmd/internal: fix bug getting wrong indicator in DRconv()
Change-Id: I251ae497b0ab237d4b3fe98e397052394142d437
Reviewed-on: https://go-review.googlesource.com/41653
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 14:09:18 +00:00
Damien Lespiau 4e5593ddaa cmd/internal/obj/x86: port the doasm comment to go
This comment is very useful but still refers to the C implementation.
Adapting it for Go is fairly straightforward though.

Change-Id: Ib6dde25f3a18acbce76bb3cffdc29f5ccf43c1f7
Reviewed-on: https://go-review.googlesource.com/41696
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-25 15:54:19 +00:00
Michael Munday db6f3bbc9a cmd: fix the order that s390x operands are printed in
The assembler reordered the operands of some instructions to put the
first operand into From3. Unfortunately this meant that when the
instructions were printed the operands were in a different order than
the assembler would expect as input. For example, 'MVC $8, (R1), (R2)'
would be printed as 'MVC (R1), $8, (R2)'.

Originally this was done to ensure that From contained the source
memory operand. The current compiler no longer requires this and so
this CL simply makes all instructions use the standard order for
operands: From, Reg, From3 and finally To.

Fixes #18295

Change-Id: Ib2b5ec29c647ca7a995eb03dc78f82d99618b092
Reviewed-on: https://go-review.googlesource.com/40299
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-25 15:16:56 +00:00
Ben Shi a041806335 cmd/internal/obj/arm: use new form of MOVW introduced in ARMv7
As discussion in issue #18293, "MOVW $Imm-16, Reg" was introduced in
ARMv7. It directly encoded the 16-bit immediate into the instruction
instead of put it in the constant pool.

This patch makes the arm assembler choose this form of MOVW if available.

Besides 4 bytes are saved in the constant pool, the go1 benchmark test
also shows a slight improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              42.7s ± 1%     42.7s ± 1%    ~     (p=0.304 n=50+50)
Fannkuch11-4                24.8s ± 1%     24.8s ± 0%    ~     (p=0.757 n=50+49)
FmtFprintfEmpty-4           875ns ± 1%     873ns ± 2%    ~     (p=0.066 n=44+46)
FmtFprintfString-4         1.43µs ± 1%    1.45µs ± 1%  +1.68%  (p=0.000 n=44+44)
FmtFprintfInt-4            1.52µs ± 1%    1.52µs ± 1%  +0.26%  (p=0.009 n=41+45)
FmtFprintfIntInt-4         2.19µs ± 1%    2.20µs ± 1%  +0.76%  (p=0.000 n=43+46)
FmtFprintfPrefixedInt-4    2.56µs ± 2%    2.53µs ± 1%  -1.03%  (p=0.000 n=45+44)
FmtFprintfFloat-4          4.41µs ± 1%    4.39µs ± 1%  -0.52%  (p=0.000 n=44+44)
FmtManyArgs-4              9.02µs ± 2%    9.04µs ± 1%  +0.27%  (p=0.000 n=46+44)
GobDecode-4                 106ms ± 1%     106ms ± 1%    ~     (p=0.310 n=45+43)
GobEncode-4                88.1ms ± 2%    88.0ms ± 2%    ~     (p=0.648 n=49+50)
Gzip-4                      4.31s ± 1%     4.27s ± 1%  -1.01%  (p=0.000 n=50+50)
Gunzip-4                    618ms ± 1%     608ms ± 1%  -1.65%  (p=0.000 n=45+47)
HTTPClientServer-4          689µs ± 6%     692µs ± 4%  +0.52%  (p=0.038 n=50+47)
JSONEncode-4                282ms ± 2%     280ms ± 1%  -0.75%  (p=0.000 n=46+43)
JSONDecode-4                945ms ± 2%     940ms ± 1%  -0.47%  (p=0.000 n=47+47)
Mandelbrot200-4            49.4ms ± 1%    49.3ms ± 1%    ~     (p=0.163 n=45+45)
GoParse-4                  46.0ms ± 3%    45.5ms ± 2%  -0.95%  (p=0.000 n=49+40)
RegexpMatchEasy0_32-4      1.29µs ± 1%    1.28µs ± 1%  -0.14%  (p=0.005 n=38+45)
RegexpMatchEasy0_1K-4      7.92µs ± 8%    7.75µs ± 6%  -2.12%  (p=0.000 n=47+50)
RegexpMatchEasy1_32-4      1.31µs ± 1%    1.31µs ± 0%    ~     (p=0.282 n=45+48)
RegexpMatchEasy1_1K-4      10.4µs ± 5%    10.4µs ± 3%    ~     (p=0.771 n=50+49)
RegexpMatchMedium_32-4     2.06µs ± 1%    2.07µs ± 1%  +0.35%  (p=0.001 n=44+49)
RegexpMatchMedium_1K-4      533µs ± 1%     532µs ± 1%    ~     (p=0.710 n=43+47)
RegexpMatchHard_32-4       29.7µs ± 1%    29.6µs ± 1%  -0.34%  (p=0.002 n=43+46)
RegexpMatchHard_1K-4        893µs ± 2%     885µs ± 1%  -0.85%  (p=0.000 n=50+45)
Revcomp-4                  85.6ms ± 4%    85.5ms ± 2%    ~     (p=0.683 n=50+50)
Template-4                  1.05s ± 3%     1.04s ± 1%  -1.06%  (p=0.000 n=50+44)
TimeParse-4                7.19µs ± 2%    7.11µs ± 2%  -1.10%  (p=0.000 n=48+46)
TimeFormat-4               13.4µs ± 1%    13.5µs ± 1%    ~     (p=0.056 n=46+49)
[Geo mean]                  747µs          745µs       -0.28%

name                     old speed      new speed      delta
GobDecode-4              7.23MB/s ± 1%  7.22MB/s ± 1%    ~     (p=0.062 n=45+39)
GobEncode-4              8.71MB/s ± 2%  8.72MB/s ± 2%    ~     (p=0.656 n=49+50)
Gzip-4                   4.50MB/s ± 1%  4.55MB/s ± 1%  +1.03%  (p=0.000 n=50+50)
Gunzip-4                 31.4MB/s ± 1%  31.9MB/s ± 1%  +1.67%  (p=0.000 n=45+47)
JSONEncode-4             6.89MB/s ± 2%  6.94MB/s ± 1%  +0.76%  (p=0.000 n=46+43)
JSONDecode-4             2.05MB/s ± 2%  2.06MB/s ± 2%  +0.32%  (p=0.017 n=47+50)
GoParse-4                1.26MB/s ± 3%  1.27MB/s ± 1%  +0.68%  (p=0.000 n=50+48)
RegexpMatchEasy0_32-4    24.9MB/s ± 1%  24.9MB/s ± 1%  +0.13%  (p=0.004 n=38+45)
RegexpMatchEasy0_1K-4     129MB/s ± 7%   132MB/s ± 6%  +2.34%  (p=0.000 n=46+50)
RegexpMatchEasy1_32-4    24.5MB/s ± 1%  24.4MB/s ± 1%    ~     (p=0.252 n=45+48)
RegexpMatchEasy1_1K-4    98.8MB/s ± 4%  98.7MB/s ± 3%    ~     (p=0.771 n=50+49)
RegexpMatchMedium_32-4    485kB/s ± 3%   480kB/s ± 0%  -0.95%  (p=0.000 n=50+38)
RegexpMatchMedium_1K-4   1.92MB/s ± 1%  1.92MB/s ± 1%    ~     (p=0.129 n=43+47)
RegexpMatchHard_32-4     1.08MB/s ± 2%  1.08MB/s ± 1%  +0.38%  (p=0.017 n=46+46)
RegexpMatchHard_1K-4     1.15MB/s ± 2%  1.16MB/s ± 1%  +0.67%  (p=0.001 n=50+49)
Revcomp-4                29.7MB/s ± 4%  29.7MB/s ± 2%    ~     (p=0.682 n=50+50)
Template-4               1.85MB/s ± 3%  1.87MB/s ± 1%  +1.04%  (p=0.000 n=50+44)
[Geo mean]               6.56MB/s       6.60MB/s       +0.47%


Change-Id: Ic2cca90133c27a08d9f1a23c65b0eed5fbd02684
Reviewed-on: https://go-review.googlesource.com/41190
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-25 12:02:09 +00:00
Josh Bleecher Snyder 405a280d01 cmd/internal/obj: eliminate LSym.Version
There were only two versions, 0 and 1,
and the only user of version 1 was the assembler,
to indicate that a symbol was static.

Rename LSym.Version to Static,
and add it to LSym.Attributes.
Simplify call-sites.

Passes toolstash-check.

Change-Id: Iabd39918f5019cce78f381d13f0481ae09f3871f
Reviewed-on: https://go-review.googlesource.com/41201
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-20 21:56:23 +00:00
Ilya Tocar 7f9832254c cmd/internal/obj/x86: fix relocation offset for VEX encoded instructions
VEX encoded instructions don't have a REX byte, so for PC relative
addressing we don't need to recalculate relocation offset.

Fixes #19518

Change-Id: Icf5414962de4350d76fd220817498337f90614fc
Reviewed-on: https://go-review.googlesource.com/38138
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-20 18:23:31 +00:00
Lynn Boger 9248ff46a8 cmd/compile: add rotates to PPC64.rules
This updates PPC64.rules to include rules to generate rotates
for ADD, OR, XOR operators that combine two opposite shifts
that sum to 32 or 64.

To support this change opcodes for ROTL and ROTLW were added to
be used like the rotldi and rotlwi extended mnemonics.

This provides the following improvement in sha3:

BenchmarkPermutationFunction-8     302.83       376.40       1.24x
BenchmarkSha3_512_MTU-8            98.64        121.92       1.24x
BenchmarkSha3_384_MTU-8            136.80       168.30       1.23x
BenchmarkSha3_256_MTU-8            169.21       211.29       1.25x
BenchmarkSha3_224_MTU-8            179.76       221.19       1.23x
BenchmarkShake128_MTU-8            212.87       263.23       1.24x
BenchmarkShake256_MTU-8            196.62       245.60       1.25x
BenchmarkShake256_16x-8            163.57       194.37       1.19x
BenchmarkShake256_1MiB-8           199.02       248.74       1.25x
BenchmarkSha3_512_1MiB-8           106.55       133.13       1.25x

Fixes #20030

Change-Id: I484c56f48395d32f53ff3ecb3ac6cb8191cfee44
Reviewed-on: https://go-review.googlesource.com/40992
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-20 18:05:22 +00:00
Josh Bleecher Snyder 6e97c71cb7 cmd/internal/obj: split Link.hash into version 0 and 1
Though LSym.Version is an int, it can only have the value 0 or 1.
Using that, split Link.hash into two maps, one for version 0
(which is far more common) and one for version 1.
This lets use just the name for lookups,
which is both faster and more compact.
This matters because Link.hash map lookups are frequent,
and will be contended once the backend is concurrent.

name        old time/op       new time/op       delta
Template          194ms ± 3%        192ms ± 5%  -1.46%  (p=0.000 n=47+49)
Unicode          84.5ms ± 3%       83.8ms ± 3%  -0.81%  (p=0.011 n=50+49)
GoTypes           543ms ± 2%        545ms ± 4%    ~     (p=0.566 n=46+49)
Compiler          2.48s ± 2%        2.48s ± 3%    ~     (p=0.706 n=47+50)
SSA               5.94s ± 3%        5.98s ± 2%  +0.55%  (p=0.040 n=49+50)
Flate             119ms ± 6%        119ms ± 4%    ~     (p=0.681 n=48+47)
GoParser          145ms ± 4%        145ms ± 3%    ~     (p=0.662 n=47+49)
Reflect           348ms ± 3%        344ms ± 3%  -1.17%  (p=0.000 n=47+47)
Tar               105ms ± 4%        104ms ± 3%    ~     (p=0.155 n=50+47)
XML               197ms ± 2%        197ms ± 3%    ~     (p=0.666 n=49+49)
[Geo mean]        332ms             331ms       -0.37%

name        old user-time/op  new user-time/op  delta
Template          230ms ±10%        226ms ±10%  -1.85%  (p=0.041 n=50+50)
Unicode           104ms ± 6%        103ms ± 5%    ~     (p=0.076 n=49+49)
GoTypes           707ms ± 4%        705ms ± 5%    ~     (p=0.521 n=50+50)
Compiler          3.30s ± 3%        3.33s ± 4%  +0.76%  (p=0.003 n=50+49)
SSA               8.17s ± 4%        8.23s ± 3%  +0.66%  (p=0.030 n=50+49)
Flate             139ms ± 6%        138ms ± 8%    ~     (p=0.184 n=49+48)
GoParser          174ms ± 5%        172ms ± 6%    ~     (p=0.107 n=48+49)
Reflect           431ms ± 8%        420ms ± 5%  -2.57%  (p=0.000 n=50+46)
Tar               119ms ± 6%        118ms ± 7%  -0.95%  (p=0.033 n=50+49)
XML               236ms ± 4%        236ms ± 4%    ~     (p=0.935 n=50+48)
[Geo mean]        410ms             407ms       -0.67%

name        old alloc/op      new alloc/op      delta
Template         38.7MB ± 0%       38.6MB ± 0%  -0.29%  (p=0.008 n=5+5)
Unicode          29.8MB ± 0%       29.7MB ± 0%  -0.24%  (p=0.008 n=5+5)
GoTypes           113MB ± 0%        113MB ± 0%  -0.29%  (p=0.008 n=5+5)
Compiler          462MB ± 0%        462MB ± 0%  -0.12%  (p=0.008 n=5+5)
SSA              1.27GB ± 0%       1.27GB ± 0%  -0.05%  (p=0.008 n=5+5)
Flate            25.2MB ± 0%       25.1MB ± 0%  -0.37%  (p=0.008 n=5+5)
GoParser         31.7MB ± 0%       31.6MB ± 0%    ~     (p=0.056 n=5+5)
Reflect          77.5MB ± 0%       77.2MB ± 0%  -0.38%  (p=0.008 n=5+5)
Tar              26.4MB ± 0%       26.3MB ± 0%    ~     (p=0.151 n=5+5)
XML              41.9MB ± 0%       41.9MB ± 0%  -0.20%  (p=0.032 n=5+5)
[Geo mean]       74.5MB            74.3MB       -0.23%

name        old allocs/op     new allocs/op     delta
Template           378k ± 1%         377k ± 1%    ~     (p=0.690 n=5+5)
Unicode            321k ± 0%         322k ± 0%    ~     (p=0.595 n=5+5)
GoTypes           1.14M ± 0%        1.14M ± 0%    ~     (p=0.310 n=5+5)
Compiler          4.25M ± 0%        4.25M ± 0%    ~     (p=0.151 n=5+5)
SSA               9.84M ± 0%        9.84M ± 0%    ~     (p=0.841 n=5+5)
Flate              232k ± 1%         232k ± 0%    ~     (p=0.690 n=5+5)
GoParser           315k ± 1%         315k ± 1%    ~     (p=0.841 n=5+5)
Reflect            970k ± 0%         970k ± 0%    ~     (p=0.841 n=5+5)
Tar                248k ± 0%         248k ± 1%    ~     (p=0.841 n=5+5)
XML                389k ± 0%         389k ± 0%    ~     (p=1.000 n=5+5)
[Geo mean]         724k              724k       +0.01%

Updates #15756

Change-Id: I2646332e89f0444ca9d5a41d7172537d904ed636
Reviewed-on: https://go-review.googlesource.com/41050
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-20 13:07:00 +00:00
Dave Cheney 1db0aae370 cmd/internal/obj: remove unused obj.Linksymfmt
obj.Linksymfmt is no longer referenced by any packages in cmd/...

Change-Id: Id4d9213d1577e13580b60755dbf7da313b17cb0e
Reviewed-on: https://go-review.googlesource.com/41171
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-19 23:51:31 +00:00
Michael Munday 6b0bd51c1c cmd/internal/obj/s390x: delete unused REGZERO constant
When we switched to SSA R0 was made allocatable and no longer holds
zero on s390x.

Change-Id: I1c752bb02da35462a535492379345fa9f4e12cb0
Reviewed-on: https://go-review.googlesource.com/41079
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-19 19:27:39 +00:00
Matthew Dempsky 1e3570ac86 cmd/internal/objabi: extract shared functionality from obj
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.

There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.

objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.

Fixes #15165.
Fixes #20026.

Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-19 00:00:09 +00:00
Matthew Dempsky 1747078695 cmd/internal/obj: un-embed FuncInfo field in LSym
Automated refactoring using github.com/mdempsky/unbed (to rewrite
s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo
field to just Func).

Passes toolstash-check -all.

Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a
Reviewed-on: https://go-review.googlesource.com/40670
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-18 17:29:50 +00:00
Josh Bleecher Snyder 9d4a84677b cmd/internal/obj: unexport Link.Hash
A prior CL eliminated the last reference to Ctxt.Hash
from the compiler.

Change-Id: Ic97ff84ed1a14e0c93fb0e8ec0b2617c3397c0e8
Reviewed-on: https://go-review.googlesource.com/40699
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-18 02:14:02 +00:00
Josh Bleecher Snyder da15fe6870 cmd/internal/obj: rework gclocals handling
The compiler handled gcargs and gclocals LSyms unusually.
It generated placeholder symbols (makefuncdatasym),
filled them in, and then renamed them for content-addressability.
This is an important binary size optimization;
the same locals information occurs over and over.

This CL continues to treat these LSyms unusually,
but in a slightly more explicit way,
and importantly for concurrent compilation,
in a way that does not require concurrent
modification of Ctxt.Hash.

Instead of creating gcargs and gclocals in the usual way,
by creating a types.Sym and then an obj.LSym,
we add them directly to obj.FuncInfo,
initialize them in obj.InitTextSym,
and deduplicate and add them to ctxt.Data at the end.
Then the backend's job is simply to fill them in
and rename them appropriately.

Updates #15756

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       38.7MB ± 0%  -0.22%  (p=0.016 n=5+5)
Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.690 n=5+5)
GoTypes          113MB ± 0%        113MB ± 0%  -0.24%  (p=0.008 n=5+5)
SSA             1.25GB ± 0%       1.24GB ± 0%  -0.39%  (p=0.008 n=5+5)
Flate           25.3MB ± 0%       25.2MB ± 0%  -0.43%  (p=0.008 n=5+5)
GoParser        31.7MB ± 0%       31.7MB ± 0%  -0.22%  (p=0.008 n=5+5)
Reflect         78.2MB ± 0%       77.6MB ± 0%  -0.80%  (p=0.008 n=5+5)
Tar             26.6MB ± 0%       26.3MB ± 0%  -0.85%  (p=0.008 n=5+5)
XML             42.4MB ± 0%       41.9MB ± 0%  -1.04%  (p=0.008 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          378k ± 0%         377k ± 1%    ~     (p=0.151 n=5+5)
Unicode           321k ± 1%         321k ± 0%    ~     (p=0.841 n=5+5)
GoTypes          1.14M ± 0%        1.14M ± 0%  -0.47%  (p=0.016 n=5+5)
SSA              9.71M ± 0%        9.67M ± 0%  -0.33%  (p=0.008 n=5+5)
Flate             233k ± 1%         232k ± 1%    ~     (p=0.151 n=5+5)
GoParser          316k ± 0%         315k ± 0%  -0.49%  (p=0.016 n=5+5)
Reflect           979k ± 0%         972k ± 0%  -0.75%  (p=0.008 n=5+5)
Tar               250k ± 0%         247k ± 1%  -0.92%  (p=0.008 n=5+5)
XML               392k ± 1%         389k ± 0%  -0.67%  (p=0.008 n=5+5)

Change-Id: Idc36186ca9d2f8214b5f7720bbc27b6bb22fdc48
Reviewed-on: https://go-review.googlesource.com/40697
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-18 02:13:32 +00:00
Lynn Boger 7d4cca07d2 cmd/asm: detect invalid DS form offsets for ppc64x
While debugging a recent regression it was discovered that
the assembler for ppc64x was not always generating the correct
instruction for DS form loads and stores.  When an instruction
is DS form then the offset must be a multiple of 4, and if it
isn't then bits outside the offset field were being incorrectly
set resulting in unexpected and incorrect instructions.

This change adds a check to determine when the opcode is DS form
and then verifies that the offset is a multiple of 4 before
generating the instruction, otherwise logs an error.

This also changes a few asm files that were using unaligned offsets
for DS form loads and stores.  In the runtime package these were
instructions intended to cause a crash so using aligned or unaligned
offsets doesn't change that behavior.

Change-Id: Ie3a7e1e65dcc9933b54de7a46a054da8459cb56f
Reviewed-on: https://go-review.googlesource.com/40476
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2017-04-17 21:24:35 +00:00
Michael Munday eed6938cbb cmd/asm, cmd/internal/obj/s390x, math: add LGDR and LDGR instructions
The instructions allow moves between floating point and general
purpose registers without any conversion taking place.

Change-Id: I82c6f3ad9c841a83783b5be80dcf5cd538ff49e6
Reviewed-on: https://go-review.googlesource.com/38777
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-17 16:33:51 +00:00
Josh Bleecher Snyder 89a355c930 cmd/internal/obj: unify Setuintxx and WriteInt
They do basically the same work.

Setuintxx was only used in a single place,
so eliminate it in favor of WriteInt.

duintxxLSym's alignment rounding was not used in practice;
change it into alignment assertion.

Passes toolstash-check. No compiler performance changes.

Change-Id: I0f7410cf2ccffbdc02ad796eaf973ee6a83074f8
Reviewed-on: https://go-review.googlesource.com/40863
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-17 03:15:02 +00:00
Josh Bleecher Snyder bb215de8a6 cmd/internal/obj: pretty-print LSym.Type when debugging
We have a stringer for LSym.Type. Use it.

Before:

	"".algarray t=31 size=224

After:

	"".algarray SBSS size=224

Change-Id: Ib4c7d2bc1dbe9943cf2a5dfa5d9f2d7fbd50b7f2
Reviewed-on: https://go-review.googlesource.com/40862
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-17 03:12:49 +00:00
Josh Bleecher Snyder 7b5f94e76c cmd/internal/obj: cache dwarfSym
Follow-up to review feedback from
mdempsky on CL 40507.

Reduces mutex contention by about 1%.

Change-Id: I540ea6772925f4a59e58f55a3458eff15880c328
Reviewed-on: https://go-review.googlesource.com/40575
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-13 15:18:54 +00:00
Josh Bleecher Snyder 4e4e51c5c5 cmd/internal/obj: generate function DWARF symbols early
This removes a concurrent access of ctxt.Data.

Updates #15756

Change-Id: Id017e90e47e093cd8825907f3853bb3d3bf8280d
Reviewed-on: https://go-review.googlesource.com/40507
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-13 14:38:01 +00:00
Wei Xiao ab636b899c hash/crc32: optimize arm64 crc32 implementation
ARMv8 defines crc32 instruction.

Comparing to the original crc32 calculation, this patch makes use of
crc32 instructions to do crc32 calculation instead of the multiple
lookup table algorithms.

ARMv8 provides IEEE and Castagnoli polynomials for crc32 calculation
so that the perfomance of these two types of crc32 get significant
improved.

name                                        old time/op   new time/op    delta
CRC32/poly=IEEE/size=15/align=0-32            117ns ± 0%      38ns ± 0%   -67.44%
CRC32/poly=IEEE/size=15/align=1-32            117ns ± 0%      38ns ± 0%   -67.52%
CRC32/poly=IEEE/size=40/align=0-32            129ns ± 0%      41ns ± 0%   -68.37%
CRC32/poly=IEEE/size=40/align=1-32            129ns ± 0%      41ns ± 0%   -68.29%
CRC32/poly=IEEE/size=512/align=0-32           828ns ± 0%     246ns ± 0%   -70.29%
CRC32/poly=IEEE/size=512/align=1-32           828ns ± 0%     132ns ± 0%   -84.06%
CRC32/poly=IEEE/size=1kB/align=0-32          1.58µs ± 0%    0.46µs ± 0%   -70.98%
CRC32/poly=IEEE/size=1kB/align=1-32          1.58µs ± 0%    0.46µs ± 0%   -70.92%
CRC32/poly=IEEE/size=4kB/align=0-32          6.06µs ± 0%    1.74µs ± 0%   -71.27%
CRC32/poly=IEEE/size=4kB/align=1-32          6.10µs ± 0%    1.74µs ± 0%   -71.44%
CRC32/poly=IEEE/size=32kB/align=0-32         48.3µs ± 0%    13.7µs ± 0%   -71.61%
CRC32/poly=IEEE/size=32kB/align=1-32         48.3µs ± 0%    13.7µs ± 0%   -71.60%
CRC32/poly=Castagnoli/size=15/align=0-32      116ns ± 0%      38ns ± 0%   -67.07%
CRC32/poly=Castagnoli/size=15/align=1-32      116ns ± 0%      38ns ± 0%   -66.90%
CRC32/poly=Castagnoli/size=40/align=0-32      127ns ± 0%      40ns ± 0%   -68.11%
CRC32/poly=Castagnoli/size=40/align=1-32      127ns ± 0%      40ns ± 0%   -68.11%
CRC32/poly=Castagnoli/size=512/align=0-32     828ns ± 0%     132ns ± 0%   -84.06%
CRC32/poly=Castagnoli/size=512/align=1-32     827ns ± 0%     132ns ± 0%   -84.04%
CRC32/poly=Castagnoli/size=1kB/align=0-32    1.59µs ± 0%    0.22µs ± 0%   -85.89%
CRC32/poly=Castagnoli/size=1kB/align=1-32    1.58µs ± 0%    0.22µs ± 0%   -85.79%
CRC32/poly=Castagnoli/size=4kB/align=0-32    6.14µs ± 0%    0.77µs ± 0%   -87.40%
CRC32/poly=Castagnoli/size=4kB/align=1-32    6.06µs ± 0%    0.77µs ± 0%   -87.25%
CRC32/poly=Castagnoli/size=32kB/align=0-32   48.3µs ± 0%     5.9µs ± 0%   -87.71%
CRC32/poly=Castagnoli/size=32kB/align=1-32   48.4µs ± 0%     6.0µs ± 0%   -87.69%
CRC32/poly=Koopman/size=15/align=0-32         104ns ± 0%     104ns ± 0%    +0.00%
CRC32/poly=Koopman/size=15/align=1-32         104ns ± 0%     104ns ± 0%    +0.00%
CRC32/poly=Koopman/size=40/align=0-32         235ns ± 0%     235ns ± 0%    +0.00%
CRC32/poly=Koopman/size=40/align=1-32         235ns ± 0%     235ns ± 0%    +0.00%
CRC32/poly=Koopman/size=512/align=0-32       2.71µs ± 0%    2.71µs ± 0%    -0.07%
CRC32/poly=Koopman/size=512/align=1-32       2.71µs ± 0%    2.71µs ± 0%    -0.04%
CRC32/poly=Koopman/size=1kB/align=0-32       5.40µs ± 0%    5.39µs ± 0%    -0.06%
CRC32/poly=Koopman/size=1kB/align=1-32       5.40µs ± 0%    5.40µs ± 0%    +0.02%
CRC32/poly=Koopman/size=4kB/align=0-32       21.5µs ± 0%    21.5µs ± 0%    -0.16%
CRC32/poly=Koopman/size=4kB/align=1-32       21.5µs ± 0%    21.5µs ± 0%    -0.05%
CRC32/poly=Koopman/size=32kB/align=0-32       172µs ± 0%     172µs ± 0%    -0.07%
CRC32/poly=Koopman/size=32kB/align=1-32       172µs ± 0%     172µs ± 0%    -0.01%

name                                        old speed     new speed      delta
CRC32/poly=IEEE/size=15/align=0-32          128MB/s ± 0%   394MB/s ± 0%  +207.95%
CRC32/poly=IEEE/size=15/align=1-32          128MB/s ± 0%   394MB/s ± 0%  +208.09%
CRC32/poly=IEEE/size=40/align=0-32          310MB/s ± 0%   979MB/s ± 0%  +216.07%
CRC32/poly=IEEE/size=40/align=1-32          310MB/s ± 0%   979MB/s ± 0%  +216.16%
CRC32/poly=IEEE/size=512/align=0-32         618MB/s ± 0%  2074MB/s ± 0%  +235.72%
CRC32/poly=IEEE/size=512/align=1-32         618MB/s ± 0%  3852MB/s ± 0%  +523.55%
CRC32/poly=IEEE/size=1kB/align=0-32         646MB/s ± 0%  2225MB/s ± 0%  +244.57%
CRC32/poly=IEEE/size=1kB/align=1-32         647MB/s ± 0%  2225MB/s ± 0%  +243.87%
CRC32/poly=IEEE/size=4kB/align=0-32         676MB/s ± 0%  2352MB/s ± 0%  +248.02%
CRC32/poly=IEEE/size=4kB/align=1-32         672MB/s ± 0%  2352MB/s ± 0%  +250.15%
CRC32/poly=IEEE/size=32kB/align=0-32        678MB/s ± 0%  2387MB/s ± 0%  +252.17%
CRC32/poly=IEEE/size=32kB/align=1-32        678MB/s ± 0%  2388MB/s ± 0%  +252.11%
CRC32/poly=Castagnoli/size=15/align=0-32    129MB/s ± 0%   393MB/s ± 0%  +205.51%
CRC32/poly=Castagnoli/size=15/align=1-32    129MB/s ± 0%   390MB/s ± 0%  +203.41%
CRC32/poly=Castagnoli/size=40/align=0-32    314MB/s ± 0%   988MB/s ± 0%  +215.04%
CRC32/poly=Castagnoli/size=40/align=1-32    314MB/s ± 0%   987MB/s ± 0%  +214.68%
CRC32/poly=Castagnoli/size=512/align=0-32   618MB/s ± 0%  3860MB/s ± 0%  +524.32%
CRC32/poly=Castagnoli/size=512/align=1-32   619MB/s ± 0%  3859MB/s ± 0%  +523.66%
CRC32/poly=Castagnoli/size=1kB/align=0-32   645MB/s ± 0%  4568MB/s ± 0%  +608.56%
CRC32/poly=Castagnoli/size=1kB/align=1-32   650MB/s ± 0%  4567MB/s ± 0%  +602.94%
CRC32/poly=Castagnoli/size=4kB/align=0-32   667MB/s ± 0%  5297MB/s ± 0%  +693.81%
CRC32/poly=Castagnoli/size=4kB/align=1-32   676MB/s ± 0%  5297MB/s ± 0%  +684.00%
CRC32/poly=Castagnoli/size=32kB/align=0-32  678MB/s ± 0%  5519MB/s ± 0%  +713.83%
CRC32/poly=Castagnoli/size=32kB/align=1-32  677MB/s ± 0%  5497MB/s ± 0%  +712.04%
CRC32/poly=Koopman/size=15/align=0-32       143MB/s ± 0%   144MB/s ± 0%    +0.27%
CRC32/poly=Koopman/size=15/align=1-32       143MB/s ± 0%   144MB/s ± 0%    +0.33%
CRC32/poly=Koopman/size=40/align=0-32       169MB/s ± 0%   170MB/s ± 0%    +0.12%
CRC32/poly=Koopman/size=40/align=1-32       170MB/s ± 0%   170MB/s ± 0%    +0.08%
CRC32/poly=Koopman/size=512/align=0-32      189MB/s ± 0%   189MB/s ± 0%    +0.07%
CRC32/poly=Koopman/size=512/align=1-32      189MB/s ± 0%   189MB/s ± 0%    +0.04%
CRC32/poly=Koopman/size=1kB/align=0-32      190MB/s ± 0%   190MB/s ± 0%    +0.05%
CRC32/poly=Koopman/size=1kB/align=1-32      190MB/s ± 0%   190MB/s ± 0%    -0.01%
CRC32/poly=Koopman/size=4kB/align=0-32      190MB/s ± 0%   190MB/s ± 0%    +0.15%
CRC32/poly=Koopman/size=4kB/align=1-32      190MB/s ± 0%   191MB/s ± 0%    +0.05%
CRC32/poly=Koopman/size=32kB/align=0-32     191MB/s ± 0%   191MB/s ± 0%    +0.06%
CRC32/poly=Koopman/size=32kB/align=1-32     191MB/s ± 0%   191MB/s ± 0%    +0.02%

Also fix a bug of arm64 assembler

The optimization is mainly contributed by Fangming.Fang <fangming.fang@arm.com>

Change-Id: I900678c2e445d7e8ad9e2a9ab3305d649230905f
Reviewed-on: https://go-review.googlesource.com/40074
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-13 12:44:10 +00:00
Josh Bleecher Snyder c18fd09840 cmd/internal/obj: build ctxt.Text during Sym init
Instead of constructing ctxt.Text in Flushplist,
which will be called concurrently,
do it in InitTextSym, which must be called serially.
This allows us to avoid a mutex for ctxt.Text,
and preserves the existing ordering of functions
for debug output.

Passes toolstash-check.

Updates #15756

Change-Id: I6322b4da24f9f0db7ba25e5b1b50e8d3be2deb37
Reviewed-on: https://go-review.googlesource.com/40502
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-13 02:29:05 +00:00
Josh Bleecher Snyder 3692925c5e cmd/compile: move Text.From.Sym initialization earlier
The initialization of an ATEXT Prog's From.Sym
can race with the assemblers in a concurrent compiler.
CL 40254 contains an initial, failed attempt to
fix that race.

This CL takes a different approach: Rather than
expose an API to initialize the Prog,
expose an API to initialize the Sym.

The initialization of the Sym can then be
moved earlier in the compiler, avoiding the race.

The growth of gc.Func has negligible
performance impact; see below.

Passes toolstash -cmp.

Updates #15756

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       38.8MB ± 0%    ~     (p=0.968 n=9+10)
Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.684 n=10+10)
GoTypes          113MB ± 0%        113MB ± 0%    ~     (p=0.912 n=10+10)
SSA             1.25GB ± 0%       1.25GB ± 0%    ~     (p=0.481 n=10+10)
Flate           25.3MB ± 0%       25.3MB ± 0%    ~     (p=0.105 n=10+10)
GoParser        31.7MB ± 0%       31.8MB ± 0%  +0.09%  (p=0.016 n=8+10)
Reflect         78.3MB ± 0%       78.2MB ± 0%    ~     (p=0.190 n=10+10)
Tar             26.5MB ± 0%       26.6MB ± 0%  +0.13%  (p=0.011 n=10+10)
XML             42.4MB ± 0%       42.4MB ± 0%    ~     (p=0.971 n=10+10)

name       old allocs/op     new allocs/op     delta
Template          378k ± 1%         378k ± 0%    ~     (p=0.315 n=10+9)
Unicode           321k ± 1%         321k ± 0%    ~     (p=0.436 n=10+10)
GoTypes          1.14M ± 0%        1.14M ± 0%    ~     (p=0.079 n=10+9)
SSA              9.70M ± 0%        9.70M ± 0%  -0.04%  (p=0.035 n=10+10)
Flate             233k ± 1%         234k ± 1%    ~     (p=0.529 n=10+10)
GoParser          315k ± 0%         316k ± 0%    ~     (p=0.095 n=9+10)
Reflect           980k ± 0%         980k ± 0%    ~     (p=0.436 n=10+10)
Tar               249k ± 1%         250k ± 0%    ~     (p=0.280 n=10+10)
XML               391k ± 1%         391k ± 1%    ~     (p=0.481 n=10+10)

Change-Id: I3c93033dddd2e1df8cc54a106a6e615d27859e71
Reviewed-on: https://go-review.googlesource.com/40496
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-12 22:49:06 +00:00
Josh Bleecher Snyder ce3ee7cdae cmd/internal/obj: stop storing Text flags in From3
Prior to this CL, flags such as NOSPLIT
on ATEXT Progs were stored in From3.Offset.
Some but not all of those flags were also
duplicated into From.Sym.Attribute.

This CL migrates all of those flags into
From.Sym.Attribute and stops creating a From3.

A side-effect of this is that printing an
ATEXT Prog can no longer simply dump From3.Offset.
That's kind of good, since the raw flag value
wasn't very informative anyway, but it did
necessitate a bunch of updates to the cmd/asm tests.

The reason I'm doing this work now is that
avoiding storing flags in both From.Sym and From3.Offset
simplifies some other changes to fix the data
race first described in CL 40254.

This CL almost passes toolstash-check -all.
The only changes are in cases where the assembler
has decided that a function's flags may be altered,
e.g. to make a function with no calls in it NOSPLIT.
Prior to this CL, that information was not printed.

Sample before:

"".Ctz64 t=1 size=63 args=0x10 locals=0x0
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	TEXT	"".Ctz64(SB), $0-16
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	FUNCDATA	$0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)

Sample after:

"".Ctz64 t=1 nosplit size=63 args=0x10 locals=0x0
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	TEXT	"".Ctz64(SB), NOSPLIT, $0-16
	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	FUNCDATA	$0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)

Observe the additional "nosplit" in the first line
and the additional "NOSPLIT" in the second line.

Updates #15756

Change-Id: I5c59bd8f3bdc7c780361f801d94a261f0aef3d13
Reviewed-on: https://go-review.googlesource.com/40495
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-12 21:53:39 +00:00
Josh Bleecher Snyder 49f4b5a4f5 cmd/internal/obj: remove Link.Plan9privates
Move it to the x86 package, matching our handling
of deferreturn in x86 and arm.
While we're here, improve the concurrency safety
of both Plan9privates and deferreturn
by eagerly initializing them in instinit.

Updates #15756

Change-Id: If3b1995c1e4ec816a5443a18f8d715631967a8b1
Reviewed-on: https://go-review.googlesource.com/40408
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-12 15:56:14 +00:00
Josh Bleecher Snyder f30de83d79 cmd/internal/obj: remove Link.Version
It is zeroed pointlessly and never read.

Change-Id: I65390501a878f545122ec558cb621b91e394a538
Reviewed-on: https://go-review.googlesource.com/40406
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-12 15:54:39 +00:00
Josh Bleecher Snyder 30ddffadd5 cmd/internal/obj: remove Link.Debugdivmod
It is only used once and never written to.
Switch to a local constant instead.

Change-Id: Icdd84e47b81f0de44ad9ed56ab5f4f91df22e6b6
Reviewed-on: https://go-review.googlesource.com/40405
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-12 14:38:15 +00:00
Josh Bleecher Snyder eacfa59220 cmd/internal/obj: remove dead Link fields
These are unused after CLs 39922, 40252, 40370, 40371, and 40372.

Change-Id: I76f9276c581067a8cb555de761550d960f6e39b8
Reviewed-on: https://go-review.googlesource.com/40404
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-12 14:38:06 +00:00
Josh Bleecher Snyder 826a09cd65 cmd/internal/obj: add SortSlice
sort.Slice was added in Go 1.8.
It's nice to use, and faster than sort.Sort,
so it'd be nice to be able to use it in the toolchain.
This CL adds obj.SortSlice, which is sort.Slice,
but with a slower fallback version for bootstrapping.

This CL also includes a single demo+test use.

Change-Id: I2accc60b61f8e48c8ab4f1a63473e3b87af9b691
Reviewed-on: https://go-review.googlesource.com/40114
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-11 20:29:04 +00:00
Josh Bleecher Snyder 2122fc6358 cmd/internal/obj/arm64: don't immediate dereference new prog
Noticed by Cherry while reviewing CL 40252.

The alternative to this is to place t on the stack, like

t := obj.Prog{Ctxt: ctxt}

However, there are only a couple of places where we
manually construct Progs, which is useful.

This isn't hot enough code to warrant
breaking abstraction layers to avoid an allocation.

Passes toolstash-check.

Change-Id: I46c79090b60641c90ee977b750ba5c708aca8ecf
Reviewed-on: https://go-review.googlesource.com/40373
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-11 15:02:39 +00:00
Josh Bleecher Snyder 1e69245418 cmd/internal/obj/s390x: make assembler almost concurrency-safe
CL 39922 made the arm assembler concurrency-safe.
This CL does the same, but for s390x.
The approach is similar: introduce ctxtz to hold
function-local state and thread it through
the assembler as necessary.

One race remains after this CL, similar to CL 40252.

That race is conceptually unrelated to this refactoring,
and will be addressed in a separate CL.

Passes toolstash-check -all.

Updates #15756

Change-Id: Iabf17aa242b70c0b078c2e85dae3d93a5e512372
Reviewed-on: https://go-review.googlesource.com/40371
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2017-04-11 15:02:28 +00:00
Josh Bleecher Snyder f95de5c679 cmd/internal/obj/ppc64: make assembler almost concurrency-safe
CL 39922 made the arm assembler concurrency-safe.
This CL does the same, but for ppc64.
The approach is similar: introduce ctxt9 to hold
function-local state and thread it through
the assembler as necessary.

One race remains after this CL, similar to CL 40252.

That race is conceptually unrelated to this refactoring,
and will be addressed in a separate CL.

Passes toolstash-check -all.

Updates #15756

Change-Id: Icc37d9a971bed2184c8e66b1a64f4f2e556dc207
Reviewed-on: https://go-review.googlesource.com/40372
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-11 14:53:08 +00:00
Josh Bleecher Snyder b5931020b7 cmd/internal/obj/arm64: make assembler almost concurrency-safe
CL 39922 made the arm assembler concurrency-safe.
This CL does the same, but for arm64.
The approach is similar: introduce ctxt7 to hold
function-local state and thread it through
the assembler as necessary.

One race remains after this CL, deep in aclass,
in the check that a Prog does not take the address
of a TLS variable.

That race is conceptually unrelated to this refactoring,
and will be addressed in a separate CL.

Passes toolstash-check -all.

Updates #15756

Change-Id: Icab1ef70008468f9a5b8bf728a77c4520bbcb67d
Reviewed-on: https://go-review.googlesource.com/40252
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-11 14:34:00 +00:00
Josh Bleecher Snyder c4135d61bb cmd/internal/obj/mips: make assembler almost concurrency-safe
CL 39922 made the arm assembler concurrency-safe.
This CL does the same, but for mips.
The approach is similar: introduce ctxt0 to hold
function-local state and thread it through
the assembler as necessary.

One race remains after this CL, similar to CL 40252.

That race is conceptually unrelated to this refactoring,
and will be addressed in a separate CL.

Passes toolstash-check -all.

Updates #15756

Change-Id: I2c54a889aa448a4476c9a75da4dd94ef69657b16
Reviewed-on: https://go-review.googlesource.com/40370
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-11 14:31:16 +00:00
Ben Shi 69261ecad6 runtime: use hardware divider to improve performance
The hardware divider is an optional component of ARMv7. This patch
detects whether it is available in runtime and use it or not.

1. The hardware divider is detected at startup and a flag is set/clear
   according to a perticular bit of runtime.hwcap.
2. Each call of runtime.udiv will check this flag and decide if
   use the hardware division instruction.

A rough test shows the performance improves 40-50% for ARMv7. And
the compatibility of ARMv5/v6 is not broken.

fixes #19118

Change-Id: Ic586bc9659ebc169553ca2004d2bdb721df823ac
Reviewed-on: https://go-review.googlesource.com/37496
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-11 12:25:55 +00:00
Josh Bleecher Snyder e367ba9eae cmd/internal/obj: refactor ATEXT symbol initialization
This makes the core Flushplist loop clearer.

We may also want to move the Sym initialization
much earlier in the compiler (see discussion on
CL 40254), for which this paves the way.

While we're here, eliminate package log in favor of ctxt.Diag.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ieaf848d196764a5aa82578b689af7bc6638c385a
Reviewed-on: https://go-review.googlesource.com/40313
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2017-04-11 02:56:56 +00:00
Josh Bleecher Snyder 52b33965fd cmd/internal/obj: rename some local variables not c
I plan to use c as a consistent local variable
in this packages. Rename most variables named c,
excepting only some simple functions in asm9.go.

Changes prepared with gorename.

Passes toolstash-check -all.

Updates #15756

Change-Id: If79baac43fca68fad1076e1ff23ae87c2ba638e4
Reviewed-on: https://go-review.googlesource.com/40172
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-10 16:19:21 +00:00
Josh Bleecher Snyder dcf643f1fd cmd/internal/obj/arm: make assembler concurrency-safe
Move global state from obj.Link
to a new function-local state struct arm.ctxt5.

This ends up being cleaner than threading
all the state through as parameters; there's a lot of it.
While we're here, move newprog from a parameter to ctxt5.

We reserve the variable name c for ctxt5,
so a few local variables named c have been renamed.

Instead of lazily initializing deferreturn
and Sym_div and friends, initialize them up front.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ifb4e4b9879e4e1f25e6168d8b7b2a25a3390dc11
Reviewed-on: https://go-review.googlesource.com/39922
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-10 16:18:50 +00:00
Josh Bleecher Snyder c9446398e8 cmd/compile: allow composite literal structs with _ fields
Given code such as

type T struct {
  _ string
}

func f() {
  var x = T{"space"}
  // ...
}

the compiler rewrote the 'var x' line as

var x T
x._ = "space"

The compiler then rejected the assignment to
a blank field, thus rejecting valid code.

It also failed to catch a number of invalid assignments.
And there were insufficient checks for validity
when emitting static data, leading to ICEs.

To fix, check earlier for explicit blanks field names,
explicitly handle legit blanks in sinit,
and don't try to emit static data for nodes
for which typechecking has failed.

Fixes #19482

Change-Id: I594476171d15e6e8ecc6a1749e3859157fe2c929
Reviewed-on: https://go-review.googlesource.com/38006
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-07 22:01:18 +00:00
Josh Bleecher Snyder 735fe51a4b cmd/internal/obj: add LookupInit
There are some LSyms that are lazily initialized,
and which cannot be made eagerly initialized,
such as elements of a constant pool.

To avoid needing a mutex to protect the internals of
those LSyms, this CL introduces LookupInit,
which allows an LSym to be initialized only once.

By itself this is not fully concurrency-safe,
but Ctxt.Hash will need mutex protection anyway,
and that will be enough to support one-time LSym initialization.

Passes toolstash-check -all.

Updates #15756

Change-Id: Id7248dfdc4dfbdfe425fa31d0c0045018eeea1fa
Reviewed-on: https://go-review.googlesource.com/39990
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-07 20:30:45 +00:00
Josh Bleecher Snyder 3cf1ce40bd Revert "cmd/compile: output DWARF lexical blocks for local variables"
This reverts commit c8b889cc48.

Reason for revert: broke noopt build, compiler performance regression, new Curfn uses

Let's fix those and then try this again.

Change-Id: Icc3cad1365d04cac8fd09da9dbb0bbf55c13ef44
Reviewed-on: https://go-review.googlesource.com/39991
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 19:52:26 +00:00
Alessandro Arzilli c8b889cc48 cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in debug_info.
Version of debug_info is updated from DWARF v.2 to DWARF v.3 since version 2
does not allow lexical blocks with discontinuous ranges.

Second attempt at https://go-review.googlesource.com/#/c/29591/

Remaining open problems:
- scope information is removed from inlined functions
- variables in debug_info do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, before its declaration.

Updates golang/go#12899, golang/go#6913

Change-Id: I0e260a45b564d14a87b88974eb16c5387cb410a5
Reviewed-on: https://go-review.googlesource.com/36879
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 18:15:06 +00:00
Josh Bleecher Snyder 7165bcc6ba cmd/internal/obj: remove timing prints from assemblers
Updates #19865

Change-Id: I24fbf5d79b5e4cac09c14cfff678a8215397b670
Reviewed-on: https://go-review.googlesource.com/39914
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-07 17:13:26 +00:00
Josh Bleecher Snyder 63c1aff60b cmd/internal/obj: eagerly initialize assemblers
CL 38662 changed the x86 assembler to be eagerly
initialized, for a concurrent backend.

This CL puts in place a proper mechanism for doing so,
and switches all architectures to use it.

Passes toolstash-check -all.

Updates #15756

Change-Id: Id2aa527d3a8259c95797d63a2f0d1123e3ca2a1c
Reviewed-on: https://go-review.googlesource.com/39917
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-07 16:57:03 +00:00
Josh Bleecher Snyder 99683483d6 cmd/internal/obj: unify creation of numeric literal syms
This is a straightforward refactoring,
to reduce the scope of upcoming changes.

The symbol size and AttrLocal=true was not
set universally, but it appears not to matter,
since toolstash -cmp is happy.

Passes toolstash-check -all.

Change-Id: I7f8392f939592d3a1bc6f61dec992f5661f42fca
Reviewed-on: https://go-review.googlesource.com/39791
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-06 19:01:50 +00:00
Josh Bleecher Snyder c311488283 cmd/internal/obj: remove Linklookup
It was simply a wrapper around Link.Lookup.
Unwrap everything.

CL prepared using eg with template:

package p

import "cmd/internal/obj"

func before(ctxt *obj.Link, name string, version int) *obj.LSym {
	return obj.Linklookup(ctxt, name, version)
}

func after(ctxt *obj.Link, name string, version int) *obj.LSym {
	return ctxt.Lookup(name, version)
}

Then one comment in cmd/asm/internal/asm/parse.go
was manually updated (and gofmt'ed!),
and func Linklookup deleted.

Passes toolstash-check (as a sanity measure).

Change-Id: Icc4d56b0b2b5c8888d3184c1898c48359ea1e638
Reviewed-on: https://go-review.googlesource.com/39715
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-06 19:01:41 +00:00
Cherry Zhang 0bae9b083b cmd/internal/obj/arm64: fix encoding of AND MBCON
When a constant is both MOVCON (can fit into a MOV instruction)
and BITCON (can fit into a logical instruction), the assembler
chooses to use the MOVCON encoding, which is actually longer for
logical instructions. We add MBCON rules explicitly to make sure
it uses the BITCON encoding.

Updates #19857.

Change-Id: Ib9881be363cbc491ac2a0792b36b87e74eff34a8
Reviewed-on: https://go-review.googlesource.com/39652
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-04-06 17:59:39 +00:00
Josh Bleecher Snyder 5c359d8083 cmd/compile: add Prog cache to Progs
The existing bulk/cached Prog allocator, Ctxt.NewProg, is not concurrency-safe.
This CL moves Prog allocation to its clients, the compiler and the assembler.

The assembler is so fast and generates so few Progs that it does not need
optimization of Prog allocation. I could not generate measureable changes.
And even if I could, the assembly is a miniscule portion of build times.

The compiler already has a natural place to manage Prog allocation;
this CL migrates the Prog cache there.
It will be made concurrency-safe in a later CL by
partitioning the Prog cache into chunks and assigning each chunk
to a different goroutine to manage.

This CL does cause a performance degradation when the compiler
is invoked with the -S flag (to dump assembly).
However, such usage is rare and almost always done manually.
The one instance I know of in a test is TestAssembly
in cmd/compile/internal/gc, and I did not detect
a measurable performance impact there.

Passes toolstash-check -all.
Minor compiler performance impact.

Updates #15756

Performance impact from just this CL:

name        old time/op     new time/op     delta
Template        213ms ± 4%      213ms ± 4%    ~     (p=0.571 n=49+49)
Unicode        89.1ms ± 3%     89.4ms ± 3%    ~     (p=0.388 n=47+48)
GoTypes         581ms ± 2%      584ms ± 3%  +0.56%  (p=0.019 n=47+48)
SSA             6.48s ± 2%      6.53s ± 2%  +0.84%  (p=0.000 n=47+49)
Flate           128ms ± 4%      128ms ± 4%    ~     (p=0.832 n=49+49)
GoParser        152ms ± 3%      152ms ± 3%    ~     (p=0.815 n=48+47)
Reflect         371ms ± 4%      371ms ± 3%    ~     (p=0.617 n=50+47)
Tar             112ms ± 4%      112ms ± 3%    ~     (p=0.724 n=49+49)
XML             208ms ± 3%      208ms ± 4%    ~     (p=0.678 n=49+50)
[Geo mean]      284ms           285ms       +0.18%

name        old user-ns/op  new user-ns/op  delta
Template         251M ± 7%       252M ±11%    ~     (p=0.704 n=49+50)
Unicode          107M ± 7%       108M ± 5%  +1.25%  (p=0.036 n=50+49)
GoTypes          738M ± 3%       740M ± 3%    ~     (p=0.305 n=49+48)
SSA             8.83G ± 2%      8.86G ± 4%    ~     (p=0.098 n=47+50)
Flate            146M ± 6%       147M ± 3%    ~     (p=0.584 n=48+41)
GoParser         178M ± 6%       179M ± 5%  +0.93%  (p=0.036 n=49+48)
Reflect          441M ± 4%       446M ± 7%    ~     (p=0.218 n=44+49)
Tar              126M ± 5%       126M ± 5%    ~     (p=0.766 n=48+49)
XML              245M ± 5%       244M ± 4%    ~     (p=0.359 n=50+50)
[Geo mean]       341M            342M       +0.51%

Performance impact from this CL combined with its parent:

name        old time/op     new time/op     delta
Template        213ms ± 3%      214ms ± 4%    ~     (p=0.685 n=47+50)
Unicode        89.8ms ± 6%     90.5ms ± 6%    ~     (p=0.055 n=50+50)
GoTypes         584ms ± 3%      585ms ± 2%    ~     (p=0.710 n=49+47)
SSA             6.50s ± 2%      6.53s ± 2%  +0.39%  (p=0.011 n=46+50)
Flate           128ms ± 3%      128ms ± 4%    ~     (p=0.855 n=47+49)
GoParser        152ms ± 3%      152ms ± 3%    ~     (p=0.666 n=49+49)
Reflect         371ms ± 3%      372ms ± 3%    ~     (p=0.298 n=48+48)
Tar             112ms ± 5%      113ms ± 3%    ~     (p=0.107 n=49+49)
XML             208ms ± 3%      208ms ± 2%    ~     (p=0.881 n=50+49)
[Geo mean]      285ms           285ms       +0.26%

name        old user-ns/op  new user-ns/op  delta
Template         254M ± 9%       252M ± 8%    ~     (p=0.290 n=49+50)
Unicode          106M ± 6%       108M ± 7%  +1.44%  (p=0.034 n=50+50)
GoTypes          741M ± 4%       743M ± 4%    ~     (p=0.992 n=50+49)
SSA             8.86G ± 2%      8.83G ± 3%    ~     (p=0.158 n=47+49)
Flate            147M ± 4%       148M ± 5%    ~     (p=0.832 n=50+49)
GoParser         179M ± 5%       178M ± 5%    ~     (p=0.370 n=48+50)
Reflect          441M ± 6%       445M ± 7%    ~     (p=0.246 n=45+47)
Tar              126M ± 6%       126M ± 6%    ~     (p=0.815 n=49+50)
XML              244M ± 3%       245M ± 4%    ~     (p=0.190 n=50+50)
[Geo mean]       342M            342M       +0.17%

Change-Id: I020f1c079d495fbe2e15ccb51e1ea2cc1b5a1855
Reviewed-on: https://go-review.googlesource.com/39634
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-06 04:53:50 +00:00
Josh Bleecher Snyder 5b59b32c97 cmd/compile: teach assemblers to accept a Prog allocator
The existing bulk Prog allocator is not concurrency-safe.
To allow for concurrency-safe bulk allocation of Progs,
I want to move Prog allocation and caching upstream,
to the clients of cmd/internal/obj.

This is a preliminary enabling refactoring.
After this CL, instead of calling Ctxt.NewProg
throughout the assemblers, we thread through
a newprog function that returns a new Prog.

That function is set up to be Ctxt.NewProg,
so there are no real changes in this CL;
this CL only establishes the plumbing.

Passes toolstash-check -all.
Negligible compiler performance impact.

Updates #15756

name        old time/op     new time/op     delta
Template        213ms ± 3%      214ms ± 4%    ~     (p=0.574 n=49+47)
Unicode        90.1ms ± 5%     89.9ms ± 4%    ~     (p=0.417 n=50+49)
GoTypes         585ms ± 4%      584ms ± 3%    ~     (p=0.466 n=49+49)
SSA             6.50s ± 3%      6.52s ± 2%    ~     (p=0.251 n=49+49)
Flate           128ms ± 4%      128ms ± 4%    ~     (p=0.673 n=49+50)
GoParser        152ms ± 3%      152ms ± 3%    ~     (p=0.810 n=48+49)
Reflect         372ms ± 4%      372ms ± 5%    ~     (p=0.778 n=49+50)
Tar             113ms ± 5%      111ms ± 4%  -0.98%  (p=0.016 n=50+49)
XML             208ms ± 3%      208ms ± 2%    ~     (p=0.483 n=47+49)
[Geo mean]      285ms           285ms       -0.17%

name        old user-ns/op  new user-ns/op  delta
Template         253M ± 8%       254M ± 9%    ~     (p=0.899 n=50+50)
Unicode          106M ± 9%       106M ±11%    ~     (p=0.642 n=50+50)
GoTypes          736M ± 4%       740M ± 4%    ~     (p=0.121 n=50+49)
SSA             8.82G ± 3%      8.88G ± 2%  +0.65%  (p=0.006 n=49+48)
Flate            147M ± 4%       147M ± 5%    ~     (p=0.844 n=47+48)
GoParser         179M ± 4%       178M ± 6%    ~     (p=0.785 n=50+50)
Reflect          443M ± 6%       441M ± 5%    ~     (p=0.850 n=48+47)
Tar              126M ± 5%       126M ± 5%    ~     (p=0.734 n=50+50)
XML              244M ± 5%       244M ± 5%    ~     (p=0.594 n=49+50)
[Geo mean]       341M            341M       +0.11%

Change-Id: Ice962f61eb3a524c2db00a166cb582c22caa7d68
Reviewed-on: https://go-review.googlesource.com/39633
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-06 02:07:21 +00:00
Josh Bleecher Snyder 92cffa1391 cmd/internal/obj: remove dead func Copyp
Change-Id: Iaeb7bcbcdbc46c0e0e40b0aa070c706e0ca53013
Reviewed-on: https://go-review.googlesource.com/39555
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-05 15:18:14 +00:00
Josh Bleecher Snyder 26308fb481 cmd/internal/obj: use string instead of LSym in Pcln
In a concurrent backend, Ctxt.Lookup will need some
form of concurrency protection, which will make it
more expensive.

This CL changes the pcln table builder to track
filenames as strings rather than LSyms.
Those strings are then converted into LSyms
at the last moment, for writing the object file.

This CL removes over 85% of the calls to Ctxt.Lookup
in a run of make.bash.

Passes toolstash-check.

Updates #15756

Change-Id: I3c53deff6f16f2643169f3bdfcc7aca2ca58b0a4
Reviewed-on: https://go-review.googlesource.com/39291
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-03 15:19:47 +00:00
Dave Cheney 78f6622b81 cmd/internal/obj/*: rename Rconv to rconv
Each architecture's Rconv function is only used inside its
respective package, so it does not need to be exported.

Change-Id: Ifbd629964d7a9edd66501d7cdf4750621d66d646
Reviewed-on: https://go-review.googlesource.com/39110
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-01 10:41:37 +00:00
Alex Brainman 361af94d5d cmd/internal/obj, cmd/link: remove Hwindowsgui everywhere
Hwindowsgui has the same meaning as Hwindows - build PE
executable. So use Hwindows everywhere.

Change-Id: I2cae5777f17c7bc3a043dfcd014c1620cc35fc20
Reviewed-on: https://go-review.googlesource.com/38761
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-30 22:51:43 +00:00
Ben Shi c5ddc558ba cmd/internal/obj/arm: support more ARMv5/ARMv6/ARMv7 instructions
REV/REV16/REVSH were introduced in ARMv6, they offered more efficient
byte reverse operatons.

MMUL/MMULA/MMULS were introduced in ARMv6, they simplified
a serial of mul->shift->add/sub operations into a single instruction.

RBIT was introduced in ARMv7, it inversed a 32-bit word's bit order.

MULS was introduced in ARMv7, it corresponded to MULA.

MULBB/MULABB were introduced in ARMv5TE, they performed 16-bit
multiplication (and accumulation).

Change-Id: I6365b17b3c4eaf382a657c210bb0094b423b11b8
Reviewed-on: https://go-review.googlesource.com/35565
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-30 18:19:04 +00:00
Michael Munday 41fd8d6401 cmd/internal/obj: make morestack cutoff the same on all architectures
There is always 128 bytes available below the stackguard. Allow functions
with medium-sized stack frames to use this, potentially allowing them to
avoid growing the stack.

This change makes all architectures use the same calculation as x86.

Change-Id: I2afb1a7c686ae5a933e50903b31ea4106e4cd0a0
Reviewed-on: https://go-review.googlesource.com/38734
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-29 20:35:46 +00:00
Dave Cheney 54af18708b cmd/internal/obj/x86: clean up byteswapreg
Make byteswapreg more Go like.

Change-Id: Ibdf3603cae9cad2b3465b4c224a28a4c4c745c2e
Reviewed-on: https://go-review.googlesource.com/38615
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-29 00:02:06 +00:00
Josh Bleecher Snyder aca58647b7 cmd/internal/obj: eliminate stray ctxt.Cursym write
It is explicitly assigned in each of the
assemblers as needed.
I plan to remove Cursym entirely eventually,
but this is a helpful intermediate step.

Passes toolstash-check -all.

Updates #15756

Change-Id: Id7ddefae2def439af44d03053886ca8cc935731f
Reviewed-on: https://go-review.googlesource.com/38727
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-28 17:03:18 +00:00
Josh Bleecher Snyder 7e817859b3 cmd/internal/obj: eliminate Curp
Remove the global obj.Link.Curp.

In asmz.go, replace the only use by passing it as an argument.
In asm0.go and asm9.go, it was written but never read.
In asm5.go and asm7.go, thread it through as an argument.

Passes toolstash-check -all.

Updates #15756

Change-Id: I1a0faa89e768820f35d73a8b37ec8088d78d15f7
Reviewed-on: https://go-review.googlesource.com/38715
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-27 18:51:42 +00:00
Josh Bleecher Snyder 1acba7d4fa cmd/internal/obj: remove prasm
Fold the printing of the offending instruction
into the neighboring Diag call, if it is not
already present.

Change-Id: I310f1479e16a4d2a24ff3c2f7e2c60e5e2015c1b
Reviewed-on: https://go-review.googlesource.com/38714
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-27 18:51:30 +00:00
Josh Bleecher Snyder 4909ecc462 cmd/internal/obj/x86: change AsmBuf.Lock to bool
Follow-up to CL 38668.

Passes toolstash-check -all.

Change-Id: I78a62509c610b5184b5e7ef2c4aa146fc8038840
Reviewed-on: https://go-review.googlesource.com/38670
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2017-03-26 21:29:31 +00:00
Josh Bleecher Snyder cf2b32e719 cmd/internal/obj: eliminate Prog.Mode
Follow-up to CL 38446.

Passes toolstash-check -all.

Change-Id: I04cadc058cbaa5f396136502c574e5a395a33276
Reviewed-on: https://go-review.googlesource.com/38669
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 19:48:48 +00:00
Josh Bleecher Snyder 4f122e82fe cmd/internal/obj: move fields from obj.Link to x86.AsmBuf
These fields are used to encode a single instruction.
Add them to AsmBuf, which is also per-instruction,
and which is not global.

Updates #15756

Change-Id: I0e5ea22ffa641b07291e27de6e2ff23b6dc534bd
Reviewed-on: https://go-review.googlesource.com/38668
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 19:48:00 +00:00
Josh Bleecher Snyder 166160b446 cmd/internal/obj/x86: make ctxt.Cursym local
Thread it through as an argument instead of using a global.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ia8c6ce09b43dbb2e6c7d889ded8dbaeb5366048d
Reviewed-on: https://go-review.googlesource.com/38667
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 19:47:47 +00:00
Josh Bleecher Snyder c06679f7b5 cmd/internal/obj/x86: remove ctxt.Curp references
Empirically, p == ctxt.Curp here.
A scan of (the thousands of lines of) asm6.go
shows no clear opportunity for them to diverge.

Passes toolstash-check -all.

Updates #15756

Change-Id: I9f5ee9585a850fbe24be3b851d8fdc2c966c65ce
Reviewed-on: https://go-review.googlesource.com/38665
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-26 14:38:19 +00:00
Josh Bleecher Snyder 2ebe1bdd89 cmd/internal/obj: make x86's asmbuf a local variable
The x86 assembler requires a buffer to build
variable-length instructions.
It used to be an obj.Link field.
That doesn't play nicely with concurrent assembly.
Move the AsmBuf type to the x86 package,
where it belongs anyway,
and make it a local variable.

Passes toolstash-check -all.
No compiler performance impact.

Updates #15756

Change-Id: I8014e52145380bfd378ee374a0c971ee5bada917
Reviewed-on: https://go-review.googlesource.com/38663
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 04:12:26 +00:00
Josh Bleecher Snyder 3b251e603d cmd/internal/obj: eagerly initialize x86 assembler
Prior to this CL, instinit was called as needed.
This does not work well in a concurrent backend.
Initialization is very cheap; do it on startup instead.

Passes toolstash-check -all.
No compiler performance impact.

Updates #15756

Change-Id: Ifa5e82e8abf4504435e1b28766f5703a0555f42d
Reviewed-on: https://go-review.googlesource.com/38662
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-25 23:20:25 +00:00
Josh Bleecher Snyder 6652572b75 cmd/compile: thread Curfn through to debuginfo
Updates #15756

Change-Id: I860dd45cae9d851c7844654621bbc99efe7c7f03
Reviewed-on: https://go-review.googlesource.com/38591
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-24 16:22:58 +00:00
Ben Shi d0ff9ece2b cmd/internal/obj/arm: Fix wrong assembly in the arm assembler
As #19357 reported,

TST has 3 different sub forms
TST $imme, Rx
TST Ry << imme, Rx
TST Ry << Rz, Rx

just like CMP/CMN/TEQ has. But current arm assembler assembles all TST
instructions wrongly. This patch fixes it and adds more tests.

Fixes #19357

Change-Id: Iafedccfaab2cbb2631e7acf259837a782e2e8e2f
Reviewed-on: https://go-review.googlesource.com/37662
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-23 05:05:36 +00:00
Dave Cheney 6f4a4585f2 cmd/internal/obj/mips: standardize on sys.MIPS and sys.MIPS64 constants
CL 38446 introduced the use of the sys.ArchFamily type into the
cmd/internal/obj/mips package and redefined the mips.Mips32 and
mips.Mips64 constants in terms of their sys.ArchFamily counterparts.
This CL removes these local declarations and consolidates on sys.MIPS
and sys.MIPS64 respectively.

Change-Id: Id7aab6c7fd0de42ff43dde605df6bd4c85a3d895
Reviewed-on: https://go-review.googlesource.com/38287
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-23 00:37:51 +00:00
Josh Bleecher Snyder cfb3c8df62 cmd/internal/obj: eliminate Link.Asmode
Asmode is always set to p.Mode,
which is always set based on the arch family.
Instead, use the arch family directly.

Passes toolstash-check -all.

Change-Id: Id982472dcc8eeb6dd22cac5ad2f116b54a44caee
Reviewed-on: https://go-review.googlesource.com/38451
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 17:33:27 +00:00
Josh Bleecher Snyder a470e5d4b8 cmd/internal/obj: eliminate Ctxt.Mode
Replace Ctxt.Mode with a method, Ctxt.RegWidth,
which is calculated directly off the arch info.

I believe that Prog.Mode can also be removed; future CL.

This is a step towards obj.Link immutability.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ifd7f8f6ed0a2fdc032d1dd306fcd695a14aa5bc5
Reviewed-on: https://go-review.googlesource.com/38446
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 17:27:43 +00:00
Josh Bleecher Snyder 2f2cd557a6 cmd/internal/obj: clean up brloop
Add docs.
Reduce indentation.

Passes toolstash-check -all.

Change-Id: I968d1af25989886ae9945052e05e211a107dde9c
Reviewed-on: https://go-review.googlesource.com/38443
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 16:11:31 +00:00
Josh Bleecher Snyder 33266df861 cmd/internal/obj: clean up checkaddr
Coalesce identical cases.
Give it a proper doc comment.
Fix comment locations.
Update/delete old comments.

Passes toolstash-check -all.

Change-Id: I88d9cf20e6e04b0c1c6583e92cd96335831f183f
Reviewed-on: https://go-review.googlesource.com/38442
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 15:15:02 +00:00
Josh Bleecher Snyder 3394633db0 cmd/internal/obj: eliminate AMODE
AMODE appears to have been intended to allow
a Prog to switch between 16 (!), 32, or 64 bit x86.
It is unused anywhere in the tree.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ic57b257cfe580f29dad81d97e4193bf3c330c598
Reviewed-on: https://go-review.googlesource.com/38445
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 14:39:07 +00:00
Josh Bleecher Snyder 88d4ab82d5 cmd/internal/obj: eliminate unnecessary ctxt.Cursym assignment
None of the following code uses it.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ieeaaca8ba31e5c345c0c8a758d520b24be88e173
Reviewed-on: https://go-review.googlesource.com/38444
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2017-03-22 14:38:59 +00:00
Matthew Dempsky 7bb5b2d33a cmd/internal/obj: remove unneeded Addr.Node and Prog.Opt fields
Change-Id: I218b241c32a5948b66ad0d95ecc368648cf4ddf5
Reviewed-on: https://go-review.googlesource.com/38130
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-20 23:49:29 +00:00
Matthew Dempsky cce4c319d6 cmd/internal/obj: remove unneeded AVARFOO ops
Change-Id: I10e36046ebce8a8741ef019cfe266b9ac9fa322d
Reviewed-on: https://go-review.googlesource.com/38088
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-20 23:40:09 +00:00
Josh Bleecher Snyder 42a915c933 cmd/internal/obj: convert Debug* Link fields into bools
Change-Id: I9ac274dbfe887675a7820d2f8f87b5887b1c9b0e
Reviewed-on: https://go-review.googlesource.com/38383
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-20 22:08:41 +00:00
Michael Munday df47b82174 cmd/internal/obj/s390x: cleanup objz.go
This CL deletes some unnecessary code in objz.go that existed to
support instruction scheduling. It's likely instruction scheduling
will never be done in this part of the backend so this code can
just be deleted.

This file can probably be cleaned up a bit more, but I think this
is a good start.

Passes: go build -toolexec 'toolstash -cmp' -a std.

Change-Id: I1645632ac551a90a4f4be418045c046b488e9469
Reviewed-on: https://go-review.googlesource.com/38394
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-20 21:58:27 +00:00
Josh Bleecher Snyder a955ece6cd cmd/internal/obj: reduce variable scope
Minor cleanup, to make it clearer
that the two p's are unrelated.

Change-Id: Icb6386c626681f60e5e631b33aa3a0fc84f40e4a
Reviewed-on: https://go-review.googlesource.com/38381
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-20 20:32:52 +00:00
Michael Munday 17570a9afb cmd/compile: emit fused multiply-{add,subtract} on ppc64x
A follow on to CL 36963 adding support for ppc64x.

Performance changes (as posted on the issue):

poly1305:
benchmark               old ns/op new ns/op delta
Benchmark64-16          172       151       -12.21%
Benchmark1K-16          1828      1523      -16.68%
Benchmark64Unaligned-16 172       151       -12.21%
Benchmark1KUnaligned-16 1827      1523      -16.64%

math:
BenchmarkAcos-16        43.9      39.9      -9.11%
BenchmarkAcosh-16       57.0      45.8      -19.65%
BenchmarkAsin-16        35.8      33.0      -7.82%
BenchmarkAsinh-16       68.6      60.8      -11.37%
BenchmarkAtan-16        19.8      16.2      -18.18%
BenchmarkAtanh-16       65.5      57.5      -12.21%
BenchmarkAtan2-16       45.4      34.2      -24.67%
BenchmarkGamma-16       37.6      26.0      -30.85%
BenchmarkLgamma-16      40.0      28.2      -29.50%
BenchmarkLog1p-16       35.1      29.1      -17.09%
BenchmarkSin-16         22.7      18.4      -18.94%
BenchmarkSincos-16      31.7      23.7      -25.24%
BenchmarkSinh-16        146       131       -10.27%
BenchmarkY0-16          130       107       -17.69%
BenchmarkY1-16          127       107       -15.75%
BenchmarkYn-16          278       235       -15.47%

Updates #17895.

Change-Id: I1c16199715d20c9c4bd97c4a950bcfa69eb688c1
Reviewed-on: https://go-review.googlesource.com/38095
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-03-20 20:01:29 +00:00
Josh Bleecher Snyder 604e4841d6 cmd/internal/obj/ppc64: remove stackbarrier function check
Stack barriers were removed in CL 36620.

Change-Id: If124d65a73a7b344a42be2a4b386a14d7a0a428b
Reviewed-on: https://go-review.googlesource.com/38169
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: David Chase <drchase@google.com>
2017-03-17 04:46:58 +00:00
Matthew Dempsky cc71aa9ac4 cmd/compile/internal/ssa: make ARM's udiv like other calls
Passes toolstash-check -all.

Change-Id: Id389f8158cf33a3c0fcef373615b5351e7c74b5b
Reviewed-on: https://go-review.googlesource.com/38082
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-13 21:29:02 +00:00
Ilya Tocar 27492a2a54 cmd/internal/obj/x86: remove unused const
Since https://go-review.googlesource.com/24040 we no longer pad functions
in asm6, so funcAlign is unused. Delete it.

Change-Id: Id710e545a76b1797398f2171fe7e0928811fcb31
Reviewed-on: https://go-review.googlesource.com/38134
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-13 19:19:09 +00:00
Robert Griesemer 2a5cf48f91 cmd/compile: print columns (not just lines) in error messages
Compiler errors now show the exact line and line byte offset (sometimes
called "column") of where an error occured. For `go tool compile x.go`:

	package p
	const c int = false
	//line foo.go:123
	type t intg

reports

	x.go:2:7: cannot convert false to type int
	foo.go:123[x.go:4:8]: undefined: intg

(Some errors use the "wrong" position for the error message; arguably
the byte offset for the first error should be 15, the position of 'false',
rathen than 7, the position of 'c'. But that is an indepedent issue.)

The byte offset (column) values are measured in bytes; they start at 1,
matching the convention used by editors and IDEs.

Positions modified by //line directives show the line offset only for the
actual source location (in square brackets), not for the "virtual" file and
line number because that code is likely generated and the //line directive
only provides line information.

Because the new format might break existing tools or scripts, printing
of line offsets can be disabled with the new compiler flag -C. We plan
to remove this flag eventually.

Fixes #10324.

Change-Id: I493f5ee6e78457cf7b00025aba6b6e28e50bb740
Reviewed-on: https://go-review.googlesource.com/37970
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-09 23:29:49 +00:00
Matthew Dempsky b1a4424a52 cmd/internal/obj: change started to bool
Change-Id: I90143e3c6e95a1495f300ffeb10de554aa41f56a
Reviewed-on: https://go-review.googlesource.com/37889
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-07 19:23:06 +00:00
Matthew Dempsky 68177d9ec0 cmd/internal/obj: move dwarf.Var generation into compiler
Passes toolstash -cmp.

Change-Id: I4bd60f7ebba5457e7b3ece688fee2351bfeeb59a
Reviewed-on: https://go-review.googlesource.com/37874
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2017-03-07 17:32:54 +00:00
Matthew Dempsky 1874d4a883 cmd/internal/obj, cmd/compile: rip off some toolstash bandaids
Change-Id: I402383e893223facae451adbd640113126d5edd9
Reviewed-on: https://go-review.googlesource.com/37873
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-06 23:29:36 +00:00
Matthew Dempsky 9cb2ee0ff2 cmd/internal/obj: move STEXT-only LSym fields into new FuncInfo struct
Shrinks LSym somewhat for non-STEXT LSyms, which are much more common.

While here, switch to tracking Automs in a slice instead of a linked
list. (Previously, this would have made LSyms larger.)

Passes toolstash-check.

Change-Id: I082e50e1d1f1b544c9e06b6e412a186be6a4a2b5
Reviewed-on: https://go-review.googlesource.com/37872
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-06 22:17:23 +00:00
Matthew Dempsky 7a98bdf1c2 cmd/internal/obj: remove AUSEFIELD pseudo-op
Instead, cmd/compile can directly emit R_USEFIELD relocations.

Manually verified rsc.io/tmp/fieldtrack still passes.

Change-Id: Ib1fb5ab902ff0ad17ef6a862a9a5692caf7f87d1
Reviewed-on: https://go-review.googlesource.com/37871
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-06 22:16:13 +00:00
Matthew Dempsky 7c84dc79fd cmd/internal/obj, cmd/link: bump magic string to go19ld
golang.org/cl/37231 changed the object file format, but forgot to bump
the version string.

Change-Id: I8351ec8ed55e65479006e7c0df20254d0e31015f
Reviewed-on: https://go-review.googlesource.com/37798
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-06 19:13:57 +00:00
Eitan Adler 789c5255a4 all: remove the the duplicate words
Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0
Reviewed-on: https://go-review.googlesource.com/37793
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-06 04:39:12 +00:00
Josh Bleecher Snyder 57e038615d cmd/internal/src: cache prefixed filenames
CL 37234 introduced string concatenation into some hot code. 
This CL does that work earlier and caches the result.

Updates #19386

Performance impact vs master:

name       old time/op      new time/op      delta
Template        223ms ± 5%       216ms ± 5%   -2.98%  (p=0.001 n=20+20)
Unicode        98.7ms ± 4%      99.0ms ± 4%     ~     (p=0.749 n=20+19)
GoTypes         631ms ± 4%       626ms ± 4%     ~     (p=0.253 n=20+20)
Compiler        2.91s ± 1%       2.87s ± 3%   -1.11%  (p=0.005 n=18+20)
SSA             4.48s ± 2%       4.36s ± 2%   -2.77%  (p=0.000 n=20+20)
Flate           130ms ± 2%       129ms ± 6%     ~     (p=0.428 n=19+20)
GoParser        160ms ± 4%       157ms ± 3%   -1.62%  (p=0.005 n=20+18)
Reflect         395ms ± 2%       394ms ± 4%     ~     (p=0.445 n=20+20)
Tar             120ms ± 5%       118ms ± 6%     ~     (p=0.101 n=19+20)
XML             224ms ± 3%       223ms ± 3%     ~     (p=0.544 n=19+19)

name       old user-ns/op   new user-ns/op   delta
Template   291user-ms ± 5%  265user-ms ± 5%   -9.02%  (p=0.000 n=20+19)
Unicode    140user-ms ± 3%  139user-ms ± 8%     ~     (p=0.904 n=20+20)
GoTypes    844user-ms ± 3%  849user-ms ± 3%     ~     (p=0.251 n=20+18)
Compiler   4.06user-s ± 5%  3.98user-s ± 2%     ~     (p=0.056 n=20+20)
SSA        6.89user-s ± 5%  6.50user-s ± 3%   -5.61%  (p=0.000 n=20+20)
Flate      164user-ms ± 5%  163user-ms ± 4%     ~     (p=0.365 n=20+19)
GoParser   206user-ms ± 6%  204user-ms ± 4%     ~     (p=0.534 n=20+18)
Reflect    501user-ms ± 4%  505user-ms ± 5%     ~     (p=0.383 n=20+20)
Tar        151user-ms ± 3%  152user-ms ± 7%     ~     (p=0.798 n=17+20)
XML        283user-ms ± 7%  280user-ms ± 5%     ~     (p=0.301 n=20+20)

name       old alloc/op     new alloc/op     delta
Template       42.5MB ± 0%      40.2MB ± 0%   -5.59%  (p=0.000 n=20+20)
Unicode        31.7MB ± 0%      31.0MB ± 0%   -2.19%  (p=0.000 n=20+18)
GoTypes         124MB ± 0%       117MB ± 0%   -5.90%  (p=0.000 n=20+20)
Compiler        533MB ± 0%       490MB ± 0%   -8.07%  (p=0.000 n=20+20)
SSA             989MB ± 0%       893MB ± 0%   -9.74%  (p=0.000 n=20+20)
Flate          27.8MB ± 0%      26.1MB ± 0%   -5.92%  (p=0.000 n=20+20)
GoParser       34.3MB ± 0%      32.1MB ± 0%   -6.43%  (p=0.000 n=19+20)
Reflect        84.6MB ± 0%      81.4MB ± 0%   -3.84%  (p=0.000 n=20+20)
Tar            28.8MB ± 0%      27.7MB ± 0%   -3.89%  (p=0.000 n=20+20)
XML            47.2MB ± 0%      44.2MB ± 0%   -6.45%  (p=0.000 n=20+19)

name       old allocs/op    new allocs/op    delta
Template         420k ± 1%        381k ± 1%   -9.35%  (p=0.000 n=20+20)
Unicode          338k ± 1%        324k ± 1%   -4.29%  (p=0.000 n=20+19)
GoTypes         1.28M ± 0%       1.15M ± 0%  -10.30%  (p=0.000 n=20+20)
Compiler        5.06M ± 0%       4.41M ± 0%  -12.92%  (p=0.000 n=20+20)
SSA             9.14M ± 0%       7.91M ± 0%  -13.46%  (p=0.000 n=19+20)
Flate            267k ± 0%        241k ± 1%   -9.53%  (p=0.000 n=20+20)
GoParser         347k ± 1%        312k ± 0%  -10.15%  (p=0.000 n=19+20)
Reflect         1.07M ± 0%       1.00M ± 0%   -6.86%  (p=0.000 n=20+20)
Tar              274k ± 1%        256k ± 1%   -6.73%  (p=0.000 n=20+20)
XML              448k ± 0%        398k ± 0%  -11.17%  (p=0.000 n=20+18)


Performance impact when applied together with CL 37234
atop CL 37234's parent commit (i.e. as if it were
a part of CL 37234), to show that this commit
makes CL 37234 completely performance-neutral:

name       old time/op      new time/op      delta
Template        222ms ±14%       222ms ±14%    ~     (p=1.000 n=14+15)
Unicode         104ms ±18%       106ms ±18%    ~     (p=0.650 n=13+14)
GoTypes         653ms ± 7%       638ms ± 5%    ~     (p=0.145 n=14+12)
Compiler        3.10s ± 1%       3.13s ±10%    ~     (p=1.000 n=2+2)
SSA             4.73s ±11%       4.68s ±11%    ~     (p=0.567 n=15+15)
Flate           136ms ± 4%       133ms ± 7%    ~     (p=0.231 n=12+14)
GoParser        163ms ±11%       169ms ±10%    ~     (p=0.352 n=14+14)
Reflect         415ms ±15%       423ms ±20%    ~     (p=0.715 n=15+14)
Tar             133ms ±17%       130ms ±23%    ~     (p=0.252 n=14+15)
XML             236ms ±16%       235ms ±14%    ~     (p=0.874 n=14+14)

name       old user-ns/op   new user-ns/op   delta
Template   271user-ms ±10%  271user-ms ±10%    ~     (p=0.780 n=14+15)
Unicode    143user-ms ± 5%  146user-ms ±11%    ~     (p=0.432 n=12+14)
GoTypes    864user-ms ± 5%  866user-ms ± 9%    ~     (p=0.905 n=14+13)
Compiler   4.17user-s ± 1%  4.26user-s ± 7%    ~     (p=1.000 n=2+2)
SSA        6.79user-s ± 8%  6.79user-s ± 6%    ~     (p=0.902 n=15+15)
Flate      169user-ms ± 8%  164user-ms ± 5%  -3.13%  (p=0.014 n=14+14)
GoParser   212user-ms ± 7%  217user-ms ±22%    ~     (p=1.000 n=13+15)
Reflect    521user-ms ± 7%  533user-ms ±15%    ~     (p=0.511 n=14+14)
Tar        165user-ms ±17%  161user-ms ±15%    ~     (p=0.345 n=15+15)
XML        294user-ms ±11%  292user-ms ±10%    ~     (p=0.839 n=14+14)

name       old alloc/op     new alloc/op     delta
Template       39.9MB ± 0%      39.9MB ± 0%    ~     (p=0.621 n=15+14)
Unicode        31.0MB ± 0%      31.0MB ± 0%    ~     (p=0.098 n=13+15)
GoTypes         117MB ± 0%       117MB ± 0%    ~     (p=0.775 n=15+15)
Compiler        488MB ± 0%       488MB ± 0%    ~     (p=0.333 n=2+2)
SSA             892MB ± 0%       892MB ± 0%  +0.03%  (p=0.000 n=15+15)
Flate          26.1MB ± 0%      26.1MB ± 0%    ~     (p=0.098 n=15+15)
GoParser       31.8MB ± 0%      31.8MB ± 0%    ~     (p=0.525 n=15+13)
Reflect        81.2MB ± 0%      81.2MB ± 0%  +0.06%  (p=0.001 n=12+14)
Tar            27.5MB ± 0%      27.5MB ± 0%    ~     (p=0.595 n=15+15)
XML            44.1MB ± 0%      44.1MB ± 0%    ~     (p=0.486 n=15+15)

name       old allocs/op    new allocs/op    delta
Template         378k ± 1%        378k ± 0%    ~     (p=0.949 n=15+14)
Unicode          324k ± 0%        324k ± 1%    ~     (p=0.057 n=14+15)
GoTypes         1.15M ± 0%       1.15M ± 0%    ~     (p=0.461 n=15+15)
Compiler        4.39M ± 0%       4.39M ± 0%    ~     (p=0.333 n=2+2)
SSA             7.90M ± 0%       7.90M ± 0%  +0.06%  (p=0.008 n=15+15)
Flate            240k ± 1%        241k ± 0%    ~     (p=0.233 n=15+15)
GoParser         309k ± 1%        309k ± 0%    ~     (p=0.867 n=15+12)
Reflect         1.00M ± 0%       1.00M ± 0%    ~     (p=0.139 n=12+15)
Tar              254k ± 1%        253k ± 1%    ~     (p=0.345 n=15+15)
XML              398k ± 0%        397k ± 1%    ~     (p=0.267 n=15+15)


Change-Id: Ic999a0f456a371c99eebba0f9747263a13836e33
Reviewed-on: https://go-review.googlesource.com/37766
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-04 18:19:06 +00:00
David Lazar 0824ae6dc1 cmd/compile: add flag for debugging PC-value tables
For example, `-d pctab=pctoinline` prints the PC-inline table and
inlining tree for every function.

Change-Id: Ia6b9ce4d83eed0b494318d40ffe06481ec5d58ab
Reviewed-on: https://go-review.googlesource.com/37235
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-03-03 21:29:38 +00:00
David Lazar 301149b9e4 cmd/internal/obj: avoid duplicate file name symbols
The meaning of Version=1 was overloaded: it was reserved for file name
symbols (to avoid conflicts with non-file name symbols), but was also
used to mean "give me a fresh version number for this symbol."

With the new inlining tree, the same file name symbol can appear in
multiple entries, but each one would become a distinct symbol with its
own version number.

Now, we avoid duplicating symbols by using Version=0 for file name
symbols and we avoid conflicts with other symbols by prefixing the
symbol name with "gofile..".

Change-Id: I8d0374053b8cdb6a9ca7fb71871b69b4dd369a9c
Reviewed-on: https://go-review.googlesource.com/37234
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-03-03 21:29:36 +00:00
David Lazar 699175a11a cmd/compile,link: generate PC-value tables with inlining information
In order to generate accurate tracebacks, the runtime needs to know the
inlined call stack for a given PC. This creates two tables per function
for this purpose. The first table is the inlining tree (stored in the
function's funcdata), which has a node containing the file, line, and
function name for every inlined call. The second table is a PC-value
table that maps each PC to a node in the inlining tree (or -1 if the PC
is not the result of inlining).

To give the appearance that inlining hasn't happened, the runtime also
needs the original source position information of inlined AST nodes.
Previously the compiler plastered over the line numbers of inlined AST
nodes with the line number of the call. This meant that the PC-line
table mapped each PC to line number of the outermost call in its inlined
call stack, with no way to access the innermost line number.

Now the compiler retains line numbers of inlined AST nodes and writes
the innermost source position information to the PC-line and PC-file
tables. Some tools and tests expect to see outermost line numbers, so we
provide the OutermostLine function for displaying line info.

To keep track of the inlined call stack for an AST node, we extend the
src.PosBase type with an index into a global inlining tree. Every time
the compiler inlines a call, it creates a node in the global inlining
tree for the call, and writes its index to the PosBase of every inlined
AST node. The parent of this node is the inlining tree index of the
call. -1 signifies no parent.

For each function, the compiler creates a local inlining tree and a
PC-value table mapping each PC to an index in the local tree.  These are
written to an object file, which is read by the linker.  The linker
re-encodes these tables compactly by deduplicating function names and
file names.

This change increases the size of binaries by 4-5%. For example, this is
how the go1 benchmark binary is impacted by this change:

section             old bytes   new bytes   delta
.text               3.49M ± 0%  3.49M ± 0%   +0.06%
.rodata             1.12M ± 0%  1.21M ± 0%   +8.21%
.gopclntab          1.50M ± 0%  1.68M ± 0%  +11.89%
.debug_line          338k ± 0%   435k ± 0%  +28.78%
Total               9.21M ± 0%  9.58M ± 0%   +4.01%

Updates #19348.

Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3
Reviewed-on: https://go-review.googlesource.com/37231
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-03-03 21:29:30 +00:00
Philip Hofer a143f5d646 cmd/internal/obj/arm: improve static branch prediction for wrapper prologue
This is a follow-up to CL 36893.

Move the unlikely branch in the wrapper prologue to the end
of the function, where it has minimal impact on the instruction
cache. Static branch prediction is also less likely to choose
a forward branch.

Updates #19042

sort benchmarks:
name                  old time/op  new time/op  delta
SearchWrappers-4      1.44µs ± 0%  1.45µs ± 0%  +1.15%  (p=0.000 n=9+10)
SortString1K-4        1.02ms ± 0%  1.04ms ± 0%  +2.39%  (p=0.000 n=10+10)
SortString1K_Slice-4   960µs ± 0%   989µs ± 0%  +2.95%  (p=0.000 n=9+10)
StableString1K-4       218µs ± 0%   213µs ± 0%  -2.13%  (p=0.000 n=10+10)
SortInt1K-4            541µs ± 0%   543µs ± 0%  +0.30%  (p=0.003 n=9+10)
StableInt1K-4          760µs ± 1%   763µs ± 1%  +0.38%  (p=0.011 n=10+10)
StableInt1K_Slice-4    840µs ± 1%   779µs ± 0%  -7.31%  (p=0.000 n=9+10)
SortInt64K-4          55.2ms ± 0%  55.4ms ± 1%  +0.34%  (p=0.012 n=10+8)
SortInt64K_Slice-4    56.2ms ± 0%  55.6ms ± 1%  -1.16%  (p=0.000 n=10+10)
StableInt64K-4        70.9ms ± 1%  71.0ms ± 0%    ~     (p=0.315 n=10+7)
Sort1e2-4              250µs ± 0%   249µs ± 1%    ~     (p=0.315 n=9+10)
Stable1e2-4            600µs ± 0%   594µs ± 0%  -1.09%  (p=0.000 n=9+10)
Sort1e4-4             51.2ms ± 0%  51.4ms ± 1%  +0.40%  (p=0.001 n=9+10)
Stable1e4-4            204ms ± 1%   199ms ± 1%  -2.27%  (p=0.000 n=10+10)
Sort1e6-4              8.42s ± 0%   8.44s ± 0%  +0.28%  (p=0.000 n=8+9)
Stable1e6-4            43.3s ± 0%   42.5s ± 1%  -1.89%  (p=0.000 n=9+9)

Change-Id: I827559aa557fdba211a38ce3f77137b471c5c67e
Reviewed-on: https://go-review.googlesource.com/37611
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-02 05:15:32 +00:00
Keith Randall 1eed80f09a cmd/compile: fix disassembly of invalid instructions
Make sure that if we encode an explicit base register, we print it.
That will ensure that if we make an Addr with an auto variable but
a base that isn't SP, then it will be obvious from the disassembly.

Update #19184

Change-Id: If5556a5183f344d719ec7197aa935a0166061e6f
Reviewed-on: https://go-review.googlesource.com/37255
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-01 21:30:49 +00:00
Heschi Kreinick ac7761e1a4 cmd/compile, cmd/asm: remove Link.Plists
Link.Plists never contained more than one Plist, and sometimes none.
Passing around the Plist being worked on is straightforward and makes
the data flow easier to follow.

Change-Id: I79cb30cb2bd3d319fdbb1dfa5d35b27fcb748e5c
Reviewed-on: https://go-review.googlesource.com/37169
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-01 00:29:23 +00:00