Commit Graph

938 Commits

Author SHA1 Message Date
isharipo b15e8babc8 cmd/asm: add amd64 PALIGNR instruction
3rd change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit adds instruction that do require new ytab variable.

Change-Id: I0bc7d9401c9176eb3760c3d59494ef082e97af84
Reviewed-on: https://go-review.googlesource.com/56870
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-09-06 15:37:20 +00:00
isharipo 4074e4e5be cmd/asm: add amd64 CLFLUSH instruction
This is the last instruction I found missing in SSE2 set.

It does not reuse 'yprefetch' ytabs due to differences in
operands SRC/DST roles:
- PREFETCHx: ModRM:r/m(r) -> FROM
- CLFLUSH:   ModRM:r/m(w) -> TO

unaryDst map is extended accordingly.

Change-Id: I89e34ebb81cc0ee5f9ebbb1301bad417f7ee437f
Reviewed-on: https://go-review.googlesource.com/56833
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-09-06 15:37:00 +00:00
isharipo 26dadbe32c cmd/asm: add amd64 PAB{SB,SD,SW}, PMADDUBSW, PMULHRSW, PSIG{NB,ND,NW}
instructions

1st change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit adds instructions that do not require new named ytab sets.

Change-Id: I0c3dfd8d39c3daa8b7683ab163c63145626d042e
Reviewed-on: https://go-review.googlesource.com/56834
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-09-06 15:35:25 +00:00
Heschi Kreinick 37dbfc51b0 cmd/link: emit DW_AT_data_member_location as a constant
Simplify the DWARF representation of structs by emitting field offsets
as constants rather than location descriptions.

This was not explicitly mentioned as an option in DWARF2. It is
mentioned in DWARF4, but isn't listed in the changes, so it's not clear
if this was always intended to work or is an undocumented change. Either
way, it should be valid DWARF4.

Change-Id: Idf7fdd397a21c8f8745673ecc77ef65afa3ffe1c
Reviewed-on: https://go-review.googlesource.com/51611
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-08-31 19:42:23 +00:00
Ben Shi 074547a5ce cmd/internal/obj/arm: support more ARM VFP instructions
Add support of more ARM VFP instructions in the assembler.
They were introduced in ARM VFPv2.

"NMULF/NMULD   Fm, Fn, Fd": Fd = -Fn*Fm
"MULAF/MULAD   Fm, Fn, Fd": Fd = Fd + Fn*Fm
"NMULAF/NMULAD Fm, Fn, Fd": Fd = -(Fd + Fn*Fm)
"MULSF/MULSD   Fm, Fn, Fd": Fd = Fd - Fn*Fm
"NMULSF/NMULSD Fm, Fn, Fd": Fd = -(Fd - Fn*Fm)

Change-Id: Icd302676ca44a9f5f153fce734225299403c4163
Reviewed-on: https://go-review.googlesource.com/60170
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-31 13:42:17 +00:00
Carlos Eduardo Seo 526f3420c2 cmd/asm, cmd/internal/obj/ppc64: add ISA 3.0 instructions
This change adds new ppc64 instructions from the POWER9 ISA. This includes
compares, loads, maths, register moves and the new random number generator and
copy/paste facilities.

Change-Id: Ife3720b90f5af184ff115bbcdcbce5c1302d39b6
Reviewed-on: https://go-review.googlesource.com/53930
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-08-29 12:28:38 +00:00
Keith Randall 053840dc00 cmd/compile: avoid generating large offsets
The assembler barfs on large offsets. Make sure that all the
instructions that need to have their offsets in an int32
  1) check on any rule that computes offsets for such instructions
  2) change their aux fields so the check builder checks it.

The assembler also silently misassembled offsets between 1<<31
and 1<<32. Add a check in the assembler to barf on those as well.

Fixes #21655

Change-Id: Iebf24bf10f9f37b3ea819ceb7d588251c0f46d7d
Reviewed-on: https://go-review.googlesource.com/59630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-08-28 22:02:59 +00:00
Wei Xiao c195deb48c cmd/internal/objfile: add arm64 disassembler support
Fixes #19157

Change-Id: Ieea286e8dc03929c3645f3113c33df569f8e26f3
Reviewed-on: https://go-review.googlesource.com/58930
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-28 14:09:35 +00:00
Daniel Martí 99da8730b0 all: remove some double spaces from comments
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.

Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-26 15:09:09 +00:00
fanzha02 aea286b449 cmd/internal/obj/arm64: fix assemble fcsels/fcseld bug
The current code treats the type of SIMD&FP register as C_REG incorrectly.

The fix code converts C_REG type into C_FREG type.

Uncomment fcsels/fcseld test cases.

Fixes #21582
Change-Id: I754c51f72a0418bd352cbc0f7740f14cc599c72d
Reviewed-on: https://go-review.googlesource.com/58350
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-25 20:43:03 +00:00
Heschi Kreinick 38bd725bf1 cmd/compile: bug fixes for DWARF location lists
Fix two small but serious bugs in the DWARF location list code that
should have been caught by the automated tests I didn't write.

After emitting debug information for a user variable, mark it as done
so that it doesn't get emitted again. Otherwise it would be written once
per slot it was decomposed into.

Correct a merge error in CL 44350: the location list abbreviations need
to have DW_AT_decl_line too, otherwise the resulting DWARF is gibberish.

Change-Id: I6ab4b8b32b7870981dac80eadf0ebfc4015ccb01
Reviewed-on: https://go-review.googlesource.com/59070
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-08-25 20:37:32 +00:00
Wei Xiao c02fc1605a cmd/compile: memory clearing optimization for arm64
Use "STP (ZR, ZR), O(R)" instead of "MOVD ZR, O(R)" to implement memory clearing.
Also improve assembler supports to STP/LDP.
Results (A57@2GHzx8):

benchmark                   old ns/op     new ns/op     delta
BenchmarkClearFat8-8        1.00          1.00          +0.00%
BenchmarkClearFat12-8       1.01          1.01          +0.00%
BenchmarkClearFat16-8       1.01          1.01          +0.00%
BenchmarkClearFat24-8       1.52          1.52          +0.00%
BenchmarkClearFat32-8       3.00          2.02          -32.67%
BenchmarkClearFat40-8       3.50          2.52          -28.00%
BenchmarkClearFat48-8       3.50          3.03          -13.43%
BenchmarkClearFat56-8       4.00          3.50          -12.50%
BenchmarkClearFat64-8       4.25          4.00          -5.88%
BenchmarkClearFat128-8      8.01          8.01          +0.00%
BenchmarkClearFat256-8      16.1          16.0          -0.62%
BenchmarkClearFat512-8      32.1          32.0          -0.31%
BenchmarkClearFat1024-8     64.1          64.1          +0.00%

Change-Id: Ie5f5eac271ff685884775005825f206167a5c146
Reviewed-on: https://go-review.googlesource.com/55610
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-25 20:09:06 +00:00
fanzha02 8f1e2a2610 cmd/internal/obj/arm64: fix assemble fcmp/fcmpe bug
The current code treats floating-point constant as integer
and does not treat fcmp/fcmpe as the comparison instrucitons
that requires special handling.

The fix corrects the type of immediate arguments and adds fcmp/fcmpe
in the special handing.

Uncomment the fcmp/fcmpe cases.

Fixes #21567
Change-Id: I6782520e2770f6ce70270b667dd5e68f71e2d5ad
Reviewed-on: https://go-review.googlesource.com/57852
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-23 15:59:34 +00:00
Agniva De Sarker ea5e3bd2a1 all: fix easy-to-miss typos
Using the wonderful https://github.com/client9/misspell tool.

Change-Id: Icdbc75a5559854f4a7a61b5271bcc7e3f99a1a24
Reviewed-on: https://go-review.googlesource.com/57851
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-23 03:07:12 +00:00
Heschi Kreinick 4a1be1e1da cmd/compile: emit DW_AT_decl_line
Some debuggers use the declaration line to avoid showing variables
before they're declared. Emit them for local variables and function
parameters.

DW_AT_decl_file would be nice too, but since its value is an index
into a table built by the linker, that's dramatically harder. In
practice, with inlining disabled it's safe to assume that all a
function's variables are declared in the same file, so this should still
be pretty useful.

Change-Id: I8105818c8940cd71bc5473ec98797cce2f3f9872
Reviewed-on: https://go-review.googlesource.com/44350
Reviewed-by: David Chase <drchase@google.com>
2017-08-22 18:05:53 +00:00
fanzha02 bdd7c01b55 cmd/internal/obj/arm64: fix assemble movk bug
The current code gets shift arguments value from prog.From3.Offset.
But prog.From3.Offset is not assigned the shift arguments value in
instructions assemble process.

The fix calls movcon() function to get the correct value.

Uncomment the movk/movkw  cases.

Fixes #21398
Change-Id: I78d40c33c24bd4e3688a04622e4af7ddb5333fa6
Reviewed-on: https://go-review.googlesource.com/54990
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-22 13:10:08 +00:00
Ben Shi 9bf521b2b4 cmd/internal/obj/arm: support BFX/BFXU instructions
BFX extracts given bits from the source register, sign extends them
to 32-bit, and writes to destination register. BFXU does the similar
operation with zero extention.

They were introduced in ARMv6T2.

Change-Id: I6822ebf663497a87a662d3645eddd7c611de2b1e
Reviewed-on: https://go-review.googlesource.com/56071
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-21 16:29:59 +00:00
Daniel Martí 943dd0fe33 cmd/*: remove negative uint checks
All of these are uints of different sizes, so checking >= 0 or < 0 are
effectively no-ops.

Found with staticcheck.

Change-Id: I16ac900eb7007bc8f9018b302136d42e483a4180
Reviewed-on: https://go-review.googlesource.com/56950
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Matt Layher <mdlayher@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-18 18:45:05 +00:00
Ben Shi 75cb22cb2f cmd/internal/obj/arm: support new arm instructions
There are two changes in this CL.

1. Add new forms of MOVH/MOVHS/MOVHU.
   MOVHS R0<<0(R1), R2   // ldrsh
   MOVH  R0<<0(R1), R2   // ldrsh
   MOVHU R0<<0(R1), R2   // ldrh
   MOVHS R2, R5<<0(R1)   // strh
   MOVH  R2, R5<<0(R1)   // strh
   MOVHU R2, R5<<0(R1)   // strh

2. Simpify "MVN $0xffffffaa, Rn" to "MOVW $0x55, Rn".
   It is originally assembled to two instructions.
   "MOVW offset(PC), R11"
   "MVN R11, Rn"

Change-Id: I8e863dcfb2bd8f21a04c5d627fa7beec0afe65fb
Reviewed-on: https://go-review.googlesource.com/53690
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-18 14:13:41 +00:00
Hiroshi Ioka 0d65cd6c1c cmd/internal/obj/x86: don't apply workaround for solaris to darwin
Currently, we have a workaround for solaris that enforce aboslute
addressing for external symbols. However, We don't want to use the
workaround for darwin.
This CL also refactors code a little bit, because the original function
name is not appropriate now.

Updates #17490

Change-Id: Id21f9cdf33dca6a40647226be49010c2c324ee24
Reviewed-on: https://go-review.googlesource.com/54871
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-17 00:43:00 +00:00
ph 4282ba0a65 cmd/internal/obj/arm64: improve arm64 wrapper prologue
Improve static branch prediction in arm64 wrapper prologue
by making the unusual case branch forwards. (Most other
architectures implement this optimization.)

Additionally, replace a CMP+BNE pair with a CBNZ
to save one instruction.

Change-Id: Id970038b34b4aaec18c101d62e2ee00f3e32a761
Reviewed-on: https://go-review.googlesource.com/54070
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-15 14:01:54 +00:00
Keith Randall f4abbc0e61 cmd/link,compile: Provide size for func types
They are currently not given a size, which makes the DWARF reader
very confused. Particularly things like [4]func() get a size of -4, not 32.

Fixes #21097

Change-Id: I01e754134d82fbbe6567e3c7847a4843792a3776
Reviewed-on: https://go-review.googlesource.com/55551
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-14 23:53:27 +00:00
Austin Clements 6f6a9398e2 Merge remote-tracking branch 'origin/dev.debug' into master
Change-Id: I85df2745af666b533f4f6f1d06f7c8e137590b5b
2017-08-11 12:17:43 -04:00
Cherry Zhang f20944de78 cmd/compile: set/unset base register for better assembly print
For address of an auto or arg, on all non-x86 architectures
the assembler backend encodes the actual SP offset in the
instruction but leaves the offset in Prog unchanged. When the
assembly is printed in compile -S, it shows an offset
relative to pseudo FP/SP with an actual hardware SP base
register (e.g. R13 on ARM). This is confusing. Unset the
base register if it is indeed SP, so the assembly output is
consistent. If the base register isn't SP, it should be an
error and the error output contains the actual base register.

For address loading instructions, the base register isn't set
in the compiler on non-x86 architectures. Set it. Normally it
is SP and will be unset in the change mentioned above for
printing. If it is not, it will be an error and the error
output contains the actual base register.

No change in generated binary, only printed assembly. Passes
"go build -a -toolexec 'toolstash -cmp' std cmd" on all
architectures.

Fixes #21064.

Change-Id: Ifafe8d5f9b437efbe824b63b3cbc2f5f6cdc1fd5
Reviewed-on: https://go-review.googlesource.com/49432
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-08-02 12:24:02 +00:00
Heschi Kreinick 4c54a047c6 [dev.debug] cmd/compile: better DWARF with optimizations on
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.

After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.

There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:

  type myint int
  var a int
  b := myint(int)
and
  b := (*uint64)(unsafe.Pointer(a))

don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.

Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.

A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.

TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.

go build -toolexec 'toolstash -cmp' -a std succeeds.

With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after:  06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean  /tmp/220352263 /tmp/621364410
completed   15 of   15, estimated time remaining 0s (eta 3:52PM)
name        old time/op       new time/op       delta
Template          199ms ± 3%        198ms ± 2%     ~     (p=0.400 n=15+14)
Unicode          96.6ms ± 5%       96.4ms ± 5%     ~     (p=0.838 n=15+15)
GoTypes           653ms ± 2%        647ms ± 2%     ~     (p=0.102 n=15+14)
Flate             133ms ± 6%        129ms ± 3%   -2.62%  (p=0.041 n=15+15)
GoParser          164ms ± 5%        159ms ± 3%   -3.05%  (p=0.000 n=15+15)
Reflect           428ms ± 4%        422ms ± 3%     ~     (p=0.156 n=15+13)
Tar               123ms ±10%        124ms ± 8%     ~     (p=0.461 n=15+15)
XML               228ms ± 3%        224ms ± 3%   -1.57%  (p=0.045 n=15+15)
[Geo mean]        206ms             377ms       +82.86%

name        old user-time/op  new user-time/op  delta
Template          292ms ±10%        301ms ±12%     ~     (p=0.189 n=15+15)
Unicode           166ms ±37%        158ms ±14%     ~     (p=0.418 n=15+14)
GoTypes           962ms ± 6%        963ms ± 7%     ~     (p=0.976 n=15+15)
Flate             207ms ±19%        200ms ±14%     ~     (p=0.345 n=14+15)
GoParser          246ms ±22%        240ms ±15%     ~     (p=0.587 n=15+15)
Reflect           611ms ±13%        587ms ±14%     ~     (p=0.085 n=15+13)
Tar               211ms ±12%        217ms ±14%     ~     (p=0.355 n=14+15)
XML               335ms ±15%        320ms ±18%     ~     (p=0.169 n=15+15)
[Geo mean]        317ms             583ms       +83.72%

name        old alloc/op      new alloc/op      delta
Template         40.2MB ± 0%       40.2MB ± 0%   -0.15%  (p=0.000 n=14+15)
Unicode          29.2MB ± 0%       29.3MB ± 0%     ~     (p=0.624 n=15+15)
GoTypes           114MB ± 0%        114MB ± 0%   -0.15%  (p=0.000 n=15+14)
Flate            25.7MB ± 0%       25.6MB ± 0%   -0.18%  (p=0.000 n=13+15)
GoParser         32.2MB ± 0%       32.2MB ± 0%   -0.14%  (p=0.003 n=15+15)
Reflect          77.8MB ± 0%       77.9MB ± 0%     ~     (p=0.061 n=15+15)
Tar              27.1MB ± 0%       27.0MB ± 0%   -0.11%  (p=0.029 n=15+15)
XML              42.7MB ± 0%       42.5MB ± 0%   -0.29%  (p=0.000 n=15+15)
[Geo mean]       42.1MB            75.0MB       +78.05%

name        old allocs/op     new allocs/op     delta
Template           402k ± 1%         398k ± 0%   -0.91%  (p=0.000 n=15+15)
Unicode            344k ± 1%         344k ± 0%     ~     (p=0.715 n=15+14)
GoTypes           1.18M ± 0%        1.17M ± 0%   -0.91%  (p=0.000 n=15+14)
Flate              243k ± 0%         240k ± 1%   -1.05%  (p=0.000 n=13+15)
GoParser           327k ± 1%         324k ± 1%   -0.96%  (p=0.000 n=15+15)
Reflect            984k ± 1%         982k ± 0%     ~     (p=0.050 n=15+15)
Tar                261k ± 1%         259k ± 1%   -0.77%  (p=0.000 n=15+15)
XML                411k ± 0%         404k ± 1%   -1.55%  (p=0.000 n=15+15)
[Geo mean]         439k              755k       +72.01%

name        old text-bytes    new text-bytes    delta
HelloSize         694kB ± 0%        694kB ± 0%   -0.00%  (p=0.000 n=15+15)

name        old data-bytes    new data-bytes    delta
HelloSize        5.55kB ± 0%       5.55kB ± 0%     ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         133kB ± 0%        133kB ± 0%     ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.04MB ± 0%       1.04MB ± 0%     ~     (all equal)

Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-27 20:19:44 +00:00
Heschi Kreinick cd702b171c [dev.debug] cmd/internal/dwarf: add DWARF abbrevs with location lists
Location lists require new DWARF abbrev entries. Add them before
CL 41770 to enable binary comparison.

Change-Id: If99461f6896db902f2774e0718065eb3d3522026
Reviewed-on: https://go-review.googlesource.com/50881
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-26 18:39:57 +00:00
Heschi Kreinick 045f605ea1 [dev.debug] cmd/compile: rename dwarf.Var.Offset to StackOffset
After we track decomposition, offset could mean stack offset or offset
in recomposed variable. Disambiguate.

Change-Id: I4d810b8c0dcac7a4ec25ac1e52898f55477025df
Reviewed-on: https://go-review.googlesource.com/50875
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-25 19:33:14 +00:00
Russ Cox f0cf740733 cmd/compile: omit X:framepointer in compile version
Framepointer is the default now. Only print an X: list
if the settings are _not_ the default.

Before:

$ go tool compile -V
compile version devel +a5f30d9508 Sun Jul 16 14:43:48 2017 -0400 X:framepointer
$ go1.8 tool compile -V
compile version go1.8 X:framepointer
$

After:

$ go tool compile -V
compile version devel +a5f30d9508 Sun Jul 16 14:43:48 2017 -0400
$ go1.9 tool compile -V # imagined
compile version go1.9
$

Perpetuates #18317.

Change-Id: I981ba5c62be32e650a166fc9740703122595639b
Reviewed-on: https://go-review.googlesource.com/49252
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-07-17 16:36:49 +00:00
Ben Shi ef26021d30 cmd/internal/obj/arm: check illegal base registers in ARM instructions
Wrong instructions "MOVW 8(F0), R1" and "MOVW R0<<0(F1), R1"
are silently accepted, and all Fx are treated as Rx.

The patch checks all those illegal base registers.

fixes #20724

Change-Id: I05d41bb43fe774b023205163b7daf4a846e9dc88
Reviewed-on: https://go-review.googlesource.com/46132
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-30 19:09:44 +00:00
fanzha02 990dac2723 cmd/internal/obj/arm64: fix assemble LDXP bug
The current code calculates register number incorrectly.

The fix corrects the register number calculation.

Add cases created by decoder to test assembler.

Fixes #20697
Fixes #20723

Change-Id: I73ac153df9ea9f51c43a5104828d7a5389551c92
Reviewed-on: https://go-review.googlesource.com/45850
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-30 18:24:58 +00:00
Ben Shi 3785457c76 cmd/internal/obj/arm: fix wrong encoding of MULBB
"MULBB R1, R2, R3" is encoded to 0xe163f182, which should be
0xe1630182.

This patch fix it.

fix #20764

Change-Id: I9d3c3ffa40ecde86638e5e083eacc67578caebf4
Reviewed-on: https://go-review.googlesource.com/46491
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-23 18:08:20 +00:00
Ben Shi e00a38c89a cmd/internal/obj/arm: fix setting U bit in shifted register offset of MOVBS
"MOVBS.U R0<<0(R1), R2" is assembled to 0xe19120d0 (ldrsb r2, [r1, r0]),
but it is expected to be 0xe11120d0 (ldrsb r2, [r1, -r0]).

This patch fixes it and adds more encoding tests.

fixes #20701

Change-Id: Ic1fb46438d71a978dbef06d97494a70c95fcbf3a
Reviewed-on: https://go-review.googlesource.com/45996
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-23 14:29:57 +00:00
Keith Randall 79d05e75ca runtime: restore arm assembly stubs for div/mod
These are used by DIV[U] and MOD[U] assembly instructions.
Add a test in the stdlib so we actually exercise linking
to these routines.

Update #19507

Change-Id: I0d8e19a53e3744abc0c661ea95486f94ec67585e
Reviewed-on: https://go-review.googlesource.com/45703
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-15 03:51:03 +00:00
Keith Randall 4958f9e2fe runtime: remove unused arm assembly for div/mod
Also add runtime· prefixes to the code that is still used.

Fixes #19507

Change-Id: Ib6da6b2a9e398061d3f93958ee1258295b6cc33b
Reviewed-on: https://go-review.googlesource.com/45699
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-06-14 18:00:26 +00:00
Ben Shi a38c8dfa44 cmd/internal/obj/arm: fix MOVW to/from FPSR
"MOVW FPSR, g" should be assembled to 0xeef1aa10, but actually
0xee30a110 (RFS). "MOVW g, FPSR" should be 0xeee1aa10, but actually
0xee20a110 (WFS). They should be updated to VFP forms, since the ARM
back end doesn't support non-VFP floating points.

The patch fixes them and adds more assembly encoding tests.

fixes #20643

Change-Id: I3b29490337c6e8d891b400fcedc8b0a87b82b527
Reviewed-on: https://go-review.googlesource.com/45276
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-13 14:06:58 +00:00
Ben Shi d3d5489135 cmd/internal/obj/arm: fix encoding of move register/immediate to CPSR
"MOVW R1, CPSR" is assembled to 0xe129f001, which should be 0xe12cf001.
"MOVW $255, CPSR" is assembled to 0xe329f0ff, which should be 0xe32cf0ff.

This patch fixes them and adds more assembly encoding tests.

fix #20626

Change-Id: Iefc945879ea774edf40438ce39f52c144e1501a1
Reviewed-on: https://go-review.googlesource.com/45170
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-09 14:14:30 +00:00
Cherry Zhang 0aede73917 cmd/internal/obj/arm: don't split instructions on NaCl
We insert guard instructions after each "dangerous" instruction
to make NaCl's validator happy. This happens before asmout. If
in asmout an instruction is split to two dangerous instructions,
but only one guard instruction is inserted, the validation fails.
Therefore don't split instructions on NaCl.

Fixes #20595.

Change-Id: Ie34f209bc7d907d6d16ecef6721f88420981ac01
Reviewed-on: https://go-review.googlesource.com/45021
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-06-07 17:33:46 +00:00
Ian Lance Taylor 555d1e36f9 cmd/internal/obj/ppc64: fix MOVFL REG, CONST
The MOVFL instruction (which external PPC64 docs call mtcrf) can take
either a CR register or a constant. It doesn't make sense to specify
both, as the CR register implies the constant value. Specifying either
a register or a constant is enforced by the implementation in the
asmout method (case 69).

However, the optab was providing a form that specified both a constant
and a CR register, and was not providing a form that specified only a
constant. This CL fixes the optab table to provide a form that takes
only a constant.

No test because I don't know where to write it. The next CL in this
series will use the new instruction format.

Change-Id: I8bb5d3ed60f483b54c341ce613931e126f7d7be6
Reviewed-on: https://go-review.googlesource.com/44732
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-06-05 19:59:33 +00:00
Ben Shi c8ab8c1f99 cmd/internal/obj/arm: fix constant decomposition
There are two issues in constant decomposition.

1. A typo in "func immrot2s" blocks "case 107" of []optab be triggered.

2. Though "ADD $0xffff, R0, R0" is decomposed to "ADD $0xff00, R0, R0" and
   "ADD $0x00ff, R0, R0" as expected, "ADD $0xffff, R0" still uses the
   constant pool, which should be the same as "ADD $0xffff, R0, R0".

This patch fixes them and adds more instruction encoding tests.

fix #20516

Change-Id: Icd7bdfa1946b29db15580dcb429111266f1384c6
Reviewed-on: https://go-review.googlesource.com/44335
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-05 17:03:54 +00:00
Heschi Kreinick b74f01d76f cmd/internal/dwarf: update to DWARF4, emit frame_base
In preparation for CL 41770, upgrade .debug_info to DWARF4, and emit
DW_AT_frame_base on subprograms. This should make no semantic
difference.

Also fix a long-standing bug/inconsistency in puttattr: it didn't
add the addend to ref_addrs. Previously this didn't matter because it
was only used for types, but now it's used for section offsets into
symbols that have multiple entries.

RELNOTE=yes

Change-Id: Ib10654ac92edfa29c5167c44133648151d70cf76
Reviewed-on: https://go-review.googlesource.com/44210
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2017-05-26 21:29:21 +00:00
Ben Shi b8a4eb4bd8 cmd/internal/obj/arm: fix illegal forms of ARM VFP instruction
"ADDF F0, R1, F2" is silently accepted by the arm assembler and
assembled to the same binary code of "ADDF F0, F1, F2". So does
"CMPF F0, R1".

"ABSF F0, F1, F2" is also silently accepted and assembled to a
different instruction.

This patch reports those illegal forms and adds test cases.

fix #20464

Change-Id: I88b80dc29de24c6266ac7bf7bce1578c5adbc68c
Reviewed-on: https://go-review.googlesource.com/43931
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-25 14:32:35 +00:00
Ben Shi 5e79787935 cmd/internal/obj/arm: report invalid .S/.P/.W suffix in ARM instructions
Many instructions can not have a .S suffix, such as MULS, SWI, CLZ,
CMP, STREX and others. And so do .P and .W suffixes. Even wrong
assembly code is generated for some instructions with invalid
suffixes.

This patch tries to simplify .S/.W/.P checks. And a wrong assembly
test for arm is added.

fixes #20377

Change-Id: Iba1c99d9e6b7b16a749b4d93ca2102e17c5822fe
Reviewed-on: https://go-review.googlesource.com/43561
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-22 13:44:12 +00:00
Alessandro Arzilli 2ad41a3090 cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in .debug_info
section when compiling with -N -l.

Version of debug_info is updated from DWARF v2 to DWARF v3 since
version 2 does not allow lexical blocks with discontinuous PC ranges.

Remaining open problems:
- scope information is removed from inlined functions
- variables records do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, even before its declaration.

Updates #6913.
Updates #12899.

Change-Id: Idc6808788512ea20e7e45bcf782453acb416fb49
Reviewed-on: https://go-review.googlesource.com/40095
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-05-18 23:10:50 +00:00
Ben Shi c7cae34b19 cmd/internal/obj/arm: remove illegal form of the SWI instruction
SWI only support "SWI $imm", but currently "SWI (Reg)" is also
accepted. This patch fixes it.

And more instruction tests are added to cmd/asm/internal/asm/testdata/arm.s

fixes #20375

Change-Id: Id437d853924a403e41da9b6cbddd20d994b624ff
Reviewed-on: https://go-review.googlesource.com/43552
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-18 13:38:13 +00:00
Cherry Zhang b53acd89db cmd/internal/obj/mips: add support of LLV, SCV, NOOP instructions
LLV and SCV are 64-bit load-linked and store-conditional. They
were used in runtime as #define WORD. Change them to normal
instruction form.

NOOP is hardware no-op. It was written as WORD $0. Make a name
for it for better disassembly output.

Fixes #12561.
Fixes #18238.

Change-Id: I82c667ce756fa83ef37b034b641e8c4366335e83
Reviewed-on: https://go-review.googlesource.com/40297
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-16 17:15:11 +00:00
Wei Xiao b2363ee9f6 cmd/internal/objabi: fix the bug of shrinking SymType down to a uint8
Previous CL (cmd/internal/objabi: shrink SymType down to a uint8) shrinks
SymType down to a uint8 but forgot making according change in goobj.

Fixes #20296
Also add a test to catch such Goobj format inconsistency bug

Change-Id: Ib43dd7122cfcacf611a643814e95f8c5a924941f
Reviewed-on: https://go-review.googlesource.com/42971
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-16 12:26:10 +00:00
Ben Shi 6897030fe3 cmd/internal/obj: continue to optimize ARM's constant pool
Both Keith's https://go-review.googlesource.com/c/41612/ and
and Ben's https://go-review.googlesource.com/c/41679/ optimized ARM's
constant pool. But neither was complete.

First, BIC was forgotten.
1. "BIC $0xff00ff00, Reg" can be optimized to
   "BIC $0xff000000, Reg
    BIC $0x0000ff00, Reg"
2. "BIC $0xffff00ff, Reg" can be optimized to
   "AND $0x0000ff00, Reg"
3. "AND $0xffff00ff, Reg" can be optimized to
   "BIC $0x0000ff00, Reg"

Second, break a non-ARMImmRot to the subtraction of two ARMImmRots was
left as TODO.
1. "ADD $0x00fffff0, Reg" can be optimized to
   "ADD $0x01000000, Reg
    SUB $0x00000010, Reg"
2. "SUB $0x00fffff0, Reg" can be optimized to
   "SUB $0x01000000, Reg
    ADD $0x00000010, Reg"

This patch fixes them and issue #19844.

The go1 benchmark shows improvements.

name                     old time/op    new time/op    delta
BinaryTree17-4              41.4s ± 1%     41.7s ± 1%  +0.54%  (p=0.000 n=50+49)
Fannkuch11-4                24.7s ± 1%     25.1s ± 0%  +1.70%  (p=0.000 n=50+49)
FmtFprintfEmpty-4           853ns ± 1%     852ns ± 1%    ~     (p=0.833 n=50+50)
FmtFprintfString-4         1.33µs ± 1%    1.33µs ± 1%    ~     (p=0.163 n=50+50)
FmtFprintfInt-4            1.40µs ± 1%    1.40µs ± 0%    ~     (p=0.293 n=50+35)
FmtFprintfIntInt-4         2.09µs ± 1%    2.08µs ± 1%  -0.39%  (p=0.000 n=50+49)
FmtFprintfPrefixedInt-4    2.43µs ± 1%    2.43µs ± 1%    ~     (p=0.552 n=50+50)
FmtFprintfFloat-4          4.57µs ± 1%    4.42µs ± 1%  -3.18%  (p=0.000 n=50+50)
FmtManyArgs-4              8.62µs ± 1%    8.52µs ± 0%  -1.08%  (p=0.000 n=50+50)
GobDecode-4                 101ms ± 1%     101ms ± 2%  +0.45%  (p=0.001 n=49+49)
GobEncode-4                90.7ms ± 1%    91.1ms ± 2%  +0.51%  (p=0.001 n=50+50)
Gzip-4                      4.23s ± 1%     4.21s ± 1%  -0.62%  (p=0.000 n=50+50)
Gunzip-4                    623ms ± 1%     619ms ± 0%  -0.63%  (p=0.000 n=50+42)
HTTPClientServer-4          721µs ± 5%     683µs ± 3%  -5.25%  (p=0.000 n=50+47)
JSONEncode-4                251ms ± 1%     253ms ± 1%  +0.54%  (p=0.000 n=49+50)
JSONDecode-4                941ms ± 1%     944ms ± 1%  +0.30%  (p=0.001 n=49+50)
Mandelbrot200-4            49.3ms ± 1%    49.3ms ± 0%    ~     (p=0.918 n=50+48)
GoParse-4                  47.1ms ± 1%    47.2ms ± 1%  +0.18%  (p=0.025 n=50+50)
RegexpMatchEasy0_32-4      1.23µs ± 1%    1.24µs ± 1%  +0.30%  (p=0.000 n=49+50)
RegexpMatchEasy0_1K-4      7.74µs ± 7%    7.76µs ± 5%    ~     (p=0.888 n=50+50)
RegexpMatchEasy1_32-4      1.32µs ± 1%    1.32µs ± 1%  +0.23%  (p=0.003 n=50+50)
RegexpMatchEasy1_1K-4      10.6µs ± 2%    10.5µs ± 3%  -1.29%  (p=0.000 n=49+50)
RegexpMatchMedium_32-4     2.19µs ± 1%    2.10µs ± 1%  -3.79%  (p=0.000 n=49+49)
RegexpMatchMedium_1K-4      544µs ± 0%     545µs ± 0%    ~     (p=0.123 n=41+50)
RegexpMatchHard_32-4       28.8µs ± 0%    28.8µs ± 1%    ~     (p=0.580 n=46+50)
RegexpMatchHard_1K-4        863µs ± 1%     865µs ± 1%  +0.31%  (p=0.027 n=47+50)
Revcomp-4                  82.2ms ± 2%    82.3ms ± 2%    ~     (p=0.894 n=48+49)
Template-4                  1.06s ± 1%     1.04s ± 1%  -1.18%  (p=0.000 n=50+49)
TimeParse-4                7.25µs ± 1%    7.35µs ± 0%  +1.48%  (p=0.000 n=50+50)
TimeFormat-4               13.3µs ± 1%    13.2µs ± 1%  -0.13%  (p=0.007 n=50+50)
[Geo mean]                  736µs          733µs       -0.37%

name                     old speed      new speed      delta
GobDecode-4              7.60MB/s ± 1%  7.56MB/s ± 2%  -0.46%  (p=0.001 n=49+49)
GobEncode-4              8.47MB/s ± 1%  8.42MB/s ± 2%  -0.50%  (p=0.001 n=50+50)
Gzip-4                   4.58MB/s ± 1%  4.61MB/s ± 1%  +0.59%  (p=0.000 n=50+50)
Gunzip-4                 31.2MB/s ± 1%  31.4MB/s ± 0%  +0.63%  (p=0.000 n=50+42)
JSONEncode-4             7.73MB/s ± 1%  7.69MB/s ± 1%  -0.53%  (p=0.000 n=49+50)
JSONDecode-4             2.06MB/s ± 1%  2.06MB/s ± 1%    ~     (p=0.052 n=44+50)
GoParse-4                1.23MB/s ± 0%  1.23MB/s ± 2%    ~     (p=0.526 n=26+50)
RegexpMatchEasy0_32-4    25.9MB/s ± 1%  25.9MB/s ± 1%  -0.30%  (p=0.000 n=49+50)
RegexpMatchEasy0_1K-4     132MB/s ± 7%   132MB/s ± 6%    ~     (p=0.885 n=50+50)
RegexpMatchEasy1_32-4    24.2MB/s ± 1%  24.1MB/s ± 1%  -0.22%  (p=0.003 n=50+50)
RegexpMatchEasy1_1K-4    96.4MB/s ± 2%  97.8MB/s ± 3%  +1.36%  (p=0.000 n=50+50)
RegexpMatchMedium_32-4    460kB/s ± 0%   476kB/s ± 1%  +3.43%  (p=0.000 n=49+50)
RegexpMatchMedium_1K-4   1.88MB/s ± 0%  1.88MB/s ± 0%    ~     (all equal)
RegexpMatchHard_32-4     1.11MB/s ± 0%  1.11MB/s ± 1%  +0.34%  (p=0.000 n=45+50)
RegexpMatchHard_1K-4     1.19MB/s ± 1%  1.18MB/s ± 1%  -0.34%  (p=0.033 n=50+50)
Revcomp-4                30.9MB/s ± 2%  30.9MB/s ± 2%    ~     (p=0.894 n=48+49)
Template-4               1.84MB/s ± 1%  1.86MB/s ± 2%  +1.19%  (p=0.000 n=48+50)
[Geo mean]               6.63MB/s       6.65MB/s       +0.26%


Fixes #19844.

Change-Id: I5ad16cc0b29267bb4579aca3dcc10a0b8ade1aa4
Reviewed-on: https://go-review.googlesource.com/42430
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-11 13:53:54 +00:00
David Chase 41d0bbdc16 cmd/link: include DW_AT_producer in .debug_info
This can make life easier for Delve (and other debuggers),
and can help them with bug reports.

Sample producer field (from objdump):
<48> DW_AT_producer : Go cmd/compile devel +8a59dbf41a Mon May 8 16:02:44 2017 -0400

Change-Id: I0605843c959b53a60a25a3b870aa8755bf5d5b13
Reviewed-on: https://go-review.googlesource.com/33588
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-05-10 14:47:41 +00:00
Ian Lance Taylor 5331e7e9df cmd/internal/obj, cmd/link: fix st_other field on PPC64
In PPC64 ELF files, the st_other field indicates the number of
prologue instructions between the global and local entry points.
We add the instructions in the compiler and assembler if -shared is used.
We were assuming that the instructions were present when building a
c-archive or PIE or doing dynamic linking, on the assumption that those
are the cases where the go tool would be building with -shared.
That assumption fails when using some other tool, such as Bazel,
that does not necessarily use -shared in exactly the same way.

This CL records in the object file whether a symbol was compiled
with -shared (this will be the same for all symbols in a given compilation)
and uses that information when setting the st_other field.

Fixes #20290.

Change-Id: Ib2b77e16aef38824871102e3c244fcf04a86c6ea
Reviewed-on: https://go-review.googlesource.com/43051
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-05-09 23:36:51 +00:00
Cherry Zhang fb0ccc5d0a cmd/internal/obj/arm64, cmd/compile: improve offset folding on ARM64
ARM64 assembler backend only accepts loads and stores with small
or aligned offset. The compiler therefore can only fold small or
aligned offsets into loads and stores. For locals and args, their
offsets to SP are not known until very late, and the compiler
makes conservative decision not folding some of them. However,
in most cases, the offset is indeed small or aligned, and can
be folded into load and store (but actually not).

This CL adds support of loads and stores with large and unaligned
offsets. When the offset doesn't fit into the instruction, it
uses two instructions and (for very large offset) the constant
pool. This way, the compiler doesn't need to be conservative,
and can simply fold the offset.

To make it work, the assembler's optab matching rules need to be
changed. Before, MOVD accepts C_UAUTO32K which matches multiple
of 8 between 0 and 32K, and also C_UAUTO16K, which may not be
multiple of 8 and does not fit into MOVD instruction. The
assembler errors in the latter case. This change makes it only
matches multiple of 8 (or offsets within ±256, which also fits
in instruction), and uses the large-or-unaligned-offset rule
for things doesn't fit (without error). Other sized move rules
are changed similarly.

Class C_UAUTO64K and C_UOREG64K are removed, as they are never
used.

In shared library, load/store of global is rewritten to using
GOT and temp register, which conflicts with the use of temp
register for assembling large offset. So the folding is disabled
for globals in shared library mode.

Reduce cmd/go binary size by 2%.

name                     old time/op    new time/op    delta
BinaryTree17-8              8.67s ± 0%     8.61s ± 0%   -0.60%  (p=0.000 n=9+10)
Fannkuch11-8                6.24s ± 0%     6.19s ± 0%   -0.83%  (p=0.000 n=10+9)
FmtFprintfEmpty-8           116ns ± 0%     116ns ± 0%     ~     (all equal)
FmtFprintfString-8          196ns ± 0%     192ns ± 0%   -1.89%  (p=0.000 n=10+10)
FmtFprintfInt-8             199ns ± 0%     198ns ± 0%   -0.35%  (p=0.001 n=9+10)
FmtFprintfIntInt-8          294ns ± 0%     293ns ± 0%   -0.34%  (p=0.000 n=8+8)
FmtFprintfPrefixedInt-8     318ns ± 1%     318ns ± 1%     ~     (p=1.000 n=10+10)
FmtFprintfFloat-8           537ns ± 0%     531ns ± 0%   -1.17%  (p=0.000 n=9+10)
FmtManyArgs-8              1.19µs ± 1%    1.18µs ± 1%   -1.41%  (p=0.001 n=10+10)
GobDecode-8                17.2ms ± 1%    17.3ms ± 2%     ~     (p=0.165 n=10+10)
GobEncode-8                14.7ms ± 1%    14.7ms ± 2%     ~     (p=0.631 n=10+10)
Gzip-8                      837ms ± 0%     836ms ± 0%   -0.14%  (p=0.006 n=9+10)
Gunzip-8                    141ms ± 0%     139ms ± 0%   -1.24%  (p=0.000 n=9+10)
HTTPClientServer-8          256µs ± 1%     253µs ± 1%   -1.35%  (p=0.000 n=10+10)
JSONEncode-8               40.1ms ± 1%    41.3ms ± 1%   +3.06%  (p=0.000 n=10+9)
JSONDecode-8                157ms ± 1%     156ms ± 1%   -0.83%  (p=0.001 n=9+8)
Mandelbrot200-8            8.94ms ± 0%    8.94ms ± 0%   +0.02%  (p=0.000 n=9+9)
GoParse-8                  8.69ms ± 0%    8.54ms ± 1%   -1.69%  (p=0.000 n=8+10)
RegexpMatchEasy0_32-8       227ns ± 1%     228ns ± 1%   +0.48%  (p=0.016 n=10+9)
RegexpMatchEasy0_1K-8      1.92µs ± 0%    1.63µs ± 0%  -15.08%  (p=0.000 n=10+9)
RegexpMatchEasy1_32-8       256ns ± 0%     251ns ± 0%   -2.19%  (p=0.000 n=10+9)
RegexpMatchEasy1_1K-8      2.38µs ± 0%    2.09µs ± 0%  -12.49%  (p=0.000 n=10+9)
RegexpMatchMedium_32-8      352ns ± 0%     354ns ± 0%   +0.39%  (p=0.002 n=10+9)
RegexpMatchMedium_1K-8      106µs ± 0%     106µs ± 0%   -0.05%  (p=0.005 n=10+9)
RegexpMatchHard_32-8       5.92µs ± 0%    5.89µs ± 0%   -0.40%  (p=0.000 n=9+8)
RegexpMatchHard_1K-8        180µs ± 0%     179µs ± 0%   -0.14%  (p=0.000 n=10+9)
Revcomp-8                   1.20s ± 0%     1.13s ± 0%   -6.29%  (p=0.000 n=9+8)
Template-8                  159ms ± 1%     154ms ± 1%   -3.14%  (p=0.000 n=9+10)
TimeParse-8                 800ns ± 3%     769ns ± 1%   -3.91%  (p=0.000 n=10+10)
TimeFormat-8                826ns ± 2%     817ns ± 2%   -1.04%  (p=0.050 n=10+10)
[Geo mean]                  145µs          143µs        -1.79%

Change-Id: I5fc42087cee9b54ea414f8ef6d6d020b80eb5985
Reviewed-on: https://go-review.googlesource.com/42172
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2017-05-09 19:41:00 +00:00