Commit Graph

257 Commits

Author SHA1 Message Date
Carlos Eduardo Seo be943df588 runtime: improve IndexByte for ppc64x
This change adds a better implementation of IndexByte in asm that uses the
vector registers/instructions on ppc64x.

benchmark                            old ns/op     new ns/op     delta
BenchmarkIndexByte/10-8              9.70          9.37          -3.40%
BenchmarkIndexByte/32-8              10.9          10.9          +0.00%
BenchmarkIndexByte/4K-8              254           92.8          -63.46%
BenchmarkIndexByte/4M-8              249246        118435        -52.48%
BenchmarkIndexByte/64M-8             10737987      7383096       -31.24%

benchmark                            old MB/s     new MB/s     speedup
BenchmarkIndexByte/10-8              1030.63      1067.24      1.04x
BenchmarkIndexByte/32-8              2922.69      2928.53      1.00x
BenchmarkIndexByte/4K-8              16065.95     44156.45     2.75x
BenchmarkIndexByte/4M-8              16827.96     35414.21     2.10x
BenchmarkIndexByte/64M-8             6249.67      9089.53      1.45x

Change-Id: I81dbdd620f7bb4e395ce4d1f2a14e8e91e39f9a1
Reviewed-on: https://go-review.googlesource.com/71710
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-11-06 21:56:18 +00:00
isharipo 39de58bad7 cmd/internal/obj/x86: add most missing AVX1/2 insts
This change applies x86avxgen output.
https://go-review.googlesource.com/c/arch/+/66972
As an effect, many new AVX instructions are now available.

One of the side-effects of this patch is
sorted AXXX (A-enum) constants.

Some AVX1/2 instructions still not added due to:
1. x86.csv V0.2 does not list them;
2. partially because of (1), test suite does not contain tests for
   these instructions.

Change-Id: I90430d773974ca5c995d6950d90e2c62ec88ef47
Reviewed-on: https://go-review.googlesource.com/75490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-03 15:32:07 +00:00
Ben Shi 1ac8846984 cmd/internal/obj/arm: add BFC/BFI to arm's assembler
BFC (Bit Field Clear) and BFI (Bit Field Insert) were
introduced in ARMv6T2, and the compiler can use them
to do further optimization.

Change-Id: I5a3fbcd2c2400c9bf4b939da6366c854c744c27f
Reviewed-on: https://go-review.googlesource.com/72891
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-03 14:06:21 +00:00
Ilya Tocar 44943aff33 cmd/internal/obj/x86: add ADX extension
Add support for ADX cpuid bit detection and all instructions,
implied by that bit (ADOX/ADCX). They are useful for rsa and math/big in
general.

Change-Id: Idaa93303ead48fd18b9b3da09b3e79de2f7e2193
Reviewed-on: https://go-review.googlesource.com/74850
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-02 15:41:50 +00:00
Michael Munday c280126557 cmd/asm, cmd/internal/obj/s390x, math: add "test under mask" instructions
Adds the following s390x test under mask (immediate) instructions:

TMHH
TMHL
TMLH
TMLL

These are useful for testing bits and are already used in the math package.

Change-Id: Idffb3f83b238dba76ac1e42ac6b0bf7f1d11bea2
Reviewed-on: https://go-review.googlesource.com/41092
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-30 23:55:14 +00:00
Michael Munday 96cdacb971 cmd/asm, cmd/compile: optimize math.Abs and math.Copysign on s390x
This change adds three new instructions:

- LPDFR: load positive (math.Abs(x))
- LNDFR: load negative (-math.Abs(x))
- CPSDR: copy sign (math.Copysign(x, y))

By making use of GPR <-> FPR moves we can now compile math.Abs and
math.Copysign to these instructions using SSA rules.

This CL also adds new rules to merge address generation into combined
load operations. This makes GPR <-> FPR move matching more reliable.

name                 old time/op  new time/op  delta
Copysign             1.85ns ± 0%  1.40ns ± 1%  -24.65%  (p=0.000 n=8+10)
Abs                  1.58ns ± 1%  0.73ns ± 1%  -53.64%  (p=0.000 n=10+10)

The geo mean improvement for all math package benchmarks was 4.6%.

Change-Id: I0cec35c5c1b3fb45243bf666b56b57faca981bc9
Reviewed-on: https://go-review.googlesource.com/73950
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-30 23:42:51 +00:00
Lynn Boger 4d0151ede5 cmd/compile,cmd/internal/obj/ppc64: make math.Abs,math.Copysign instrinsics on ppc64x
This adds support for math Abs, Copysign to be instrinsics on ppc64x.

New instruction FCPSGN is added to generate fcpsgn. Some new
rules are added to improve the int<->float conversions that are
generated mainly due to the Float64bits and Float64frombits in
the math package. PPC64.rules is also modified as suggested
in the review for CL 63290.

Improvements:
benchmark                           old ns/op     new ns/op     delta
BenchmarkAbs-16                   1.12          0.69          -38.39%
BenchmarkCopysign-16              1.30          0.93          -28.46%
BenchmarkNextafter32-16           9.34          8.05          -13.81%
BenchmarkFrexp-16                 8.81          7.60          -13.73%

Others that used Copysign also saw smaller improvements.

I attempted to make this work using rules since that
seems to be preferred, but due to the use of Float64bits and
Float64frombits in these functions, several rules had to be added and
even then not all cases were matched. Using rules became too
complicated and seemed too fragile for these.

Updates #21390

Change-Id: Ia265da9a18355e08000818a4fba1a40e9e031995
Reviewed-on: https://go-review.googlesource.com/67130
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-30 13:56:39 +00:00
Cherry Zhang 083338cb97 cmd/internal/obj/arm64: handle global address in LDP/STP
The addressing mode of global variable was missing, whereas the
compiler may make use of it, causing "illegal combination" error.
This CL adds support of that addressing mode.

Fixes #22390.

Change-Id: Ic8eade31aba73e6fb895f758ee7f277f8f1832ef
Reviewed-on: https://go-review.googlesource.com/72610
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-10-23 15:40:08 +00:00
Ben Shi 47193dcc0c cmd/internal/obj/arm: better solution of .S/.P/.U/.W suffix check
Current suffix check is based on instruction, which is not very
accurate. For example, "MOVW.S R1, R2" is valid, but
"MOVW.S $0xaaaaaaaa, R1" and "MOVW.P CPSR, R9" are not.

This patch fixes the above kinds of issues by checking suffix
based on []optab. And also more test cases are added.

fixes #20509

Change-Id: Ibad91be72c78eefa719412a83b4d44370d2202a8
Reviewed-on: https://go-review.googlesource.com/70910
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-17 15:18:12 +00:00
Cherry Zhang 290de1f880 cmd/asm: reject STREX with same source and destination register on ARM
On ARM, STREX does not permit the same register used as both the
source and the destination. Reject the bad instruction.

The assembler also accepted special cases
	STREX R0, (R1)	as STREX R0, (R1), R0
	STREX (R1), R0	as STREX R0, (R1), R0
both are illegal. Remove this special case as well.

For STREXD, check that the destination is not source, and not
source+1. Also check that the source register is even numbered,
as required by the architecture's manual.

Fixes #22268.

Change-Id: I6bfde86ae692d8f1d35bd0bd7aac0f8a11ce8e22
Reviewed-on: https://go-review.googlesource.com/71190
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-10-16 18:30:56 +00:00
Wei Xiao 531e6c06c4 cmd/asm: refine Go assembly for ARM64
Some ARM64-specific instructions (such as SIMD instructions) are not supported.
This patch adds support for the following:
1. Extended register, e.g.:
     ADD	Rm.<ext>[<<amount], Rn, Rd
     <ext> can have the following values:
       UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW and SXTX
2. Arrangement for SIMD instructions, e.g.:
     VADDP	Vm.<T>, Vn.<T>, Vd.<T>
     <T> can have the following values:
       B8, B16, H4, H8, S2, S4 and D2
3. Width specifier and element index for SIMD instructions, e.g.:
     VMOV	Vn.<T>[index], Rd // MOV(to general register)
     <T> can have the following values:
       S and D
4. Register List, e.g.:
     VLD1	(Rn), [Vt1.<T>, Vt2.<T>, Vt3.<T>]
5. Register offset variant, e.g.:
     VLD1.P	(Rn)(Rm), [Vt1.<T>, Vt2.<T>] // Rm is the post-index register
6. Go assembly for ARM64 reference manual
     new added instructions are required to have according explanation items in
     the manual and items for existed instructions will be added incrementally

For more information about the refinement background, please refer to the
discussion (https://groups.google.com/forum/#!topic/golang-dev/rWgDxCrL4GU)

This patch only adds syntax and doesn't break any assembly that already exists.

Change-Id: I34e90b7faae032820593a0e417022c354a882008
Reviewed-on: https://go-review.googlesource.com/41654
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-13 13:41:19 +00:00
Carlos Eduardo Seo 3a165bba34 cmd/asm, cmd/internal/obj/ppc64: Fix Data Cache instructions for ppc64x
This change fixes the implementation of Data Cache instructions for
ppc64x, allowing non-zero hint field values.

Change-Id: I454aac9293d069a4817ee574d5809fa1799b3216
Reviewed-on: https://go-review.googlesource.com/68670
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-10-10 21:25:33 +00:00
Russ Cox 840f2c167f cmd/asm, cmd/cgo, cmd/compile, cmd/cover, cmd/link: use standard -V output
Also add -V=full to print a unique identifier of the specific tool being invoked.
This will be used for content-based staleness.

Also sort and clean up a few of the flag doc comments.

Change-Id: I786fe50be0b8e5f77af809d8d2dab721185c2abd
Reviewed-on: https://go-review.googlesource.com/68590
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2017-10-06 20:28:40 +00:00
isharipo ca5127cbcd cmd/asm: add amd64 EXTRACTPS instruction
Adds last missing SSE4 instruction.
Also introduces additional ytab set 'yextractps'.

See https://golang.org/cl/57470 that adds other SSE4 instructions
but skips this one due to 'yextractps'.

To make EXTRACTPS less "sloppy", Yu2 oclass added to forbid
usage of invalid offset values in immediate operand.

Part of the mission to add missing amd64 SSE4 instructions to Go asm.

Change-Id: I0e67e3497054f53257dd8eb4c6268da5118b4853
Reviewed-on: https://go-review.googlesource.com/57490
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-10-06 15:44:29 +00:00
Ben Shi 0ed1c38017 cmd/internal/obj/arm: support more ARMv6 instructions
The following instructions were introduced in ARMv6, and the compiler
can do more optimization with them.

1. "MOVBS Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does signed
byte extension to word, and writes the result to Rd.

2. "MOVHS Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does signed
halfword extension to word, and writes the result to Rd.

3. "MOVBU Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does unsigned
byte extension to word, and writes the result to Rd.

4. "MOVHU Rm@>i, Rd": rotates Rm 0/8/16/24 bits, does unsigned
half-word extension to word, and writes the result to Rd.

5. "XTAB Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does signed
byte extension to word, adds Rn, and writes the result to Rd.

6. "XTAH Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does signed
half-word extension to word, adds Rn, and writes the result to Rd.

7. "XTABU Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does unsigned
byte extension to word, adds Rn, and writes the result to Rd.

8. "XTAHU Rm@>i, Rn, Rd": rotates Rm 0/8/16/24 bits, does unsigned
half-word extension to word, adds Rn, and writes the result to Rd.

Change-Id: I4306d7ebac93015d7e2e40d307f2c4271c03f466
Reviewed-on: https://go-review.googlesource.com/65790
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-05 13:52:42 +00:00
Damien Lespiau 1e607f225e cmd/internal/obj/x86: add ADDSUBPS/PD
These are the last instructions missing to complete SSE3 support.

For reference what was missing was found by a tool [1]:

$ x86db-gogen list --extension SSE3 --not-known
ADDSUBPD xmmreg,xmmrm [rm: 66 0f d0 /r] PRESCOTT,SSE3,SO
ADDSUBPS xmmreg,xmmrm [rm: f2 0f d0 /r] PRESCOTT,SSE3,SO

[1] https://github.com/dlespiau/x86db

Fixes #20293

Change-Id: Ib5a91bf64dcc5282cdb60eae740ae52b4db16ebd
Reviewed-on: https://go-review.googlesource.com/42990
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-09-18 15:06:05 +00:00
isharipo 8c67f210a1 cmd/internal/obj: change Prog.From3 to RestArgs ([]Addr)
This change makes it easier to express instructions
with arbitrary number of operands.

Rationale: previous approach with operand "hiding" does
not scale well, AVX and especially AVX512 have many
instructions with 3+ operands.

x86 asm backend is updated to handle up to 6 explicit operands.
It also fixes issue with 4-th immediate operand type checks.
All `ytab` tables are updated accordingly.

Changes to non-x86 backends only include these patterns:
`p.From3 = X` => `p.SetFrom3(X)`
`p.From3.X = Y` => `p.GetFrom3().X = Y`

Over time, other backends can adapt Prog.RestArgs
and reduce the amount of workarounds.

-- Performance --

x/benchmark/build:

$ benchstat upstream.bench patched.bench
name      old time/op                 new time/op                 delta
Build-48                  21.7s ± 2%                  21.8s ± 2%   ~     (p=0.218 n=10+10)

name      old binary-size             new binary-size             delta
Build-48                  10.3M ± 0%                  10.3M ± 0%   ~     (all equal)

name      old build-time/op           new build-time/op           delta
Build-48                  21.7s ± 2%                  21.8s ± 2%   ~     (p=0.218 n=10+10)

name      old build-peak-RSS-bytes    new build-peak-RSS-bytes    delta
Build-48                  145MB ± 5%                  148MB ± 5%   ~     (p=0.218 n=10+10)

name      old build-user+sys-time/op  new build-user+sys-time/op  delta
Build-48                  21.0s ± 2%                  21.2s ± 2%   ~     (p=0.075 n=10+10)

Microbenchmark shows a slight slowdown.

name        old time/op  new time/op  delta
AMD64asm-4  49.5ms ± 1%  49.9ms ± 1%  +0.67%  (p=0.001 n=23+15)

func BenchmarkAMD64asm(b *testing.B) {
  for i := 0; i < b.N; i++ {
    TestAMD64EndToEnd(nil)
    TestAMD64Encoder(nil)
  }
}

Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183
Reviewed-on: https://go-review.googlesource.com/63490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-15 21:05:03 +00:00
Lynn Boger 40e25895e3 cmd/compile,math: improve int<->float conversions on ppc64x
The functions Float64bits and Float64frombits perform
poorly on ppc64x because the int<->float conversions
often result in load and store sequences to handle the
type change. This patch adds more rules to recognize
those sequences and use register to register moves
and avoid unnecessary loads and stores where possible.

There were some existing rules to improve these conversions,
but this provides additional improvements. Included here:

- New instruction FCFIDS to improve on conversion to 32 bit
- Rename Xf2i64 and Xi2f64 as MTVSRD, MFVSRD, to match the asm
- Add rules to lower some of the load/store sequences for
- Added new go asm to ppc64.s testcase.
conversions

Improvements:

BenchmarkAbs-16                2.16          0.93          -56.94%
BenchmarkCopysign-16           2.66          1.18          -55.64%
BenchmarkRound-16              4.82          2.69          -44.19%
BenchmarkSignbit-16            1.71          1.14          -33.33%
BenchmarkFrexp-16              11.4          7.94          -30.35%
BenchmarkLogb-16               10.4          7.34          -29.42%
BenchmarkLdexp-16              15.7          11.2          -28.66%
BenchmarkIlogb-16              10.2          7.32          -28.24%
BenchmarkPowInt-16             69.6          55.9          -19.68%
BenchmarkModf-16               10.1          8.19          -18.91%
BenchmarkLog2-16               17.4          14.3          -17.82%
BenchmarkCbrt-16               45.0          37.3          -17.11%
BenchmarkAtanh-16              57.6          48.3          -16.15%
BenchmarkRemainder-16          76.6          65.4          -14.62%
BenchmarkGamma-16              26.0          22.5          -13.46%
BenchmarkPowFrac-16            197           174           -11.68%
BenchmarkMod-16                112           99.8          -10.89%
BenchmarkAsinh-16              59.9          53.7          -10.35%
BenchmarkAcosh-16              44.8          40.3          -10.04%

Updates #21390

Change-Id: I56cc991fc2e55249d69518d4e1ba76cc23904e35
Reviewed-on: https://go-review.googlesource.com/63290
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2017-09-14 12:14:00 +00:00
Ben Shi f727fa7939 cmd/internal/obj/arm: support more ARM VFP instructions
Add support of more ARM VFP instructions in the assembler.
They were introduced in ARM VFPv4.

"FMULAF/FMULAD   Fm, Fn, Fd": Fd = Fd + Fn*Fm
"FNMULAF/FNMULAD Fm, Fn, Fd": Fd = -(Fd + Fn*Fm)
"FMULSF/FMULSD   Fm, Fn, Fd": Fd = Fd - Fn*Fm
"FNMULSF/FNMULSD Fm, Fn, Fd": Fd = -(Fd - Fn*Fm)

The multiplication results are not rounded.

Change-Id: Id9cc52fd8e1b9a708103cd1e514c85a9e1cb3f47
Reviewed-on: https://go-review.googlesource.com/62550
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 13:10:50 +00:00
Agniva De Sarker 6367c19f26 cmd/internal/obj/x86: add some more AVX2 instructions
This adds the VFMADD[213|231]SD, VFNMADD[213|231]SD,
VADDSD, VSUBSD instructions

This will allow us to write a fast path for exp_amd64.s where
these optimizations can be applied in a lot of places.

Change-Id: Ide292107ab887bd1e225a1ad60880235b5ed7c61
Reviewed-on: https://go-review.googlesource.com/61810
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-06 19:31:18 +00:00
isharipo 50f1f639a4 cmd/asm: add most SSE4 missing instructions
Instructions added:
  INSERTPS immb, r/m, xmm
  MPSADBW immb, r/m, xmm
  BLENDPD immb, r/m, xmm
  BLENDPS immb, r/m, xmm
  DPPD immb, r/m, xmm
  DPPS immb, r/m, xmm
  MOVNTDQA r/m, xmm
  PACKUSDW r/m, xmm
  PBLENDW immb, r/m, xmm
  PCMPEQQ r/m, xmm
  PCMPGTQ r/m, xmm
  PCMPISTRI immb, r/m, xmm
  PCMPISTRM immb, r/m, xmm
  PMAXSB r/m, xmm
  PMAXSD r/m, xmm
  PMAXUD r/m, xmm
  PMAXUW r/m, xmm
  PMINSB r/m, xmm
  PMINSD r/m, xmm
  PMINUD r/m, xmm
  PMINUW r/m, xmm
  PTEST r/m, xmm
  PCMPESTRM immb, r/m, xmm

Note: only 'optab' table is extended.

`EXTRACTPS immb, xmm, r/m` is not included in this
change due to new ytab set 'yextractps'. This should simplify
code review.

4-operand instructions are a subject of upcoming changes that
make 4-th (and so on) operands explicit.
Related TODO note in asm6.go:
"dont't hide 4op, some version have xmm version".

Part of the mission to add missing amd64 SSE4 instructions to Go asm.

Change-Id: I71716df14a8a5332e866dd0f0d52d43d7714872f
Reviewed-on: https://go-review.googlesource.com/57470
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-09-06 16:00:54 +00:00
isharipo b15e8babc8 cmd/asm: add amd64 PALIGNR instruction
3rd change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit adds instruction that do require new ytab variable.

Change-Id: I0bc7d9401c9176eb3760c3d59494ef082e97af84
Reviewed-on: https://go-review.googlesource.com/56870
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-09-06 15:37:20 +00:00
isharipo 4074e4e5be cmd/asm: add amd64 CLFLUSH instruction
This is the last instruction I found missing in SSE2 set.

It does not reuse 'yprefetch' ytabs due to differences in
operands SRC/DST roles:
- PREFETCHx: ModRM:r/m(r) -> FROM
- CLFLUSH:   ModRM:r/m(w) -> TO

unaryDst map is extended accordingly.

Change-Id: I89e34ebb81cc0ee5f9ebbb1301bad417f7ee437f
Reviewed-on: https://go-review.googlesource.com/56833
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-09-06 15:37:00 +00:00
isharipo 26dadbe32c cmd/asm: add amd64 PAB{SB,SD,SW}, PMADDUBSW, PMULHRSW, PSIG{NB,ND,NW}
instructions

1st change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit adds instructions that do not require new named ytab sets.

Change-Id: I0c3dfd8d39c3daa8b7683ab163c63145626d042e
Reviewed-on: https://go-review.googlesource.com/56834
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-09-06 15:35:25 +00:00
Ben Shi 074547a5ce cmd/internal/obj/arm: support more ARM VFP instructions
Add support of more ARM VFP instructions in the assembler.
They were introduced in ARM VFPv2.

"NMULF/NMULD   Fm, Fn, Fd": Fd = -Fn*Fm
"MULAF/MULAD   Fm, Fn, Fd": Fd = Fd + Fn*Fm
"NMULAF/NMULAD Fm, Fn, Fd": Fd = -(Fd + Fn*Fm)
"MULSF/MULSD   Fm, Fn, Fd": Fd = Fd - Fn*Fm
"NMULSF/NMULSD Fm, Fn, Fd": Fd = -(Fd - Fn*Fm)

Change-Id: Icd302676ca44a9f5f153fce734225299403c4163
Reviewed-on: https://go-review.googlesource.com/60170
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-31 13:42:17 +00:00
Carlos Eduardo Seo 526f3420c2 cmd/asm, cmd/internal/obj/ppc64: add ISA 3.0 instructions
This change adds new ppc64 instructions from the POWER9 ISA. This includes
compares, loads, maths, register moves and the new random number generator and
copy/paste facilities.

Change-Id: Ife3720b90f5af184ff115bbcdcbce5c1302d39b6
Reviewed-on: https://go-review.googlesource.com/53930
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-08-29 12:28:38 +00:00
Daniel Martí 5d39af9d9b all: remove some unused result params
Most of these are return values that were part of a receiving parameter,
so they're still accessible.

A few others are not, but those have never had a use.

Found with github.com/mvdan/unparam, after Kevin Burke's suggestion that
the tool should also warn about unused result parameters.

Change-Id: Id8b5ed89912a99db22027703a88bd94d0b292b8b
Reviewed-on: https://go-review.googlesource.com/55910
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-28 06:52:55 +00:00
Daniel Martí 99da8730b0 all: remove some double spaces from comments
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.

Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-26 15:09:09 +00:00
fanzha02 aea286b449 cmd/internal/obj/arm64: fix assemble fcsels/fcseld bug
The current code treats the type of SIMD&FP register as C_REG incorrectly.

The fix code converts C_REG type into C_FREG type.

Uncomment fcsels/fcseld test cases.

Fixes #21582
Change-Id: I754c51f72a0418bd352cbc0f7740f14cc599c72d
Reviewed-on: https://go-review.googlesource.com/58350
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-25 20:43:03 +00:00
fanzha02 8f1e2a2610 cmd/internal/obj/arm64: fix assemble fcmp/fcmpe bug
The current code treats floating-point constant as integer
and does not treat fcmp/fcmpe as the comparison instrucitons
that requires special handling.

The fix corrects the type of immediate arguments and adds fcmp/fcmpe
in the special handing.

Uncomment the fcmp/fcmpe cases.

Fixes #21567
Change-Id: I6782520e2770f6ce70270b667dd5e68f71e2d5ad
Reviewed-on: https://go-review.googlesource.com/57852
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-23 15:59:34 +00:00
fanzha02 bdd7c01b55 cmd/internal/obj/arm64: fix assemble movk bug
The current code gets shift arguments value from prog.From3.Offset.
But prog.From3.Offset is not assigned the shift arguments value in
instructions assemble process.

The fix calls movcon() function to get the correct value.

Uncomment the movk/movkw  cases.

Fixes #21398
Change-Id: I78d40c33c24bd4e3688a04622e4af7ddb5333fa6
Reviewed-on: https://go-review.googlesource.com/54990
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-22 13:10:08 +00:00
Ben Shi 9bf521b2b4 cmd/internal/obj/arm: support BFX/BFXU instructions
BFX extracts given bits from the source register, sign extends them
to 32-bit, and writes to destination register. BFXU does the similar
operation with zero extention.

They were introduced in ARMv6T2.

Change-Id: I6822ebf663497a87a662d3645eddd7c611de2b1e
Reviewed-on: https://go-review.googlesource.com/56071
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-21 16:29:59 +00:00
isharipo 9f5f51af7c cmd/asm: uncomment tests for amd64 PHADD{SW,W}, PHSUB{D,SW,W}
Instructions added in https://golang.org/cl/18853

2nd change out of 3 to cover AMD64 SSSE3 instruction set in Go asm.
This commit does not actually add any new instructions, only
enables some test cases.

Change-Id: I9596435b31ee4c19460a51dd6cea4530aac9d198
Reviewed-on: https://go-review.googlesource.com/56835
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-08-18 16:10:21 +00:00
Ben Shi 75cb22cb2f cmd/internal/obj/arm: support new arm instructions
There are two changes in this CL.

1. Add new forms of MOVH/MOVHS/MOVHU.
   MOVHS R0<<0(R1), R2   // ldrsh
   MOVH  R0<<0(R1), R2   // ldrsh
   MOVHU R0<<0(R1), R2   // ldrh
   MOVHS R2, R5<<0(R1)   // strh
   MOVH  R2, R5<<0(R1)   // strh
   MOVHU R2, R5<<0(R1)   // strh

2. Simpify "MVN $0xffffffaa, Rn" to "MOVW $0x55, Rn".
   It is originally assembled to two instructions.
   "MOVW offset(PC), R11"
   "MVN R11, Rn"

Change-Id: I8e863dcfb2bd8f21a04c5d627fa7beec0afe65fb
Reviewed-on: https://go-review.googlesource.com/53690
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-18 14:13:41 +00:00
isharipo b5dab2b9d9 cmd/asm: uncomment tests for PCMPESTRI, PHMINPOSUW
Instructions are implemented in the following revisions:
PCMPESTRI - https://golang.org/cl/22337
PHMINPOSUW - https://golang.org/cl/18853

It is unknown when x86test will be updated/re-run, but tests are useful
to check which x86 instructions are not yet supported.
As an example of tool that uses this information, there is Damien
Lespiau x86db.

Part of the mission to add missing amd64 SSE4 instructions to Go asm.

Change-Id: I512ff26040f47a0976b3e37000fb1f37eac5b762
Reviewed-on: https://go-review.googlesource.com/55830
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-08-17 15:55:39 +00:00
fanzha02 6e8b10397b cmd/internal/obj/arm64: fix assemble stxr/stxrw/stxrb/stxrh bug
The stxr/stxrw/stxrb/stxrh instructions belong to STLXR-like instructions
set and they require special handling. The current code has no special
handling for those instructions.

The fix adds the special handling for those instructions.

Uncomment stxr/stxrw/stxrb/stxrh test cases.

Fixes #21397
Change-Id: I31cee29dd6b30b1c25badd5c7574dda7a01bf016
Reviewed-on: https://go-review.googlesource.com/54951
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-15 14:05:29 +00:00
Ben Shi ef26021d30 cmd/internal/obj/arm: check illegal base registers in ARM instructions
Wrong instructions "MOVW 8(F0), R1" and "MOVW R0<<0(F1), R1"
are silently accepted, and all Fx are treated as Rx.

The patch checks all those illegal base registers.

fixes #20724

Change-Id: I05d41bb43fe774b023205163b7daf4a846e9dc88
Reviewed-on: https://go-review.googlesource.com/46132
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-30 19:09:44 +00:00
fanzha02 990dac2723 cmd/internal/obj/arm64: fix assemble LDXP bug
The current code calculates register number incorrectly.

The fix corrects the register number calculation.

Add cases created by decoder to test assembler.

Fixes #20697
Fixes #20723

Change-Id: I73ac153df9ea9f51c43a5104828d7a5389551c92
Reviewed-on: https://go-review.googlesource.com/45850
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-30 18:24:58 +00:00
Ben Shi 3785457c76 cmd/internal/obj/arm: fix wrong encoding of MULBB
"MULBB R1, R2, R3" is encoded to 0xe163f182, which should be
0xe1630182.

This patch fix it.

fix #20764

Change-Id: I9d3c3ffa40ecde86638e5e083eacc67578caebf4
Reviewed-on: https://go-review.googlesource.com/46491
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-23 18:08:20 +00:00
Ben Shi e00a38c89a cmd/internal/obj/arm: fix setting U bit in shifted register offset of MOVBS
"MOVBS.U R0<<0(R1), R2" is assembled to 0xe19120d0 (ldrsb r2, [r1, r0]),
but it is expected to be 0xe11120d0 (ldrsb r2, [r1, -r0]).

This patch fixes it and adds more encoding tests.

fixes #20701

Change-Id: Ic1fb46438d71a978dbef06d97494a70c95fcbf3a
Reviewed-on: https://go-review.googlesource.com/45996
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-23 14:29:57 +00:00
Ben Shi a38c8dfa44 cmd/internal/obj/arm: fix MOVW to/from FPSR
"MOVW FPSR, g" should be assembled to 0xeef1aa10, but actually
0xee30a110 (RFS). "MOVW g, FPSR" should be 0xeee1aa10, but actually
0xee20a110 (WFS). They should be updated to VFP forms, since the ARM
back end doesn't support non-VFP floating points.

The patch fixes them and adds more assembly encoding tests.

fixes #20643

Change-Id: I3b29490337c6e8d891b400fcedc8b0a87b82b527
Reviewed-on: https://go-review.googlesource.com/45276
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-13 14:06:58 +00:00
Ben Shi d3d5489135 cmd/internal/obj/arm: fix encoding of move register/immediate to CPSR
"MOVW R1, CPSR" is assembled to 0xe129f001, which should be 0xe12cf001.
"MOVW $255, CPSR" is assembled to 0xe329f0ff, which should be 0xe32cf0ff.

This patch fixes them and adds more assembly encoding tests.

fix #20626

Change-Id: Iefc945879ea774edf40438ce39f52c144e1501a1
Reviewed-on: https://go-review.googlesource.com/45170
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-09 14:14:30 +00:00
Ben Shi c8ab8c1f99 cmd/internal/obj/arm: fix constant decomposition
There are two issues in constant decomposition.

1. A typo in "func immrot2s" blocks "case 107" of []optab be triggered.

2. Though "ADD $0xffff, R0, R0" is decomposed to "ADD $0xff00, R0, R0" and
   "ADD $0x00ff, R0, R0" as expected, "ADD $0xffff, R0" still uses the
   constant pool, which should be the same as "ADD $0xffff, R0, R0".

This patch fixes them and adds more instruction encoding tests.

fix #20516

Change-Id: Icd7bdfa1946b29db15580dcb429111266f1384c6
Reviewed-on: https://go-review.googlesource.com/44335
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-05 17:03:54 +00:00
Ben Shi ffab6ab877 cmd/asm/internal/asm: fix a bug in ARM assembly encoding test
It is expected to test assembly code for ARMv5, ARMv6 and ARMv7
in cmd/asm/internal/asm/endtoend_test.go. But actually the loop
in "func TestARMEndToEnd(t *testing.T)" runs three times all
for ARMv5.

This patch fixes that bug and adds a new armv6.s which is only tested
with GOARM=6.

fixes #20465

Change-Id: I5dbf00809a47ace2c195335e2c9bdd768479aada
Reviewed-on: https://go-review.googlesource.com/43930
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-25 18:23:20 +00:00
Ben Shi b8a4eb4bd8 cmd/internal/obj/arm: fix illegal forms of ARM VFP instruction
"ADDF F0, R1, F2" is silently accepted by the arm assembler and
assembled to the same binary code of "ADDF F0, F1, F2". So does
"CMPF F0, R1".

"ABSF F0, F1, F2" is also silently accepted and assembled to a
different instruction.

This patch reports those illegal forms and adds test cases.

fix #20464

Change-Id: I88b80dc29de24c6266ac7bf7bce1578c5adbc68c
Reviewed-on: https://go-review.googlesource.com/43931
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-25 14:32:35 +00:00
Ben Shi 5e79787935 cmd/internal/obj/arm: report invalid .S/.P/.W suffix in ARM instructions
Many instructions can not have a .S suffix, such as MULS, SWI, CLZ,
CMP, STREX and others. And so do .P and .W suffixes. Even wrong
assembly code is generated for some instructions with invalid
suffixes.

This patch tries to simplify .S/.W/.P checks. And a wrong assembly
test for arm is added.

fixes #20377

Change-Id: Iba1c99d9e6b7b16a749b4d93ca2102e17c5822fe
Reviewed-on: https://go-review.googlesource.com/43561
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-22 13:44:12 +00:00
Ben Shi c7cae34b19 cmd/internal/obj/arm: remove illegal form of the SWI instruction
SWI only support "SWI $imm", but currently "SWI (Reg)" is also
accepted. This patch fixes it.

And more instruction tests are added to cmd/asm/internal/asm/testdata/arm.s

fixes #20375

Change-Id: Id437d853924a403e41da9b6cbddd20d994b624ff
Reviewed-on: https://go-review.googlesource.com/43552
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-18 13:38:13 +00:00
Cherry Zhang b53acd89db cmd/internal/obj/mips: add support of LLV, SCV, NOOP instructions
LLV and SCV are 64-bit load-linked and store-conditional. They
were used in runtime as #define WORD. Change them to normal
instruction form.

NOOP is hardware no-op. It was written as WORD $0. Make a name
for it for better disassembly output.

Fixes #12561.
Fixes #18238.

Change-Id: I82c667ce756fa83ef37b034b641e8c4366335e83
Reviewed-on: https://go-review.googlesource.com/40297
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-16 17:15:11 +00:00
Cherry Zhang fb0ccc5d0a cmd/internal/obj/arm64, cmd/compile: improve offset folding on ARM64
ARM64 assembler backend only accepts loads and stores with small
or aligned offset. The compiler therefore can only fold small or
aligned offsets into loads and stores. For locals and args, their
offsets to SP are not known until very late, and the compiler
makes conservative decision not folding some of them. However,
in most cases, the offset is indeed small or aligned, and can
be folded into load and store (but actually not).

This CL adds support of loads and stores with large and unaligned
offsets. When the offset doesn't fit into the instruction, it
uses two instructions and (for very large offset) the constant
pool. This way, the compiler doesn't need to be conservative,
and can simply fold the offset.

To make it work, the assembler's optab matching rules need to be
changed. Before, MOVD accepts C_UAUTO32K which matches multiple
of 8 between 0 and 32K, and also C_UAUTO16K, which may not be
multiple of 8 and does not fit into MOVD instruction. The
assembler errors in the latter case. This change makes it only
matches multiple of 8 (or offsets within ±256, which also fits
in instruction), and uses the large-or-unaligned-offset rule
for things doesn't fit (without error). Other sized move rules
are changed similarly.

Class C_UAUTO64K and C_UOREG64K are removed, as they are never
used.

In shared library, load/store of global is rewritten to using
GOT and temp register, which conflicts with the use of temp
register for assembling large offset. So the folding is disabled
for globals in shared library mode.

Reduce cmd/go binary size by 2%.

name                     old time/op    new time/op    delta
BinaryTree17-8              8.67s ± 0%     8.61s ± 0%   -0.60%  (p=0.000 n=9+10)
Fannkuch11-8                6.24s ± 0%     6.19s ± 0%   -0.83%  (p=0.000 n=10+9)
FmtFprintfEmpty-8           116ns ± 0%     116ns ± 0%     ~     (all equal)
FmtFprintfString-8          196ns ± 0%     192ns ± 0%   -1.89%  (p=0.000 n=10+10)
FmtFprintfInt-8             199ns ± 0%     198ns ± 0%   -0.35%  (p=0.001 n=9+10)
FmtFprintfIntInt-8          294ns ± 0%     293ns ± 0%   -0.34%  (p=0.000 n=8+8)
FmtFprintfPrefixedInt-8     318ns ± 1%     318ns ± 1%     ~     (p=1.000 n=10+10)
FmtFprintfFloat-8           537ns ± 0%     531ns ± 0%   -1.17%  (p=0.000 n=9+10)
FmtManyArgs-8              1.19µs ± 1%    1.18µs ± 1%   -1.41%  (p=0.001 n=10+10)
GobDecode-8                17.2ms ± 1%    17.3ms ± 2%     ~     (p=0.165 n=10+10)
GobEncode-8                14.7ms ± 1%    14.7ms ± 2%     ~     (p=0.631 n=10+10)
Gzip-8                      837ms ± 0%     836ms ± 0%   -0.14%  (p=0.006 n=9+10)
Gunzip-8                    141ms ± 0%     139ms ± 0%   -1.24%  (p=0.000 n=9+10)
HTTPClientServer-8          256µs ± 1%     253µs ± 1%   -1.35%  (p=0.000 n=10+10)
JSONEncode-8               40.1ms ± 1%    41.3ms ± 1%   +3.06%  (p=0.000 n=10+9)
JSONDecode-8                157ms ± 1%     156ms ± 1%   -0.83%  (p=0.001 n=9+8)
Mandelbrot200-8            8.94ms ± 0%    8.94ms ± 0%   +0.02%  (p=0.000 n=9+9)
GoParse-8                  8.69ms ± 0%    8.54ms ± 1%   -1.69%  (p=0.000 n=8+10)
RegexpMatchEasy0_32-8       227ns ± 1%     228ns ± 1%   +0.48%  (p=0.016 n=10+9)
RegexpMatchEasy0_1K-8      1.92µs ± 0%    1.63µs ± 0%  -15.08%  (p=0.000 n=10+9)
RegexpMatchEasy1_32-8       256ns ± 0%     251ns ± 0%   -2.19%  (p=0.000 n=10+9)
RegexpMatchEasy1_1K-8      2.38µs ± 0%    2.09µs ± 0%  -12.49%  (p=0.000 n=10+9)
RegexpMatchMedium_32-8      352ns ± 0%     354ns ± 0%   +0.39%  (p=0.002 n=10+9)
RegexpMatchMedium_1K-8      106µs ± 0%     106µs ± 0%   -0.05%  (p=0.005 n=10+9)
RegexpMatchHard_32-8       5.92µs ± 0%    5.89µs ± 0%   -0.40%  (p=0.000 n=9+8)
RegexpMatchHard_1K-8        180µs ± 0%     179µs ± 0%   -0.14%  (p=0.000 n=10+9)
Revcomp-8                   1.20s ± 0%     1.13s ± 0%   -6.29%  (p=0.000 n=9+8)
Template-8                  159ms ± 1%     154ms ± 1%   -3.14%  (p=0.000 n=9+10)
TimeParse-8                 800ns ± 3%     769ns ± 1%   -3.91%  (p=0.000 n=10+10)
TimeFormat-8                826ns ± 2%     817ns ± 2%   -1.04%  (p=0.050 n=10+10)
[Geo mean]                  145µs          143µs        -1.79%

Change-Id: I5fc42087cee9b54ea414f8ef6d6d020b80eb5985
Reviewed-on: https://go-review.googlesource.com/42172
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2017-05-09 19:41:00 +00:00
Damien Lespiau 23c5db9bbb cmd/asm: enable MOVSD in the encoding end-to-end test
MOVSD is properly handled but its encoding test wasn't enabled. Enable
it.

For reference this was found with a little tool I wrote [1] to explore
which instructions are missing or not tested in the go obj package and
assembler:

"which SSE2 instructions aren't tested? And don't list instructions
which can take MMX operands"

$ x86db-gogen list --extension SSE2 --not-tested --not-mmx
CLFLUSH mem           [m:  np 0f ae /7] WILLAMETTE,SSE2
MOVSD   xmmreg,xmmreg [rm: f2 0f 10 /r] WILLAMETTE,SSE2
MOVSD   xmmreg,xmmreg [mr: f2 0f 11 /r] WILLAMETTE,SSE2
MOVSD   mem64,xmmreg  [mr: f2 0f 11 /r] WILLAMETTE,SSE2
MOVSD   xmmreg,mem64  [rm: f2 0f 10 /r] WILLAMETTE,SSE2

(CLFLUSH was introduced with SSE2, but has its own CPUID bit)

[1] https://github.com/dlespiau/x86db

Change-Id: Ic3af3028cb8d4f02e53fdebb9b30fb311f4ee454
Reviewed-on: https://go-review.googlesource.com/42814
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-07 17:00:58 +00:00