Commit Graph

504 Commits

Author SHA1 Message Date
fanzha02 9136d958ab cmd/asm: complete the support for VDUP on arm64
"VMOV Vn.<T>[index], Vn" is equivalent to "VDUP Vn.<T>[index], Vn", and
the latter has a higher priority in the disassembler than the former.
But the assembler doesn't support to encode this combination of VDUP,
this leads to an inconsistency between assembler and disassembler.

For example, if we assemble "VMOV V20.S[0], V20" to hex then decode it,
we'll get "VDUP V20.S[0], V20".

  VMOV V20.S[0], V20 -> 9406045e -> VDUP V20.S[0], V20 -> error

But we cannot assemble this VDUP again.

Similar reason for "VDUP Rn, Vd.<T>". This CL completes the support for
VDUP.

This patch is a copy of CL 276092. Co-authored-by: JunchenLi
<junchen.li@arm.com>

Change-Id: I8f8d86cf1911d5b16bb40d189f1dc34b24416aaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/302929
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-19 01:38:59 +00:00
Austin Clements eaa1ddee84 all: explode GOEXPERIMENT=regabi into 5 sub-experiments
This separates GOEXPERIMENT=regabi into five sub-experiments:
regabiwrappers, regabig, regabireflect, regabidefer, and regabiargs.
Setting GOEXPERIMENT=regabi now implies the working subset of these
(currently, regabiwrappers, regabig, and regabireflect).

This simplifies testing, helps derisk the register ABI project,
and will also help with performance comparisons.

This replaces the -abiwrap flag to the compiler and linker with
the regabiwrappers experiment.

As part of this, regabiargs now enables registers for all calls
in the compiler. Previously, this was statically disabled in
regabiEnabledForAllCompilation, but now that we can control it
independently, this isn't necessary.

For #40724.

Change-Id: I5171e60cda6789031f2ef034cc2e7c5d62459122
Reviewed-on: https://go-review.googlesource.com/c/go/+/302070
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-03-18 16:51:27 +00:00
Joel Sing c2d625168f cmd/compile,cmd/internal/obj/riscv: load >32-bit constants from memory for riscv64
Follow what MIPS does and load >32-bit constants from memory using two instructions,
rather than generating a four to six instruction sequence. This removes more than 2,500
instructions from the Go binary. This also makes it possible to load >32-bit constants
via a single assembly instruction, if required.

Change-Id: Ie679a0754071e6d8c52fe0d027f00eb241b3a758
Reviewed-on: https://go-review.googlesource.com/c/go/+/302609
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-18 04:17:00 +00:00
Keith Randall c870e86329 cmd/asm: when dynamic linking, reject code that uses a clobbered R15
The assember uses R15 as scratch space when assembling global variable
references in dynamically linked code. If the assembly code uses the
clobbered value of R15, report an error. The user is probably expecting
some other value in that register.

Getting rid of the R15 use isn't very practical (we could save a
register to a field in the G maybe, but that gets cumbersome).

Fixes #43661

Change-Id: I43f848a3d8b8a28931ec733386b85e6e9a42d8ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/283474
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-16 21:18:43 +00:00
Meng Zhuo e31e84010e cmd/asm: add rotr/drotr for mips64
This CL encodes:

ROTR rd, rt, sa
ROTRV rd, rt, rs

=> ROTR (SCON|REG), (REG,)? REG

DROTR rd, rt, sa
DROTR32 rd, rt, sa
DROTRV rd, rt, rs

=> ROTRV (SCON|REG), (REG,)? REG

Note: ROTRV will handle const over 32
Ref: The MIPS64® Instruction Set Reference Manual Revision 6.05
Change-Id: Ibe69f999b83eb43843d088cf1ac5a13c995269a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/280114
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-16 14:31:05 +00:00
fanzha02 a607408403 cmd/internal/obj/arm64: add support for op(extended register) with RSP arguments
Refer to ARM reference manual, like add(extended register) instructions,
the extension is encoded in the "option" field. If "Rd" or "Rn" is
RSP and "option" is "010" then LSL is preferred. Therefore, the instrution
"add Rm<<imm, RSP, RSP" or "add Rm<<imm RSP" is valid and can be encoded
as add(extended register) instruction.

But the current assembler can not handle like "op R1<<1, RSP, RSP"
instructions, this patch adds the support.

Because MVN(extended register) does not exist, remove it.

Add test cases.

Change-Id: I968749d75c6b93a4f297b39c73cc292e6b1035ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/284900
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-12 01:47:01 +00:00
Matthew Dempsky 7fc638d6f1 cmd: move GOEXPERIMENT knob from make.bash to cmd/go
This CL changes GOEXPERIMENT to act like other GO[CONFIG] environment
variables. Namely, that it can be set at make.bash time to provide a
default value used by the toolchain, but then can be manually set when
running either cmd/go or the individual tools (compiler, assembler,
linker).

For example, it's now possible to test rsc.io/tmp/fieldtrack by simply
running:

GOEXPERIMENT=fieldtrack go test -gcflags=-l rsc.io/tmp/fieldtrack \
  -ldflags=-k=rsc.io/tmp/fieldtrack.tracked

without needing to re-run make.bash. (-gcflags=-l is needed because
the compiler's inlining abilities have improved, so calling a function
with a for loop is no longer sufficient to suppress inlining.)

Fixes #42681.

Change-Id: I2cf8995d5d0d05f6785a2ee1d3b54b2cfb3331ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/300991
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-03-11 21:43:04 +00:00
Paul E. Murphy 48ddf70128 cmd/asm,cmd/compile: support 5 operand RLWNM/RLWMI on ppc64
These instructions are actually 5 argument opcodes as specified
by the ISA.  Prior to this patch, the MB and ME arguments were
merged into a single bitmask operand to workaround the limitations
of the ppc64 assembler backend.

This limitation no longer exists. Thus, we can pass operands for
these opcodes without having to merge the MB and ME arguments in
the assembler frontend or compiler backend.

Likewise, support for 4 operand variants is unchanged.

Change-Id: Ib086774f3581edeaadfd2190d652aaaa8a90daeb
Reviewed-on: https://go-review.googlesource.com/c/go/+/298750
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Trust: Carlos Eduardo Seo <carlos.seo@linaro.org>
2021-03-09 20:35:41 +00:00
eric fang 27dbc4551a cmd/asm: disable scaled register format for arm64
Arm64 doesn't have scaled register format, such as (R1*2), (R1)(R2*3),
but currently the assembler doesn't report an error for such kind of
instruction operand format. This CL disables the scaled register
operand format for arm64 and reports an error if this kind of instruction
format is seen.
With this CL, the assembler won't print (R1)(R2) as (R1)(R2*1), so that
we can make the assembly test simpler.

Change-Id: I6d7569065597215be4c767032a63648d2ad16fed
Reviewed-on: https://go-review.googlesource.com/c/go/+/289589
Trust: eric fang <eric.fang@arm.com>
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: eric fang <eric.fang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-04 01:31:21 +00:00
eric fang 355c3a037e cmd/internal/obj/asm64: add support for moving BITCON to RSP
Constant of BITCON type can be moved into RSP by MOVD or MOVW instructions
directly, this CL enables this format of these two instructions.

For 32-bit ADDWop instructions with constant, rewrite the high 32-bit
to be a repetition of the low 32-bit, just as ANDWop instructions do,
so that we can optimize ADDW $bitcon, Rn, Rt as:
MOVW $bitcon, Rtmp
ADDW Rtmp, Rn, Rt
The original code is:
MOVZ $bitcon_low, Rtmp
MOVK $bitcon_high,Rtmp
ADDW Rtmp, Rn, Rt

Change-Id: I30e71972bcfd6470a8b6e6ffbacaee79d523805a
Reviewed-on: https://go-review.googlesource.com/c/go/+/289649
Trust: eric fang <eric.fang@arm.com>
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: eric fang <eric.fang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-04 01:28:21 +00:00
eric fang 726d704c32 cmd/asm: add arm64 instructions VUMAX and VUMIN
This CL adds support for arm64 fp&simd instructions VUMAX and VUMIN.
Fixes #42326

Change-Id: I3757ba165dc31ce1ce70f3b06a9e5b94c14d2ab9
Reviewed-on: https://go-review.googlesource.com/c/go/+/271497
Trust: eric fang <eric.fang@arm.com>
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: fannie zhang <Fannie.Zhang@arm.com>
Reviewed-by: eric fang <eric.fang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-04 01:26:52 +00:00
eric fang 79beddc773 cmd/asm: add 128-bit FLDPQ and FSTPQ instructions for arm64
This CL adds assembly support for 128-bit FLDPQ and FSTPQ instructions.

This CL also deletes some wrong pre/post-indexed LDP and STP instructions,
such as {ALDP, C_UAUTO4K, C_NONE, C_NONE, C_PAIR, 74, 8, REGSP, 0, C_XPRE},
because when the offset type is C_UAUTO4K, pre and post don't work.

Change-Id: Ifd901d4440eb06eb9e86c9dd17518749fdf32848
Reviewed-on: https://go-review.googlesource.com/c/go/+/273668
Trust: eric fang <eric.fang@arm.com>
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: eric fang <eric.fang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-04 01:26:21 +00:00
Josh Bleecher Snyder a61524d103 cmd/internal/obj: add Prog.SetFrom3{Reg,Const}
These are the the most common uses, and they reduce line noise.

I don't love adding new deprecated APIs,
but since they're trivial wrappers,
it'll be very easy to update them along with the rest.
No functional changes; passes toolstash-check.

Change-Id: I691a8175cfef9081180e463c63f326376af3f3a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/296009
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-02-25 23:19:16 +00:00
Joel Sing 0398a771d2 cmd/internal/obj/riscv: prevent constant loads that do not target registers
Check that the target of a constant load is a register and add test coverage
for this error condition. While here, rename the RISC-V testdata and tests
to be consistent with other platforms.

Change-Id: I7fd0bfcee8cf9df0597d72e65cd74a2d0bfd349a
Reviewed-on: https://go-review.googlesource.com/c/go/+/292895
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-02-23 09:15:50 +00:00
Russ Cox 4dd77bdc91 cmd/asm, cmd/link, runtime: introduce FuncInfo flag bits
The runtime traceback code has its own definition of which functions
mark the top frame of a stack, separate from the TOPFRAME bits that
exist in the assembly and are passed along in DWARF information.
It's error-prone and redundant to have two different sources of truth.
This CL provides the actual TOPFRAME bits to the runtime, so that
the runtime can use those bits instead of reinventing its own category.

This CL also adds a new bit, SPWRITE, which marks functions that
write directly to SP (anything but adding and subtracting constants).
Such functions must stop a traceback, because the traceback has no
way to rederive the SP on entry. Again, the runtime has its own definition
which is mostly correct, but also missing some functions. During ordinary
goroutine context switches, such functions do not appear on the stack,
so the incompleteness in the runtime usually doesn't matter.
But profiling signals can arrive at any moment, and the runtime may
crash during traceback if it attempts to unwind an SP-writing frame
and gets out-of-sync with the actual stack. The runtime contains code
to try to detect likely candidates but again it is incomplete.
Deriving the SPWRITE bit automatically from the actual assembly code
provides the complete truth, and passing it to the runtime lets the
runtime use it.

This CL is part of a stack adding windows/arm64
support (#36439), intended to land in the Go 1.17 cycle.
This CL is, however, not windows/arm64-specific.
It is cleanup meant to make the port (and future ports) easier.

Change-Id: I227f53b23ac5b3dabfcc5e8ee3f00df4e113cf58
Reviewed-on: https://go-review.googlesource.com/c/go/+/288800
Trust: Russ Cox <rsc@golang.org>
Trust: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-19 00:02:30 +00:00
Cherry Zhang 397a46a10a [dev.regabi] cmd/asm: define g register on AMD64
Define g register as R14 on AMD64. It is not used now, but will
be in later CLs.

The name "R14" is still recognized.

Change-Id: I9a066b15bf1051113db8c6640605e350cea397b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/289195
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-02-05 17:34:26 +00:00
Dan Scales a956a0e909 [dev.regabi] cmd/compile, runtime: fix up comments/error messages from recent renames
Went in a semi-automated way through the clearest renames of functions,
and updated comments and error messages where it made sense.

Change-Id: Ied8e152b562b705da7f52f715991a77dab60da35
Reviewed-on: https://go-review.googlesource.com/c/go/+/284216
Trust: Dan Scales <danscales@google.com>
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-01-16 02:31:08 +00:00
Than McIntosh 56b783ad94 cmd/go, cmd/asm: pass -linkshared to assembler for shared linkage builds
When the -linkshared build mode is in effect, the Go command passes
the "-linkshared" command line option to the compiler so as to insure
special handling for things like builtin functions (which may appear
in a shared library and not the main executable). This patch extends
this behavior to the assembler, since the assembler may also wind up
referencing builtins when emitting a stack-split prolog.

Fixes #43107.

Change-Id: I56eaded79789b083f3c3d800fb140353dee33ba9
Reviewed-on: https://go-review.googlesource.com/c/go/+/276932
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-12-10 16:38:15 +00:00
Quey-Liang Kao 0433845ad1 cmd/asm, cmd/internal/obj/riscv: fix branch pseudo-instructions
Pseudo branch instructions BGT, BGTU, BLE, and BLEU implemented In
CL 226397 were translated inconsistently compared to other ones due
to the inversion of registers. For instance, while "BLT a, b" generates
"jump if a < b", "BLE a, b" generates "jump if b <= a."

This CL fixes the translation in the assembler and the tests.

Change-Id: Ia757be73e848734ca5b3a790e081f7c4f98c30f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/271911
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
2020-12-02 14:20:12 +00:00
Michael Munday 189931296f cmd/internal/obj/s390x: fix SYNC instruction encoding
SYNC is supposed to correspond to 'fast-BCR-serialization' which is
encoded as 'bcr 14,0'. In CL 197178 I accidentally modified the
encoding to 'bcr 7,0' which is a no-op. This CL reverses that change.

Fixes #42479.

Change-Id: I9918d93d720f5e12acc3014cde20d2d32cc87ee5
Reviewed-on: https://go-review.googlesource.com/c/go/+/268797
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Trust: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-11-10 15:48:04 +00:00
Paul E. Murphy 5ed81a3d14 cmd/asm: fix rlwnm reg,reg,const,reg encoding on ppc64
The wrong value for the first reg parameter was selected.
Likewise the wrong opcode was selected.  This should match
rlwnm (rrr type), not rlwinm (irr type).

Similarly, fix the optab matching rules so clrlslwi does
not match reg,reg,const,reg arguments.  This is not a valid
operand combination for clrlslwi.

Fixes #42368

Change-Id: I4eb16d45a760b9fd3f497ef9863f82465351d39f
Reviewed-on: https://go-review.googlesource.com/c/go/+/267421
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-11-04 21:26:23 +00:00
Jonathan Swinney 5f0fca1475 cmd/asm: rename arm64 instructions LDANDx to LDCLRx
The LDANDx instructions were misleading because they correspond to the
mnemonic LDCLRx as defined in the Arm Architecture Reference Manual for
Armv8. This changes the assembler to use the same mnemonic as the GNU
assembler and the manual.

The instruction has the form:

LDCLRx Rs, (Rb), Rt: *Rb -> Rt, Rs AND NOT(*Rb) -> *Rb

Change-Id: I94ae003e99e817209bba1afe960e612bf3a0b410
Reviewed-on: https://go-review.googlesource.com/c/go/+/267138
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: fannie zhang <Fannie.Zhang@arm.com>
2020-11-04 15:53:19 +00:00
Cherry Zhang fdba080220 cmd: remove Go115AMD64
Always do aligned jumps now.

Change-Id: If68a16fe93c9173c83323a9063465c9bd166eeb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/266857
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-11-02 03:02:35 +00:00
Joel Sing 8b51798304 cmd/asm: remove X27 and S11 register names on riscv64
The X27 register (known as S11 via its ABI name) is the g register on riscv64.
Prevent assembly from referring to it by either of these names.

Change-Id: Iba389eb8e44e097c0142c5b3d92e72bcae8a244a
Reviewed-on: https://go-review.googlesource.com/c/go/+/265519
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-29 08:00:50 +00:00
fanzha02 15131caeaa cmd/internal/obj/arm64: add CASx/CASPx instructions
This patch adds support for CASx and CASPx atomic instructions.

  go syntax                 gnu syntax
CASD Rs, (Rn|RSP), Rt => cas Xs, Xt, (Xn|SP)
CASALW Rs, (Rn|RSP), Rt => casal Ws, Wt, (Xn|SP)
CASPD (Rs, Rs+1), (Rn|RSP), (Rt, Rt+1) => casp Xs, Xs+1, Xt, Xt+1, (Xn|SP)
CASPW (Rs, Rs+1), (Rn|RSP), (Rt, Rt+1) => casp Ws, Ws+1, Wt, Wt+1, (Xn|SP)

This patch changes the type of prog.RestArgs from "[]Addr" to
"[]struct{Addr, Pos}", Pos is a enum, indicating the position of
the operand.

This patch also adds test cases.

Change-Id: Ib971cfda7890b7aa895d17bab22dea326c7fcaa4
Reviewed-on: https://go-review.googlesource.com/c/go/+/233277
Trust: fannie zhang <Fannie.Zhang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-29 05:07:11 +00:00
fanzha02 53efbdb12e cmd/asm: sort test cases in the arm64.s file
This patch sorts the test cases in the arm64.s file by instruction
category and deletes comments related to the old parser.

Change-Id: I9bbf56281e247a4fd8d5e670e8ad67c923aef1ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/263458
Trust: fannie zhang <Fannie.Zhang@arm.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-29 04:12:30 +00:00
fanzha02 3089ef6bd7 cmd/asm: add several arm64 SIMD instructions
This patch enables VSLI, VUADDW(2), VUSRA and FMOVQ SIMD instructions
required by the issue #40725. And the GNU syntax of 'FMOVQ' is 128-bit
ldr/str(immediate, simd&fp).

Add test cases.

Fixes #40725

Change-Id: Ide968ef4a9385ce4cd8f69bce854289014d30456
Reviewed-on: https://go-review.googlesource.com/c/go/+/258397
Trust: fannie zhang <Fannie.Zhang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-29 03:52:23 +00:00
Joel Sing 4a67825628 cmd/internal/obj/riscv: support additional register to register moves
Add support for signed and unsigned register to register moves of various
sizes. This makes it easier to handle zero and sign extension and will allow
for further changes that improve the compiler optimisations for riscv64.

While here, change the existing register to register moves from obj.Prog
rewriting to instruction generation.

Change-Id: Id21911019b76922367a134da13c3449a84a1fb08
Reviewed-on: https://go-review.googlesource.com/c/go/+/264657
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-24 08:12:11 +00:00
Than McIntosh d595712540 cmd/asm: rename "compiling runtime" flag
Rename the assembler "-compilingRuntime" flag to "-compiling-runtime",
to be more consistent with the flag style of other Go commands.

Change-Id: I8cc5cbf0b9b34d1dd4e9fa499d3fec8c1ef10b6e
Reviewed-on: https://go-review.googlesource.com/c/go/+/263857
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-20 17:47:31 +00:00
Than McIntosh 4d1cecdee8 cmd/dist,cmd/go: broaden use of asm macro GOEXPERIMENT_REGABI
This extends a change made in https://golang.org/cl/252258 to the go
command (to define an asm macro when GOEXPERIMENT=regabi is in
effect); we need this same macro during the bootstrap build in order
to build the runtime correctly.

In addition, expand the set of packages where the macro is applied to
{runtime, reflect, syscall, runtime/internal/*}, and move the logic
for deciding when something is a "runtime package" out of the
assembler and into cmd/{go,dist}, introducing a new assembler command
line flag instead.

Updates #27539, #40724.

Change-Id: Ifcc7f029f56873584de1e543c55b0d3e54ad6c49
Reviewed-on: https://go-review.googlesource.com/c/go/+/262317
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-19 19:27:54 +00:00
Than McIntosh dd58239dd2 cmd/asm: allow def/ref of func ABI when compiling runtime
Function symbols defined and referenced by assembly source currently
always default to ABI0; this patch adds preliminary support for
accepting an explicit ABI selector clause for func defs/refs. This
functionality is currently only enabled when compiling runtime-related
packages (runtime, syscall, reflect). Examples:

  TEXT ·DefinedAbi0Symbol<ABI0>(SB),NOSPLIT,$0
        RET

  TEXT ·DefinedAbi1Symbol<ABIInternal>(SB),NOSPLIT,$0
        CALL    ·AbiZerolSym<ABI0>(SB)
	...
        JMP     ·AbiInternalSym<ABIInternal>(SB)
        RET

Also included is a small change to the code in the compiler that reads
the symabis file emitted by the assembler.

New behavior is currently gated under GOEXPERIMENT=regabi.

Updates #27539, #40724.

Change-Id: Ia22221fe26df0fa002191cfb13bdfaaa38d7df38
Reviewed-on: https://go-review.googlesource.com/c/go/+/260477
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Trust: Than McIntosh <thanm@google.com>
2020-10-19 11:25:35 +00:00
Lynn Boger e981936855 cmd/internal/obj/ppc64,cmd/asm/internal/asm/testdata: fix up ppc64 testcases
When a fix was made at the end of the last release related to
NOPs, it was discovered that the ppc64.s testcase was out of date
and contained comments that weren't being processed. Essentially the
instructions in that test were being assembled but there was no
verification that the encodings weres correct. The ppc64enc.s file
was mostly complete and included the valid encodings for verification.
This change moves ppc64enc.s to ppc64.s and adds the instructions
that were missing.

This also adds a minor fix to asm9.go on the assembly of the
addex that was discovered during this testing.

Change-Id: Iaada1563b137849ad195fa88f32ecc9ab3e1e95f
Reviewed-on: https://go-review.googlesource.com/c/go/+/260217
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-16 12:48:42 +00:00
Russ Cox 912262b806 cmd/internal/obj: move LSym.Func into LSym.Extra
This creates space for a different kind of extension field
in LSym without making the struct any larger.
(There are many LSym, so we care about keeping the struct small.)

Change-Id: Ib16edb9e15f54c2a7351c8b875e19684058711e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/243943
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-16 03:02:36 +00:00
Russ Cox 85f829deb8 cmd/asm: reject misplaced go:build comments
We are converting from using error-prone ad-hoc syntax // +build lines
to less error-prone, standard boolean syntax //go:build lines.
The timeline is:

Go 1.16: prepare for transition
 - Builds still use // +build for file selection.
 - Source files may not contain //go:build without // +build.
 - Builds fail when a source file contains //go:build lines without // +build lines. <<<

Go 1.17: start transition
 - Builds prefer //go:build for file selection, falling back to // +build
   for files containing only // +build.
 - Source files may contain //go:build without // +build (but they won't build with Go 1.16).
 - Gofmt moves //go:build and // +build lines to proper file locations.
 - Gofmt introduces //go:build lines into files with only // +build lines.
 - Go vet rejects files with mismatched //go:build and // +build lines.

Go 1.18: complete transition
 - Go fix removes // +build lines, leaving behind equivalent // +build lines.

This CL provides part of the <<< marked line above in the Go 1.16 step:
rejecting files containing //go:build but not // +build.

Reject any //go:build comments found after actual assembler code
(include #include etc directives), because the go command itself
doesn't read that far.

For #41184.

Change-Id: Ib460bfd380cce4239993980dd208afd07deff3f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/240602
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-10-13 01:16:56 +00:00
Meng Zhuo 3036b76df0 cmd/asm: Add SHA3 hardware instructions for ARM64
Armv8.2-SHA introduced four SHA3-related instructions

EOR3 <Vd>.16B, <Vn>.16B, <Vm>.16B, <Va>.16B
RAX1 <Vd>.2D, <Vn>.2D, <Vm>.2D
XAR <Vd>.2D, <Vn>.2D, <Vm>.2D, #<imm6>
BCAX <Vd>.16B, <Vn>.16B, <Vm>.16B, <Va>.16B

We convert them into Go asm style as:

VEOR3 <Va>.B16, <Vm>.B16, <Vn>.B16, <Vd>.B16
VRAX1 <Vm>.D2, <Vn>.D2, <Vd>.D2
VXAR $imm6, <Vm>.D2, <Vn>.D2, <Vd>.D2
VBCAX <Va>.B16, <Vm>.B16, <Vn>.B16, <Vd>.B16

Armv8 Reference Manual:
* EOR3 (Three-way Exclusive OR) on C7.2.42
* RAX1 (Rotate and Exclusive OR) on C7.2.217
* XAR (Exclusive OR and Rotate) on C7.2.401
* BCAX (Bit Clear and Exclusive OR) on C7.2.12

Change-Id: I9a5d1b5ad508ed8fd5289d535906c54d9a63ca5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/180757
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Trust: Emmanuel Odeke <emm.odeke@gmail.com>
2020-10-10 16:06:07 +00:00
Lynn Boger bdab5df40f cmd/compile,cmd/internal/obj/ppc64: use mulli where possible
This adds support to allow the use of mulli when one of the multiply
operands is a constant that fits in 16 bits.

This especially helps in the case where this instruction appears in
a loop since the load of the constant is not being moved out of the loop.

Some improvements seen in compress/flate on power9:

Decode/Digits/Huffman/1e4         259µs ± 0%     261µs ± 0%   +0.57%  (p=1.000 n=1+1)
Decode/Digits/Huffman/1e5        2.43ms ± 0%    2.45ms ± 0%   +0.79%  (p=1.000 n=1+1)
Decode/Digits/Huffman/1e6        23.9ms ± 0%    24.2ms ± 0%   +0.86%  (p=1.000 n=1+1)
Decode/Digits/Speed/1e4           278µs ± 0%     279µs ± 0%   +0.34%  (p=1.000 n=1+1)
Decode/Digits/Speed/1e5          2.80ms ± 0%    2.81ms ± 0%   +0.29%  (p=1.000 n=1+1)
Decode/Digits/Speed/1e6          28.0ms ± 0%    28.1ms ± 0%   +0.28%  (p=1.000 n=1+1)
Decode/Digits/Default/1e4         278µs ± 0%     278µs ± 0%   +0.28%  (p=1.000 n=1+1)
Decode/Digits/Default/1e5        2.68ms ± 0%    2.69ms ± 0%   +0.19%  (p=1.000 n=1+1)
Decode/Digits/Default/1e6        26.6ms ± 0%    26.6ms ± 0%   +0.21%  (p=1.000 n=1+1)
Decode/Digits/Compression/1e4     278µs ± 0%     278µs ± 0%   +0.00%  (p=1.000 n=1+1)
Decode/Digits/Compression/1e5    2.68ms ± 0%    2.69ms ± 0%   +0.21%  (p=1.000 n=1+1)
Decode/Digits/Compression/1e6    26.6ms ± 0%    26.6ms ± 0%   +0.07%  (p=1.000 n=1+1)
Decode/Newton/Huffman/1e4         322µs ± 0%     312µs ± 0%   -2.84%  (p=1.000 n=1+1)
Decode/Newton/Huffman/1e5        3.11ms ± 0%    2.91ms ± 0%   -6.41%  (p=1.000 n=1+1)
Decode/Newton/Huffman/1e6        31.4ms ± 0%    29.3ms ± 0%   -6.85%  (p=1.000 n=1+1)
Decode/Newton/Speed/1e4           282µs ± 0%     269µs ± 0%   -4.69%  (p=1.000 n=1+1)
Decode/Newton/Speed/1e5          2.29ms ± 0%    2.20ms ± 0%   -4.13%  (p=1.000 n=1+1)
Decode/Newton/Speed/1e6          22.7ms ± 0%    21.3ms ± 0%   -6.06%  (p=1.000 n=1+1)
Decode/Newton/Default/1e4         254µs ± 0%     237µs ± 0%   -6.60%  (p=1.000 n=1+1)
Decode/Newton/Default/1e5        1.86ms ± 0%    1.75ms ± 0%   -5.99%  (p=1.000 n=1+1)
Decode/Newton/Default/1e6        18.1ms ± 0%    17.4ms ± 0%   -4.10%  (p=1.000 n=1+1)
Decode/Newton/Compression/1e4     254µs ± 0%     244µs ± 0%   -3.91%  (p=1.000 n=1+1)
Decode/Newton/Compression/1e5    1.85ms ± 0%    1.79ms ± 0%   -3.10%  (p=1.000 n=1+1)
Decode/Newton/Compression/1e6    18.0ms ± 0%    17.3ms ± 0%   -3.88%  (p=1.000 n=1+1)

Change-Id: I840320fab1c4bf64c76b001c2651ab79f23df4eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/259444
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-06 19:40:46 +00:00
Keith Randall fe2cfb74ba all: drop 387 support
My last 387 CL. So sad ... ... ... ... not!

Fixes #40255

Change-Id: I8d4ddb744b234b8adc735db2f7c3c7b6d8bbdfa4
Reviewed-on: https://go-review.googlesource.com/c/go/+/258957
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-02 00:00:51 +00:00
Lynn Boger cc2a5cf4b8 cmd/compile,cmd/internal/obj/ppc64: fix some shift rules due to a regression
A recent change to improve shifts was generating some
invalid cases when the rule was based on an AND. The
extended mnemonics CLRLSLDI and CLRLSLWI only allow
certain values for the operands and in the mask case
those values were not being checked properly. This
adds a check to those rules to verify that the
'b' and 'n' values used when an AND was part of the rule
have correct values.

There was a bug in some diag messages in asm9. The
message expected 3 values but only provided 2. Those are
corrected here also.

The test/codegen/shift.go was updated to add a few more
cases to check for the case mentioned here.

Some of the comments that mention the order of operands
in these extended mnemonics were wrong and those have been
corrected.

Fixes #41683.

Change-Id: If5bb860acaa5051b9e0cd80784b2868b85898c31
Reviewed-on: https://go-review.googlesource.com/c/go/+/258138
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-01 18:51:18 +00:00
Lynn Boger a424f6e45e cmd/asm,cmd/compile,cmd/internal/obj/ppc64: add extswsli support on power9
This adds support for the extswsli instruction which combines
extsw followed by a shift.

New benchmark demonstrates the improvement:
name      old time/op  new time/op  delta
ExtShift  1.34µs ± 0%  1.30µs ± 0%  -3.15%  (p=0.057 n=4+3)

Change-Id: I21b410676fdf15d20e0cbbaa75d7c6dcd3bbb7b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/257017
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-09-28 18:13:48 +00:00
fanzha02 fa04d488bd cmd/asm: fix the issue of moving 128-bit integers to vector registers on arm64
The CL 249758 added `FMOVQ $vcon, Vd` instruction and assembler used
128-bit simd literal-loading to load `$vcon` from pool into 128-bit vector
register `Vd`. Because Go does not have 128-bit integers for now, the
assembler will report an error of `immediate out of range` when
assembleing `FMOVQ $0x123456789abcdef0123456789abcdef, V0` instruction.

This patch lets 128-bit integers take two 64-bit operands, for the high
and low parts separately and adds `VMOVQ $hi, $lo, Vd` instruction to
move `$hi<<64+$lo' into 128-bit register `Vd`.

In addition, this patch renames `FMOVQ/FMOVD/FMOVS` ops to 'VMOVQ/VMOVD/VMOVS'
and uses them to move 128-bit, 64-bit and 32-bit constants into vector
registers, respectively

Update the go doc.

Fixes #40725

Change-Id: Ia3c83bb6463f104d2bee960905053a97299e0a3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/255900
Trust: fannie zhang <Fannie.Zhang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-25 01:47:40 +00:00
Than McIntosh 75fab04b83 cmd/asm: make asm -S flag consistent with compile -S flag
Change things so that the -S command line option for the assembler
works the same as -S in the compiler, e.g. you can use -S=2 to
get additional detail.

Change-Id: I7bdfba39a98e67c7ae4b93019e171b188bb99a2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/255717
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-09-17 23:22:25 +00:00
Lynn Boger 967465da29 cmd/compile: use combined shifts to improve array addressing on ppc64x
This change adds rules to find pairs of instructions that can
be combined into a single shifts. These instruction sequences
are common in array addressing within loops. Improvements can
be seen in many crypto packages and the hash packages.

These are based on the extended mnemonics found in the ISA
sections C.8.1 and C.8.2.

Some rules in PPC64.rules were moved because the ordering prevented
some matching.

The following results were generated on power9.

hash/crc32:
    CRC32/poly=Koopman/size=40/align=0          195ns ± 0%     163ns ± 0%  -16.41%
    CRC32/poly=Koopman/size=40/align=1          200ns ± 0%     163ns ± 0%  -18.50%
    CRC32/poly=Koopman/size=512/align=0        1.98µs ± 0%    1.67µs ± 0%  -15.46%
    CRC32/poly=Koopman/size=512/align=1        1.98µs ± 0%    1.69µs ± 0%  -14.80%
    CRC32/poly=Koopman/size=1kB/align=0        3.90µs ± 0%    3.31µs ± 0%  -15.27%
    CRC32/poly=Koopman/size=1kB/align=1        3.85µs ± 0%    3.31µs ± 0%  -14.15%
    CRC32/poly=Koopman/size=4kB/align=0        15.3µs ± 0%    13.1µs ± 0%  -14.22%
    CRC32/poly=Koopman/size=4kB/align=1        15.4µs ± 0%    13.1µs ± 0%  -14.79%
    CRC32/poly=Koopman/size=32kB/align=0        137µs ± 0%     105µs ± 0%  -23.56%
    CRC32/poly=Koopman/size=32kB/align=1        137µs ± 0%     105µs ± 0%  -23.53%

crypto/rc4:
    RC4_128    733ns ± 0%    650ns ± 0%  -11.32%  (p=1.000 n=1+1)
    RC4_1K    5.80µs ± 0%   5.17µs ± 0%  -10.89%  (p=1.000 n=1+1)
    RC4_8K    45.7µs ± 0%   40.8µs ± 0%  -10.73%  (p=1.000 n=1+1)

crypto/sha1:
    Hash8Bytes       635ns ± 0%     613ns ± 0%   -3.46%  (p=1.000 n=1+1)
    Hash320Bytes    2.30µs ± 0%    2.18µs ± 0%   -5.38%  (p=1.000 n=1+1)
    Hash1K          5.88µs ± 0%    5.38µs ± 0%   -8.62%  (p=1.000 n=1+1)
    Hash8K          42.0µs ± 0%    37.9µs ± 0%   -9.75%  (p=1.000 n=1+1)

There are other improvements found in golang.org/x/crypto which are all in the
range of 5-15%.

Change-Id: I193471fbcf674151ffe2edab212799d9b08dfb8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/252097
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
2020-09-17 12:37:40 +00:00
diaxu01 a86b6f23f0 cmd/internal/obj/arm64: optimize the instruction of moving long effective stack address
Currently, when the offset of "MOVD $offset(Rn), Rd" is a large positive
constant or a negative constant, the assembler will load this offset from
the constant pool.This patch gets rid of the constant pool by encoding the
offset into two ADD instructions if it's a large positive constant or one
SUB instruction if negative. For very large negative offset, it is rarely
used, here we don't optimize this case.

Optimized case 1: MOVD $-0x100000(R7), R0
Before: LDR 0x67670(constant pool), R27; ADD R27.UXTX, R0, R7
After: SUB $0x100000, R7, R0

Optimized case 2: MOVD $0x123468(R7), R0
Before: LDR 0x67670(constant pool), R27; ADD R27.UXTX, R0, R7
After: ADD $0x123000, R7, R27; ADD $0x000468, R27, R0

1. Binary size before/after.
binary                 size change
pkg/linux_arm64        +4KB
pkg/tool/linux_arm64   no change
go                     no change
gofmt                  no change

2. go1 benckmark.
name                      old time/op                new time/op                delta
pkg:test/bench/go1 goos:linux goarch:arm64
BinaryTree17-64           7335721401.800000ns +-40%  6264542009.800000ns +-14%    ~     (p=0.421 n=5+5)
Fannkuch11-64             3886551822.600000ns +- 0%  3875870590.200000ns +- 0%    ~     (p=0.151 n=5+5)
FmtFprintfEmpty-64                82.960000ns +- 1%          83.900000ns +- 2%  +1.13%  (p=0.048 n=5+5)
FmtFprintfString-64              149.200000ns +- 1%         148.000000ns +- 0%  -0.80%  (p=0.016 n=5+4)
FmtFprintfInt-64                 177.000000ns +- 0%         178.400000ns +- 2%    ~     (p=0.794 n=4+5)
FmtFprintfIntInt-64              240.200000ns +- 2%         239.400000ns +- 4%    ~     (p=0.302 n=5+5)
FmtFprintfPrefixedInt-64         300.400000ns +- 0%         299.200000ns +- 1%    ~     (p=0.119 n=5+5)
FmtFprintfFloat-64               360.000000ns +- 0%         361.600000ns +- 3%    ~     (p=0.349 n=4+5)
FmtManyArgs-64                  1064.400000ns +- 1%        1061.400000ns +- 0%    ~     (p=0.087 n=5+5)
GobDecode-64                12080404.400000ns +- 2%    11637601.000000ns +- 1%  -3.67%  (p=0.008 n=5+5)
GobEncode-64                 8474973.800000ns +- 2%     7977801.600000ns +- 2%  -5.87%  (p=0.008 n=5+5)
Gzip-64                    416501238.400000ns +- 0%   410463405.400000ns +- 0%  -1.45%  (p=0.008 n=5+5)
Gunzip-64                   58088415.200000ns +- 0%    58826209.600000ns +- 0%  +1.27%  (p=0.008 n=5+5)
HTTPClientServer-64           128660.200000ns +-23%      117840.800000ns +- 8%    ~     (p=0.222 n=5+5)
JSONEncode-64               17547746.800000ns +- 4%    17216180.000000ns +- 1%    ~     (p=0.222 n=5+5)
JSONDecode-64               80879896.000000ns +- 1%    80063737.200000ns +- 0%  -1.01%  (p=0.008 n=5+5)
Mandelbrot200-64             5484901.600000ns +- 0%     5483614.400000ns +- 0%    ~     (p=0.310 n=5+5)
GoParse-64                   6201166.800000ns +- 6%     6150920.600000ns +- 1%    ~     (p=0.548 n=5+5)
RegexpMatchEasy0_32-64           135.000000ns +- 0%         139.200000ns +- 7%    ~     (p=0.643 n=5+5)
RegexpMatchEasy0_1K-64           484.600000ns +- 2%         483.800000ns +- 2%    ~     (p=0.984 n=5+5)
RegexpMatchEasy1_32-64           128.000000ns +- 1%         124.600000ns +- 1%  -2.66%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K-64           769.400000ns +- 2%         761.400000ns +- 1%    ~     (p=0.460 n=5+5)
RegexpMatchMedium_32-64           12.900000ns +- 0%          12.500000ns +- 0%  -3.10%  (p=0.008 n=5+5)
RegexpMatchMedium_1K-64        57879.200000ns +- 1%       56512.200000ns +- 0%  -2.36%  (p=0.008 n=5+5)
RegexpMatchHard_32-64           3091.600000ns +- 1%        3071.000000ns +- 0%  -0.67%  (p=0.048 n=5+5)
RegexpMatchHard_1K-64          92941.200000ns +- 1%       92794.000000ns +- 0%    ~     (p=1.000 n=5+5)
Revcomp-64                1695605187.000000ns +-54%  1821697637.400000ns +-47%    ~     (p=1.000 n=5+5)
Template-64                112839686.800000ns +- 1%   109964069.200000ns +- 3%    ~     (p=0.095 n=5+5)
TimeParse-64                     587.000000ns +- 0%         587.000000ns +- 0%    ~     (all equal)
TimeFormat-64                    586.000000ns +- 1%         584.200000ns +- 1%    ~     (p=0.659 n=5+5)
[Geo mean]                      81804.262218ns             80694.712973ns       -1.36%

name                      old speed                  new speed                  delta
pkg:test/bench/go1 goos:linux goarch:arm64
GobDecode-64                         63.6MB/s +- 2%             66.0MB/s +- 1%  +3.78%  (p=0.008 n=5+5)
GobEncode-64                         90.6MB/s +- 2%             96.2MB/s +- 2%  +6.23%  (p=0.008 n=5+5)
Gzip-64                              46.6MB/s +- 0%             47.3MB/s +- 0%  +1.47%  (p=0.008 n=5+5)
Gunzip-64                             334MB/s +- 0%              330MB/s +- 0%  -1.25%  (p=0.008 n=5+5)
JSONEncode-64                         111MB/s +- 4%              113MB/s +- 1%    ~     (p=0.222 n=5+5)
JSONDecode-64                        24.0MB/s +- 1%             24.2MB/s +- 0%  +1.02%  (p=0.008 n=5+5)
GoParse-64                           9.35MB/s +- 6%             9.42MB/s +- 1%    ~     (p=0.571 n=5+5)
RegexpMatchEasy0_32-64                237MB/s +- 0%              231MB/s +- 7%    ~     (p=0.690 n=5+5)
RegexpMatchEasy0_1K-64               2.11GB/s +- 2%             2.12GB/s +- 2%    ~     (p=1.000 n=5+5)
RegexpMatchEasy1_32-64                250MB/s +- 1%              257MB/s +- 1%  +2.63%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K-64               1.33GB/s +- 2%             1.35GB/s +- 1%    ~     (p=0.548 n=5+5)
RegexpMatchMedium_32-64              77.6MB/s +- 0%             79.8MB/s +- 0%  +2.80%  (p=0.008 n=5+5)
RegexpMatchMedium_1K-64              17.7MB/s +- 1%             18.1MB/s +- 0%  +2.41%  (p=0.008 n=5+5)
RegexpMatchHard_32-64                10.4MB/s +- 1%             10.4MB/s +- 0%    ~     (p=0.056 n=5+5)
RegexpMatchHard_1K-64                11.0MB/s +- 1%             11.0MB/s +- 0%    ~     (p=0.984 n=5+5)
Revcomp-64                            188MB/s +-71%              155MB/s +-71%    ~     (p=1.000 n=5+5)
Template-64                          17.2MB/s +- 1%             17.7MB/s +- 3%    ~     (p=0.095 n=5+5)
[Geo mean]                            79.2MB/s                   79.3MB/s       +0.24%

Change-Id: I593ac3e7037afafc3605ad4b0cfb51d5dd88015d
Reviewed-on: https://go-review.googlesource.com/c/go/+/232438
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-16 14:56:18 +00:00
Junchen Li d7ab277eed cmd/asm: add more SIMD instructions on arm64
This CL adds USHLL, USHLL2, UZP1, UZP2, and BIF instructions requested
by #40725. And since UXTL* are aliases of USHLL*, this CL also merges
them into one case.

Updates #40725

Change-Id: I404a4fdaf953319f72eea548175bec1097a2a816
Reviewed-on: https://go-review.googlesource.com/c/go/+/253659
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-09-10 15:48:36 +00:00
fanzha02 dfdc3880b0 cmd/internal/obj/arm64: enable some SIMD instructions
Enable VBSL, VBIT, VCMTST, VUXTL VUXTL2 and FMOVQ SIMD
instructions required by the issue #40725. And FMOVQ
instrucion is used to move a large constant to a Vn
register.

Add test cases.

Fixes #40725

Change-Id: I1cac1922a0a0165d698a4b73a41f7a5f0a0ad549
Reviewed-on: https://go-review.googlesource.com/c/go/+/249758
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-10 02:22:19 +00:00
fanzha02 0e19aaabc0 cmd/asm: fix the error of checking the post-index offset of VLD[1-4]R instructions of arm64
The post-index offset of VLD[1-4]R instructions is decided by the
"size" field not "Q" field, the current assembler uses "Q" fileld
to check the correctness of post-index offset which is not correct.
This patch fixes it.

Fixes #40725

Change-Id: If1cde7f21c6b3ee0e491649eb567700bd1475c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/249757
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-07 03:28:25 +00:00
Xiangdong Ji 5d0b35ca98 cmd/asm: Always use go-style arrangement specifiers on ARM64
Fixing several error message and comment texts of the ARM64 assembler
to use arrangement specifiers of Go's assembly style.

Change-Id: Icdbb14fba7aaede40d57d0d754795b050366a1ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/237859
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-28 16:40:13 +00:00
Paul E. Murphy 64350f1eab cmd/asm,cmd/internal/obj/ppc64: add {l,st}xvx power9 instructions
These are the indexed vsx load operations with the
same endian and alignment benefits of {l,st}vx.

Likewise, cleanup redundant comments in op{load,store}x and
fix ISA 3.0 typos nearby.

Change-Id: Ie1ace17c6150cf9168a834e435114028ff6eb07c
Reviewed-on: https://go-review.googlesource.com/c/go/+/249025
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-08-19 17:54:18 +00:00
Meng Zhuo 93eeb819ca cmd/asm: Add SHA512 hardware instructions for ARM64
ARMv8.2-SHA add SHA512 intructions:

1. SHA512H	Vm.D2, Vn, Vd
2. SHA512H2	Vm.D2, Vn, Vd
3. SHA512SU0	Vn.D2, Vd.D2
4. SHA512SU1	Vm.D2, Vn.D2, Vd.D2

ARMv8 Architecture Reference Manual C7.2.234-C7.2.234

Change-Id: Ie970fef1bba5312ad466f246035da4c40a1bbb39
Reviewed-on: https://go-review.googlesource.com/c/go/+/180057
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-08-18 17:30:53 +00:00
Cherry Zhang 526d99a49a [dev.link] cmd/internal/obj: handle content-addressable symbols with relocations
For content-addressable symbols with relocations, we build a
content hash based on its content and relocations. Depending on
the category of the referenced symbol, we choose different hash
algorithms such that the hash is globally consistent.

For now, we only support content-addressable symbols with
relocations when the current package's import path is known, so
that the symbol names are fully expanded. Otherwise, if the
referenced symbol is a named symbol whose name is not fully
expanded, the hash won't be globally consistent, and can cause
erroneous collisions. This is fine for now, as the deduplication
is just an optimization, not a requirement for correctness (until
we get to type descriptors).

Change-Id: I639e4e03dd749b5d71f0a55c2525926575b1ac30
Reviewed-on: https://go-review.googlesource.com/c/go/+/243142
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
2020-07-20 17:26:32 +00:00