The comments in cmd/internal/obj/funcdata.go are identical to the
comments in runtime/funcdata.h, but the majority of the definitions
they refer to don't apply to Go sources and have been stripped out of
funcdata.go.
Remove these stale comments from funcdata.go and clean up the
references to other copies of the PCDATA and FUNCDATA indexes.
Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884
Reviewed-on: https://go-review.googlesource.com/37330
Reviewed-by: Keith Randall <khr@golang.org>
Fix the encoding of the SH field for rldimi.
The SH field of rldimi is 6-bit wide and it is not contiguous in the instruction.
Bits 0-4 are placed in bit fields 16-20 in the instruction, while bit 5 is
placed in bit field 30. The current implementation does not consider this and,
therefore, any SH field between 32 and 63 are encoded wrongly in the instruciton.
Change-Id: I4d25a0a70f4219569be0e18160dea5505bd7fff0
Reviewed-on: https://go-review.googlesource.com/37350
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Make the comments a bit clearer and more accurate,
in anticipation of updating the code.
Change-Id: I1111e6c3405a8688fcd29b809a48a762ff41edaa
Reviewed-on: https://go-review.googlesource.com/36833
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The s390x port was based on the ppc64 port and, because of the way the
port was done, inherited some instructions from it. ppc64 supports
3-operand (4-operand for FMADD etc.) floating point instructions
but s390x doesn't (the destination register is always an input) and
so these were emulated.
There is a bug in the emulation of FMADD whereby if the destination
register is also a source for the multiplication it will be
clobbered. This doesn't break any assembly code in the std lib but
could affect future work.
To fix this I have gone through the floating point instructions and
removed all unnecessary 3-/4-operand emulation. The compiler doesn't
need it and assembly writers don't need it, it's just a source of
bugs.
I've also deleted the FNMADD family of emulated instructions. They
aren't used anywhere.
Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04
Reviewed-on: https://go-review.googlesource.com/33350
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This change adds instructions from ISA 2.05, 2.06 and 2.07 that are frequently
used in assembly optimizations for ppc64.
It also fixes two problems:
* the implementation of RLDICR[CC]/RLDICL[CC] did not consider all possible
cases for the bit mask.
* removed two non-existing instructions that were added by mistake in the VMX
implementation (VORL/VANDL).
Change-Id: Iaef4e5c6a5240c2156c6c0f28ad3bcd8780e9830
Reviewed-on: https://go-review.googlesource.com/36230
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
In cmd/compile, we can directly construct obj.Auto to represent local
variables and attach them to the function's obj.LSym.
In preparation for being able to emit more precise DWARF info based on
other compiler available information (e.g., lexical scoping).
Change-Id: I9c4225ec59306bec42552838493022e0e9d70228
Reviewed-on: https://go-review.googlesource.com/36420
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This code is dead as a result of
* removing the Follow pass
* moving rotation detection from walk to ssa
Change-Id: I14599c85bedb4e3148347b547e724187920182c4
Reviewed-on: https://go-review.googlesource.com/36484
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Also adds tests for all missing VRI-a instructions (which may be
affected by this change).
Fixes#18749.
Change-Id: I48249dda626f32555da9ab58659e2e140de6504a
Reviewed-on: https://go-review.googlesource.com/35561
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
XPos is a compact (8 instead of 16 bytes on a 64bit machine) source
position representation. There is a 1:1 correspondence between each
XPos and each regular Pos, translated via a global table.
In some sense this brings back the LineHist, though positions can
track line and column information; there is a O(1) translation
between the representations (no binary search), and the translation
is factored out.
The size increase with the prior change is brought down again and
the compiler speed is in line with the master repo (measured on
the same "quiet" machine as for prior change):
name old time/op new time/op delta
Template 256ms ± 1% 262ms ± 2% ~ (p=0.063 n=5+4)
Unicode 132ms ± 1% 135ms ± 2% ~ (p=0.063 n=5+4)
GoTypes 891ms ± 1% 871ms ± 1% -2.28% (p=0.016 n=5+4)
Compiler 3.84s ± 2% 3.89s ± 2% ~ (p=0.413 n=5+4)
MakeBash 47.1s ± 1% 46.2s ± 2% ~ (p=0.095 n=5+5)
name old user-ns/op new user-ns/op delta
Template 309M ± 1% 314M ± 2% ~ (p=0.111 n=5+4)
Unicode 165M ± 1% 172M ± 9% ~ (p=0.151 n=5+5)
GoTypes 1.14G ± 2% 1.12G ± 1% ~ (p=0.063 n=5+4)
Compiler 5.00G ± 1% 4.96G ± 1% ~ (p=0.286 n=5+4)
Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47
Reviewed-on: https://go-review.googlesource.com/34506
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This replaces the src.Pos LineHist-based position tracking with
the syntax.Pos implementation and updates all uses.
The LineHist table is not used anymore - the respective code is still
there but should be removed eventually. CL forthcoming.
Passes toolstash -cmp when comparing to the master repo (with the
exception of a couple of swapped assembly instructions, likely due
to different instruction scheduling because the line-based sorting
has changed; though this is won't affect correctness).
The sizes of various important compiler data structures have increased
significantly (see the various sizes_test.go files); this is probably
the reason for an increase of compilation times (to be addressed). Here
are the results of compilebench -count 5, run on a "quiet" machine (no
apps running besides a terminal):
name old time/op new time/op delta
Template 256ms ± 1% 280ms ±15% +9.54% (p=0.008 n=5+5)
Unicode 132ms ± 1% 132ms ± 1% ~ (p=0.690 n=5+5)
GoTypes 891ms ± 1% 917ms ± 2% +2.88% (p=0.008 n=5+5)
Compiler 3.84s ± 2% 3.99s ± 2% +3.95% (p=0.016 n=5+5)
MakeBash 47.1s ± 1% 47.2s ± 2% ~ (p=0.841 n=5+5)
name old user-ns/op new user-ns/op delta
Template 309M ± 1% 326M ± 2% +5.18% (p=0.008 n=5+5)
Unicode 165M ± 1% 168M ± 4% ~ (p=0.421 n=5+5)
GoTypes 1.14G ± 2% 1.18G ± 1% +3.47% (p=0.008 n=5+5)
Compiler 5.00G ± 1% 5.16G ± 1% +3.12% (p=0.008 n=5+5)
Change-Id: I241c4246cdff627d7ecb95cac23060b38f9775ec
Reviewed-on: https://go-review.googlesource.com/34273
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Loop breaking with a counter. Benchmarked (see comments),
eyeball checked for sanity on popular loops. This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.
Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test. Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.
If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop. Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.
This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.
Updates #17831.
Updates #10958.
Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This is the simplest CL that I can make for Go 1.8. For Go 1.9, we can revisit it
and optimize the redundant address generation instructions or just fix#599 instead.
Fixes#18140.
Change-Id: Ie4804ab0e00dc6bb318da2bece8035c7c71caac3
Reviewed-on: https://go-review.googlesource.com/34193
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This will let us use the src.Pos struct to thread inlining
information through to obj.
Change-Id: I96a16d3531167396988df66ae70f0b729049cc82
Reviewed-on: https://go-review.googlesource.com/34195
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This file is entirely about the implementation of LineHist, and I can
never remember which generic filename in cmd/internal/obj has it.
Rename to line.go to match the already existing line_test.go.
Change-Id: Id01f3339dc550c9759569d5610d808b17bca44d0
Reviewed-on: https://go-review.googlesource.com/33803
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
func f() {
g()
}
We mistakenly don't add a frame pointer for f. This means f
isn't seen when walking the frame pointer linked list. That
matters for kernel-gathered profiles, and is an impediment for
issues like #16638.
To fix, allocate a stack frame even for otherwise frameless functions
like f. It is a bit tricky because we need to avoid some runtime
internals that really, really don't want one.
No test at the moment, as only kernel CPU profiles would catch it.
Tests will come with the implementation of #16638.
Fixes#18103
Change-Id: I411206cc9de4c8fdd265bee2e4fa61d161ad1847
Reviewed-on: https://go-review.googlesource.com/33754
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Introduce R_WEAKADDROFF, a "weak" variation of the R_ADDROFF relocation
that will only reference the type described if it is in some other way
reachable.
Use this for the ptrToThis field in reflect type information where it
is safe to do so (that is, types that don't need to be included for
interface satisfaction, and types that won't cause the compiler to
recursively generate an endless series of ptr-to-ptr-to-ptr-to...
types).
Also fix a small bug in reflect, where StructOf was not clearing the
ptrToThis field of new types.
Fixes#17931
Change-Id: I4d3b53cb9c916c97b3b16e367794eee142247281
Reviewed-on: https://go-review.googlesource.com/33427
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The assembler backend fixes too-far conditional branches, but only
for BEQ and like. Add a case for CBZ and like.
Fixes#17925.
Change-Id: Ie516e6c5ca165b582367283a0110f7081e00c214
Reviewed-on: https://go-review.googlesource.com/33304
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Materialize float constant 0 from integer zero register, instead
of loading from constant pool.
Also fix assembling FMOV from zero register to FP register.
Change-Id: Ie413dd342cedebdb95ba8cfc220e23ed2a39e885
Reviewed-on: https://go-review.googlesource.com/32250
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Materialize float constant 0 from integer zero register, instead
of loading from constant pool.
Change-Id: Ie4728895b9d617bec2a29d15729c0efaa10eedbb
Reviewed-on: https://go-review.googlesource.com/32109
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
The current implementation for Power architecture does not include the vector
scalar (VSX) registers. This adds the 63 VSX registers and the most commonly
used instructions: load/store VSX vector/scalar, move to/from VSR, logical
operations, select, merge, splat, permute, shift, FP-FP conversion, FP-integer
conversion and integer-FP conversion.
Change-Id: I0f7572d2359fe7f3ea0124a1eb1b0bebab33649e
Reviewed-on: https://go-review.googlesource.com/30510
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The runtime traceback code assumes non-empty frame has link
link register saved on LR architectures. Make sure it is so in
the assember.
Also make sure that LR is stored before update SP, so the traceback
code will not see a half-updated stack frame if a signal comes
during the execution of function prologue.
Fixes#17381.
Change-Id: I668b04501999b7f9b080275a2d1f8a57029cbbb3
Reviewed-on: https://go-review.googlesource.com/31760
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Instead of generating typelink symbols in the compiler
mark types that should have typelinks with a flag.
The linker detects this flag and adds the marked types
to the typelink table.
name old s/op new s/op delta
LinkCmdCompile 0.27 ± 6% 0.25 ± 6% -6.93% (p=0.000 n=97+98)
LinkCmdGo 0.30 ± 5% 0.29 ±10% -4.22% (p=0.000 n=97+99)
name old MaxRSS new MaxRSS delta
LinkCmdCompile 112k ± 3% 106k ± 2% -4.85% (p=0.000 n=100+100)
LinkCmdGo 107k ± 3% 103k ± 3% -3.00% (p=0.000 n=100+100)
Change-Id: Ic95dd4b0101e90c1fa262c9c6c03a2028d6b3623
Reviewed-on: https://go-review.googlesource.com/31772
Run-TryBot: Shahar Kohanim <skohanim@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Some original shift opcodes for ppc64x expected an operand to be
a mask instead of a shift count, preventing some valid shift counts
from being written.
This adds new opcodes for shifts where needed, using mnemonics that
match the ppc64 asm and allowing the assembler to accept the full set
of valid shift counts.
Fixes#15016
Change-Id: Id573489f852038d06def279c13fd0523736878a7
Reviewed-on: https://go-review.googlesource.com/31853
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
These are emulated by the assembler and we don't need them.
Change-Id: I2b07c5315a5b642fdb5e50b468453260ae121164
Reviewed-on: https://go-review.googlesource.com/31758
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This prevents the traceback code from seeing a half-updated
stack frame when a profiling signal comes during the execution
of function prologue. Also fixes mips64x part of #17381.
Change-Id: Iec9683427e546e3648b2e8b1dde956d13f6eb938
Reviewed-on: https://go-review.googlesource.com/31721
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The Gotype field is only used for ATYPE instructions. Instead of
specially storing the Go type symbol in From.Gotype, just store it in
To.Sym like any other 2-argument instruction would.
Modest reduction in allocations:
name old alloc/op new alloc/op delta
Template 42.0MB ± 0% 41.8MB ± 0% -0.40% (p=0.000 n=9+10)
Unicode 34.3MB ± 0% 34.1MB ± 0% -0.48% (p=0.000 n=9+10)
GoTypes 122MB ± 0% 122MB ± 0% -0.14% (p=0.000 n=9+10)
Compiler 518MB ± 0% 518MB ± 0% -0.04% (p=0.000 n=9+10)
Passes toolstash -cmp.
Change-Id: I0e603266b5d7d4e405106a26369e22773a0d3a91
Reviewed-on: https://go-review.googlesource.com/31850
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The m5 and m6 fields were the wrong way round.
Fixes#17444.
Change-Id: I10297064f2cd09d037eac581c96a011358f70aae
Reviewed-on: https://go-review.googlesource.com/31130
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Just happened to notice that these names (funcAlign and friends) are
never referenced outside their package, so no need to export them.
Change-Id: I4bbdaa4b0ef330c3c3ef50a2ca39593977a83545
Reviewed-on: https://go-review.googlesource.com/31496
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
This change omits the stack check on ppc64 and s390x when the size of
a stack frame is less than obj.StackSmall. This is an optimization
x86 already performs.
The effect on s390x isn't huge because we were already omitting the
stack check when the frame size was 0 (it shaves about 1K from the
size of bin/go). On ppc64 however this change reduces the size of the
.text section in bin/go by 33K (1%).
Updates #13379 (for ppc64).
Change-Id: I6af0eb987646bea47fcaf0a812db3496bab0f680
Reviewed-on: https://go-review.googlesource.com/31357
Reviewed-by: David Chase <drchase@google.com>
Adds the new canMergeLoad function which can be used by rules to
decide whether a load can be merged into an operation. The function
ensures that the merge will not reorder the load relative to memory
operations (for example, stores) in such a way that the block can no
longer be scheduled.
This new function enables transformations such as:
MOVD 0(R1), R2
ADD R2, R3
to:
ADD 0(R1), R3
The two-operand form of the following instructions can now read a
single memory operand:
- ADD
- ADDC
- ADDW
- MULLD
- MULLW
- SUB
- SUBC
- SUBE
- SUBW
- AND
- ANDW
- OR
- ORW
- XOR
- XORW
Improves SHA3 performance by 6-8%.
Updates #15054.
Change-Id: Ibcb9122126cd1a26f2c01c0dfdbb42fe5e7b5b94
Reviewed-on: https://go-review.googlesource.com/29272
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This improves the performance for byte.Compare by rewriting
the cmpbody function in runtime/asm_ppc64x.s. The previous code
had a simple loop which loaded a pair of bytes and compared them,
which is inefficient for long buffers. The updated function checks
for 8 or 32 byte chunks and then loads and compares double words where
possible.
Because the byte.Compare result indicates greater or less than,
the doubleword loads must take endianness into account, using a
byte reversed load in the little endian case.
Fixes#17433
benchmark old ns/op new ns/op delta
BenchmarkBytesCompare/8-16 13.6 7.16 -47.35%
BenchmarkBytesCompare/16-16 25.7 7.83 -69.53%
BenchmarkBytesCompare/32-16 38.1 7.78 -79.58%
BenchmarkBytesCompare/64-16 63.0 10.6 -83.17%
BenchmarkBytesCompare/128-16 112 13.0 -88.39%
BenchmarkBytesCompare/256-16 211 28.1 -86.68%
BenchmarkBytesCompare/512-16 410 38.6 -90.59%
BenchmarkBytesCompare/1024-16 807 60.2 -92.54%
BenchmarkBytesCompare/2048-16 1601 103 -93.57%
Change-Id: I121acc74fcd27c430797647b8d682eb0607c63eb
Reviewed-on: https://go-review.googlesource.com/30949
Reviewed-by: David Chase <drchase@google.com>
Some of the branch instructions (BEQ, BNE, BLT, etc.) accept
all the valid CR values as operands, but the CR register value is
not parsed and not put into the instruction, so that CR0 is always
used regardless of what was specified on the instruction. For example
BEQ CR2,label becomes beq cr0,label.
This adds the change to the PPC64 assembler to recognize the CR value
and set the approppriate field in the instruction so the correct
CR is used. This also adds some general comments on the branch
instruction BC and its operand values.
Fixes#17408
Change-Id: I8e956372a42846a4c09a7259e9172eaa29118e71
Reviewed-on: https://go-review.googlesource.com/30930
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
ARM direct CALL/JMP instruction has 24 bit offset, which can only
encodes jumps within +/-32M. When the target is too far, the top
bits get truncated and the program jumps wild.
This CL detects too-far jumps and automatically insert trampolines,
currently only internal linking on ARM.
It is necessary to make the following changes to the linker:
- Resolve direct jump relocs when assigning addresses to functions.
this allows trampoline insertion without moving all code that
already laid down.
- Lay down packages in dependency order, so that when resolving a
inter-package direct jump reloc, the target address is already
known. Intra-package jumps are assumed never too far.
- a linker flag -debugtramp is added for debugging trampolines:
"-debugtramp=1 -v" prints trampoline debug message
"-debugtramp=2" forces all inter-package jump to use
trampolines (currently ARM only)
"-debugtramp=2 -v" does both
- Some data structures are changed for bookkeeping.
On ARM, pseudo DIV/DIVU/MOD/MODU instructions now clobber R8
(unfortunate). In the standard library there is no ARM assembly
code that uses these instructions, and the compiler no longer emits
them (CL 29390).
all.bash passes with -debugtramp=2, except a disassembly test (this
is unavoidable as we changed the instruction).
TBD: debug info of trampolines?
Fixes#17028.
Change-Id: Idcce347ea7e0af77c4079041a160b2f6e114b474
Reviewed-on: https://go-review.googlesource.com/29397
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This CL removes support for MOVD instructions that store the address
of a global variable. For example:
MOVD $main·a(SB), (R1)
MOVD $main·b(SB), main·c(SB)
These instructions are emulated and the new backend doesn't need them
(the stores now always go through an intermediate register).
Change-Id: I3a1bcb3f19c5096ad0426afd76d35a4d7975733b
Reviewed-on: https://go-review.googlesource.com/30720
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
With this change, the code in bug #15609 compiles and runs properly:
0000000000401070 <main.jump>:
401070: ff 15 aa 7e 06 00 callq *0x67eaa(%rip) # 468f20 <main.pointer>
401076: c3 retq
0000000000468f20 g O .rodata 0000000000000008 main.pointer
Fixes#15609
Change-Id: Iebb4d5a9f9fff335b693f4efcc97882fe04eefd7
Reviewed-on: https://go-review.googlesource.com/22950
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TESTB was implemented as AND $0xff, Rx, REGTMP. Unfortunately there
is no 3-operand AND-with-immediate instruction and so it was emulated
by the assembler using two instructions.
This CL uses CMPW instead of AND and also optimizes CMPW to use
the chi instruction where possible.
Overall this CL reduces the size of the .text section of the
bin/go binary by ~2%.
Change-Id: Ic335c29fc1129378fcbb1265bfb10f5b744a0f3f
Reviewed-on: https://go-review.googlesource.com/30690
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Deletes the following s390x instructions:
- ADDME
- ADDZE
- SUBME
- SUBZE
They appear to be emulated PPC instructions left over from the
porting process and I don't think they will ever be useful.
Change-Id: I9b1ba78019dbd1218d0c8f8ea2903878802d1990
Reviewed-on: https://go-review.googlesource.com/30538
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Adds the following instructions and uses them in the SSA backend:
- ANDW
- ORW
- XORW
The instruction encodings for 32-bit operations are typically shorter,
particularly when an immediate is used. For example, XORW $-1, R1
only requires one instruction, whereas XOR requires two.
Also removes some unused instructions (that were emulated):
- ANDN
- NAND
- ORN
- NOR
Change-Id: Iff2a16f52004ba498720034e354be9771b10cac4
Reviewed-on: https://go-review.googlesource.com/30291
Reviewed-by: Cherry Zhang <cherryyz@google.com>
These are conditional branches that takes a register instead of
flags as control value.
Reduce binary size by 0.7%, text size by 2.4% (cmd/go as an
exmaple).
Change-Id: I0020cfde745f9eab680b8b949ad28c87fe183afd
Reviewed-on: https://go-review.googlesource.com/30030
Reviewed-by: David Chase <drchase@google.com>
In the function prologue, we emit a jump to the beginning of
the function immediately after calling morestack. And in the
runtime stack growing code, it decodes and emulates that jump.
This emulation was necessary before we had per-PC SP deltas,
since the traceback code assumed that the frame size was fixed
for the whole function, except on the first instruction where
it was 0. Since we now have per-PC SP deltas and PCDATA, we
can correctly record that the frame size is 0. This makes the
emulation unnecessary.
This may be helpful for registerized calling convention, where
there may be unspills of arguments after calling morestack. It
also simplifies the runtime.
Change-Id: I7ebee31eaee81795445b33f521ab6a79624c4ceb
Reviewed-on: https://go-review.googlesource.com/30138
Reviewed-by: David Chase <drchase@google.com>
When offset < 0 and -offset fits in instruction, generate SUB
instruction, instead of ADD with constant from the pool.
Fixes#13280.
Change-Id: I57d97fe9300fe1f6554365e2262393ef50acbdd3
Reviewed-on: https://go-review.googlesource.com/30014
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Remove duplicate vars, commented out code and duplicate lines.
When choosing between 2 aliases, on with more uses was chosen.
Change-Id: I7bc15f1693de3f6d378cef9c09469970a659db40
Reviewed-on: https://go-review.googlesource.com/29152
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Use the A{,G}HI instructions where possible (4 bytes instead of 6 bytes
for A{,G}FI). Also, use 32-bit operations where appropriate for
multiplication.
Change-Id: I4041781cda26be52b54e4804a9e71552310762d0
Reviewed-on: https://go-review.googlesource.com/29733
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This adds the instructions frim, frip, and friz to the ppc64x
assembler for use in implementing the math.Floor, math.Ceil, and
math.Trunc functions to improve performance.
Fixes#17185
BenchmarkCeil-128 21.4 6.99 -67.34%
BenchmarkFloor-128 13.9 6.37 -54.17%
BenchmarkTrunc-128 12.7 6.33 -50.16%
Change-Id: I96131bd4e8c9c8dbafb25bfeb544cf9d2dbb4282
Reviewed-on: https://go-review.googlesource.com/29654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Abandoned earlier efforts to expose zero register,
but left it in numbering to decrease squirrelyness of
register allocator.
ISELrelOp used in code generation of bool := x relOp y.
Some patterns added to better elide zero case and
some sign extension.
Updates: #17109
Change-Id: Ida7839f0023ca8f0ffddc0545f0ac269e65b05d9
Reviewed-on: https://go-review.googlesource.com/29380
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This makes it possible for cmd/compile, when run with -dynlink on
darwin/amd64, to generate TLS_LE relocations which the linker then
turns into the appropriate PC-relative GOT load.
Change-Id: I1a71da432608bdb108ff66c22de600100209c873
Reviewed-on: https://go-review.googlesource.com/29393
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Also adds the 'find leftmost one' instruction (FLOGR) and replaces the
WORD-encoded use of FLOGR in math/big with it.
Change-Id: I18e7cd19e75b8501a6ae8bd925471f7e37ded206
Reviewed-on: https://go-review.googlesource.com/29372
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The current implementation for Power architecture does not include the vector
(Altivec) registers. This adds the 32 VMX registers and the most commonly used
instructions: X-form loads/stores; VX-form logical operations, add/sub,
rotate/shift, count, splat, SHA Sigma and AES cipher; VC-form compare; and
VA-form permute, shift, add/sub and select.
Fixes#15619
Change-Id: I544b990631726e8fdfcce8ecca0aeeb72faae9aa
Reviewed-on: https://go-review.googlesource.com/25600
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
Keep Plists in a slice instead of a linked list.
Eliminate unnecessary fields.
Also, while here remove gc's unused breakpc and continpc vars.
Change-Id: Ia04264036c0442843869965d247ccf68a5295115
Reviewed-on: https://go-review.googlesource.com/29367
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Replace the AGLOBL pseudo-op with a method to directly register an
LSym as a global. Similar to how we previously already replaced the
ADATA pseudo-op with directly writing out data bytes.
Passes toolstash -cmp.
Change-Id: I3631af0a2ab5798152d0c26b833dc309dbec5772
Reviewed-on: https://go-review.googlesource.com/29366
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Since the legacy backends were removed, these fields are write-only.
Change-Id: I4816c39267b7c10a4da2a6d22cd367dc475e564d
Reviewed-on: https://go-review.googlesource.com/29246
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
Created by running 'go generate'.
That made debugging fun today.
Change-Id: I9ffe00877851f2b198275220ad6058b9005daa72
Reviewed-on: https://go-review.googlesource.com/29117
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This adds the support for the ppc64le isel instruction so
it can be used by SSA.
Fixed#16771
Change-Id: Ia2517f0834ff5e7ad927e218b84493e0106ab4a7
Reviewed-on: https://go-review.googlesource.com/28611
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Also add assembly implementation, in case intrinsics is disabled.
Change-Id: Iff0a8a8ce326651bd29f6c403f5ec08dd3629993
Reviewed-on: https://go-review.googlesource.com/28979
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This commit adds the following instructions to support the new SSA
backend for s390x:
32-bit operations:
ADDW
SUBW
NEGW
FNEGS
Conditional moves:
MOVDEQ
MOVDGE
MOVDGT
MOVDLE
MOVDLT
MOVDNE
Unordered branches (for floating point comparisons):
BLEU
BLTU
Modulo operations:
MODW
MODWU
MODD
MODDU
The modulo operations might be removed in a future commit because
I'd like to change DIV to produce a tuple once the old backend is
removed.
This commit also removes uses of REGZERO from the assembler. They
aren't necessary and R0 will be used as a GPR by SSA.
Change-Id: I05756c1cbb74bf4a35fc492f8f0cd34b50763dc9
Reviewed-on: https://go-review.googlesource.com/29075
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
When internally linking with using rel.ro sections, this segment covers
the sections. To do this, move to other read-only sections, SELFROSECT
and SMACHOPLT, out of the way.
Part of adding PIE internal linking on linux/amd64.
Change-Id: I4fb3d180e92f7e801789ab89864010faf5a2cb6d
Reviewed-on: https://go-review.googlesource.com/28538
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Moves the grouping of symbol kinds (sections) into cmd/internal/obj
to keep it near the definition. Groundwork for CL 28538.
Change-Id: I99112981e69b028f366e1333f31cd7defd4ff82c
Reviewed-on: https://go-review.googlesource.com/28691
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
As cmd/internal/obj is coordinating the definition of GOOS, GOARCH,
etc across the compiler and linker, turn its functions into globals
and use them everywhere.
Change-Id: I5db5addda3c6b6435c37fd5581c7c3d9a561f492
Reviewed-on: https://go-review.googlesource.com/28854
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Separate out windows/windowsgui properly so we are not encoding some
of the Headtype value in a separate headstring global.
Remove one of the two copies of the variable from cmd/link.
Remove duplicate string to headtype list.
Change-Id: Ifa20fb9652a1dc95161e154aac11f15ad0f709d0
Reviewed-on: https://go-review.googlesource.com/28853
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This partially reverts commit 4e24e1d999.
Since in release 1.7 VPSHUFD support negative constant as an argument,
removing it as part of 4e24e1d999 was wrong.
Add it back.
Change-Id: Id1a3e062fe8fb4cf538edb3f9970f0664f3f545f
Reviewed-on: https://go-review.googlesource.com/27712
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
On ARM64, MIPS64, and PPC64, some floating point registers were
reserved for constants 0, 1, 2, 0.5, etc. This CL removes them.
On ARM64, they are never used. On MIPS64 and PPC64, the only use
case is a multiplication-by-2 in the old backend of the compiler,
which is replaced with an addition.
Change-Id: I737cbf43283756e3408964fc88c567a938c57036
Reviewed-on: https://go-review.googlesource.com/28095
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This change makes sure that tests are run with the correct
version of the go tool. The correct version is the one that
we invoked with "go test", not the one that is first in our path.
Fixes#16577
Change-Id: If22c8f8c3ec9e7c35d094362873819f2fbb8559b
Reviewed-on: https://go-review.googlesource.com/28089
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The existing implementation used a pure go implementation, leading to slow
cryptographic performance.
Implemented mulWW, subVV, mulAddVWW, addMulVVW, and bitLen for
ppc64{le}.
Implemented divWW for ppc64le only, as the DIVDEU instruction is only
available on Power8 or newer.
benchcmp output:
benchmark old ns/op new ns/op delta
BenchmarkSignP384 28934360 10877330 -62.41%
BenchmarkRSA2048Decrypt 41261033 5139930 -87.54%
BenchmarkRSA2048Sign 45231300 7610985 -83.17%
Benchmark3PrimeRSA2048Decrypt 20487300 2481408 -87.89%
Fixes#16621
Change-Id: If8b68963bb49909bde832f2bda08a3791c4f5b7a
Reviewed-on: https://go-review.googlesource.com/26951
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Fixes#16911.
Fix obsolete inferno-os links, since code.google.com shutdown.
This CL points to the right files by replacing
http://code.google.com/p/inferno-os/source/browse
with
https://bitbucket.org/inferno-os/inferno-os/src/default
To implement the change I wrote and ran this script in the root:
$ grep -Rn 'http://code.google.com/p/inferno-os/source/browse' * \
| cut -d":" -f1 | while read F;do perl -pi -e \
's/http:\/\/code.google.com\/p\/inferno-os\/source\/browse/https:\/\/bitbucket.org\/inferno-os\/inferno-os\/src\/default/g'
$F;done
I excluded any cmd/vendor changes from the commit.
Change-Id: Iaaf828ac8f6fc949019fd01832989d00b29b6749
Reviewed-on: https://go-review.googlesource.com/27994
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
FIDBR and FIEBR can be used for floating-point to integer rounding.
The relevant functions (Ceil, Floor and Trunc) will be updated
in a future CL.
Change-Id: I5952d67ab29d5ef8923ff1143e17a8d30169d692
Reviewed-on: https://go-review.googlesource.com/27826
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Adds the following s390x instructions from the interlocked access
facility:
* LAA(G) - load and add
* LAAL(G) - load and add logical
* LAN(G) - load and and
* LAX(G) - load and exclusive or
* LAO(G) - load and or
These instructions can be used for atomic arithmetic/logical
operations. The atomic packages will be updated in future CLs.
Change-Id: Idc850ac6749b3e778fda3da66bcd864f6b1df375
Reviewed-on: https://go-review.googlesource.com/27871
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
- implement *, /, %, shifts, Zero, Move.
- fix mistakes in comparison.
- fix floating point rounding.
- handle RetJmp in assembler (which was not handled, as a consequence
Duff's device was disabled in the old backend.)
all.bash now passes with SSA on.
Updates #16359.
Change-Id: Ia14eed0ed1176b5d800592080c8f53dded7fe73f
Reviewed-on: https://go-review.googlesource.com/27592
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Replace the various calls to Fprintf(ctxt.Bso, ...) with a helper,
ctxt.Logf. This also addresses the various inconsistent flushing of
ctxt.Bso.
Because we have two Link structures, add Link.Logf in both places.
Change-Id: I23093f9b9b3bf33089a0ffd7f815f92dcd1a1fa1
Reviewed-on: https://go-review.googlesource.com/27730
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
In many places where ctx.Bso.Flush is used as the target for some debug
logging, ctx.Bso.Flush is called unconditionally. In the majority of
cases where debug logging is not enabled, this means Flush is called
many times when there is nothing to be flushed (it will be called anyway
when ctx.Bso is eventually closed), sometimes in a loop.
Avoid this by moving the ctx.Bso.Flush call into the same condition
block as the debug print. This pattern was previously applied
sporadically.
Change-Id: I0444cb235cc8b9bac51a59b2e44e59872db2be06
Reviewed-on: https://go-review.googlesource.com/27579
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Update goobj reader so it can provide all the information
necessary to disassemble .o (and .a) files.
Grab architecture of .o files from header.
.o files have relocations in them. This CL also contains a simple
mechanism to disassemble relocations and add relocation info as an extra
column in the output.
Fixes#13862
Change-Id: I608fd253ff1522ea47f18be650b38d528dae9054
Reviewed-on: https://go-review.googlesource.com/24818
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This time with the cherry-pick from the proper patch of
the old CL.
Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.
live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.
Enhanced debugging output for shared libs test.
Rebased onto master.
Updates #16010.
Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This is a copy of golang.org/cl/22092 by Ryan Brown.
Here's his original comment:
On my machine this increases the average time for 'go build cmd/go' from
2.25s to 2.36s. I tried to measure compile and link separately but saw
no significant change.
Change-Id: If0d2b756d52a0d30d4eda526929c82794d89dd7b
Reviewed-on: https://go-review.googlesource.com/25311
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
- Use machine instructions for uint64<->float conversions
- Do not enforce alignment on Zero/Move
ARM64 supports unaligned load/stores, but only aligned offset
or small offset can be encoded into instructions.
- Do combined loads
Change-Id: Iffca7dd0f13070b17b784861ce5a30af584680eb
Reviewed-on: https://go-review.googlesource.com/27086
Reviewed-by: David Chase <drchase@google.com>
s390x took up the last available chunk of int16 opcodes.
There are RISC-V and sparc64 ports in progress out of tree,
and there will likely be other architectures.
Reduce the opcode space to allow more architectures to
fit without increasing to int32.
This is the smallest power of two that accomodates all
existing architectures. All else being equal, smaller is
better--smaller numbers are easier to generate immediates
for and easier on the eyes when debugging.
Change-Id: I4d0824b28913892fbd0579d3f90bea34e44c8946
Reviewed-on: https://go-review.googlesource.com/24223
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
This CL adds a safety mechanism
for changing the number of opcodes
available per architecture.
A subsequent CL will actually make the change.
Change-Id: I6332ed5514f2f153c54d11b7da0cc8a6be1c8066
Reviewed-on: https://go-review.googlesource.com/24222
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Now that assembler opcodes have their own type, they can have a true
stringer, rather than explicit calls to Aconv, which makes for nicer
format strings.
Change-Id: Ic77f5f8ac38b4e519dcaa08c93e7b732226f7bfe
Reviewed-on: https://go-review.googlesource.com/25045
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
VPSHUFD should take an unsigned argument to be consistent with
PSHUFD. Also fix all usage.
Fixes#16499
Change-Id: Ie699c102afed0379445914a251710365b14d89b6
Reviewed-on: https://go-review.googlesource.com/25383
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Passes ssa_test.
Requires a few new instructions and some scratchpad
memory to move data between G and F registers.
Also fixed comparisons to be correct in case of NaN.
Added missing instructions for run.bash.
Removed some FP registers that are apparently "reserved"
(but that are also apparently also unused except for a
gratuitous multiplication by two when y = x+x would work
just as well).
Currently failing stack splits.
Updates #16010.
Change-Id: I73b161bfff54445d72bd7b813b1479f89fc72602
Reviewed-on: https://go-review.googlesource.com/26813
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Add more ARM64 optimizations:
- use hardware zero register when it is possible.
- use shifted ops.
The assembler supports shifted ops but not documented, nor knows
how to print it. This CL adds them.
- enable fast division.
This was disabled because it makes the old backend generate slower
code. But with SSA it generates faster code.
Turn on SSA by default, also adjust tests.
Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6
Reviewed-on: https://go-review.googlesource.com/26950
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
We shouldn't issue instructions like MOVL foo(SB), AX directly from the
SSA backend. Instead we should do LEAL foo(SB), AX; MOVL (AX), AX.
This simplifies obj logic because now only LEAL needs to be treated
specially. The register allocator uses the LEAL to in effect allocate
the temporary register required for the shared library thunk calls.
Also, the LEALs can now be CSEd. So code like
var g int
func f() { g += 5 }
Requires only one thunk call instead of 2.
Change-Id: Ib87d465f617f73af437445871d0ea91a630b2355
Reviewed-on: https://go-review.googlesource.com/26814
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
When a constant can be encoded in a logical instruction (BITCON), do
it this way instead of using the constant pool. The BITCON testing
code runs faster than table lookup (using map):
(on AMD64 machine, with pseudo random input)
BenchmarkIsBitcon-4 300000000 4.04 ns/op
BenchmarkTable-4 50000000 27.3 ns/op
The equivalent C code of BITCON testing is formally verified with
model checker CBMC against linear search of the lookup table.
Also handle cases when a constant can be encoded in a MOV instruction.
In this case, materializa the constant into REGTMP without using the
constant pool.
When constants need to be added to the constant pool, make sure to
check whether it fits in 32-bit. If not, store 64-bit.
Both legacy and SSA compiler backends are happy with this.
Fixes#16226.
Change-Id: I883e3069dee093a1cdc40853c42221a198a152b0
Reviewed-on: https://go-review.googlesource.com/26631
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Use the destination register for materializing the pc
for GOT references also. See https://go-review.googlesource.com/c/25442/
The SSA backend assumes CX does not get clobbered for these instructions.
Mark duffzero as clobbering CX. The linker needs to clobber CX
to materialize the address to call. (This affects the non-shared-library
duffzero also, but hopefully forbidding one register across duffzero
won't be a big deal.)
Hopefully this is all the cases where the linker is clobbering CX
under the hood and SSA assumes it isn't.
Change-Id: I080c938170193df57cd5ce1f2a956b68a34cc886
Reviewed-on: https://go-review.googlesource.com/26611
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Access to globals requires a 2-instruction sequence on PIC 386.
MOVL foo(SB), AX
is translated by the obj package into:
CALL getPCofNextInstructionInTempRegister(SB)
MOVL (&foo-&thisInstruction)(tmpReg), AX
The call returns the PC of the next instruction in a register.
The next instruction then offsets from that register to get the
address required. The tricky part is the allocation of the
temp register. The legacy compiler always used CX, and forbid
the register allocator from allocating CX when in PIC mode.
We can't easily do that in SSA because CX is actually a required
register for shift instructions. (I think the old backend got away
with this because the register allocator never uses CX, only
codegen knows that shifts must use CX.)
Instead, we allow the temp register to be anything. When the
destination of the MOV (or LEA) is an integer register, we can
use that register. Otherwise, we make sure to compile the
operation using an LEA to reference the global. So
MOVL AX, foo(SB)
is never generated directly. Instead, SSA generates:
LEAL foo(SB), DX
MOVL AX, (DX)
which is then rewritten by the obj package to:
CALL getPcInDX(SB)
LEAL (&foo-&thisInstruction)(DX), AX
MOVL AX, (DX)
So this CL modifies the obj package to use different thunks
to materialize the pc into different registers. We use the
registers that regalloc chose so that SSA can still allocate
the full set of registers.
Change-Id: Ie095644f7164a026c62e95baf9d18a8bcaed0bba
Reviewed-on: https://go-review.googlesource.com/25442
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This adds the initial SSA implementation for PPC64.
Builds golang and all.bash runs correctly. Simple hello.go
builds but does not run.
Change-Id: I7cec211b934cd7a2dd75a6cdfaf9f71867063466
Reviewed-on: https://go-review.googlesource.com/24453
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Functions should be declared to end after the last real instruction, not
after the last padding byte. We achieve this by adding the padding while
assembling the text section in the linker instead of adding the padding
to the function symbol in the compiler. This change makes dtrace happy.
TODO: check that this works with external linking
Fixes#15969
Change-Id: I973e478d0cd34b61be1ddc55410552cbd645ad62
Reviewed-on: https://go-review.googlesource.com/24040
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Any defer in a shared object crashed when GOARCH=386. This turns out to be two
bugs:
1) Calls to morestack were not processed to be PIC safe (must have been
possible to trigger this another way too)
2) jmpdefer needs to rewind the return address of the deferred function past
the instructions that load the GOT pointer into BX, not just past the call
Bug 2) requires re-introducing the a way for .s files to know when they are
being compiled for dynamic linking but I've tried to do that in as minimal
a way as possible.
Fixes#15916
Change-Id: Ia0d09b69ec272a176934176b8eaef5f3bfcacf04
Reviewed-on: https://go-review.googlesource.com/23623
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This has a minor performance cost, but far less than is being gained by SSA.
As an experiment, enable it during the Go 1.7 beta.
Having frame pointers on by default makes Linux's perf, Intel VTune,
and other profilers much more useful, because it lets them gather a
stack trace efficiently on profiling events.
(It doesn't help us that much, since when we walk the stack we usually
need to look up PC-specific information as well.)
Fixes#15840.
Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a
Reviewed-on: https://go-review.googlesource.com/23451
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
A few other architectures have already defined a NOFRAME flag.
Use it to disable frame pointer code on a few very low-level functions
that must behave like Windows code.
Makes the failing os/signal test pass on a Windows gomote.
Change-Id: I982365f2c59a0aa302b4428c970846c61027cf3e
Reviewed-on: https://go-review.googlesource.com/23456
Reviewed-by: Austin Clements <austin@google.com>
The following performance improvements have been made to the
low-level atomic functions for ppc64le & ppc64:
- For those cases containing a lwarx and stwcx (or other sizes):
sync, lwarx, maybe something, stwcx, loop to sync, sync, isync
The sync is moved before (outside) the lwarx/stwcx loop, and the
sync after is removed, so it becomes:
sync, lwarx, maybe something, stwcx, loop to lwarx, isync
- For the Or8 and And8, the shifting and manipulation of the
address to the word aligned version were removed and the
instructions were changed to use lbarx, stbcx instead of
register shifting, xor, then lwarx, stwcx.
- New instructions LWSYNC, LBAR, STBCC were tested and added.
runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode
instead of the WORD encoding.
Fixes#15469
Ran some of the benchmarks in the runtime and sync directories.
Some results varied from run to run but the trend was improvement
based on best times for base and new:
runtime.test:
BenchmarkChanNonblocking-128 0.88 0.89 +1.14%
BenchmarkChanUncontended-128 569 511 -10.19%
BenchmarkChanContended-128 63110 53231 -15.65%
BenchmarkChanSync-128 691 598 -13.46%
BenchmarkChanSyncWork-128 11355 11649 +2.59%
BenchmarkChanProdCons0-128 2402 2090 -12.99%
BenchmarkChanProdCons10-128 1348 1363 +1.11%
BenchmarkChanProdCons100-128 1002 746 -25.55%
BenchmarkChanProdConsWork0-128 2554 2720 +6.50%
BenchmarkChanProdConsWork10-128 1909 1804 -5.50%
BenchmarkChanProdConsWork100-128 1624 1580 -2.71%
BenchmarkChanCreation-128 237 212 -10.55%
BenchmarkChanSem-128 705 667 -5.39%
BenchmarkChanPopular-128 5081190 4497566 -11.49%
BenchmarkCreateGoroutines-128 532 473 -11.09%
BenchmarkCreateGoroutinesParallel-128 35.0 34.7 -0.86%
BenchmarkCreateGoroutinesCapture-128 4923 4200 -14.69%
sync.test:
BenchmarkUncontendedSemaphore-128 112 94.2 -15.89%
BenchmarkContendedSemaphore-128 133 128 -3.76%
BenchmarkMutexUncontended-128 1.90 1.67 -12.11%
BenchmarkMutex-128 353 310 -12.18%
BenchmarkMutexSlack-128 304 283 -6.91%
BenchmarkMutexWork-128 554 541 -2.35%
BenchmarkMutexWorkSlack-128 567 556 -1.94%
BenchmarkMutexNoSpin-128 275 242 -12.00%
BenchmarkMutexSpin-128 1129 1030 -8.77%
BenchmarkOnce-128 1.08 0.96 -11.11%
BenchmarkPool-128 29.8 27.4 -8.05%
BenchmarkPoolOverflow-128 40564 36583 -9.81%
BenchmarkSemaUncontended-128 3.14 2.63 -16.24%
BenchmarkSemaSyntNonblock-128 1087 1069 -1.66%
BenchmarkSemaSyntBlock-128 897 893 -0.45%
BenchmarkSemaWorkNonblock-128 1034 1028 -0.58%
BenchmarkSemaWorkBlock-128 949 886 -6.64%
Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e
Reviewed-on: https://go-review.googlesource.com/22549
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
Adds the FCFIDU instruction and uses it instead of the FCFID
instruction for unsigned integer to float casts. This change means
that unsigned integers do not have to be cast to signed integers
before being cast to a floating point value. Therefore it is no
longer necessary to insert instructions to detect and fix
values that overflow int64.
The previous code generating the uint64 to int64 cast handled
overflow by truncating the uint64 value. This truncation can
change the result of the rounding performed by the integer to
float cast.
The FCFIDU instruction was added in Power ISA 2.06B.
Fixes#15539.
Change-Id: Ia37a9631293eff91032d4cd9a9bec759d2142437
Reviewed-on: https://go-review.googlesource.com/22772
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The format has been tweaked several times in the latest cycle, so
replace go13ld with go17ld.
Change-Id: I343c49b02b7516fd781bc96ad46640579da68c59
Reviewed-on: https://go-review.googlesource.com/22708
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
a new relocation R_ADDRMIPSTLS is added, which resolves to 16-bit offset
of a TLS address on mips64x.
Change-Id: Ic60d0e1ba49ff1c433cead242f5884677ab227a5
Reviewed-on: https://go-review.googlesource.com/19804
Reviewed-by: Minux Ma <minux@golang.org>
SB register (R28) is introduced for access external addresses with shorter
instruction sequences. It is loaded at entry points. External data within
2G of SB can be accessed this way.
cmd/internal/obj: relocaltion R_ADDRMIPS is split into two relocations
R_ADDRMIPS and R_ADDRMIPSU, handling the low 16 bits and the "upper" 16
bits of external addresses, respectively, since the instructios may not
be adjacent. It might be better if relocation Variant could be used.
cmd/link/internal/mips64: support new relocations.
cmd/compile/internal/mips64: reserve SB register.
runtime: initialize SB register at entry points.
Change-Id: I5f34868f88c5a9698c042a8a1f12f76806c187b9
Reviewed-on: https://go-review.googlesource.com/19802
Reviewed-by: Minux Ma <minux@golang.org>
Leave R28 to SB register, which will be introduced in CL 19802.
Change-Id: I1cf7a789695c5de664267ec8086bfb0b043ebc14
Reviewed-on: https://go-review.googlesource.com/19863
Reviewed-by: Minux Ma <minux@golang.org>
MGHI (16-bit signed immediate) is now used where possible for both
MULLW and MULLD. MGHI is 2-bytes shorter than MSGFI.
Change-Id: I5d0648934f28b3403b1126913fd703d8f62b9e9f
Reviewed-on: https://go-review.googlesource.com/22398
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Current V-register range is V32~V63 on arm64. This patch changes it to
V0~V31.
fix#15465.
Change-Id: I90dab42dea46825ec5d7a8321ec4f6550735feb8
Reviewed-on: https://go-review.googlesource.com/22520
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Run-TryBot: Aram Hăvărneanu <aram@mgk.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Also add MustClose and MustWriter to cmd/internal/bio, and use them in
cmd/asm.
Change-Id: I07f5df3b66c17bc5b2e6ec9c4357d9b653e354e0
Reviewed-on: https://go-review.googlesource.com/21938
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
By replacing the *string used to represent pkgPath with a
reflect.name everywhere, the embedded *string for package paths
inside the reflect.name can be replaced by an offset, nameOff.
This reduces the number of pointers in the type information.
This also moves all reflect.name types into the same section, making
it possible to use nameOff more widely in later CLs.
No significant binary size change for normal binaries, but:
linux/amd64 PIE:
cmd/go: -440KB (3.7%)
jujud: -2.6MB (3.2%)
For #6853.
Change-Id: I3890b132a784a1090b1b72b32febfe0bea77eaee
Reviewed-on: https://go-review.googlesource.com/21395
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This CL introduces the typeOff type and a lookup method of the same
name that can turn a typeOff offset into an *rtype.
In a typical Go binary (built with buildmode=exe, pie, c-archive, or
c-shared), there is one moduledata and all typeOff values are offsets
relative to firstmoduledata.types. This makes computing the pointer
cheap in typical programs.
With buildmode=shared (and one day, buildmode=plugin) there are
multiple modules whose relative offset is determined at runtime.
We identify a type in the general case by the pair of the original
*rtype that references it and its typeOff value. We determine
the module from the original pointer, and then use the typeOff from
there to compute the final *rtype.
To ensure there is only one *rtype representing each type, the
runtime initializes a typemap for each module, using any identical
type from an earlier module when resolving that offset. This means
that types computed from an offset match the type mapped by the
pointer dynamic relocations.
A series of followup CLs will replace other *rtype values with typeOff
(and name/*string with nameOff).
For types created at runtime by reflect, type offsets are treated as
global IDs and reference into a reflect offset map kept by the runtime.
darwin/amd64:
cmd/go: -57KB (0.6%)
jujud: -557KB (0.8%)
linux/amd64 PIE:
cmd/go: -361KB (3.0%)
jujud: -3.5MB (4.2%)
For #6853.
Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96
Reviewed-on: https://go-review.googlesource.com/21285
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Only splits into separate files, no other changes.
Change-Id: Icc0da2c5f18e03e9ed7c0043bd7c790f741900f2
Reviewed-on: https://go-review.googlesource.com/21804
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This is the first in a series of CLs to replace the use of pointers
in binary read-only data with offsets.
In standard Go binaries these CLs have a small effect, shrinking
8-byte pointers to 4-bytes. In position-independent code, it also
saves the dynamic relocation for the pointer. This has a significant
effect on the binary size when building as PIE, c-archive, or
c-shared.
darwin/amd64:
cmd/go: -12KB (0.1%)
jujud: -82KB (0.1%)
linux/amd64 PIE:
cmd/go: -86KB (0.7%)
jujud: -569KB (0.7%)
For #6853.
Change-Id: Iad5625bbeba58dabfd4d334dbee3fcbfe04b2dcf
Reviewed-on: https://go-review.googlesource.com/21284
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Emulate 64-bit signed high multiplication ((a*b)>>64). To do this
we use the 64-bit unsigned high multiplication method and then
fix the result as shown in Hacker's Delight 2nd ed., chapter 8-3.
Required to enable some division optimizations.
Change-Id: I9194f428e09d3d029cb1afb4715cd5424b5d922e
Reviewed-on: https://go-review.googlesource.com/21774
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Generated with honnef.co/go/unused
There is a large amount of unused code in cmd/internal/obj/s390x but
that can wait til the s390x port is merged.
There is some unused code in
cmd/internal/unvendor/golang.org/x/arch/arm/armasm but that should be
addressed upstream and a new revision imported.
Change-Id: I252c0f9ea8c5bb1a0b530a374ef13a0a20ea56aa
Reviewed-on: https://go-review.googlesource.com/21782
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dave Cheney <dave@cheney.net>
bio.BufReader was never used.
bio.BufWriter was used to wrap an existing io.Writer, but the
bio.Writer returned would not be seekable, so replace all occurences
with bufio.Reader instead.
Change-Id: I9c6779e35c63178aa4e104c17bb5bb8b52de0359
Reviewed-on: https://go-review.googlesource.com/21722
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Replace the bidirectional bio.Buf type with a pair of unidirectional
buffered seekable Reader and Writers.
Change-Id: I86664a06f93c94595dc67c2cbd21356feb6680ef
Reviewed-on: https://go-review.googlesource.com/21720
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
API could still be made more Go-ey.
Updates #15165.
Change-Id: I514ffceffa43c293ae5d7e5f1e9193fda0098865
Reviewed-on: https://go-review.googlesource.com/21644
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Information about CPU architectures (e.g., name, family, byte
ordering, pointer and register size) is currently redundantly
scattered around the source tree. Instead consolidate the basic
information into a single new package cmd/internal/sys.
Also, introduce new sys.I386, sys.AMD64, etc. names for the constants
'8', '6', etc. and replace most uses of the latter. The notable
exceptions are a couple of error messages that still refer to the old
char-based toolchain names and function reltype in cmd/link.
Passes toolstash/buildall.
Change-Id: I8a6f0cbd49577ec1672a98addebc45f767e36461
Reviewed-on: https://go-review.googlesource.com/21623
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This updates dwarf.go to generate debug information as symbols
instead of directly writing to the output file. This should make
it easier to move generation of some of the debug info into the compiler.
Change-Id: Id2358988bfb689865ab4d68f82716f0676336df4
Reviewed-on: https://go-review.googlesource.com/20679
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Change-Id: I91873aaebf79bdf1c00d38aacc1a1fb8d79656a7
Reviewed-on: https://go-review.googlesource.com/21433
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
We create appropriate ELF files automatically based on GOOS. There's
no point in supporting -H elf flag, particularly since we need to emit
different flavors of ELF depending on GOOS anyway.
If that weren't reason enough, -H elf appears to be broken since at
least Go 1.4. At least I wasn't able to find a way to make use of it.
As best I can tell digging through commit history, -H elf is just an
artifact leftover from Plan 9's 6l linker.
Change-Id: I7393caaadbc60107bbd6bc99b976a4f4fe6b5451
Reviewed-on: https://go-review.googlesource.com/21343
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Based on the ppc64 port.
s390x supports 2, 4 and 6 byte instructions and Go assembly
instructions sometimes map to several s390x instructions. The
assembler loops until a fixed point is reached in order to use
branch instructions that can only handle a short offset in a
similar way to other ports.
Change-Id: I4278bf46aca35a96ca9cea0857e6229643c9c1e3
Reviewed-on: https://go-review.googlesource.com/20942
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Record total number of relocations, pcdata, automatics, funcdata and files in
object file and use these numbers in the linker to allocate contiguous
slices to later be filled by the defined symbols.
name old secs new secs delta
LinkCmdGo 0.52 ± 3% 0.49 ± 3% -4.21% (p=0.000 n=91+92)
LinkJuju 4.48 ± 4% 4.21 ± 7% -6.08% (p=0.000 n=96+100)
name old MaxRSS new MaxRSS delta
LinkCmdGo 122k ± 2% 120k ± 4% -1.66% (p=0.000 n=98+93)
LinkJuju 799k ± 5% 865k ± 8% +8.29% (p=0.000 n=89+99)
GOGC=off
name old secs new secs delta
LinkCmdGo 0.42 ± 2% 0.41 ± 0% -2.98% (p=0.000 n=89+70)
LinkJuju 3.61 ± 0% 3.52 ± 1% -2.46% (p=0.000 n=80+89)
name old MaxRSS new MaxRSS delta
LinkCmdGo 130k ± 1% 128k ± 1% -1.33% (p=0.000 n=100+100)
LinkJuju 1.00M ± 0% 0.99M ± 0% -1.70% (p=0.000 n=100+100)
Change-Id: Ie08f6ccd4311bb78d8950548c678230a58635c73
Reviewed-on: https://go-review.googlesource.com/21026
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
See #14874
This change tells the linker to collect all the itablink symbols and
collect them so that moduledata can have a slice of all compiler
generated itabs.
The logic is shamelessly adapted from what is done with typelink symbols.
Change-Id: Ie93b59acf0fcba908a876d506afbf796f222dbac
Reviewed-on: https://go-review.googlesource.com/20889
Reviewed-by: Keith Randall <khr@golang.org>
Adds a new R_PCRELDBL relocation for 2-byte aligned relative
relocations on s390x. Should be removed once #14218 is
implemented.
Change-Id: I79dd2d8e746ba8cbc26c570faccfdd691e8161e8
Reviewed-on: https://go-review.googlesource.com/20941
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The obj.Nocache helper was only used by the arm back end, move it there.
Change-Id: I5c9faf995499991ead1f3d8c8ffc3b6af7346876
Reviewed-on: https://go-review.googlesource.com/20868
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This CL addresses a long standing CL by rsc by pushing the use of
Link.Windows down to its two users.
Link.Window was always initalised with the value of runtime.GOOS so
this does not affect cross compilation.
Change-Id: Ibbae068f8b5aad06336909691f094384caf12352
Reviewed-on: https://go-review.googlesource.com/20869
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Another object file change, gives a reasonable improvement:
name old s/op new s/op delta
LinkCmdGo 0.46 ± 3% 0.44 ± 9% -3.34% (p=0.000 n=98+82)
LinkJuju 4.09 ± 4% 3.92 ± 5% -4.30% (p=0.000 n=98+99)
I guess the data section could be mmap-ed instead of read, I haven't tried
that.
Change-Id: I959eee470a05526ab1579e3f5d3ede41c16c954f
Reviewed-on: https://go-review.googlesource.com/20928
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reduces size of archives in pkg/linux_amd64 by 3% from 41.5MB to 40.2MB
Change-Id: Id64ca7995de8dd84c9e7ce1985730927cf4bfd66
Reviewed-on: https://go-review.googlesource.com/20912
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Most 64-bit x86 ops can only take a signed 32-bit constant.
Clean up our rewrite rules to enforce this restriction.
Modify the assembler to fail if the offset does not fit
in the instruction.
That last check triggers a few times on weird testing code.
Suppress those errors if the compiler itself generated errors.
Fixes#14862
Change-Id: I76559af035b38483b1e59621a8029fc66b3a5d1e
Reviewed-on: https://go-review.googlesource.com/20815
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reduces size of archives in pkg/linux_amd64 by 1.4MB (3.2%),
slightly improving link time.
name old s/op new s/op delta
LinkCmdGo 0.52 ± 3% 0.51 ± 2% -0.65% (p=0.000 n=98+99)
Change-Id: I7e265f4d4dd08967c5c5d55c1045e533466bbbec
Reviewed-on: https://go-review.googlesource.com/20802
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
For every string constant the compiler was creating 2 Sym's and 2
Node's. It would never refer to them again, but would keep them alive
in gostringpkg. This changes the code to just use obj.LSym's instead.
When compiling x/tools/go/types, this yields about a 15% reduction in
the number of calls to newname and a 3% reduction in the total number of
Node objects. Unfortunately I couldn't see any change in compile time,
but reducing memory usage is desirable anyhow.
Passes toolstash -cmp.
Change-Id: I24f1cb1e6cff0a3afba4ca66f7166874917a036b
Reviewed-on: https://go-review.googlesource.com/20792
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
c2go translated writing and advancing a pointer using slices.
Switch to something more idiomatic.
It is also more efficient, but not enough to matter.
Change-Id: I67709632ac53253615a35365824ae97bbe5458d5
Reviewed-on: https://go-review.googlesource.com/20767
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Use a local variable instead.
Passes toolstash -cmp.
Change-Id: I9623a40ff0d568f11afd1279b6aaa1c33eda644c
Reviewed-on: https://go-review.googlesource.com/20730
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The obj.Fmt* values are only used by gc/fmt.go, so just move them
there. Also, add comments documenting the correspondance between
FmtFoo names and their flag characters to make understanding the
existing documentation slightly less confusing.
While here, add a new FmtFlag named type to represent these values.
Change-Id: I9631214b892557d094823f1ac575d0c43a84007b
Reviewed-on: https://go-review.googlesource.com/20717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Symbols in the object file currently refer to each other using symbol name
and version. Referring to the same symbol many times in an object file takes
up space and causes redundant map lookups. Instead write out a list of unique
symbol references and have symbols refer to each other using indexes into this
list.
Credit to Michael Hudson-Doyle for kicking this off.
Reduces pkg/linux_amd64 size by 30% from 61MB to 43MB
name old s/op new s/op delta
LinkCmdGo 0.74 ± 3% 0.63 ± 4% -15.22% (p=0.000 n=20+20)
LinkJuju 6.38 ± 6% 5.73 ± 6% -10.16% (p=0.000 n=20+19)
Change-Id: I7e101a0c80b8e673a3ba688295e6f80ea04e1cfb
Reviewed-on: https://go-review.googlesource.com/20099
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Partial automatic cleanup driven by Dominik Honnef's unused tool.
As _lookup now only has one caller, merge it into the caller and remove
the conditional create logic.
Change-Id: I2ea354d9d4b32a19905271eca74725231b6d8a93
Reviewed-on: https://go-review.googlesource.com/20589
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Just recognize "DATA" as a special pseudo op word in the assembler
directly.
Change-Id: I508e111fd71f561efa600ad69567a7089a57adb2
Reviewed-on: https://go-review.googlesource.com/20648
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
The only remaining place that generated ADATA
Prog was the assembler. Stop, and delete some
now-dead code.
Passes toolstash -cmp.
Change-Id: I26578ff1b4868e98562b44f69d909c083e96f8d5
Reviewed-on: https://go-review.googlesource.com/20646
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Instead of generating ADATA instructions for
static data, write that static data directly
into the linker sym.
This is considerably more efficient.
The assembler still generates
ADATA instructions, so the ADATA machinery
cannot be dismantled yet. (Future work.)
Skipping ADATA has a significant impact
compiling the unicode package, which has lots
of static data.
name old time/op new time/op delta
Unicode 227ms ±10% 192ms ± 4% -15.61% (p=0.000 n=29+30)
name old alloc/op new alloc/op delta
Unicode 51.0MB ± 0% 45.8MB ± 0% -10.29% (p=0.000 n=30+30)
name old allocs/op new allocs/op delta
Unicode 610k ± 0% 578k ± 0% -5.29% (p=0.000 n=30+30)
This does not pass toolstash -cmp, because
this changes the order in which some relocations
get added, and thus it changes the output from
the compiler. It is not worth the execution time
to sort the relocs in the normal case.
However, compiling with -S -v generates identical
output if (1) you suppress printing of ADATA progs
in flushplist and (2) you suppress printing of
cpu timing. It is reasonable to suppress printing
the ADATA progs, since the data itself is dumped
later. I am therefore fairly confident that all
changes are superficial and non-functional.
Fixes#14786, although there's more to do
in general.
Change-Id: I8dfabe7b423b31a30e516cfdf005b62a2e9ccd82
Reviewed-on: https://go-review.googlesource.com/20645
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This makes the output of compiling with -S more
stable in the face of unimportant variation in the
order in which relocs are generated.
It is also more pleasant to read the relocs when
they are sorted.
Also, do some minor cleanup.
For #14786
Change-Id: Id92020b13fd21777dfb5b29c2722c3b2eb27001b
Reviewed-on: https://go-review.googlesource.com/20641
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Removes an intermediate layer of functions that was clogging up a
corner of the compiler's profile graph.
I can't measure a performance improvement running a large build
like jujud, but the profile reports less total time spent in
gc.(*lexer).getr.
Change-Id: I3000585cfcb0f9729d3a3859e9023690a6528591
Reviewed-on: https://go-review.googlesource.com/20565
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In addition to reflect.Value.Call, exported methods can be invoked
by the Func value in the reflect.Method struct. This CL has the
compiler track what functions get access to a legitimate reflect.Method
struct by looking for interface calls to either of:
Method(int) reflect.Method
MethodByName(string) (reflect.Method, bool)
This is a little overly conservative. If a user implements a type
with one of these methods without using the underlying calls on
reflect.Type, the linker will assume the worst and include all
exported methods. But it's cheap.
No change to any of the binary sizes reported in cl/20483.
For #14740
Change-Id: Ie17786395d0453ce0384d8b240ecb043b7726137
Reviewed-on: https://go-review.googlesource.com/20489
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
p can be nil in Dconv so we need to do a check before dereferencing
it. Fixes a problem I was having running toolstash.
Change-Id: I34d6d278b319583d8454c2342ac88e054fc4b641
Reviewed-on: https://go-review.googlesource.com/20595
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Today the linker keeps all methods of reachable types. This is
necessary if a program uses reflect.Value.Call. But while use of
reflection is widespread in Go for encoders and decoders, using
it to call a method is rare.
This CL looks for the use of reflect.Value.Call in a program, and
if it is absent, adopts a (reasonably conservative) method pruning
strategy as part of dead code elimination. Any method that is
directly called is kept, and any method that matches a used
interface's method signature is kept.
Whether or not a method body is kept is determined by the relocation
from its receiver's *rtype to its *rtype. A small change in the
compiler marks these relocations as R_METHOD so they can be easily
collected and manipulated by the linker.
As a bonus, this technique removes the text segment of methods that
have been inlined. Looking at the output of building cmd/objdump with
-ldflags=-v=2 shows that inlined methods like
runtime.(*traceAllocBlockPtr).ptr are removed from the program.
Relatively little work is necessary to do this. Linking two
examples, jujud and cmd/objdump show no more than +2% link time.
Binaries that do not use reflect.Call.Value drop 4 - 20% in size:
addr2line: -793KB (18%)
asm: -346KB (8%)
cgo: -490KB (10%)
compile: -564KB (4%)
dist: -736KB (17%)
fix: -404KB (12%)
link: -328KB (7%)
nm: -827KB (19%)
objdump: -712KB (16%)
pack: -327KB (14%)
yacc: -350KB (10%)
Binaries that do use reflect.Call.Value see a modest size decrease
of 2 - 6% thanks to pruning of unexported methods:
api: -151KB (3%)
cover: -222KB (4%)
doc: -106KB (2.5%)
pprof: -314KB (3%)
trace: -357KB (4%)
vet: -187KB (2.7%)
jujud: -4.4MB (5.8%)
cmd/go: -384KB (3.4%)
The trivial Hello example program goes from 2MB to 1.68MB:
package main
import "fmt"
func main() {
fmt.Println("Hello, 世界")
}
Method pruning also helps when building small binaries with
"-ldflags=-s -w". The above program goes from 1.43MB to 1.2MB.
Unfortunately the linker can only tell if reflect.Value.Call has been
statically linked, not if it is dynamically used. And while use is
rare, it is linked into a very common standard library package,
text/template. The result is programs like cmd/go, which don't use
reflect.Value.Call, see limited benefit from this CL. If binary size
is important enough it may be possible to address this in future work.
For #6853.
Change-Id: Iabe90e210e813b08c3f8fd605f841f0458973396
Reviewed-on: https://go-review.googlesource.com/20483
Reviewed-by: Russ Cox <rsc@golang.org>
Drops cmd/binary size from 14.41 MiB to 11.42 MiB.
Before:
text data bss dec hex filename
8121210 3521696 737960 12380866 bceac2 ../pkg/tool/linux_amd64/compile
bradfitz@dev-bradfitz-debian2:~/go/src$ ls -l ../pkg/tool/linux_amd64/compile
-rwxr-xr-x 1 bradfitz bradfitz 15111272 Mar 8 23:32 ../pkg/tool/linux_amd64/compile
a2afc0 51312 R html.statictmp_0085
6753f0 56592 T cmd/internal/obj/x86.doasm
625480 58080 T cmd/compile/internal/gc.typecheck1
f34c40 65688 D runtime.trace
be0a20 133552 D cmd/compile/internal/ppc64.varianttable
c013e0 265856 D cmd/compile/internal/arm.progtable
c42260 417280 D cmd/compile/internal/amd64.progtable
ca8060 417280 D cmd/compile/internal/x86.progtable
f44ce0 500640 D cmd/internal/obj/arm64.oprange
d0de60 534208 D cmd/compile/internal/ppc64.progtable
d90520 667520 D cmd/compile/internal/arm64.progtable
e334a0 790368 D cmd/compile/internal/mips64.progtable
a3e8c0 1579362 r runtime.pclntab
After:
text data bss dec hex filename
8128226 375954 246432 8750612 858614 ../pkg/tool/linux_amd64/compile
-rwxr-xr-x 1 bradfitz bradfitz 11971432 Mar 8 23:35 ../pkg/tool/linux_amd64/compile
6436d0 43936 T cmd/compile/internal/gc.walkexpr
c13ca0 45056 D cmd/compile/internal/ssa.opcodeTable
5d8ea0 50256 T cmd/compile/internal/gc.(*state).expr
818c50 50448 T cmd/compile/internal/ssa.rewriteValueAMD64_OpMove
a2d0e0 51312 R html.statictmp_0085
6753d0 56592 T cmd/internal/obj/x86.doasm
625460 58080 T cmd/compile/internal/gc.typecheck1
c38fe0 65688 D runtime.trace
a409e0 1578810 r runtime.pclntab
Fixes#14703
Change-Id: I2177596d5c7fd67db0a3c423cd90801cf52adb12
Reviewed-on: https://go-review.googlesource.com/20450
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Deleting the string merging pass makes the linker 30-35% faster
but makes jujud (using the github.com/davecheney/benchjuju snapshot) 2.5% larger.
Two optimizations bring the space overhead down to 0.6%.
First, change the default alignment for string data to 1 byte.
(It was previously defaulting to larger amounts, usually pointer width.)
Second, write out the type string for T (usually a bigger expression) as "*T"[1:],
so that the type strings for T and *T share storage.
Combined, these obtain the bulk of the benefit of string merging
at essentially no cost. The remaining benefit from string merging
is not worth the excessive cost, so delete it.
As penance for making the jujud binary 0.6% larger,
the next CL in this sequence trims the reflect functype
information enough to make the jujud binary overall 0.75% smaller
(that is, that CL has a net -1.35% effect).
For #6853.
Fixes#14648.
Change-Id: I3fdd74c85410930c36bb66160ca4174ed540fc6e
Reviewed-on: https://go-review.googlesource.com/20334
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Currently, package obj reserves a range of 1<<12 opcodes for each
target architecture. E.g., mips64 has [6<<12, 7<<12).
However, because mips.ABEQ and mips.ALAST are both within that range,
the expression mips.ABEQ+mips.ALAST in turn falls (far) outside that
range around 12<<12, meaning it could theoretically collide with
another arch's opcodes.
More practically, it's a problem because 12<<12 overflows an int16,
which hampers fixing #14692. (We could also just switch to uint16 to
avoid the overflow, but that still leaves the first problem.)
As a workaround, use Michael Hudson-Doyle's solution from
https://golang.org/cl/20182 and use negative values for these variant
instructions.
Passes toolstash -cmp for GOARCH=arm and GOARCH=mips64.
Updates #14692.
Change-Id: Iad797d10652360109fa4db19d4d1edb6529fc2c0
Reviewed-on: https://go-review.googlesource.com/20345
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>