Commit Graph

200 Commits

Author SHA1 Message Date
Russ Cox d2b0c387b2 cmd/asm: add YMM registers Y0 through Y15
Not recognized in any instructions yet, but this lets the
assembler parse them at least.

For #14068.

Change-Id: Id4f7329a969b747a867ce261b20165fab2cdcab8
Reviewed-on: https://go-review.googlesource.com/18846
Reviewed-by: Rob Pike <r@golang.org>
2016-01-24 05:00:43 +00:00
Russ Cox 36edf48a10 cmd/asm: report more than one instruction encoding error
Also, remove output file if there are encoding errors.
The extra reports are convenient.
Removing the output file is very important.
Noticed while testing.

Change-Id: I0fab17d4078f93c5a0d6d1217d8d9a63ac789696
Reviewed-on: https://go-review.googlesource.com/18845
Reviewed-by: Rob Pike <r@golang.org>
2016-01-24 05:00:28 +00:00
Russ Cox a5ba581ae0 cmd/asm: simplify golden test maintenance
Instead of two parallel files that look almost identical,
mark the expected differences in the original file.

The annotations being added here keep the tests passing,
but they also make clear a number of printing or parsing
errors that were not as easily seen when the data was
split across two files.

Fix a few diagnostic problems in cmd/internal/obj as well.

A step toward #13822.

Change-Id: I997172681ea6fa7da915ff0f0ab93d2b76f8dce2
Reviewed-on: https://go-review.googlesource.com/18823
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-01-24 05:00:04 +00:00
Shenghou Ma 1b6d55acab cmd/internal/obj/mips, cmd/internal/obj: reduce MIPS register space
Change-Id: I43458ce0e78ffc3d0943d28dc8df8e1c9e4cf679
Reviewed-on: https://go-review.googlesource.com/18821
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-01-22 04:33:47 +00:00
Russ Cox 5f23bc8903 cmd/compile: add AVARLIVE to peep for arm, arm64, mips64, ppc64
Fixes build on those systems.

Also fix printing of AVARLIVE.

Change-Id: I1b38cca0125689bc08e4e1bdd0d0c140b1ea079a
Reviewed-on: https://go-review.googlesource.com/18641
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-14 02:04:50 +00:00
Russ Cox ed03dab853 cmd/internal/obj: separate code layout from object writing
This will allow the compiler to crunch Prog lists down to code as each
function is compiled, instead of waiting until the end, which should
reduce the working set of the compiler. But not until Go 1.7.

This also makes it easier to write some machine code output tests
for the assembler, which is why it's being done now.

For #13822.

Change-Id: I0811123bc6e5717cebb8948f9cea18e1b9baf6f7
Reviewed-on: https://go-review.googlesource.com/18311
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-01-14 01:51:27 +00:00
Russ Cox 1ac637c766 cmd/compile: recognize Syscall-like functions for liveness analysis
Consider this code:

	func f(*int)

	func g() {
		p := new(int)
		f(p)
	}

where f is an assembly function.
In general liveness analysis assumes that during the call to f, p is dead
in this frame. If f has retained p, p will be found alive in f's frame and keep
the new(int) from being garbage collected. This is all correct and works.
We use the Go func declaration for f to give the assembly function
liveness information (the arguments are assumed live for the entire call).

Now consider this code:

	func h1() {
		p := new(int)
		syscall.Syscall(1, 2, 3, uintptr(unsafe.Pointer(p)))
	}

Here syscall.Syscall is taking the place of f, but because its arguments
are uintptr, the liveness analysis and the garbage collector ignore them.
Since p is no longer live in h once the call starts, if the garbage collector
scans the stack while the system call is blocked, it will find no reference
to the new(int) and reclaim it. If the kernel is going to write to *p once
the call finishes, reclaiming the memory is a mistake.

We can't change the arguments or the liveness information for
syscall.Syscall itself, both for compatibility and because sometimes the
arguments really are integers, and the garbage collector will get quite upset
if it finds an integer where it expects a pointer. The problem is that
these arguments are fundamentally untyped.

The solution we have taken in the syscall package's wrappers in past
releases is to insert a call to a dummy function named "use", to make
it look like the argument is live during the call to syscall.Syscall:

	func h2() {
		p := new(int)
		syscall.Syscall(1, 2, 3, uintptr(unsafe.Pointer(p)))
		use(unsafe.Pointer(p))
	}

Keeping p alive during the call means that if the garbage collector
scans the stack during the system call now, it will find the reference to p.

Unfortunately, this approach is not available to users outside syscall,
because 'use' is unexported, and people also have to realize they need
to use it and do so. There is much existing code using syscall.Syscall
without a 'use'-like function. That code will fail very occasionally in
mysterious ways (see #13372).

This CL fixes all that existing code by making the compiler do the right
thing automatically, without any code modifications. That is, it takes h1
above, which is incorrect code today, and makes it correct code.

Specifically, if the compiler sees a foreign func definition (one
without a body) that has uintptr arguments, it marks those arguments
as "unsafe uintptrs". If it later sees the function being called
with uintptr(unsafe.Pointer(x)) as an argument, it arranges to mark x
as having escaped, and it makes sure to hold x in a live temporary
variable until the call returns, so that the garbage collector cannot
reclaim whatever heap memory x points to.

For now I am leaving the explicit calls to use in package syscall,
but they can be removed early in a future cycle (likely Go 1.7).

The rule has no effect on escape analysis, only on liveness analysis.

Fixes #13372.

Change-Id: I2addb83f70d08db08c64d394f9d06ff0a063c500
Reviewed-on: https://go-review.googlesource.com/18584
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-01-14 01:16:45 +00:00
Ilya Tocar 1d1f2fb4c6 cmd/internal/obj/x86: add new instructions, cleanup.
Add several instructions that were used via BYTE and use them.
Instructions added: PEXTRB, PEXTRD, PEXTRQ, PINSRB, XGETBV, POPCNT.

Change-Id: I5a80cd390dc01f3555dbbe856a475f74b5e6df65
Reviewed-on: https://go-review.googlesource.com/18593
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-13 14:04:44 +00:00
David Chase ab5d2bf92f cmd/compile: suppress export of Note field within exported bodies
Added a format option to inhibit output of .Note field in
printing, and enabled that option during export.
Added test.

Fixes #13777.

Change-Id: I739f9785eb040f2fecbeb96d5a9ceb8c1ca0f772
Reviewed-on: https://go-review.googlesource.com/18217
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: David Chase <drchase@google.com>
2016-01-05 15:42:12 +00:00
Matthew Dempsky 66f1f89dc0 cmd/internal/obj: fix PCSP table at runtime.morestack calls
Fixes #13346.

Change-Id: Ic903ee90575e8dbe23905d0678d3295745d1d47f
Reviewed-on: https://go-review.googlesource.com/18154
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-05 01:48:31 +00:00
Russ Cox 83746fd55a cmd/link: use current GOROOT for source file paths for standard library
This CL changes the source file information in the
standard library's .a files to say "$GOROOT/src/runtime/chan.go"
(with a literal "$GOROOT") instead of spelling out the actual directory.
The linker then substitutes the actual $GOROOT (or $GOROOT_FINAL)
as appropriate.

If people download a binary distribution to an alternate location,
following the instructions at https://golang.org/doc/install#install,
the code before this CL would end up with source paths pointing to
/usr/local/go no matter where the actual sources were.
Now the source paths for built binaries will point to the actual sources
(hopefully).

The source line information in distributed binaries is not affected:
those will still say /usr/local/go. But binaries people build themselves
(their own programs, not the go distribution programs) will be correct.

Fixing this path also fixes the lookup of the runtime-gdb.py file.

Fixes #5533.

Change-Id: I03729baae3fbd8cd636e016275ee5ad2606e4663
Reviewed-on: https://go-review.googlesource.com/18200
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-01-04 20:09:57 +00:00
Russ Cox 9917165083 cmd/internal/obj: remove 3 incorrect copyright notices
These three files contain only code written for Go
(and trivial amounts at that), not any code ported
from Inferno or Plan 9.

Remove the incorrect Inferno/Plan 9 notices.

Fixes #13576.

Change-Id: Ib9901fb360232282aae5ee0f4aa527bd6f4eaaed
Reviewed-on: https://go-review.googlesource.com/17779
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-12-14 21:09:39 +00:00
Ian Lance Taylor 0b37a6f47b cmd/compile, cmd/internal/obj: ignore AUSEFIELD
When using GOEXPERIMENT=fieldtrack, we can see AUSEFIELD instructions.
We generally want to ignore them.

No tests because as far as I can tell there are no tests for
GOEXPERIMENT=fieldtrack.

Change-Id: Iee26f25592158e5db691a36cf8d77fc54d051314
Reviewed-on: https://go-review.googlesource.com/17610
Reviewed-by: David Symonds <dsymonds@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-09 21:07:32 +00:00
Michael Hudson-Doyle 3a1bed82a6 cmd/internal/obj: fix stack barriers in ppc64le shared libs
runtime.stackBarrier is a strange function: it is only ever "called" by
smashing its address into a LR slot on the stack. Calling it like this
certainly does not adhere to the rule that r12 is set to the global entry point
before calling it and the prologue instrutions that compute r2 from r12 in fact
just corrupt r2, which is bad because the function that stackBarrier returns to
probably uses r2 to access global data.

Fortunately stackBarrier itself does not access any global data and so does not
depend on the value of r2, meaning we can ignore the ABI rules and simply skip
inserting the prologue instructions into this specific function.

Fixes 64bit.go, append.go and fixedbugs/issue13169.go from "cd test; go run
run.go -linkshared".

Change-Id: I606864133a83935899398e2d42edd08a946aab24
Reviewed-on: https://go-review.googlesource.com/17281
Reviewed-by: Austin Clements <austin@google.com>
2015-12-02 21:15:37 +00:00
sergey.arseev e9081b3c76 cmd/internal/obj/x86: add support for TSX instructions
Transactional memory, will later be used for semaphore implementation.
Nacl not supported yet.

Change-Id: Ic18453dcaa08d07bb217c0b95461584f007d518b
Reviewed-on: https://go-review.googlesource.com/16479
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-26 16:39:00 +00:00
Ilya Tocar b597e1ed54 runtime: speed up memclr with avx2 on amd64
Results are a bit noisy, but show good improvement (haswell)

name            old time/op    new time/op     delta
Memclr5-48        6.06ns ± 8%     5.65ns ± 8%    -6.81%  (p=0.000 n=20+20)
Memclr16-48       5.75ns ± 6%     5.71ns ± 6%      ~     (p=0.545 n=20+19)
Memclr64-48       6.54ns ± 5%     6.14ns ± 9%    -6.12%  (p=0.000 n=18+20)
Memclr256-48      10.1ns ±12%      9.9ns ±14%      ~     (p=0.285 n=20+19)
Memclr4096-48      104ns ± 8%       57ns ±15%   -44.98%  (p=0.000 n=20+20)
Memclr65536-48    2.45µs ± 5%     2.43µs ± 8%      ~     (p=0.665 n=16+20)
Memclr1M-48       58.7µs ±13%     56.4µs ±11%    -3.92%  (p=0.033 n=20+19)
Memclr4M-48        233µs ± 9%      234µs ± 9%      ~     (p=0.728 n=20+19)
Memclr8M-48        469µs ±11%      472µs ±16%      ~     (p=0.947 n=20+20)
Memclr16M-48       947µs ±10%      916µs ±10%      ~     (p=0.050 n=20+19)
Memclr64M-48      10.9ms ±10%      4.5ms ± 9%   -58.43%  (p=0.000 n=20+20)
GoMemclr5-48      3.80ns ±13%     3.38ns ± 6%   -11.02%  (p=0.000 n=20+20)
GoMemclr16-48     3.34ns ±15%     3.40ns ± 9%      ~     (p=0.351 n=20+20)
GoMemclr64-48     4.10ns ±15%     4.04ns ±10%      ~     (p=1.000 n=20+19)
GoMemclr256-48    7.75ns ±20%     7.88ns ± 9%      ~     (p=0.227 n=20+19)

name            old speed      new speed       delta
Memclr5-48       826MB/s ± 7%    886MB/s ± 8%    +7.32%  (p=0.000 n=20+20)
Memclr16-48     2.78GB/s ± 5%   2.81GB/s ± 6%      ~     (p=0.550 n=20+19)
Memclr64-48     9.79GB/s ± 5%  10.44GB/s ±10%    +6.64%  (p=0.000 n=18+20)
Memclr256-48    25.4GB/s ±14%   25.6GB/s ±12%      ~     (p=0.647 n=20+19)
Memclr4096-48   39.4GB/s ± 8%   72.0GB/s ±13%   +82.81%  (p=0.000 n=20+20)
Memclr65536-48  26.6GB/s ± 6%   27.0GB/s ± 9%      ~     (p=0.517 n=17+20)
Memclr1M-48     17.9GB/s ±12%   18.5GB/s ±11%      ~     (p=0.068 n=20+20)
Memclr4M-48     18.0GB/s ± 9%   17.8GB/s ±14%      ~     (p=0.547 n=20+20)
Memclr8M-48     17.9GB/s ±10%   17.8GB/s ±14%      ~     (p=0.947 n=20+20)
Memclr16M-48    17.8GB/s ± 9%   18.4GB/s ± 9%      ~     (p=0.050 n=20+19)
Memclr64M-48    6.19GB/s ±10%  14.87GB/s ± 9%  +140.11%  (p=0.000 n=20+20)
GoMemclr5-48    1.31GB/s ±10%   1.48GB/s ± 6%   +13.06%  (p=0.000 n=19+20)
GoMemclr16-48   4.81GB/s ±14%   4.71GB/s ± 8%      ~     (p=0.341 n=20+20)
GoMemclr64-48   15.7GB/s ±13%   15.8GB/s ±11%      ~     (p=0.967 n=20+19)

Change-Id: I393f3f20e2f31538d1b1dd70d6e5c201c106a095
Reviewed-on: https://go-review.googlesource.com/16773
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Klaus Post <klauspost@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2015-11-24 16:49:30 +00:00
Michael Hudson-Doyle 0fbf0955d4 cmd/internal/obj/x86: still use (fake) local exec TLS mode on android/386
golang.org/cl/16383 broke android/386 because by a sort of confluence of hacks
no TLS relocations were emitted at all when Flag_shared != 0. The hack in
runtime/cgo works as well in a PIE executable as it does with a position
dependent one, so the simplest fix is to still emit a R_TLS_LE reloc when goos
== "android".

A real fix is to use something more like the IE model code but loading the
offset from %gs to the thread local storage from a global variable rather than
from a location chosen by the system linker (this is how android/arm works).

Issue #9327.

Change-Id: I9fbfc890ec7fe191f80a595b6cf8e2a1fcbe3034
Reviewed-on: https://go-review.googlesource.com/17049
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2015-11-19 17:43:37 +00:00
Michael Hudson-Doyle 342f17eaf7 cmd/internal/obj/x86, cmd/link: enable access to global data via GOT when -dynlink on 386
Change-Id: I97504a11291ee60e656efb7704e37387e864d74f
Reviewed-on: https://go-review.googlesource.com/16385
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-18 21:27:45 +00:00
Michael Hudson-Doyle cb0393866a cmd/internal/obj/x86: position independent access to global data on 386 when -shared
This works by adding a call to __x86.get_pc_thunk.cx immediately before any
instruction that accesses global data and then assembling the instruction to
use the appropriate offset from CX instead of the absolute address. Some forms
cannot be assembled that way and are rewritten to load the address into CX
first.

-buildmode=pie works now, but is not yet tested.

Fixes #13201 (I think)

Change-Id: I32a8561e7fc9dd4ca6ae3b0e57ad78a6c50bf1f5
Reviewed-on: https://go-review.googlesource.com/17014
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-18 21:26:42 +00:00
Michael Hudson-Doyle 3c85e1b186 cmd/internal/obj/x86: factor rewriting to use GOT into separate function
I was prodded into doing this in review comments for the ARM version, and it's
going to make shared libs for 386 easier.

Change-Id: Id12de801b1425b8c6b5736fe91b418fc123a4e40
Reviewed-on: https://go-review.googlesource.com/17012
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-11-18 21:25:55 +00:00
Michael Hudson-Doyle 3bf61fb2e5 cmd/internal/obj/x86, cmd/link/internal/x86: support IE model TLS on linux/386
This includes the first parts of the general approach to PIC: load PC into CX
whenever it is needed. This is going to lead to large binaries and poor
performance but it's a start and easy to get right.

Change-Id: Ic8bf1d0a74284cca0d94a68cf75024e8ab063b4e
Reviewed-on: https://go-review.googlesource.com/16383
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-18 01:57:01 +00:00
Michael Hudson-Doyle 3534e2bef4 cmd/internal/obj, cmd/link: access global data via a GOT in -dynlink mode on arm64
Change-Id: I6ca9406207e40c7c2c661075ccfe57b6600235cf
Reviewed-on: https://go-review.googlesource.com/13997
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-15 23:40:22 +00:00
David Crawshaw 9958a7b563 Revert "cmd/internal/obj/arm64, cmd/link: use two instructions rather than three for loads from memory"
This reverts commit 3a9bc571b0.

Breaks darwin/arm64.

Change-Id: Ib958beacabca48020a6a47332fbdec99d994060b
Reviewed-on: https://go-review.googlesource.com/16906
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2015-11-13 11:40:10 +00:00
Shenghou Ma 2a031e6a2a cmd/internal/obj/arm64: rewrite branches that are too far
Fixes #12540.

Change-Id: I7893fdc023145b0aca4b4c7df7e08e47edcf5bba
Reviewed-on: https://go-review.googlesource.com/16902
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-13 03:40:58 +00:00
Michael Hudson-Doyle 64fbca41c8 cmd/internal/obj/ppc64: avoid calling morestack via a PLT when dynamically linking
Change-Id: Ie79f72786b1d7154f1910e717a0faf354b913b89
Reviewed-on: https://go-review.googlesource.com/15970
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-13 00:36:22 +00:00
Michael Hudson-Doyle 2ac993107f cmd/internal/obj, cmd/link: access global data via GOT when dynlinking on ppc64le
Change-Id: I79c60241df6c785f35371e70c777a7bd6e93571c
Reviewed-on: https://go-review.googlesource.com/15968
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-13 00:25:21 +00:00
Michael Hudson-Doyle a35c85c0cc cmd/internal/obj, runtime: implement IE model TLS on ppc64le
This requires changing the tls access code to match the patterns documented in
the ABI documentation or the system linker will "optimize" it into ridiculousness.

With this change, -buildmode=pie works, although as it is tested in testshared,
the tests are not run yet.

Change-Id: I1efa6687af0a5b8db3385b10f6542a49056b2eb3
Reviewed-on: https://go-review.googlesource.com/15971
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-12 23:50:27 +00:00
Michael Hudson-Doyle bd329d47d9 cmd/internal/obj, cmd/link: generate position independent loads of static data
Change-Id: I0a8448c2b69f5cfa6f099d772f5eb3412f853045
Reviewed-on: https://go-review.googlesource.com/15969
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-12 23:33:12 +00:00
Michael Hudson-Doyle 368d548417 cmd/compile, cmd/link, runtime: on ppc64x, maintain the TOC pointer in R2 when compiling PIC
The PowerPC ISA does not have a PC-relative load instruction, which poses
obvious challenges when generating position-independent code. The way the ELFv2
ABI addresses this is to specify that r2 points to a per "module" (shared
library or executable) TOC pointer. Maintaining this pointer requires
cooperation between codegen and the system linker:

 * Non-leaf functions leave space on the stack at r1+24 to save the TOC pointer.
 * A call to a function that *might* have to go via a PLT stub must be followed
   by a nop instruction that the system linker can replace with "ld r1, 24(r1)"
   to restore the TOC pointer (only when dynamically linking Go code).
 * When calling a function via a function pointer, the address of the function
   must be in r12, and the first couple of instructions (the "global entry
   point") of the called function use this to derive the address of the TOC
   for the module it is in.
 * When calling a function that is implemented in the same module, the system
   linker adjusts the call to skip over the instructions mentioned above (the
   "local entry point"), assuming that r2 is already correctly set.

So this changeset adds the global entry point instructions, sets the metadata so
the system linker knows where the local entry point is, inserts code to save the
TOC pointer at 24(r1), adds a nop after any call not known to be local and copes
with the odd non-local code transfer in the runtime (e.g. the stuff around
jmpdefer). It does not actually compile PIC yet.

Change-Id: I7522e22bdfd2f891745a900c60254fe9e372c854
Reviewed-on: https://go-review.googlesource.com/15967
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-12 23:18:58 +00:00
Michael Hudson-Doyle c83c806535 cmd/internal/obj, cmd/link, runtime: use a larger stack frame on ppc64
The larger stack frames causes the nosplit stack to overflow so the next change
increases the stackguard.

Change-Id: Ib2b4f24f0649eb1d13e3a58d265f13d1b6cc9bf9
Reviewed-on: https://go-review.googlesource.com/15964
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-12 22:32:37 +00:00
Michael Hudson-Doyle c1b6e392f5 cmd/internal/obj, cmd/link, runtime: increase stack limit to accommodate larger frames on ppc64x
Larger stack frames mean nosplit functions use more stack and so the limit
needs to increase.

The change to test/nosplit.go is a bit ugly but I can't really think of a
way to make it nicer.

Change-Id: I2616b58015f0b62abbd62951575fcd0d2d8643c2
Reviewed-on: https://go-review.googlesource.com/16504
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-12 22:32:16 +00:00
Yao Zhang fa6a1ecd63 cmd/internal/obj/mips: added support for GOARCH=mips64{,le}
MIPS64 has 32 general purpose 64-bit integer registers (R0-R31), 32
64-bit floating point registers (F0-F31). Instructions are fixed-width,
and are 32-bit wide. Instructions are all in standard 1-, 2-, 3-operand
forms.

MIPS64-specific relocations are added. For this reason, test data of
cmd/newlink are regenerated.

No other changes are made to portable structures.

Branch delay slots are current filled with NOP instructions. The function
for instruction scheduling (try to fill the delay slot with a useful
instruction) is implemented but disabled for now.

Change-Id: Ic364999c7a33245260c1381fc26a2fa8972d38b3
Reviewed-on: https://go-review.googlesource.com/14442
Reviewed-by: Minux Ma <minux@golang.org>
2015-11-12 04:42:44 +00:00
Hyang-Ah Hana Kim 05c4c6e2f4 cmd,runtime: TLS setup for android/386
Same ugly hack as https://go-review.googlesource.com/15991.

Update golang/go#9327.

Change-Id: I58284e83268a15de95eabc833c3e01bf1e3faa2e
Reviewed-on: https://go-review.googlesource.com/16678
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-11-11 21:59:24 +00:00
Michael Hudson-Doyle e6ceb92e1c cmd/internal/obj/arm: access global data via GOT on arm when -dynlink
Change-Id: I88034611f56cc06bb47b0c431075cc78ca8dbb09
Reviewed-on: https://go-review.googlesource.com/14188
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-10 19:57:05 +00:00
Michael Hudson-Doyle 3a9bc571b0 cmd/internal/obj/arm64, cmd/link: use two instructions rather than three for loads from memory
Reduces size of godoc .text section by about 75k (or 1.4%).

Change-Id: I65850aa569aefbddd6cb07c6ae1addcc39cab6a5
Reviewed-on: https://go-review.googlesource.com/13993
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-09 22:03:11 +00:00
Michael Hudson-Doyle 5e1d0fcbed cmd/internal/obj, cmd/link: handle the fact that a few store/loads on ppc64 are DS form
Change-Id: I4fe1af48ec1cd8a23e2f7f2a0257dc989ff7aced
Reviewed-on: https://go-review.googlesource.com/14235
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-08 19:35:47 +00:00
Ilya Tocar 321a40721b runtime: optimize indexbytebody on amd64
Use avx2 to compare 32 bytes per iteration.
Results (haswell):

name                    old time/op    new time/op     delta
IndexByte32-6             15.5ns ± 0%     14.7ns ± 5%   -4.87%        (p=0.000 n=16+20)
IndexByte4K-6              360ns ± 0%      183ns ± 0%  -49.17%        (p=0.000 n=19+20)
IndexByte4M-6              384µs ± 0%      256µs ± 1%  -33.41%        (p=0.000 n=20+20)
IndexByte64M-6            6.20ms ± 0%     4.18ms ± 1%  -32.52%        (p=0.000 n=19+20)
IndexBytePortable32-6     73.4ns ± 5%     75.8ns ± 3%   +3.35%        (p=0.000 n=20+19)
IndexBytePortable4K-6     5.15µs ± 0%     5.15µs ± 0%     ~     (all samples are equal)
IndexBytePortable4M-6     5.26ms ± 0%     5.25ms ± 0%   -0.12%        (p=0.000 n=20+18)
IndexBytePortable64M-6    84.1ms ± 0%     84.1ms ± 0%   -0.08%        (p=0.012 n=18+20)
Index32-6                  352ns ± 0%      352ns ± 0%     ~     (all samples are equal)
Index4K-6                 53.8µs ± 0%     53.8µs ± 0%   -0.03%        (p=0.000 n=16+18)
Index4M-6                 55.4ms ± 0%     55.4ms ± 0%     ~           (p=0.149 n=20+19)
Index64M-6                 886ms ± 0%      886ms ± 0%     ~           (p=0.108 n=20+20)
IndexEasy32-6             80.3ns ± 0%     80.1ns ± 0%   -0.21%        (p=0.000 n=20+20)
IndexEasy4K-6              426ns ± 0%      215ns ± 0%  -49.53%        (p=0.000 n=20+20)
IndexEasy4M-6              388µs ± 0%      262µs ± 1%  -32.42%        (p=0.000 n=18+20)
IndexEasy64M-6            6.20ms ± 0%     4.19ms ± 1%  -32.47%        (p=0.000 n=18+20)

name                    old speed      new speed       delta
IndexByte32-6           2.06GB/s ± 1%   2.17GB/s ± 5%   +5.19%        (p=0.000 n=18+20)
IndexByte4K-6           11.4GB/s ± 0%   22.3GB/s ± 0%  +96.45%        (p=0.000 n=17+20)
IndexByte4M-6           10.9GB/s ± 0%   16.4GB/s ± 1%  +50.17%        (p=0.000 n=20+20)
IndexByte64M-6          10.8GB/s ± 0%   16.0GB/s ± 1%  +48.19%        (p=0.000 n=19+20)
IndexBytePortable32-6    436MB/s ± 5%    422MB/s ± 3%   -3.27%        (p=0.000 n=20+19)
IndexBytePortable4K-6    795MB/s ± 0%    795MB/s ± 0%     ~           (p=0.940 n=17+18)
IndexBytePortable4M-6    798MB/s ± 0%    799MB/s ± 0%   +0.12%        (p=0.000 n=20+18)
IndexBytePortable64M-6   798MB/s ± 0%    798MB/s ± 0%   +0.08%        (p=0.011 n=18+20)
Index32-6               90.9MB/s ± 0%   90.9MB/s ± 0%   -0.00%        (p=0.025 n=20+20)
Index4K-6               76.1MB/s ± 0%   76.1MB/s ± 0%   +0.03%        (p=0.000 n=14+15)
Index4M-6               75.7MB/s ± 0%   75.7MB/s ± 0%     ~           (p=0.076 n=20+19)
Index64M-6              75.7MB/s ± 0%   75.7MB/s ± 0%     ~           (p=0.456 n=20+17)
IndexEasy32-6            399MB/s ± 0%    399MB/s ± 0%   +0.20%        (p=0.000 n=20+19)
IndexEasy4K-6           9.60GB/s ± 0%  19.02GB/s ± 0%  +98.19%        (p=0.000 n=20+20)
IndexEasy4M-6           10.8GB/s ± 0%   16.0GB/s ± 1%  +47.98%        (p=0.000 n=18+20)
IndexEasy64M-6          10.8GB/s ± 0%   16.0GB/s ± 1%  +48.08%        (p=0.000 n=18+20)

Change-Id: I46075921dde9f3580a89544c0b3a2d8c9181ebc4
Reviewed-on: https://go-review.googlesource.com/16484
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Klaus Post <klauspost@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-06 15:16:28 +00:00
Michael Hudson-Doyle 10c0753761 cmd/internal/obj/ppc64: fix assembly of SRADCC with immediate
sradi and sradi. hide the top bit of their immediate argument apart from the
rest of it, but the code only handled the sradi case.

I'm pretty sure this is the only instruction missing (a couple of the rotate
instructions encode their immediate the same way but their handling looks OK).

This fixes the failure of "GOARCH=amd64 ~/go/bin/go install -v runtime" as
reported in the bug.

Fixes #11987

Change-Id: I0cdefcd7a04e0e8fce45827e7054ffde9a83f589
Reviewed-on: https://go-review.googlesource.com/16710
Reviewed-by: Minux Ma <minux@golang.org>
2015-11-05 22:54:21 +00:00
Ilya Tocar 967564be7e runtime: optimize string comparison on amd64
Use AVX2 if possible.
Results below (haswell):

name                            old time/op    new time/op     delta
CompareStringEqual-6              8.77ns ± 0%     8.63ns ± 1%   -1.58%        (p=0.000 n=20+19)
CompareStringIdentical-6          5.02ns ± 0%     5.02ns ± 0%     ~     (all samples are equal)
CompareStringSameLength-6         7.51ns ± 0%     7.51ns ± 0%     ~     (all samples are equal)
CompareStringDifferentLength-6    1.56ns ± 0%     1.56ns ± 0%     ~     (all samples are equal)
CompareStringBigUnaligned-6        124µs ± 1%      105µs ± 5%  -14.99%        (p=0.000 n=20+18)
CompareStringBig-6                 112µs ± 1%      103µs ± 0%   -7.87%        (p=0.000 n=20+17)

name                            old speed      new speed       delta
CompareStringBigUnaligned-6     8.48GB/s ± 1%   9.98GB/s ± 5%  +17.67%        (p=0.000 n=20+18)
CompareStringBig-6              9.37GB/s ± 1%  10.17GB/s ± 0%   +8.54%        (p=0.000 n=20+17)

Change-Id: I1c949626dd2aaf9f633e3c888a9df71c82eed7e1
Reviewed-on: https://go-review.googlesource.com/16481
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Klaus Post <klauspost@gmail.com>
2015-11-05 15:42:33 +00:00
Ilya Tocar 0e23ca41d9 bytes: speed up Compare() on amd64
Use AVX2 if available.
Results (haswell), below:

name                           old time/op    new time/op     delta
BytesCompare1-6                  11.4ns ± 0%     11.4ns ± 0%     ~     (all samples are equal)
BytesCompare2-6                  11.4ns ± 0%     11.4ns ± 0%     ~     (all samples are equal)
BytesCompare4-6                  11.4ns ± 0%     11.4ns ± 0%     ~     (all samples are equal)
BytesCompare8-6                  9.29ns ± 2%     8.76ns ± 0%   -5.72%        (p=0.000 n=16+17)
BytesCompare16-6                 9.29ns ± 2%     9.20ns ± 0%   -1.02%        (p=0.000 n=20+16)
BytesCompare32-6                 11.4ns ± 1%     11.4ns ± 0%     ~           (p=0.191 n=20+20)
BytesCompare64-6                 14.4ns ± 0%     13.1ns ± 0%   -8.68%        (p=0.000 n=20+20)
BytesCompare128-6                20.2ns ± 0%     18.5ns ± 0%   -8.27%        (p=0.000 n=16+20)
BytesCompare256-6                29.3ns ± 0%     24.5ns ± 0%  -16.38%        (p=0.000 n=16+16)
BytesCompare512-6                46.8ns ± 0%     37.1ns ± 0%  -20.78%        (p=0.000 n=18+16)
BytesCompare1024-6               82.9ns ± 0%     62.3ns ± 0%  -24.86%        (p=0.000 n=20+14)
BytesCompare2048-6                155ns ± 0%      112ns ± 0%  -27.74%        (p=0.000 n=20+20)
CompareBytesEqual-6              10.1ns ± 1%     10.0ns ± 1%     ~           (p=0.527 n=20+20)
CompareBytesToNil-6              10.0ns ± 2%      9.4ns ± 0%   -6.57%        (p=0.000 n=20+17)
CompareBytesEmpty-6              8.76ns ± 0%     8.76ns ± 0%     ~     (all samples are equal)
CompareBytesIdentical-6          8.76ns ± 0%     8.76ns ± 0%     ~     (all samples are equal)
CompareBytesSameLength-6         10.6ns ± 1%     10.6ns ± 1%     ~           (p=0.240 n=20+20)
CompareBytesDifferentLength-6    10.6ns ± 0%     10.6ns ± 1%     ~           (p=1.000 n=20+20)
CompareBytesBigUnaligned-6        132±s ± 1%      105±s ± 1%  -20.61%        (p=0.000 n=20+18)
CompareBytesBig-6                 125±s ± 1%      105±s ± 1%  -16.31%        (p=0.000 n=20+20)
CompareBytesBigIdentical-6       8.13ns ± 0%     8.13ns ± 0%     ~     (all samples are equal)

name                           old speed      new speed       delta
CompareBytesBigUnaligned-6     7.94GB/s ± 1%  10.01GB/s ± 1%  +25.96%        (p=0.000 n=20+18)
CompareBytesBig-6              8.38GB/s ± 1%  10.01GB/s ± 1%  +19.48%        (p=0.000 n=20+20)
CompareBytesBigIdentical-6      129TB/s ± 0%    129TB/s ± 0%   +0.01%        (p=0.003 n=17+19)

Change-Id: I820f31bab4582dd4204b146bb077c0d2f24cd8f5
Reviewed-on: https://go-review.googlesource.com/16434
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Klaus Post <klauspost@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2015-11-02 18:39:38 +00:00
Michael Hudson-Doyle c9b8cab16c cmd/internal/obj, cmd/link, runtime: handle TLS more like a platform linker on ppc64
On ppc64x, the thread pointer, held in R13, points 0x7000 bytes past where
thread-local storage begins (presumably to maximize the amount of storage that
can be accessed with a 16-bit signed displacement). The relocations used to
indicate thread-local storage to the platform linker account for this, so to be
able to support external linking we need to change things so the linker applies
this offset instead of the runtime assembly.

Change-Id: I2556c249ab2d802cae62c44b2b4c5b44787d7059
Reviewed-on: https://go-review.googlesource.com/14233
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2015-10-29 22:24:29 +00:00
Michael Hudson-Doyle d821ae2a9e cmd/internal/obj, cmd/link: simplify ppc64 archreloc now that the original value is passed to it
And get rid of the stupid game of encoding the instruction in the addend.

Change-Id: Ib4de7515196cbc1e63b4261b01931cf02a44c1e6
Reviewed-on: https://go-review.googlesource.com/14055
Reviewed-by: Russ Cox <rsc@golang.org>
2015-10-29 20:46:23 +00:00
Hyang-Ah Hana Kim dfc8649854 runtime, cmd: TLS setup for android/amd64.
Android linker does not handle TLS for us. We set up the TLS slot
for g, as darwin/386,amd64 handle instead. This is disgusting and
fragile. We will eventually fix this ugly hack by taking advantage
of the recent TLS IE model implementation. (Instead of referencing
an GOT entry, make the code sequence look into the TLS variable that
holds the offset.)

The TLS slot for g in android/amd64 assumes a fixed offset from %fs.
See runtime/cgo/gcc_android_amd64.c for details.

For golang/go#10743

Change-Id: I1a3fc207946c665515f79026a56ea19134ede2dd
Reviewed-on: https://go-review.googlesource.com/15991
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-10-28 20:54:28 +00:00
Michael Hudson-Doyle 80d9106487 cmd/internal/obj, cmd/link: support inital-exec TLS on arm64
Change-Id: Iaf9159a68fa395245bc20ccb4a2a377f89371a7e
Reviewed-on: https://go-review.googlesource.com/13996
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-10-28 19:51:28 +00:00
Michael Hudson-Doyle 72180c3b82 cmd/internal/obj, cmd/link, runtime: native-ish support for tls on arm64
Fixes #10560

Change-Id: Iedffd9c236c4fbb386c3afc52c5a1457f96ef122
Reviewed-on: https://go-review.googlesource.com/13991
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-10-28 19:51:05 +00:00
Michael Hudson-Doyle 00f42437fd cmd/internal/obj/x86: remove REGTMP
Nothing uses this.

Change-Id: Ibc13066940bd2ea5c74d955a67f9dc531bef2758
Reviewed-on: https://go-review.googlesource.com/16344
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-10-27 03:48:21 +00:00
Robert Griesemer ae2f54a771 cmd/compile/internal/gc: compact binary export format
The binary import/export format is significantly more
compact than the existing textual format. It should
also be faster to read and write (to be measured).

Use -newexport to enable, for instance:
export GO_GCFLAGS=-newexport; make.bash

The compiler can import packages using both the old
and the new format ("mixed mode").

Missing: export info for inlined functions bodies
(performance issue, does not affect correctness).

Disabled by default until we have inlined function
bodies and confirmation of no regression and equality
of binaries.

For #6110.
For #1909.

This change depends on:

   https://go-review.googlesource.com/16220
   https://go-review.googlesource.com/16222

(already submitted) for all.bash to work.

Some initial export data sizes for std lib packages. This data
is without exported functions with inlineable function bodies.

Package                                       old      new    new/old

archive/tar.................................13875.....3883    28%
archive/zip.................................19464.....5046    26%
bufio....................................... 7733.....2222    29%
bytes.......................................10342.....3347    32%
cmd/addr2line.................................242.......26    11%
cmd/api.....................................39305....10368    26%
cmd/asm/internal/arch.......................27732.....7939    29%
cmd/asm/internal/asm........................35264....10295    29%
cmd/asm/internal/flags........................629......178    28%
cmd/asm/internal/lex........................39248....11128    28%
cmd/asm.......................................306.......26     8%
cmd/cgo.....................................40197....10570    26%
cmd/compile/internal/amd64...................1106......214    19%
cmd/compile/internal/arm....................27891.....7710    28%
cmd/compile/internal/arm64....................891......153    17%
cmd/compile/internal/big....................21637.....8336    39%
cmd/compile/internal/gc....................109845....29727    27%
cmd/compile/internal/mips64...................972......168    17%
cmd/compile/internal/ppc64....................972......168    17%
cmd/compile/internal/x86.....................1104......195    18%
cmd/compile...................................329.......26     8%
cmd/cover...................................12986.....3749    29%
cmd/dist......................................477.......67    14%
cmd/doc.....................................23043.....6793    29%
cmd/expdump...................................167.......26    16%
cmd/fix......................................1190......208    17%
cmd/go......................................26399.....5629    21%
cmd/gofmt.....................................499.......26     5%
cmd/internal/gcprog..........................1342......490    37%
cmd/internal/goobj...........................2690......980    36%
cmd/internal/obj/arm........................32740....10057    31%
cmd/internal/obj/arm64......................46542....15364    33%
cmd/internal/obj/mips.......................42140....13731    33%
cmd/internal/obj/ppc64......................42140....13731    33%
cmd/internal/obj/x86........................52732....19015    36%
cmd/internal/obj............................36729....11690    32%
cmd/internal/objfile........................36365....10287    28%
cmd/link/internal/amd64.....................45893....12220    27%
cmd/link/internal/arm.........................307.......96    31%
cmd/link/internal/arm64.......................345.......98    28%
cmd/link/internal/ld.......................109300....46326    42%
cmd/link/internal/ppc64.......................344.......99    29%
cmd/link/internal/x86.........................334......107    32%
cmd/link......................................314.......26     8%
cmd/newlink..................................8110.....2544    31%
cmd/nm........................................210.......26    12%
cmd/objdump...................................244.......26    11%
cmd/pack....................................14248.....4066    29%
cmd/pprof/internal/commands..................5239.....1285    25%
cmd/pprof/internal/driver...................37967.....8860    23%
cmd/pprof/internal/fetch....................30962.....7337    24%
cmd/pprof/internal/plugin...................47734.....7719    16%
cmd/pprof/internal/profile..................22286.....6922    31%
cmd/pprof/internal/report...................31187.....7838    25%
cmd/pprof/internal/svg.......................4315......965    22%
cmd/pprof/internal/symbolizer...............30051.....7397    25%
cmd/pprof/internal/symbolz..................28545.....6949    24%
cmd/pprof/internal/tempfile.................12550.....3356    27%
cmd/pprof.....................................563.......26     5%
cmd/trace....................................1455......636    44%
cmd/vendor/golang.org/x/arch/arm/armasm....168035....64737    39%
cmd/vendor/golang.org/x/arch/x86/x86asm.....26871.....8578    32%
cmd/vet.....................................38980.....9913    25%
cmd/vet/whitelist.............................102.......49    48%
cmd/yacc.....................................2518......926    37%
compress/bzip2...............................6326......129     2%
compress/flate...............................7069.....2541    36%
compress/gzip...............................20143.....5069    25%
compress/lzw..................................828......295    36%
compress/zlib...............................10676.....2692    25%
container/heap................................523......181    35%
container/list...............................3517......740    21%
container/ring................................881......229    26%
crypto/aes....................................550......187    34%
crypto/cipher................................1966......825    42%
crypto.......................................1836......646    35%
crypto/des....................................632......235    37%
crypto/dsa..................................18718.....5035    27%
crypto/ecdsa................................23131.....6097    26%
crypto/elliptic.............................20790.....5740    28%
crypto/hmac...................................455......186    41%
crypto/md5...................................1375......171    12%
crypto/rand.................................18132.....4748    26%
crypto/rc4....................................561......240    43%
crypto/rsa..................................22094.....6380    29%
crypto/sha1..................................1416......172    12%
crypto/sha256.................................551......238    43%
crypto/sha512.................................839......378    45%
crypto/subtle................................1153......250    22%
crypto/tls..................................58203....17984    31%
crypto/x509/pkix............................29447.....8161    28%
database/sql/driver..........................3318.....1096    33%
database/sql................................11258.....3942    35%
debug/dwarf.................................18416.....7006    38%
debug/elf...................................57530....21014    37%
debug/gosym..................................4992.....2058    41%
debug/macho.................................23037.....6538    28%
debug/pe....................................21063.....6619    31%
debug/plan9obj...............................2467......802    33%
encoding/ascii85.............................1523......360    24%
encoding/asn1................................1718......527    31%
encoding/base32..............................2642......686    26%
encoding/base64..............................3077......800    26%
encoding/binary..............................4727.....1040    22%
encoding/csv................................12223.....2850    23%
encoding......................................383......217    57%
encoding/gob................................37563....10113    27%
encoding/hex.................................1327......390    29%
encoding/json...............................30897.....7804    25%
encoding/pem..................................595......200    34%
encoding/xml................................37798.....9336    25%
errors........................................274.......36    13%
expvar.......................................3155.....1021    32%
flag........................................19860.....2849    14%
fmt..........................................3137.....1263    40%
go/ast......................................44729....13422    30%
go/build....................................16336.....4657    29%
go/constant..................................3703......846    23%
go/doc.......................................9877.....2807    28%
go/format....................................5472.....1575    29%
go/importer..................................4980.....1301    26%
go/internal/gccgoimporter....................5587.....1525    27%
go/internal/gcimporter.......................8979.....2186    24%
go/parser...................................20692.....5304    26%
go/printer...................................7015.....2029    29%
go/scanner...................................9719.....2824    29%
go/token.....................................7933.....2465    31%
go/types....................................64569....19978    31%
hash/adler32.................................1176......176    15%
hash/crc32...................................1663......360    22%
hash/crc64...................................1587......306    19%
hash/fnv.....................................3964......260     7%
hash..........................................591......278    47%
html..........................................217.......74    34%
html/template...............................69623....12588    18%
image/color/palette...........................315.......98    31%
image/color..................................5565.....1036    19%
image/draw...................................6917.....1028    15%
image/gif....................................8894.....1654    19%
image/internal/imageutil.....................9112.....1476    16%
image/jpeg...................................6647.....1026    15%
image/png....................................6906.....1069    15%
image.......................................28992.....6139    21%
index/suffixarray...........................17106.....4773    28%
internal/singleflight........................1614......506    31%
internal/testenv............................12212.....3152    26%
internal/trace...............................2762.....1323    48%
io/ioutil...................................13502.....3682    27%
io...........................................6765.....2482    37%
log.........................................11620.....3317    29%
log/syslog..................................13516.....3821    28%
math/big....................................21819.....8320    38%
math/cmplx...................................2816......438    16%
math/rand....................................2317......929    40%
math.........................................7511.....2444    33%
mime/multipart..............................12679.....3360    27%
mime/quotedprintable.........................5458.....1235    23%
mime.........................................6076.....1628    27%
net/http/cgi................................59796....17173    29%
net/http/cookiejar..........................14781.....3739    25%
net/http/fcgi...............................57861....16426    28%
net/http/httptest...........................84100....24365    29%
net/http/httputil...........................67763....18869    28%
net/http/internal............................6907......637     9%
net/http/pprof..............................57945....16316    28%
net/http....................................95391....30210    32%
net/internal/socktest........................4555.....1453    32%
net/mail....................................14481.....3608    25%
net/rpc/jsonrpc.............................33335......988     3%
net/rpc.....................................79950....23106    29%
net/smtp....................................57790....16468    28%
net/textproto...............................11356.....3248    29%
net/url......................................3123.....1009    32%
os/exec.....................................20738.....5769    28%
os/signal.....................................437......167    38%
os..........................................24875.....6668    27%
path/filepath...............................11340.....2826    25%
path..........................................778......285    37%
reflect.....................................15469.....5198    34%
regexp......................................13627.....4661    34%
regexp/syntax................................5539.....2249    41%
runtime/debug................................9275.....2322    25%
runtime/pprof................................1355......477    35%
runtime/race...................................39.......17    44%
runtime/trace.................................228.......92    40%
runtime.....................................13498.....1821    13%
sort.........................................2848......842    30%
strconv......................................2947.....1252    42%
strings......................................7983.....2456    31%
sync/atomic..................................2666.....1149    43%
sync.........................................2568......845    33%
syscall.....................................81252....38398    47%
testing/iotest...............................2444......302    12%
testing/quick...............................18890.....5076    27%
testing.....................................16502.....4800    29%
text/scanner.................................6849.....2052    30%
text/tabwriter...............................6607.....1863    28%
text/template/parse.........................22978.....6183    27%
text/template...............................64153....11518    18%
time........................................12103.....3546    29%
unicode......................................9706.....3320    34%
unicode/utf16................................1055......148    14%
unicode/utf8.................................1118......513    46%
vendor/golang.org/x/net/http2/hpack..........8905.....2636    30%

All packages                              3518505  1017774    29%

Change-Id: Id657334f276383ff1e6fa91472d3d1db5a03349c
Reviewed-on: https://go-review.googlesource.com/13937
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Chris Manghane <cmang@golang.org>
2015-10-22 21:01:29 +00:00
Keith Randall d96b4c494f cmd/internal/obj: fix PSRLW opcode
The reg-reg version compiled to PSRAW, not PSRLW (arithmetic
instead of logical shift right).

Fixes #13010.

Change-Id: I69a47bd83c8bbe66c7f8d82442ab45e9bf3b94fb
Reviewed-on: https://go-review.googlesource.com/16168
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-10-21 20:06:01 +00:00
Ilya Tocar 7d2c6eb3f5 cmd/internal/obj/x86: align functions with trap instruction
Align functions with 0xCC (INT $3) - breakpoint instruction,
instead of 0x00, which can disassemble into valid instruction.

Change-Id: Ieda191886efc4aacb86f58bea1169fd1b3b57636
Reviewed-on: https://go-review.googlesource.com/16102
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Gregory Shimansky <gregory.shimansky@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2015-10-20 15:48:49 +00:00
Michael Hudson-Doyle 97055dc1f1 cmd/compile, cmd/internal/obj: centralize knowledge of size of fixed part of stack
Shared libraries on ppc64le will require a larger minimum stack frame (because
the ABI mandates that the TOC pointer is available at 24(R1)). Part 2a of
preparing for that is to have all bits of arch-independent and ppc64-specific
codegen that need to know call a function to find out.

Change-Id: I55899f73037e92227813c491049a3bd6f30bd41f
Reviewed-on: https://go-review.googlesource.com/15524
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-10-18 22:19:06 +00:00