Commit Graph

606 Commits

Author SHA1 Message Date
Josh Bleecher Snyder 92cffa1391 cmd/internal/obj: remove dead func Copyp
Change-Id: Iaeb7bcbcdbc46c0e0e40b0aa070c706e0ca53013
Reviewed-on: https://go-review.googlesource.com/39555
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-05 15:18:14 +00:00
Josh Bleecher Snyder 26308fb481 cmd/internal/obj: use string instead of LSym in Pcln
In a concurrent backend, Ctxt.Lookup will need some
form of concurrency protection, which will make it
more expensive.

This CL changes the pcln table builder to track
filenames as strings rather than LSyms.
Those strings are then converted into LSyms
at the last moment, for writing the object file.

This CL removes over 85% of the calls to Ctxt.Lookup
in a run of make.bash.

Passes toolstash-check.

Updates #15756

Change-Id: I3c53deff6f16f2643169f3bdfcc7aca2ca58b0a4
Reviewed-on: https://go-review.googlesource.com/39291
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-03 15:19:47 +00:00
Dave Cheney 78f6622b81 cmd/internal/obj/*: rename Rconv to rconv
Each architecture's Rconv function is only used inside its
respective package, so it does not need to be exported.

Change-Id: Ifbd629964d7a9edd66501d7cdf4750621d66d646
Reviewed-on: https://go-review.googlesource.com/39110
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-01 10:41:37 +00:00
Alex Brainman 361af94d5d cmd/internal/obj, cmd/link: remove Hwindowsgui everywhere
Hwindowsgui has the same meaning as Hwindows - build PE
executable. So use Hwindows everywhere.

Change-Id: I2cae5777f17c7bc3a043dfcd014c1620cc35fc20
Reviewed-on: https://go-review.googlesource.com/38761
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-30 22:51:43 +00:00
Ben Shi c5ddc558ba cmd/internal/obj/arm: support more ARMv5/ARMv6/ARMv7 instructions
REV/REV16/REVSH were introduced in ARMv6, they offered more efficient
byte reverse operatons.

MMUL/MMULA/MMULS were introduced in ARMv6, they simplified
a serial of mul->shift->add/sub operations into a single instruction.

RBIT was introduced in ARMv7, it inversed a 32-bit word's bit order.

MULS was introduced in ARMv7, it corresponded to MULA.

MULBB/MULABB were introduced in ARMv5TE, they performed 16-bit
multiplication (and accumulation).

Change-Id: I6365b17b3c4eaf382a657c210bb0094b423b11b8
Reviewed-on: https://go-review.googlesource.com/35565
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-30 18:19:04 +00:00
Michael Munday 41fd8d6401 cmd/internal/obj: make morestack cutoff the same on all architectures
There is always 128 bytes available below the stackguard. Allow functions
with medium-sized stack frames to use this, potentially allowing them to
avoid growing the stack.

This change makes all architectures use the same calculation as x86.

Change-Id: I2afb1a7c686ae5a933e50903b31ea4106e4cd0a0
Reviewed-on: https://go-review.googlesource.com/38734
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-29 20:35:46 +00:00
Dave Cheney 54af18708b cmd/internal/obj/x86: clean up byteswapreg
Make byteswapreg more Go like.

Change-Id: Ibdf3603cae9cad2b3465b4c224a28a4c4c745c2e
Reviewed-on: https://go-review.googlesource.com/38615
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-29 00:02:06 +00:00
Josh Bleecher Snyder aca58647b7 cmd/internal/obj: eliminate stray ctxt.Cursym write
It is explicitly assigned in each of the
assemblers as needed.
I plan to remove Cursym entirely eventually,
but this is a helpful intermediate step.

Passes toolstash-check -all.

Updates #15756

Change-Id: Id7ddefae2def439af44d03053886ca8cc935731f
Reviewed-on: https://go-review.googlesource.com/38727
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-28 17:03:18 +00:00
Josh Bleecher Snyder 7e817859b3 cmd/internal/obj: eliminate Curp
Remove the global obj.Link.Curp.

In asmz.go, replace the only use by passing it as an argument.
In asm0.go and asm9.go, it was written but never read.
In asm5.go and asm7.go, thread it through as an argument.

Passes toolstash-check -all.

Updates #15756

Change-Id: I1a0faa89e768820f35d73a8b37ec8088d78d15f7
Reviewed-on: https://go-review.googlesource.com/38715
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-27 18:51:42 +00:00
Josh Bleecher Snyder 1acba7d4fa cmd/internal/obj: remove prasm
Fold the printing of the offending instruction
into the neighboring Diag call, if it is not
already present.

Change-Id: I310f1479e16a4d2a24ff3c2f7e2c60e5e2015c1b
Reviewed-on: https://go-review.googlesource.com/38714
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-27 18:51:30 +00:00
Josh Bleecher Snyder 4909ecc462 cmd/internal/obj/x86: change AsmBuf.Lock to bool
Follow-up to CL 38668.

Passes toolstash-check -all.

Change-Id: I78a62509c610b5184b5e7ef2c4aa146fc8038840
Reviewed-on: https://go-review.googlesource.com/38670
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2017-03-26 21:29:31 +00:00
Josh Bleecher Snyder cf2b32e719 cmd/internal/obj: eliminate Prog.Mode
Follow-up to CL 38446.

Passes toolstash-check -all.

Change-Id: I04cadc058cbaa5f396136502c574e5a395a33276
Reviewed-on: https://go-review.googlesource.com/38669
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 19:48:48 +00:00
Josh Bleecher Snyder 4f122e82fe cmd/internal/obj: move fields from obj.Link to x86.AsmBuf
These fields are used to encode a single instruction.
Add them to AsmBuf, which is also per-instruction,
and which is not global.

Updates #15756

Change-Id: I0e5ea22ffa641b07291e27de6e2ff23b6dc534bd
Reviewed-on: https://go-review.googlesource.com/38668
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 19:48:00 +00:00
Josh Bleecher Snyder 166160b446 cmd/internal/obj/x86: make ctxt.Cursym local
Thread it through as an argument instead of using a global.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ia8c6ce09b43dbb2e6c7d889ded8dbaeb5366048d
Reviewed-on: https://go-review.googlesource.com/38667
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 19:47:47 +00:00
Josh Bleecher Snyder c06679f7b5 cmd/internal/obj/x86: remove ctxt.Curp references
Empirically, p == ctxt.Curp here.
A scan of (the thousands of lines of) asm6.go
shows no clear opportunity for them to diverge.

Passes toolstash-check -all.

Updates #15756

Change-Id: I9f5ee9585a850fbe24be3b851d8fdc2c966c65ce
Reviewed-on: https://go-review.googlesource.com/38665
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-26 14:38:19 +00:00
Josh Bleecher Snyder 2ebe1bdd89 cmd/internal/obj: make x86's asmbuf a local variable
The x86 assembler requires a buffer to build
variable-length instructions.
It used to be an obj.Link field.
That doesn't play nicely with concurrent assembly.
Move the AsmBuf type to the x86 package,
where it belongs anyway,
and make it a local variable.

Passes toolstash-check -all.
No compiler performance impact.

Updates #15756

Change-Id: I8014e52145380bfd378ee374a0c971ee5bada917
Reviewed-on: https://go-review.googlesource.com/38663
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 04:12:26 +00:00
Josh Bleecher Snyder 3b251e603d cmd/internal/obj: eagerly initialize x86 assembler
Prior to this CL, instinit was called as needed.
This does not work well in a concurrent backend.
Initialization is very cheap; do it on startup instead.

Passes toolstash-check -all.
No compiler performance impact.

Updates #15756

Change-Id: Ifa5e82e8abf4504435e1b28766f5703a0555f42d
Reviewed-on: https://go-review.googlesource.com/38662
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-25 23:20:25 +00:00
Josh Bleecher Snyder 6652572b75 cmd/compile: thread Curfn through to debuginfo
Updates #15756

Change-Id: I860dd45cae9d851c7844654621bbc99efe7c7f03
Reviewed-on: https://go-review.googlesource.com/38591
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-24 16:22:58 +00:00
Ben Shi d0ff9ece2b cmd/internal/obj/arm: Fix wrong assembly in the arm assembler
As #19357 reported,

TST has 3 different sub forms
TST $imme, Rx
TST Ry << imme, Rx
TST Ry << Rz, Rx

just like CMP/CMN/TEQ has. But current arm assembler assembles all TST
instructions wrongly. This patch fixes it and adds more tests.

Fixes #19357

Change-Id: Iafedccfaab2cbb2631e7acf259837a782e2e8e2f
Reviewed-on: https://go-review.googlesource.com/37662
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-23 05:05:36 +00:00
Dave Cheney 6f4a4585f2 cmd/internal/obj/mips: standardize on sys.MIPS and sys.MIPS64 constants
CL 38446 introduced the use of the sys.ArchFamily type into the
cmd/internal/obj/mips package and redefined the mips.Mips32 and
mips.Mips64 constants in terms of their sys.ArchFamily counterparts.
This CL removes these local declarations and consolidates on sys.MIPS
and sys.MIPS64 respectively.

Change-Id: Id7aab6c7fd0de42ff43dde605df6bd4c85a3d895
Reviewed-on: https://go-review.googlesource.com/38287
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-23 00:37:51 +00:00
Josh Bleecher Snyder cfb3c8df62 cmd/internal/obj: eliminate Link.Asmode
Asmode is always set to p.Mode,
which is always set based on the arch family.
Instead, use the arch family directly.

Passes toolstash-check -all.

Change-Id: Id982472dcc8eeb6dd22cac5ad2f116b54a44caee
Reviewed-on: https://go-review.googlesource.com/38451
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 17:33:27 +00:00
Josh Bleecher Snyder a470e5d4b8 cmd/internal/obj: eliminate Ctxt.Mode
Replace Ctxt.Mode with a method, Ctxt.RegWidth,
which is calculated directly off the arch info.

I believe that Prog.Mode can also be removed; future CL.

This is a step towards obj.Link immutability.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ifd7f8f6ed0a2fdc032d1dd306fcd695a14aa5bc5
Reviewed-on: https://go-review.googlesource.com/38446
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 17:27:43 +00:00
Josh Bleecher Snyder 2f2cd557a6 cmd/internal/obj: clean up brloop
Add docs.
Reduce indentation.

Passes toolstash-check -all.

Change-Id: I968d1af25989886ae9945052e05e211a107dde9c
Reviewed-on: https://go-review.googlesource.com/38443
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 16:11:31 +00:00
Josh Bleecher Snyder 33266df861 cmd/internal/obj: clean up checkaddr
Coalesce identical cases.
Give it a proper doc comment.
Fix comment locations.
Update/delete old comments.

Passes toolstash-check -all.

Change-Id: I88d9cf20e6e04b0c1c6583e92cd96335831f183f
Reviewed-on: https://go-review.googlesource.com/38442
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 15:15:02 +00:00
Josh Bleecher Snyder 3394633db0 cmd/internal/obj: eliminate AMODE
AMODE appears to have been intended to allow
a Prog to switch between 16 (!), 32, or 64 bit x86.
It is unused anywhere in the tree.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ic57b257cfe580f29dad81d97e4193bf3c330c598
Reviewed-on: https://go-review.googlesource.com/38445
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-22 14:39:07 +00:00
Josh Bleecher Snyder 88d4ab82d5 cmd/internal/obj: eliminate unnecessary ctxt.Cursym assignment
None of the following code uses it.

Passes toolstash-check -all.

Updates #15756

Change-Id: Ieeaaca8ba31e5c345c0c8a758d520b24be88e173
Reviewed-on: https://go-review.googlesource.com/38444
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2017-03-22 14:38:59 +00:00
Matthew Dempsky 7bb5b2d33a cmd/internal/obj: remove unneeded Addr.Node and Prog.Opt fields
Change-Id: I218b241c32a5948b66ad0d95ecc368648cf4ddf5
Reviewed-on: https://go-review.googlesource.com/38130
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-20 23:49:29 +00:00
Matthew Dempsky cce4c319d6 cmd/internal/obj: remove unneeded AVARFOO ops
Change-Id: I10e36046ebce8a8741ef019cfe266b9ac9fa322d
Reviewed-on: https://go-review.googlesource.com/38088
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-20 23:40:09 +00:00
Josh Bleecher Snyder 42a915c933 cmd/internal/obj: convert Debug* Link fields into bools
Change-Id: I9ac274dbfe887675a7820d2f8f87b5887b1c9b0e
Reviewed-on: https://go-review.googlesource.com/38383
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-20 22:08:41 +00:00
Michael Munday df47b82174 cmd/internal/obj/s390x: cleanup objz.go
This CL deletes some unnecessary code in objz.go that existed to
support instruction scheduling. It's likely instruction scheduling
will never be done in this part of the backend so this code can
just be deleted.

This file can probably be cleaned up a bit more, but I think this
is a good start.

Passes: go build -toolexec 'toolstash -cmp' -a std.

Change-Id: I1645632ac551a90a4f4be418045c046b488e9469
Reviewed-on: https://go-review.googlesource.com/38394
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-20 21:58:27 +00:00
Josh Bleecher Snyder a955ece6cd cmd/internal/obj: reduce variable scope
Minor cleanup, to make it clearer
that the two p's are unrelated.

Change-Id: Icb6386c626681f60e5e631b33aa3a0fc84f40e4a
Reviewed-on: https://go-review.googlesource.com/38381
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-20 20:32:52 +00:00
Michael Munday 17570a9afb cmd/compile: emit fused multiply-{add,subtract} on ppc64x
A follow on to CL 36963 adding support for ppc64x.

Performance changes (as posted on the issue):

poly1305:
benchmark               old ns/op new ns/op delta
Benchmark64-16          172       151       -12.21%
Benchmark1K-16          1828      1523      -16.68%
Benchmark64Unaligned-16 172       151       -12.21%
Benchmark1KUnaligned-16 1827      1523      -16.64%

math:
BenchmarkAcos-16        43.9      39.9      -9.11%
BenchmarkAcosh-16       57.0      45.8      -19.65%
BenchmarkAsin-16        35.8      33.0      -7.82%
BenchmarkAsinh-16       68.6      60.8      -11.37%
BenchmarkAtan-16        19.8      16.2      -18.18%
BenchmarkAtanh-16       65.5      57.5      -12.21%
BenchmarkAtan2-16       45.4      34.2      -24.67%
BenchmarkGamma-16       37.6      26.0      -30.85%
BenchmarkLgamma-16      40.0      28.2      -29.50%
BenchmarkLog1p-16       35.1      29.1      -17.09%
BenchmarkSin-16         22.7      18.4      -18.94%
BenchmarkSincos-16      31.7      23.7      -25.24%
BenchmarkSinh-16        146       131       -10.27%
BenchmarkY0-16          130       107       -17.69%
BenchmarkY1-16          127       107       -15.75%
BenchmarkYn-16          278       235       -15.47%

Updates #17895.

Change-Id: I1c16199715d20c9c4bd97c4a950bcfa69eb688c1
Reviewed-on: https://go-review.googlesource.com/38095
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-03-20 20:01:29 +00:00
Josh Bleecher Snyder 604e4841d6 cmd/internal/obj/ppc64: remove stackbarrier function check
Stack barriers were removed in CL 36620.

Change-Id: If124d65a73a7b344a42be2a4b386a14d7a0a428b
Reviewed-on: https://go-review.googlesource.com/38169
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: David Chase <drchase@google.com>
2017-03-17 04:46:58 +00:00
Matthew Dempsky cc71aa9ac4 cmd/compile/internal/ssa: make ARM's udiv like other calls
Passes toolstash-check -all.

Change-Id: Id389f8158cf33a3c0fcef373615b5351e7c74b5b
Reviewed-on: https://go-review.googlesource.com/38082
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-13 21:29:02 +00:00
Ilya Tocar 27492a2a54 cmd/internal/obj/x86: remove unused const
Since https://go-review.googlesource.com/24040 we no longer pad functions
in asm6, so funcAlign is unused. Delete it.

Change-Id: Id710e545a76b1797398f2171fe7e0928811fcb31
Reviewed-on: https://go-review.googlesource.com/38134
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-13 19:19:09 +00:00
Robert Griesemer 2a5cf48f91 cmd/compile: print columns (not just lines) in error messages
Compiler errors now show the exact line and line byte offset (sometimes
called "column") of where an error occured. For `go tool compile x.go`:

	package p
	const c int = false
	//line foo.go:123
	type t intg

reports

	x.go:2:7: cannot convert false to type int
	foo.go:123[x.go:4:8]: undefined: intg

(Some errors use the "wrong" position for the error message; arguably
the byte offset for the first error should be 15, the position of 'false',
rathen than 7, the position of 'c'. But that is an indepedent issue.)

The byte offset (column) values are measured in bytes; they start at 1,
matching the convention used by editors and IDEs.

Positions modified by //line directives show the line offset only for the
actual source location (in square brackets), not for the "virtual" file and
line number because that code is likely generated and the //line directive
only provides line information.

Because the new format might break existing tools or scripts, printing
of line offsets can be disabled with the new compiler flag -C. We plan
to remove this flag eventually.

Fixes #10324.

Change-Id: I493f5ee6e78457cf7b00025aba6b6e28e50bb740
Reviewed-on: https://go-review.googlesource.com/37970
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-09 23:29:49 +00:00
Matthew Dempsky b1a4424a52 cmd/internal/obj: change started to bool
Change-Id: I90143e3c6e95a1495f300ffeb10de554aa41f56a
Reviewed-on: https://go-review.googlesource.com/37889
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-07 19:23:06 +00:00
Matthew Dempsky 68177d9ec0 cmd/internal/obj: move dwarf.Var generation into compiler
Passes toolstash -cmp.

Change-Id: I4bd60f7ebba5457e7b3ece688fee2351bfeeb59a
Reviewed-on: https://go-review.googlesource.com/37874
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2017-03-07 17:32:54 +00:00
Matthew Dempsky 1874d4a883 cmd/internal/obj, cmd/compile: rip off some toolstash bandaids
Change-Id: I402383e893223facae451adbd640113126d5edd9
Reviewed-on: https://go-review.googlesource.com/37873
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-06 23:29:36 +00:00
Matthew Dempsky 9cb2ee0ff2 cmd/internal/obj: move STEXT-only LSym fields into new FuncInfo struct
Shrinks LSym somewhat for non-STEXT LSyms, which are much more common.

While here, switch to tracking Automs in a slice instead of a linked
list. (Previously, this would have made LSyms larger.)

Passes toolstash-check.

Change-Id: I082e50e1d1f1b544c9e06b6e412a186be6a4a2b5
Reviewed-on: https://go-review.googlesource.com/37872
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-06 22:17:23 +00:00
Matthew Dempsky 7a98bdf1c2 cmd/internal/obj: remove AUSEFIELD pseudo-op
Instead, cmd/compile can directly emit R_USEFIELD relocations.

Manually verified rsc.io/tmp/fieldtrack still passes.

Change-Id: Ib1fb5ab902ff0ad17ef6a862a9a5692caf7f87d1
Reviewed-on: https://go-review.googlesource.com/37871
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-06 22:16:13 +00:00
Matthew Dempsky 7c84dc79fd cmd/internal/obj, cmd/link: bump magic string to go19ld
golang.org/cl/37231 changed the object file format, but forgot to bump
the version string.

Change-Id: I8351ec8ed55e65479006e7c0df20254d0e31015f
Reviewed-on: https://go-review.googlesource.com/37798
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-06 19:13:57 +00:00
Eitan Adler 789c5255a4 all: remove the the duplicate words
Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0
Reviewed-on: https://go-review.googlesource.com/37793
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-06 04:39:12 +00:00
Josh Bleecher Snyder 57e038615d cmd/internal/src: cache prefixed filenames
CL 37234 introduced string concatenation into some hot code. 
This CL does that work earlier and caches the result.

Updates #19386

Performance impact vs master:

name       old time/op      new time/op      delta
Template        223ms ± 5%       216ms ± 5%   -2.98%  (p=0.001 n=20+20)
Unicode        98.7ms ± 4%      99.0ms ± 4%     ~     (p=0.749 n=20+19)
GoTypes         631ms ± 4%       626ms ± 4%     ~     (p=0.253 n=20+20)
Compiler        2.91s ± 1%       2.87s ± 3%   -1.11%  (p=0.005 n=18+20)
SSA             4.48s ± 2%       4.36s ± 2%   -2.77%  (p=0.000 n=20+20)
Flate           130ms ± 2%       129ms ± 6%     ~     (p=0.428 n=19+20)
GoParser        160ms ± 4%       157ms ± 3%   -1.62%  (p=0.005 n=20+18)
Reflect         395ms ± 2%       394ms ± 4%     ~     (p=0.445 n=20+20)
Tar             120ms ± 5%       118ms ± 6%     ~     (p=0.101 n=19+20)
XML             224ms ± 3%       223ms ± 3%     ~     (p=0.544 n=19+19)

name       old user-ns/op   new user-ns/op   delta
Template   291user-ms ± 5%  265user-ms ± 5%   -9.02%  (p=0.000 n=20+19)
Unicode    140user-ms ± 3%  139user-ms ± 8%     ~     (p=0.904 n=20+20)
GoTypes    844user-ms ± 3%  849user-ms ± 3%     ~     (p=0.251 n=20+18)
Compiler   4.06user-s ± 5%  3.98user-s ± 2%     ~     (p=0.056 n=20+20)
SSA        6.89user-s ± 5%  6.50user-s ± 3%   -5.61%  (p=0.000 n=20+20)
Flate      164user-ms ± 5%  163user-ms ± 4%     ~     (p=0.365 n=20+19)
GoParser   206user-ms ± 6%  204user-ms ± 4%     ~     (p=0.534 n=20+18)
Reflect    501user-ms ± 4%  505user-ms ± 5%     ~     (p=0.383 n=20+20)
Tar        151user-ms ± 3%  152user-ms ± 7%     ~     (p=0.798 n=17+20)
XML        283user-ms ± 7%  280user-ms ± 5%     ~     (p=0.301 n=20+20)

name       old alloc/op     new alloc/op     delta
Template       42.5MB ± 0%      40.2MB ± 0%   -5.59%  (p=0.000 n=20+20)
Unicode        31.7MB ± 0%      31.0MB ± 0%   -2.19%  (p=0.000 n=20+18)
GoTypes         124MB ± 0%       117MB ± 0%   -5.90%  (p=0.000 n=20+20)
Compiler        533MB ± 0%       490MB ± 0%   -8.07%  (p=0.000 n=20+20)
SSA             989MB ± 0%       893MB ± 0%   -9.74%  (p=0.000 n=20+20)
Flate          27.8MB ± 0%      26.1MB ± 0%   -5.92%  (p=0.000 n=20+20)
GoParser       34.3MB ± 0%      32.1MB ± 0%   -6.43%  (p=0.000 n=19+20)
Reflect        84.6MB ± 0%      81.4MB ± 0%   -3.84%  (p=0.000 n=20+20)
Tar            28.8MB ± 0%      27.7MB ± 0%   -3.89%  (p=0.000 n=20+20)
XML            47.2MB ± 0%      44.2MB ± 0%   -6.45%  (p=0.000 n=20+19)

name       old allocs/op    new allocs/op    delta
Template         420k ± 1%        381k ± 1%   -9.35%  (p=0.000 n=20+20)
Unicode          338k ± 1%        324k ± 1%   -4.29%  (p=0.000 n=20+19)
GoTypes         1.28M ± 0%       1.15M ± 0%  -10.30%  (p=0.000 n=20+20)
Compiler        5.06M ± 0%       4.41M ± 0%  -12.92%  (p=0.000 n=20+20)
SSA             9.14M ± 0%       7.91M ± 0%  -13.46%  (p=0.000 n=19+20)
Flate            267k ± 0%        241k ± 1%   -9.53%  (p=0.000 n=20+20)
GoParser         347k ± 1%        312k ± 0%  -10.15%  (p=0.000 n=19+20)
Reflect         1.07M ± 0%       1.00M ± 0%   -6.86%  (p=0.000 n=20+20)
Tar              274k ± 1%        256k ± 1%   -6.73%  (p=0.000 n=20+20)
XML              448k ± 0%        398k ± 0%  -11.17%  (p=0.000 n=20+18)


Performance impact when applied together with CL 37234
atop CL 37234's parent commit (i.e. as if it were
a part of CL 37234), to show that this commit
makes CL 37234 completely performance-neutral:

name       old time/op      new time/op      delta
Template        222ms ±14%       222ms ±14%    ~     (p=1.000 n=14+15)
Unicode         104ms ±18%       106ms ±18%    ~     (p=0.650 n=13+14)
GoTypes         653ms ± 7%       638ms ± 5%    ~     (p=0.145 n=14+12)
Compiler        3.10s ± 1%       3.13s ±10%    ~     (p=1.000 n=2+2)
SSA             4.73s ±11%       4.68s ±11%    ~     (p=0.567 n=15+15)
Flate           136ms ± 4%       133ms ± 7%    ~     (p=0.231 n=12+14)
GoParser        163ms ±11%       169ms ±10%    ~     (p=0.352 n=14+14)
Reflect         415ms ±15%       423ms ±20%    ~     (p=0.715 n=15+14)
Tar             133ms ±17%       130ms ±23%    ~     (p=0.252 n=14+15)
XML             236ms ±16%       235ms ±14%    ~     (p=0.874 n=14+14)

name       old user-ns/op   new user-ns/op   delta
Template   271user-ms ±10%  271user-ms ±10%    ~     (p=0.780 n=14+15)
Unicode    143user-ms ± 5%  146user-ms ±11%    ~     (p=0.432 n=12+14)
GoTypes    864user-ms ± 5%  866user-ms ± 9%    ~     (p=0.905 n=14+13)
Compiler   4.17user-s ± 1%  4.26user-s ± 7%    ~     (p=1.000 n=2+2)
SSA        6.79user-s ± 8%  6.79user-s ± 6%    ~     (p=0.902 n=15+15)
Flate      169user-ms ± 8%  164user-ms ± 5%  -3.13%  (p=0.014 n=14+14)
GoParser   212user-ms ± 7%  217user-ms ±22%    ~     (p=1.000 n=13+15)
Reflect    521user-ms ± 7%  533user-ms ±15%    ~     (p=0.511 n=14+14)
Tar        165user-ms ±17%  161user-ms ±15%    ~     (p=0.345 n=15+15)
XML        294user-ms ±11%  292user-ms ±10%    ~     (p=0.839 n=14+14)

name       old alloc/op     new alloc/op     delta
Template       39.9MB ± 0%      39.9MB ± 0%    ~     (p=0.621 n=15+14)
Unicode        31.0MB ± 0%      31.0MB ± 0%    ~     (p=0.098 n=13+15)
GoTypes         117MB ± 0%       117MB ± 0%    ~     (p=0.775 n=15+15)
Compiler        488MB ± 0%       488MB ± 0%    ~     (p=0.333 n=2+2)
SSA             892MB ± 0%       892MB ± 0%  +0.03%  (p=0.000 n=15+15)
Flate          26.1MB ± 0%      26.1MB ± 0%    ~     (p=0.098 n=15+15)
GoParser       31.8MB ± 0%      31.8MB ± 0%    ~     (p=0.525 n=15+13)
Reflect        81.2MB ± 0%      81.2MB ± 0%  +0.06%  (p=0.001 n=12+14)
Tar            27.5MB ± 0%      27.5MB ± 0%    ~     (p=0.595 n=15+15)
XML            44.1MB ± 0%      44.1MB ± 0%    ~     (p=0.486 n=15+15)

name       old allocs/op    new allocs/op    delta
Template         378k ± 1%        378k ± 0%    ~     (p=0.949 n=15+14)
Unicode          324k ± 0%        324k ± 1%    ~     (p=0.057 n=14+15)
GoTypes         1.15M ± 0%       1.15M ± 0%    ~     (p=0.461 n=15+15)
Compiler        4.39M ± 0%       4.39M ± 0%    ~     (p=0.333 n=2+2)
SSA             7.90M ± 0%       7.90M ± 0%  +0.06%  (p=0.008 n=15+15)
Flate            240k ± 1%        241k ± 0%    ~     (p=0.233 n=15+15)
GoParser         309k ± 1%        309k ± 0%    ~     (p=0.867 n=15+12)
Reflect         1.00M ± 0%       1.00M ± 0%    ~     (p=0.139 n=12+15)
Tar              254k ± 1%        253k ± 1%    ~     (p=0.345 n=15+15)
XML              398k ± 0%        397k ± 1%    ~     (p=0.267 n=15+15)


Change-Id: Ic999a0f456a371c99eebba0f9747263a13836e33
Reviewed-on: https://go-review.googlesource.com/37766
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-04 18:19:06 +00:00
David Lazar 0824ae6dc1 cmd/compile: add flag for debugging PC-value tables
For example, `-d pctab=pctoinline` prints the PC-inline table and
inlining tree for every function.

Change-Id: Ia6b9ce4d83eed0b494318d40ffe06481ec5d58ab
Reviewed-on: https://go-review.googlesource.com/37235
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-03-03 21:29:38 +00:00
David Lazar 301149b9e4 cmd/internal/obj: avoid duplicate file name symbols
The meaning of Version=1 was overloaded: it was reserved for file name
symbols (to avoid conflicts with non-file name symbols), but was also
used to mean "give me a fresh version number for this symbol."

With the new inlining tree, the same file name symbol can appear in
multiple entries, but each one would become a distinct symbol with its
own version number.

Now, we avoid duplicating symbols by using Version=0 for file name
symbols and we avoid conflicts with other symbols by prefixing the
symbol name with "gofile..".

Change-Id: I8d0374053b8cdb6a9ca7fb71871b69b4dd369a9c
Reviewed-on: https://go-review.googlesource.com/37234
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-03-03 21:29:36 +00:00
David Lazar 699175a11a cmd/compile,link: generate PC-value tables with inlining information
In order to generate accurate tracebacks, the runtime needs to know the
inlined call stack for a given PC. This creates two tables per function
for this purpose. The first table is the inlining tree (stored in the
function's funcdata), which has a node containing the file, line, and
function name for every inlined call. The second table is a PC-value
table that maps each PC to a node in the inlining tree (or -1 if the PC
is not the result of inlining).

To give the appearance that inlining hasn't happened, the runtime also
needs the original source position information of inlined AST nodes.
Previously the compiler plastered over the line numbers of inlined AST
nodes with the line number of the call. This meant that the PC-line
table mapped each PC to line number of the outermost call in its inlined
call stack, with no way to access the innermost line number.

Now the compiler retains line numbers of inlined AST nodes and writes
the innermost source position information to the PC-line and PC-file
tables. Some tools and tests expect to see outermost line numbers, so we
provide the OutermostLine function for displaying line info.

To keep track of the inlined call stack for an AST node, we extend the
src.PosBase type with an index into a global inlining tree. Every time
the compiler inlines a call, it creates a node in the global inlining
tree for the call, and writes its index to the PosBase of every inlined
AST node. The parent of this node is the inlining tree index of the
call. -1 signifies no parent.

For each function, the compiler creates a local inlining tree and a
PC-value table mapping each PC to an index in the local tree.  These are
written to an object file, which is read by the linker.  The linker
re-encodes these tables compactly by deduplicating function names and
file names.

This change increases the size of binaries by 4-5%. For example, this is
how the go1 benchmark binary is impacted by this change:

section             old bytes   new bytes   delta
.text               3.49M ± 0%  3.49M ± 0%   +0.06%
.rodata             1.12M ± 0%  1.21M ± 0%   +8.21%
.gopclntab          1.50M ± 0%  1.68M ± 0%  +11.89%
.debug_line          338k ± 0%   435k ± 0%  +28.78%
Total               9.21M ± 0%  9.58M ± 0%   +4.01%

Updates #19348.

Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3
Reviewed-on: https://go-review.googlesource.com/37231
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-03-03 21:29:30 +00:00
Philip Hofer a143f5d646 cmd/internal/obj/arm: improve static branch prediction for wrapper prologue
This is a follow-up to CL 36893.

Move the unlikely branch in the wrapper prologue to the end
of the function, where it has minimal impact on the instruction
cache. Static branch prediction is also less likely to choose
a forward branch.

Updates #19042

sort benchmarks:
name                  old time/op  new time/op  delta
SearchWrappers-4      1.44µs ± 0%  1.45µs ± 0%  +1.15%  (p=0.000 n=9+10)
SortString1K-4        1.02ms ± 0%  1.04ms ± 0%  +2.39%  (p=0.000 n=10+10)
SortString1K_Slice-4   960µs ± 0%   989µs ± 0%  +2.95%  (p=0.000 n=9+10)
StableString1K-4       218µs ± 0%   213µs ± 0%  -2.13%  (p=0.000 n=10+10)
SortInt1K-4            541µs ± 0%   543µs ± 0%  +0.30%  (p=0.003 n=9+10)
StableInt1K-4          760µs ± 1%   763µs ± 1%  +0.38%  (p=0.011 n=10+10)
StableInt1K_Slice-4    840µs ± 1%   779µs ± 0%  -7.31%  (p=0.000 n=9+10)
SortInt64K-4          55.2ms ± 0%  55.4ms ± 1%  +0.34%  (p=0.012 n=10+8)
SortInt64K_Slice-4    56.2ms ± 0%  55.6ms ± 1%  -1.16%  (p=0.000 n=10+10)
StableInt64K-4        70.9ms ± 1%  71.0ms ± 0%    ~     (p=0.315 n=10+7)
Sort1e2-4              250µs ± 0%   249µs ± 1%    ~     (p=0.315 n=9+10)
Stable1e2-4            600µs ± 0%   594µs ± 0%  -1.09%  (p=0.000 n=9+10)
Sort1e4-4             51.2ms ± 0%  51.4ms ± 1%  +0.40%  (p=0.001 n=9+10)
Stable1e4-4            204ms ± 1%   199ms ± 1%  -2.27%  (p=0.000 n=10+10)
Sort1e6-4              8.42s ± 0%   8.44s ± 0%  +0.28%  (p=0.000 n=8+9)
Stable1e6-4            43.3s ± 0%   42.5s ± 1%  -1.89%  (p=0.000 n=9+9)

Change-Id: I827559aa557fdba211a38ce3f77137b471c5c67e
Reviewed-on: https://go-review.googlesource.com/37611
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-03-02 05:15:32 +00:00
Keith Randall 1eed80f09a cmd/compile: fix disassembly of invalid instructions
Make sure that if we encode an explicit base register, we print it.
That will ensure that if we make an Addr with an auto variable but
a base that isn't SP, then it will be obvious from the disassembly.

Update #19184

Change-Id: If5556a5183f344d719ec7197aa935a0166061e6f
Reviewed-on: https://go-review.googlesource.com/37255
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-01 21:30:49 +00:00
Heschi Kreinick ac7761e1a4 cmd/compile, cmd/asm: remove Link.Plists
Link.Plists never contained more than one Plist, and sometimes none.
Passing around the Plist being worked on is straightforward and makes
the data flow easier to follow.

Change-Id: I79cb30cb2bd3d319fdbb1dfa5d35b27fcb748e5c
Reviewed-on: https://go-review.googlesource.com/37169
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-01 00:29:23 +00:00
Matthew Dempsky 9bc67bb4f4 cmd/internal/obj: remove unused Getcallerpc function
Change-Id: I0c7b677657326f318e906e109cbda0cfa78c4973
Reviewed-on: https://go-review.googlesource.com/37537
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2017-02-28 20:33:03 +00:00
Austin Clements bab191042b cmd/internal/obj, runtime: update funcdata comments
The comments in cmd/internal/obj/funcdata.go are identical to the
comments in runtime/funcdata.h, but the majority of the definitions
they refer to don't apply to Go sources and have been stripped out of
funcdata.go.

Remove these stale comments from funcdata.go and clean up the
references to other copies of the PCDATA and FUNCDATA indexes.

Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884
Reviewed-on: https://go-review.googlesource.com/37330
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-27 22:29:28 +00:00
Josh Bleecher Snyder 88f423edac cmd/internal/obj/x86: improve static branch prediction for wrapper prologue
Static branch prediction assumes that forward branches are not taken.
The existing wrapper prologue almost always takes the first forward
branch.
Move the rare case to the end of the function.

This CL is amd64 only. Other architectures will be done in separate CLs.

Updates #19042.

Package sort benchmarks:

SearchWrappers-8       104ns ± 2%   104ns ± 0%  -0.41%  (p=0.006 n=30+41)
SortString1K-8         128µs ± 1%   128µs ± 1%  -0.25%  (p=0.045 n=30+56)
SortString1K_Slice-8   117µs ± 1%   117µs ± 1%    ~     (p=0.855 n=30+59)
StableString1K-8      18.6µs ± 1%  18.6µs ± 1%    ~     (p=0.599 n=29+60)
SortInt1K-8           61.0µs ± 1%  56.5µs ± 1%  -7.36%  (p=0.000 n=29+58)
StableInt1K-8         74.6µs ± 1%  70.4µs ± 3%  -5.54%  (p=0.000 n=28+60)
StableInt1K_Slice-8   59.9µs ± 1%  58.3µs ± 4%  -2.64%  (p=0.000 n=29+60)
SortInt64K-8          6.02ms ± 2%  5.98ms ± 2%  -0.60%  (p=0.000 n=29+59)
SortInt64K_Slice-8    5.07ms ± 2%  5.05ms ± 2%  -0.38%  (p=0.006 n=30+58)
StableInt64K-8        6.41ms ± 1%  6.22ms ± 1%  -3.00%  (p=0.000 n=27+58)
Sort1e2-8             37.4µs ± 1%  37.1µs ± 1%  -0.91%  (p=0.000 n=30+57)
Stable1e2-8           74.8µs ± 1%  75.2µs ± 1%  +0.52%  (p=0.000 n=30+57)
Sort1e4-8             8.11ms ± 1%  8.01ms ± 1%  -1.20%  (p=0.000 n=30+59)
Stable1e4-8           24.3ms ± 1%  24.3ms ± 1%    ~     (p=0.157 n=30+60)
Sort1e6-8              1.25s ± 1%   1.23s ± 1%  -1.43%  (p=0.000 n=29+58)
Stable1e6-8            4.93s ± 1%   4.90s ± 1%  -0.56%  (p=0.000 n=29+59)
[Geo mean]             720µs        709µs       -1.52%

Assembly for sort.(*intPairs).Swap:

Before:

"".(*intPairs).Swap t=1 size=147 args=0x18 locals=0x8
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Swap(SB), $8-24
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	SUBQ	$8, SP
	0x000d 00013 (<autogenerated>:1)	MOVQ	BP, (SP)
	0x0011 00017 (<autogenerated>:1)	LEAQ	(SP), BP
	0x0015 00021 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0019 00025 (<autogenerated>:1)	TESTQ	BX, BX
	0x001c 00028 (<autogenerated>:1)	JEQ	43
	0x001e 00030 (<autogenerated>:1)	LEAQ	16(SP), DI
	0x0023 00035 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0026 00038 (<autogenerated>:1)	JNE	43
	0x0028 00040 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x002b 00043 (<autogenerated>:1)	NOP
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB)
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x002b 00043 (<autogenerated>:1)	MOVQ	""..this+16(FP), AX
	0x0030 00048 (<autogenerated>:1)	TESTQ	AX, AX
	0x0033 00051 (<autogenerated>:1)	JEQ	$0, 140
	0x0035 00053 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0038 00056 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x003c 00060 (<autogenerated>:1)	MOVQ	"".i+24(FP), DX
	0x0041 00065 (<autogenerated>:1)	CMPQ	DX, AX
	0x0044 00068 (<autogenerated>:1)	JCC	$0, 133
	0x0046 00070 (<autogenerated>:1)	SHLQ	$4, DX
	0x004a 00074 (<autogenerated>:1)	MOVQ	8(CX)(DX*1), BX
	0x004f 00079 (<autogenerated>:1)	MOVQ	(CX)(DX*1), SI
	0x0053 00083 (<autogenerated>:1)	MOVQ	"".j+32(FP), DI
	0x0058 00088 (<autogenerated>:1)	CMPQ	DI, AX
	0x005b 00091 (<autogenerated>:1)	JCC	$0, 133
	0x005d 00093 (<autogenerated>:1)	SHLQ	$4, DI
	0x0061 00097 (<autogenerated>:1)	MOVQ	8(CX)(DI*1), AX
	0x0066 00102 (<autogenerated>:1)	MOVQ	(CX)(DI*1), R8
	0x006a 00106 (<autogenerated>:1)	MOVQ	R8, (CX)(DX*1)
	0x006e 00110 (<autogenerated>:1)	MOVQ	AX, 8(CX)(DX*1)
	0x0073 00115 (<autogenerated>:1)	MOVQ	SI, (CX)(DI*1)
	0x0077 00119 (<autogenerated>:1)	MOVQ	BX, 8(CX)(DI*1)
	0x007c 00124 (<autogenerated>:1)	MOVQ	(SP), BP
	0x0080 00128 (<autogenerated>:1)	ADDQ	$8, SP
	0x0084 00132 (<autogenerated>:1)	RET
	0x0085 00133 (<autogenerated>:1)	PCDATA	$0, $1
	0x0085 00133 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x008a 00138 (<autogenerated>:1)	UNDEF
	0x008c 00140 (<autogenerated>:1)	PCDATA	$0, $1
	0x008c 00140 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x0091 00145 (<autogenerated>:1)	UNDEF

After:

"".(*intPairs).Swap t=1 size=149 args=0x18 locals=0x8
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Swap(SB), $8-24
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	SUBQ	$8, SP
	0x000d 00013 (<autogenerated>:1)	MOVQ	BP, (SP)
	0x0011 00017 (<autogenerated>:1)	LEAQ	(SP), BP
	0x0015 00021 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0019 00025 (<autogenerated>:1)	TESTQ	BX, BX
	0x001c 00028 (<autogenerated>:1)	JNE	134
	0x001e 00030 (<autogenerated>:1)	NOP
	0x001e 00030 (<autogenerated>:1)	FUNCDATA	$0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB)
	0x001e 00030 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x001e 00030 (<autogenerated>:1)	MOVQ	""..this+16(FP), AX
	0x0023 00035 (<autogenerated>:1)	TESTQ	AX, AX
	0x0026 00038 (<autogenerated>:1)	JEQ	$0, 127
	0x0028 00040 (<autogenerated>:1)	MOVQ	(AX), CX
	0x002b 00043 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x002f 00047 (<autogenerated>:1)	MOVQ	"".i+24(FP), DX
	0x0034 00052 (<autogenerated>:1)	CMPQ	DX, AX
	0x0037 00055 (<autogenerated>:1)	JCC	$0, 120
	0x0039 00057 (<autogenerated>:1)	SHLQ	$4, DX
	0x003d 00061 (<autogenerated>:1)	MOVQ	8(CX)(DX*1), BX
	0x0042 00066 (<autogenerated>:1)	MOVQ	(CX)(DX*1), SI
	0x0046 00070 (<autogenerated>:1)	MOVQ	"".j+32(FP), DI
	0x004b 00075 (<autogenerated>:1)	CMPQ	DI, AX
	0x004e 00078 (<autogenerated>:1)	JCC	$0, 120
	0x0050 00080 (<autogenerated>:1)	SHLQ	$4, DI
	0x0054 00084 (<autogenerated>:1)	MOVQ	8(CX)(DI*1), AX
	0x0059 00089 (<autogenerated>:1)	MOVQ	(CX)(DI*1), R8
	0x005d 00093 (<autogenerated>:1)	MOVQ	R8, (CX)(DX*1)
	0x0061 00097 (<autogenerated>:1)	MOVQ	AX, 8(CX)(DX*1)
	0x0066 00102 (<autogenerated>:1)	MOVQ	SI, (CX)(DI*1)
	0x006a 00106 (<autogenerated>:1)	MOVQ	BX, 8(CX)(DI*1)
	0x006f 00111 (<autogenerated>:1)	MOVQ	(SP), BP
	0x0073 00115 (<autogenerated>:1)	ADDQ	$8, SP
	0x0077 00119 (<autogenerated>:1)	RET
	0x0078 00120 (<autogenerated>:1)	PCDATA	$0, $1
	0x0078 00120 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x007d 00125 (<autogenerated>:1)	UNDEF
	0x007f 00127 (<autogenerated>:1)	PCDATA	$0, $1
	0x007f 00127 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x0084 00132 (<autogenerated>:1)	UNDEF
	0x0086 00134 (<autogenerated>:1)	LEAQ	16(SP), DI
	0x008b 00139 (<autogenerated>:1)	CMPQ	(BX), DI
	0x008e 00142 (<autogenerated>:1)	JNE	30
	0x0090 00144 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x0093 00147 (<autogenerated>:1)	JMP	30

Change-Id: Ie8c37f384bba10fbacaa754bb0a6b0a7e520ef01
Reviewed-on: https://go-review.googlesource.com/36893
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-27 20:47:37 +00:00
Carlos Eduardo Seo c12cd31a33 cmd/internal/obj/ppc64: Fix RLDIMI
Fix the encoding of the SH field for rldimi.

The SH field of rldimi is 6-bit wide and it is not contiguous in the instruction.
Bits 0-4 are placed in bit fields 16-20 in the instruction, while bit 5 is
placed in bit field 30. The current implementation does not consider this and,
therefore, any SH field between 32 and 63 are encoded wrongly in the instruciton.

Change-Id: I4d25a0a70f4219569be0e18160dea5505bd7fff0
Reviewed-on: https://go-review.googlesource.com/37350
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-02-22 20:29:48 +00:00
Matthew Dempsky 02de5ed748 cmd/internal/obj: add AddrName type and cleanup AddrType values
Passes toolstash -cmp.

Change-Id: Ida3eda9bd9d79a34c1c3f18cb41aea9392698076
Reviewed-on: https://go-review.googlesource.com/36950
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-02-13 21:56:17 +00:00
Josh Bleecher Snyder 5030bfdf81 cmd/internal/obj/x86: add comments to wrapper prologue insertion
Make the comments a bit clearer and more accurate,
in anticipation of updating the code.

Change-Id: I1111e6c3405a8688fcd29b809a48a762ff41edaa
Reviewed-on: https://go-review.googlesource.com/36833
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-11 23:38:25 +00:00
Josh Bleecher Snyder 2c91bb4c8a cmd/compile: make panicwrap argument-free
When code defines a method on T,
the compiler generates a corresponding wrapper method on *T.
The first thing the wrapper does is check whether
the pointer is nil and if so, call panicwrap.
This is done to provide a useful error message.

The existing implementation gets its information
from arguments set up by the compiler.
However, with some trouble, this information can
be extracted from the name of the wrapper method itself.

Removing the arguments to panicwrap simplifies and
shrinks the wrapper method.
It also means that the call to panicwrap does not
require any stack space.
This enables a further optimization on amd64/x86,
which is to skip the function prologue if nothing
else in the method requires stack space.
This is frequently the case in simple, hot methods,
such as Less and Swap in sort.Interface implementations.

Fixes #19040.

Benchmarks for package sort on amd64:

name                  old time/op  new time/op  delta
SearchWrappers-8       104ns ± 1%   104ns ± 1%    ~     (p=0.286 n=27+27)
SortString1K-8         128µs ± 1%   128µs ± 1%  -0.44%  (p=0.004 n=30+30)
SortString1K_Slice-8   118µs ± 2%   117µs ± 1%    ~     (p=0.106 n=30+30)
StableString1K-8      18.6µs ± 1%  18.6µs ± 1%    ~     (p=0.446 n=28+26)
SortInt1K-8           65.9µs ± 1%  60.7µs ± 1%  -7.96%  (p=0.000 n=28+30)
StableInt1K-8         75.3µs ± 2%  72.8µs ± 1%  -3.41%  (p=0.000 n=30+30)
StableInt1K_Slice-8   57.7µs ± 1%  57.7µs ± 1%    ~     (p=0.515 n=30+30)
SortInt64K-8          6.28ms ± 1%  6.01ms ± 1%  -4.19%  (p=0.000 n=28+28)
SortInt64K_Slice-8    5.04ms ± 1%  5.04ms ± 1%    ~     (p=0.927 n=28+27)
StableInt64K-8        6.65ms ± 1%  6.38ms ± 1%  -3.97%  (p=0.000 n=26+30)
Sort1e2-8             37.9µs ± 1%  37.2µs ± 1%  -1.89%  (p=0.000 n=29+27)
Stable1e2-8           77.0µs ± 1%  74.7µs ± 1%  -3.06%  (p=0.000 n=27+30)
Sort1e4-8             8.21ms ± 2%  7.98ms ± 1%  -2.77%  (p=0.000 n=29+30)
Stable1e4-8           24.8ms ± 1%  24.3ms ± 1%  -2.31%  (p=0.000 n=28+30)
Sort1e6-8              1.27s ± 4%   1.22s ± 1%  -3.42%  (p=0.000 n=30+29)
Stable1e6-8            5.06s ± 1%   4.92s ± 1%  -2.77%  (p=0.000 n=25+29)
[Geo mean]             731µs        714µs       -2.29%

Before/after assembly for sort.(*intPairs).Less follows.
It can be optimized further, but that's for a follow-up CL.

Before:

"".(*intPairs).Less t=1 size=214 args=0x20 locals=0x38
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Less(SB), $56-32
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	CMPQ	SP, 16(CX)
	0x000d 00013 (<autogenerated>:1)	JLS	204
	0x0013 00019 (<autogenerated>:1)	SUBQ	$56, SP
	0x0017 00023 (<autogenerated>:1)	MOVQ	BP, 48(SP)
	0x001c 00028 (<autogenerated>:1)	LEAQ	48(SP), BP
	0x0021 00033 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0025 00037 (<autogenerated>:1)	TESTQ	BX, BX
	0x0028 00040 (<autogenerated>:1)	JEQ	55
	0x002a 00042 (<autogenerated>:1)	LEAQ	64(SP), DI
	0x002f 00047 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0032 00050 (<autogenerated>:1)	JNE	55
	0x0034 00052 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x0037 00055 (<autogenerated>:1)	NOP
	0x0037 00055 (<autogenerated>:1)	FUNCDATA	$0, gclocals·4032f753396f2012ad1784f398b170f4(SB)
	0x0037 00055 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x0037 00055 (<autogenerated>:1)	MOVQ	""..this+64(FP), AX
	0x003c 00060 (<autogenerated>:1)	TESTQ	AX, AX
	0x003f 00063 (<autogenerated>:1)	JEQ	$0, 135
	0x0041 00065 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0044 00068 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x0048 00072 (<autogenerated>:1)	MOVQ	"".i+72(FP), DX
	0x004d 00077 (<autogenerated>:1)	CMPQ	DX, AX
	0x0050 00080 (<autogenerated>:1)	JCC	$0, 128
	0x0052 00082 (<autogenerated>:1)	SHLQ	$4, DX
	0x0056 00086 (<autogenerated>:1)	MOVQ	(CX)(DX*1), DX
	0x005a 00090 (<autogenerated>:1)	MOVQ	"".j+80(FP), BX
	0x005f 00095 (<autogenerated>:1)	CMPQ	BX, AX
	0x0062 00098 (<autogenerated>:1)	JCC	$0, 128
	0x0064 00100 (<autogenerated>:1)	SHLQ	$4, BX
	0x0068 00104 (<autogenerated>:1)	MOVQ	(CX)(BX*1), AX
	0x006c 00108 (<autogenerated>:1)	CMPQ	DX, AX
	0x006f 00111 (<autogenerated>:1)	SETLT	AL
	0x0072 00114 (<autogenerated>:1)	MOVB	AL, "".~r2+88(FP)
	0x0076 00118 (<autogenerated>:1)	MOVQ	48(SP), BP
	0x007b 00123 (<autogenerated>:1)	ADDQ	$56, SP
	0x007f 00127 (<autogenerated>:1)	RET
	0x0080 00128 (<autogenerated>:1)	PCDATA	$0, $1
	0x0080 00128 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x0085 00133 (<autogenerated>:1)	UNDEF
	0x0087 00135 (<autogenerated>:1)	LEAQ	go.string."sort_test"(SB), AX
	0x008e 00142 (<autogenerated>:1)	MOVQ	AX, (SP)
	0x0092 00146 (<autogenerated>:1)	MOVQ	$9, 8(SP)
	0x009b 00155 (<autogenerated>:1)	LEAQ	go.string."intPairs"(SB), AX
	0x00a2 00162 (<autogenerated>:1)	MOVQ	AX, 16(SP)
	0x00a7 00167 (<autogenerated>:1)	MOVQ	$8, 24(SP)
	0x00b0 00176 (<autogenerated>:1)	LEAQ	go.string."Less"(SB), AX
	0x00b7 00183 (<autogenerated>:1)	MOVQ	AX, 32(SP)
	0x00bc 00188 (<autogenerated>:1)	MOVQ	$4, 40(SP)
	0x00c5 00197 (<autogenerated>:1)	PCDATA	$0, $1
	0x00c5 00197 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x00ca 00202 (<autogenerated>:1)	UNDEF
	0x00cc 00204 (<autogenerated>:1)	NOP
	0x00cc 00204 (<autogenerated>:1)	PCDATA	$0, $-1
	0x00cc 00204 (<autogenerated>:1)	CALL	runtime.morestack_noctxt(SB)
	0x00d1 00209 (<autogenerated>:1)	JMP	0

After:

"".(*intPairs).Swap t=1 size=147 args=0x18 locals=0x8
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Swap(SB), $8-24
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	SUBQ	$8, SP
	0x000d 00013 (<autogenerated>:1)	MOVQ	BP, (SP)
	0x0011 00017 (<autogenerated>:1)	LEAQ	(SP), BP
	0x0015 00021 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0019 00025 (<autogenerated>:1)	TESTQ	BX, BX
	0x001c 00028 (<autogenerated>:1)	JEQ	43
	0x001e 00030 (<autogenerated>:1)	LEAQ	16(SP), DI
	0x0023 00035 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0026 00038 (<autogenerated>:1)	JNE	43
	0x0028 00040 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x002b 00043 (<autogenerated>:1)	NOP
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB)
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x002b 00043 (<autogenerated>:1)	MOVQ	""..this+16(FP), AX
	0x0030 00048 (<autogenerated>:1)	TESTQ	AX, AX
	0x0033 00051 (<autogenerated>:1)	JEQ	$0, 140
	0x0035 00053 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0038 00056 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x003c 00060 (<autogenerated>:1)	MOVQ	"".i+24(FP), DX
	0x0041 00065 (<autogenerated>:1)	CMPQ	DX, AX
	0x0044 00068 (<autogenerated>:1)	JCC	$0, 133
	0x0046 00070 (<autogenerated>:1)	SHLQ	$4, DX
	0x004a 00074 (<autogenerated>:1)	MOVQ	8(CX)(DX*1), BX
	0x004f 00079 (<autogenerated>:1)	MOVQ	(CX)(DX*1), SI
	0x0053 00083 (<autogenerated>:1)	MOVQ	"".j+32(FP), DI
	0x0058 00088 (<autogenerated>:1)	CMPQ	DI, AX
	0x005b 00091 (<autogenerated>:1)	JCC	$0, 133
	0x005d 00093 (<autogenerated>:1)	SHLQ	$4, DI
	0x0061 00097 (<autogenerated>:1)	MOVQ	8(CX)(DI*1), AX
	0x0066 00102 (<autogenerated>:1)	MOVQ	(CX)(DI*1), R8
	0x006a 00106 (<autogenerated>:1)	MOVQ	R8, (CX)(DX*1)
	0x006e 00110 (<autogenerated>:1)	MOVQ	AX, 8(CX)(DX*1)
	0x0073 00115 (<autogenerated>:1)	MOVQ	SI, (CX)(DI*1)
	0x0077 00119 (<autogenerated>:1)	MOVQ	BX, 8(CX)(DI*1)
	0x007c 00124 (<autogenerated>:1)	MOVQ	(SP), BP
	0x0080 00128 (<autogenerated>:1)	ADDQ	$8, SP
	0x0084 00132 (<autogenerated>:1)	RET
	0x0085 00133 (<autogenerated>:1)	PCDATA	$0, $1
	0x0085 00133 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x008a 00138 (<autogenerated>:1)	UNDEF
	0x008c 00140 (<autogenerated>:1)	PCDATA	$0, $1
	0x008c 00140 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x0091 00145 (<autogenerated>:1)	UNDEF

Change-Id: I15bb8435f0690badb868799f313ed8817335efd3
Reviewed-on: https://go-review.googlesource.com/36809
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-11 23:27:35 +00:00
Michael Munday a524616860 cmd/{asm,internal/obj/s390x}, math: remove emulated float instructions
The s390x port was based on the ppc64 port and, because of the way the
port was done, inherited some instructions from it. ppc64 supports
3-operand (4-operand for FMADD etc.) floating point instructions
but s390x doesn't (the destination register is always an input) and
so these were emulated.

There is a bug in the emulation of FMADD whereby if the destination
register is also a source for the multiplication it will be
clobbered. This doesn't break any assembly code in the std lib but
could affect future work.

To fix this I have gone through the floating point instructions and
removed all unnecessary 3-/4-operand emulation. The compiler doesn't
need it and assembly writers don't need it, it's just a source of
bugs.

I've also deleted the FNMADD family of emulated instructions. They
aren't used anywhere.

Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04
Reviewed-on: https://go-review.googlesource.com/33350
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-02-10 16:11:25 +00:00
Carlos Eduardo Seo 85ecc51c48 cmd/asm, cmd/internal/obj/ppc64: Add ISA 2.05, 2.06 and 2.07 instructions.
This change adds instructions from ISA 2.05, 2.06 and 2.07 that are frequently
used in assembly optimizations for ppc64.

It also fixes two problems:

  * the implementation of RLDICR[CC]/RLDICL[CC] did not consider all possible
  cases for the bit mask.
  * removed two non-existing instructions that were added by mistake in the VMX
  implementation (VORL/VANDL).

Change-Id: Iaef4e5c6a5240c2156c6c0f28ad3bcd8780e9830
Reviewed-on: https://go-review.googlesource.com/36230
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2017-02-09 15:27:12 +00:00
Matthew Dempsky 7bad00366b cmd/internal/obj: remove ATYPE
In cmd/compile, we can directly construct obj.Auto to represent local
variables and attach them to the function's obj.LSym.

In preparation for being able to emit more precise DWARF info based on
other compiler available information (e.g., lexical scoping).

Change-Id: I9c4225ec59306bec42552838493022e0e9d70228
Reviewed-on: https://go-review.googlesource.com/36420
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-02-07 22:38:18 +00:00
Matthew Dempsky 1a7582f5e9 cmd/internal/dwarf: use []*Var instead of linked lists
Passes toolstash -cmp.

Change-Id: I202b29495ca1aaf3c52879fa99fdc0a4b86703af
Reviewed-on: https://go-review.googlesource.com/36419
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-07 20:17:24 +00:00
Josh Bleecher Snyder 46085c4b36 cmd/compile: cmd/internal/obj: cull dead code
This code is dead as a result of

* removing the Follow pass
* moving rotation detection from walk to ssa

Change-Id: I14599c85bedb4e3148347b547e724187920182c4
Reviewed-on: https://go-review.googlesource.com/36484
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-07 17:47:19 +00:00
Cherry Zhang bed8129ee6 cmd/internal/obj: remove Follow pass
The Follow pass in the assembler backend reorders and copies
instructions. This even applies to hand-written assembly code,
which in many cases don't want to be reordered. Now that the
SSA compiler does a good job for laying out instructions, the
benefit of this pass is very little:

AMD64: (old = with Follow, new = without Follow)
name                      old time/op    new time/op    delta
BinaryTree17-12              2.78s ± 1%     2.79s ± 1%  +0.44%  (p=0.000 n=20+19)
Fannkuch11-12                3.11s ± 0%     3.31s ± 1%  +6.16%  (p=0.000 n=19+19)
FmtFprintfEmpty-12          50.9ns ± 1%    51.6ns ± 3%  +1.40%  (p=0.000 n=17+20)
FmtFprintfString-12          127ns ± 0%     128ns ± 1%  +0.88%  (p=0.000 n=17+17)
FmtFprintfInt-12             122ns ± 0%     123ns ± 1%  +0.76%  (p=0.000 n=20+19)
FmtFprintfIntInt-12          185ns ± 1%     186ns ± 1%  +0.65%  (p=0.000 n=20+19)
FmtFprintfPrefixedInt-12     192ns ± 1%     202ns ± 1%  +4.99%  (p=0.000 n=20+19)
FmtFprintfFloat-12           284ns ± 0%     288ns ± 0%  +1.33%  (p=0.000 n=15+19)
FmtManyArgs-12               807ns ± 0%     804ns ± 0%  -0.44%  (p=0.000 n=16+18)
GobDecode-12                7.23ms ± 1%    7.21ms ± 1%    ~     (p=0.052 n=20+20)
GobEncode-12                6.09ms ± 1%    6.12ms ± 1%  +0.41%  (p=0.002 n=19+19)
Gzip-12                      253ms ± 1%     255ms ± 1%  +0.95%  (p=0.000 n=18+20)
Gunzip-12                   38.4ms ± 0%    38.5ms ± 0%  +0.34%  (p=0.000 n=17+17)
HTTPClientServer-12         95.4µs ± 2%    96.1µs ± 1%  +0.78%  (p=0.002 n=19+19)
JSONEncode-12               16.5ms ± 1%    16.6ms ± 1%  +1.17%  (p=0.000 n=19+19)
JSONDecode-12               54.6ms ± 1%    55.3ms ± 1%  +1.23%  (p=0.000 n=18+18)
Mandelbrot200-12            4.47ms ± 0%    4.47ms ± 0%  +0.06%  (p=0.000 n=18+18)
GoParse-12                  3.47ms ± 1%    3.47ms ± 1%    ~     (p=0.583 n=20+20)
RegexpMatchEasy0_32-12      84.8ns ± 1%    85.2ns ± 2%  +0.51%  (p=0.022 n=20+20)
RegexpMatchEasy0_1K-12       206ns ± 1%     206ns ± 1%    ~     (p=0.770 n=20+20)
RegexpMatchEasy1_32-12      82.8ns ± 1%    83.4ns ± 1%  +0.64%  (p=0.000 n=20+19)
RegexpMatchEasy1_1K-12       363ns ± 1%     361ns ± 1%  -0.48%  (p=0.007 n=20+20)
RegexpMatchMedium_32-12      126ns ± 1%     126ns ± 0%  +0.72%  (p=0.000 n=20+20)
RegexpMatchMedium_1K-12     39.1µs ± 1%    39.8µs ± 0%  +1.73%  (p=0.000 n=19+19)
RegexpMatchHard_32-12       1.97µs ± 0%    1.98µs ± 1%  +0.29%  (p=0.005 n=18+20)
RegexpMatchHard_1K-12       59.5µs ± 1%    59.8µs ± 1%  +0.36%  (p=0.000 n=18+20)
Revcomp-12                   442ms ± 1%     445ms ± 2%  +0.67%  (p=0.000 n=19+20)
Template-12                 58.0ms ± 1%    57.5ms ± 1%  -0.85%  (p=0.000 n=19+19)
TimeParse-12                 311ns ± 0%     314ns ± 0%  +0.94%  (p=0.000 n=20+18)
TimeFormat-12                350ns ± 3%     346ns ± 0%    ~     (p=0.076 n=20+19)
[Geo mean]                  55.9µs         56.4µs       +0.80%

ARM32:
name                     old time/op    new time/op    delta
BinaryTree17-4              30.4s ± 0%     30.1s ± 0%  -1.14%  (p=0.000 n=10+8)
Fannkuch11-4                13.7s ± 0%     13.6s ± 0%  -0.75%  (p=0.000 n=10+10)
FmtFprintfEmpty-4           664ns ± 1%     651ns ± 1%  -1.96%  (p=0.000 n=7+8)
FmtFprintfString-4         1.83µs ± 2%    1.77µs ± 2%  -3.21%  (p=0.000 n=10+10)
FmtFprintfInt-4            1.57µs ± 2%    1.54µs ± 2%  -2.25%  (p=0.007 n=10+10)
FmtFprintfIntInt-4         2.37µs ± 2%    2.31µs ± 1%  -2.68%  (p=0.000 n=10+10)
FmtFprintfPrefixedInt-4    2.14µs ± 2%    2.10µs ± 1%  -1.83%  (p=0.006 n=10+10)
FmtFprintfFloat-4          3.69µs ± 2%    3.74µs ± 1%  +1.60%  (p=0.000 n=10+10)
FmtManyArgs-4              9.43µs ± 1%    9.17µs ± 1%  -2.70%  (p=0.000 n=10+10)
GobDecode-4                76.3ms ± 1%    75.5ms ± 1%  -1.14%  (p=0.003 n=10+10)
GobEncode-4                70.7ms ± 2%    69.0ms ± 1%  -2.36%  (p=0.000 n=10+10)
Gzip-4                      2.64s ± 1%     2.65s ± 0%  +0.59%  (p=0.002 n=10+10)
Gunzip-4                    402ms ± 0%     398ms ± 0%  -1.11%  (p=0.000 n=10+9)
HTTPClientServer-4          458µs ± 0%     457µs ± 0%    ~     (p=0.247 n=10+10)
JSONEncode-4                171ms ± 0%     172ms ± 0%  +0.56%  (p=0.000 n=10+10)
JSONDecode-4                672ms ± 1%     668ms ± 1%    ~     (p=0.105 n=10+10)
Mandelbrot200-4            33.5ms ± 0%    33.5ms ± 0%    ~     (p=0.156 n=9+10)
GoParse-4                  33.9ms ± 0%    34.0ms ± 0%  +0.36%  (p=0.031 n=9+9)
RegexpMatchEasy0_32-4       823ns ± 1%     835ns ± 1%  +1.49%  (p=0.000 n=8+8)
RegexpMatchEasy0_1K-4      3.99µs ± 0%    4.02µs ± 1%  +0.92%  (p=0.000 n=8+10)
RegexpMatchEasy1_32-4       877ns ± 3%     904ns ± 2%  +3.07%  (p=0.012 n=10+10)
RegexpMatchEasy1_1K-4      5.99µs ± 0%    5.97µs ± 1%  -0.38%  (p=0.023 n=8+8)
RegexpMatchMedium_32-4     1.40µs ± 2%    1.40µs ± 2%    ~     (p=0.590 n=10+9)
RegexpMatchMedium_1K-4      357µs ± 0%     355µs ± 1%  -0.72%  (p=0.000 n=7+8)
RegexpMatchHard_32-4       22.3µs ± 0%    22.1µs ± 0%  -0.49%  (p=0.000 n=8+7)
RegexpMatchHard_1K-4        661µs ± 0%     658µs ± 0%  -0.42%  (p=0.000 n=8+7)
Revcomp-4                  46.3ms ± 0%    46.3ms ± 0%    ~     (p=0.393 n=10+10)
Template-4                  753ms ± 1%     750ms ± 0%    ~     (p=0.211 n=10+9)
TimeParse-4                4.28µs ± 1%    4.22µs ± 1%  -1.34%  (p=0.000 n=8+10)
TimeFormat-4               9.00µs ± 0%    9.05µs ± 0%  +0.59%  (p=0.000 n=10+10)
[Geo mean]                  538µs          535µs       -0.55%

ARM64:
name                     old time/op    new time/op    delta
BinaryTree17-8              8.39s ± 0%     8.39s ± 0%    ~     (p=0.684 n=10+10)
Fannkuch11-8                5.95s ± 0%     5.99s ± 0%  +0.63%  (p=0.000 n=10+10)
FmtFprintfEmpty-8           116ns ± 0%     116ns ± 0%    ~     (all equal)
FmtFprintfString-8          361ns ± 0%     360ns ± 0%  -0.31%  (p=0.003 n=8+6)
FmtFprintfInt-8             290ns ± 0%     290ns ± 0%    ~     (p=0.620 n=9+9)
FmtFprintfIntInt-8          476ns ± 1%     469ns ± 0%  -1.47%  (p=0.000 n=10+6)
FmtFprintfPrefixedInt-8     412ns ± 2%     417ns ± 2%  +1.39%  (p=0.006 n=9+10)
FmtFprintfFloat-8           652ns ± 1%     652ns ± 0%    ~     (p=0.161 n=10+8)
FmtManyArgs-8              1.94µs ± 0%    1.94µs ± 2%    ~     (p=0.781 n=10+10)
GobDecode-8                17.7ms ± 1%    17.7ms ± 0%    ~     (p=0.962 n=10+7)
GobEncode-8                15.6ms ± 0%    15.6ms ± 1%    ~     (p=0.063 n=10+10)
Gzip-8                      786ms ± 0%     787ms ± 0%    ~     (p=0.356 n=10+9)
Gunzip-8                    127ms ± 0%     127ms ± 0%  +0.08%  (p=0.028 n=10+9)
HTTPClientServer-8          198µs ± 6%     198µs ± 7%    ~     (p=0.796 n=10+10)
JSONEncode-8               42.5ms ± 0%    42.2ms ± 0%  -0.73%  (p=0.000 n=9+8)
JSONDecode-8                158ms ± 1%     162ms ± 0%  +2.28%  (p=0.000 n=10+9)
Mandelbrot200-8            10.1ms ± 0%    10.1ms ± 0%  -0.01%  (p=0.000 n=10+9)
GoParse-8                  8.54ms ± 1%    8.63ms ± 1%  +1.06%  (p=0.000 n=10+9)
RegexpMatchEasy0_32-8       231ns ± 1%     225ns ± 0%  -2.52%  (p=0.000 n=9+10)
RegexpMatchEasy0_1K-8      1.63µs ± 0%    1.63µs ± 0%    ~     (p=0.170 n=10+10)
RegexpMatchEasy1_32-8       253ns ± 0%     249ns ± 0%  -1.41%  (p=0.000 n=9+10)
RegexpMatchEasy1_1K-8      2.08µs ± 0%    2.08µs ± 0%  -0.32%  (p=0.000 n=9+10)
RegexpMatchMedium_32-8      355ns ± 1%     351ns ± 0%  -1.04%  (p=0.007 n=10+7)
RegexpMatchMedium_1K-8      104µs ± 0%     104µs ± 0%    ~     (p=0.148 n=10+10)
RegexpMatchHard_32-8       5.79µs ± 0%    5.79µs ± 0%    ~     (p=0.578 n=10+10)
RegexpMatchHard_1K-8        176µs ± 0%     176µs ± 0%    ~     (p=0.137 n=10+10)
Revcomp-8                   1.37s ± 1%     1.36s ± 1%  -0.26%  (p=0.023 n=10+10)
Template-8                  151ms ± 1%     154ms ± 1%  +2.14%  (p=0.000 n=9+10)
TimeParse-8                 723ns ± 2%     721ns ± 1%    ~     (p=0.592 n=10+10)
TimeFormat-8                804ns ± 2%     798ns ± 3%    ~     (p=0.344 n=10+10)
[Geo mean]                  154µs          154µs       -0.02%

Therefore remove this pass. Also reduce text size by 0.5~2%.

Comment out some dead code in runtime/sys_nacl_amd64p32.s
which contains undefined symbols.

Change-Id: I1473986fe5b18b3d2554ce96cdc6f0999b8d955d
Reviewed-on: https://go-review.googlesource.com/36205
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-07 15:00:48 +00:00
Josh Bleecher Snyder c682d3239e cmd/compile: convert constants to interfaces without allocating
The order pass is responsible for ensuring that
values passed to runtime functions, including
convT2E/convT2I, are addressable.

Prior to this CL, this was always accomplished
by creating a temp, which frequently escaped to
the heap, causing allocations, perhaps most
notably in code like:

fmt.Println(1, 2, 3) // allocates three times

None of the runtime routines modify the contents
of the pointers they receive, so in the case of
constants, instead of creating a temp value,
we can create a static value.

(Marking the static value as read-only provides
protection against accidental attempts by the runtime
to modify the constant data.)

This improves code generation for code like:

panic("abc")
c <- 2 // c is a chan int

which can now simply refer to "abc" and 2,
rather than going by way of a temporary.

It also allows us to optimize convT2E/convT2I,
by recognizing static readonly values
and directly constructing the interface.

This CL adds ~0.5% to binary size, despite
decreasing the size of many functions,
because it also adds many static symbols.

This binary size regression could be recovered in
future (but currently unplanned) work.

There is a lot of content-duplication in these
symbols; this statement generates six new symbols,
three containing an int 1 and three containing
a pointer to the string "a":

fmt.Println(1, 1, 1, "a", "a", "a")

These symbols could be made content-addressable.

Furthermore, these symbols are small, so the
alignment and naming overhead is large.
As with the go.strings section, these symbols
could be hidden and have their alignment reduced.

The changes to test/live.go make it impossible
(at least with current optimization techniques)
to place the values being passed to the runtime
in static symbols, preserving autotmp creation.

Fixes #18704

Benchmarks from fmt and go-kit's logging package:

github.com/go-kit/kit/log

name                      old time/op    new time/op    delta
JSONLoggerSimple-8          1.91µs ± 2%    2.11µs ±22%     ~     (p=1.000 n=9+10)
JSONLoggerContextual-8      2.60µs ± 6%    2.43µs ± 2%   -6.29%  (p=0.000 n=9+10)
Discard-8                    101ns ± 2%      34ns ±14%  -66.33%  (p=0.000 n=10+9)
OneWith-8                    161ns ± 1%     102ns ±16%  -36.78%  (p=0.000 n=10+10)
TwoWith-8                    175ns ± 3%     106ns ± 7%  -39.36%  (p=0.000 n=10+9)
TenWith-8                    293ns ± 3%     227ns ±15%  -22.44%  (p=0.000 n=9+10)
LogfmtLoggerSimple-8         704ns ± 2%     608ns ± 2%  -13.65%  (p=0.000 n=10+9)
LogfmtLoggerContextual-8     962ns ± 1%     860ns ±17%  -10.57%  (p=0.003 n=9+10)
NopLoggerSimple-8            188ns ± 1%     120ns ± 1%  -36.39%  (p=0.000 n=9+10)
NopLoggerContextual-8        379ns ± 1%     243ns ± 0%  -35.77%  (p=0.000 n=9+10)
ValueBindingTimestamp-8      577ns ± 1%     499ns ± 1%  -13.51%  (p=0.000 n=10+10)
ValueBindingCaller-8         898ns ± 2%     844ns ± 2%   -6.00%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
JSONLoggerSimple-8            904B ± 0%      872B ± 0%   -3.54%  (p=0.000 n=10+10)
JSONLoggerContextual-8      1.20kB ± 0%    1.14kB ± 0%   -5.33%  (p=0.000 n=10+10)
Discard-8                    64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)
OneWith-8                    96.0B ± 0%     64.0B ± 0%  -33.33%  (p=0.000 n=10+10)
TwoWith-8                     160B ± 0%      128B ± 0%  -20.00%  (p=0.000 n=10+10)
TenWith-8                     672B ± 0%      640B ± 0%   -4.76%  (p=0.000 n=10+10)
LogfmtLoggerSimple-8          128B ± 0%       96B ± 0%  -25.00%  (p=0.000 n=10+10)
LogfmtLoggerContextual-8      304B ± 0%      240B ± 0%  -21.05%  (p=0.000 n=10+10)
NopLoggerSimple-8             128B ± 0%       96B ± 0%  -25.00%  (p=0.000 n=10+10)
NopLoggerContextual-8         304B ± 0%      240B ± 0%  -21.05%  (p=0.000 n=10+10)
ValueBindingTimestamp-8       159B ± 0%      127B ± 0%  -20.13%  (p=0.000 n=10+10)
ValueBindingCaller-8          112B ± 0%       80B ± 0%  -28.57%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
JSONLoggerSimple-8            19.0 ± 0%      17.0 ± 0%  -10.53%  (p=0.000 n=10+10)
JSONLoggerContextual-8        25.0 ± 0%      21.0 ± 0%  -16.00%  (p=0.000 n=10+10)
Discard-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
OneWith-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
TwoWith-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
TenWith-8                     3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.000 n=10+10)
LogfmtLoggerSimple-8          4.00 ± 0%      2.00 ± 0%  -50.00%  (p=0.000 n=10+10)
LogfmtLoggerContextual-8      7.00 ± 0%      3.00 ± 0%  -57.14%  (p=0.000 n=10+10)
NopLoggerSimple-8             4.00 ± 0%      2.00 ± 0%  -50.00%  (p=0.000 n=10+10)
NopLoggerContextual-8         7.00 ± 0%      3.00 ± 0%  -57.14%  (p=0.000 n=10+10)
ValueBindingTimestamp-8       5.00 ± 0%      3.00 ± 0%  -40.00%  (p=0.000 n=10+10)
ValueBindingCaller-8          4.00 ± 0%      2.00 ± 0%  -50.00%  (p=0.000 n=10+10)

fmt

name                             old time/op    new time/op    delta
SprintfPadding-8                   88.9ns ± 3%    79.1ns ± 1%   -11.09%  (p=0.000 n=10+7)
SprintfEmpty-8                     12.6ns ± 3%    12.8ns ± 3%      ~     (p=0.136 n=10+10)
SprintfString-8                    38.7ns ± 5%    26.9ns ± 6%   -30.65%  (p=0.000 n=10+10)
SprintfTruncateString-8            56.7ns ± 2%    47.0ns ± 3%   -17.05%  (p=0.000 n=10+10)
SprintfQuoteString-8                164ns ± 2%     153ns ± 2%    -7.01%  (p=0.000 n=10+10)
SprintfInt-8                       38.9ns ±15%    26.5ns ± 2%   -31.93%  (p=0.000 n=10+9)
SprintfIntInt-8                    60.3ns ± 9%    38.2ns ± 1%   -36.67%  (p=0.000 n=10+8)
SprintfPrefixedInt-8               58.6ns ±13%    51.2ns ±11%   -12.66%  (p=0.001 n=10+10)
SprintfFloat-8                     71.4ns ± 3%    64.2ns ± 3%   -10.08%  (p=0.000 n=8+10)
SprintfComplex-8                    175ns ± 3%     159ns ± 2%    -9.03%  (p=0.000 n=10+10)
SprintfBoolean-8                   33.5ns ± 4%    25.7ns ± 5%   -23.28%  (p=0.000 n=10+10)
SprintfHexString-8                 65.3ns ± 3%    51.7ns ± 5%   -20.86%  (p=0.000 n=10+9)
SprintfHexBytes-8                  67.2ns ± 5%    67.9ns ± 4%      ~     (p=0.383 n=10+10)
SprintfBytes-8                      129ns ± 7%     124ns ± 7%      ~     (p=0.074 n=9+10)
SprintfStringer-8                   127ns ± 4%     126ns ± 8%      ~     (p=0.506 n=9+10)
SprintfStructure-8                  357ns ± 3%     359ns ± 3%      ~     (p=0.469 n=10+10)
ManyArgs-8                          203ns ± 6%     126ns ± 3%   -37.94%  (p=0.000 n=10+10)
FprintInt-8                         119ns ±10%      74ns ± 3%   -37.54%  (p=0.000 n=10+10)
FprintfBytes-8                      122ns ± 4%     120ns ± 3%      ~     (p=0.124 n=10+10)
FprintIntNoAlloc-8                 78.2ns ± 5%    74.1ns ± 3%    -5.28%  (p=0.000 n=10+10)
ScanInts-8                          349µs ± 1%     349µs ± 0%      ~     (p=0.606 n=9+8)
ScanRecursiveInt-8                 43.8ms ± 7%    40.1ms ± 2%    -8.42%  (p=0.000 n=10+10)
ScanRecursiveIntReaderWrapper-8    43.5ms ± 4%    40.4ms ± 2%    -7.16%  (p=0.000 n=10+9)

name                             old alloc/op   new alloc/op   delta
SprintfPadding-8                    24.0B ± 0%     16.0B ± 0%   -33.33%  (p=0.000 n=10+10)
SprintfEmpty-8                      0.00B          0.00B           ~     (all equal)
SprintfString-8                     21.0B ± 0%      5.0B ± 0%   -76.19%  (p=0.000 n=10+10)
SprintfTruncateString-8             32.0B ± 0%     16.0B ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfQuoteString-8                48.0B ± 0%     32.0B ± 0%   -33.33%  (p=0.000 n=10+10)
SprintfInt-8                        16.0B ± 0%      1.0B ± 0%   -93.75%  (p=0.000 n=10+10)
SprintfIntInt-8                     24.0B ± 0%      3.0B ± 0%   -87.50%  (p=0.000 n=10+10)
SprintfPrefixedInt-8                72.0B ± 0%     64.0B ± 0%   -11.11%  (p=0.000 n=10+10)
SprintfFloat-8                      16.0B ± 0%      8.0B ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfComplex-8                    48.0B ± 0%     32.0B ± 0%   -33.33%  (p=0.000 n=10+10)
SprintfBoolean-8                    8.00B ± 0%     4.00B ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfHexString-8                  96.0B ± 0%     80.0B ± 0%   -16.67%  (p=0.000 n=10+10)
SprintfHexBytes-8                    112B ± 0%      112B ± 0%      ~     (all equal)
SprintfBytes-8                      96.0B ± 0%     96.0B ± 0%      ~     (all equal)
SprintfStringer-8                   32.0B ± 0%     32.0B ± 0%      ~     (all equal)
SprintfStructure-8                   256B ± 0%      256B ± 0%      ~     (all equal)
ManyArgs-8                          80.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
FprintInt-8                         8.00B ± 0%     0.00B       -100.00%  (p=0.000 n=10+10)
FprintfBytes-8                      32.0B ± 0%     32.0B ± 0%      ~     (all equal)
FprintIntNoAlloc-8                  0.00B          0.00B           ~     (all equal)
ScanInts-8                         15.2kB ± 0%    15.2kB ± 0%      ~     (p=0.248 n=9+10)
ScanRecursiveInt-8                 21.6kB ± 0%    21.6kB ± 0%      ~     (all equal)
ScanRecursiveIntReaderWrapper-8    21.7kB ± 0%    21.7kB ± 0%      ~     (all equal)

name                             old allocs/op  new allocs/op  delta
SprintfPadding-8                     2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfEmpty-8                       0.00           0.00           ~     (all equal)
SprintfString-8                      2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfTruncateString-8              2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfQuoteString-8                 2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfInt-8                         2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfIntInt-8                      3.00 ± 0%      1.00 ± 0%   -66.67%  (p=0.000 n=10+10)
SprintfPrefixedInt-8                 2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfFloat-8                       2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfComplex-8                     2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfBoolean-8                     2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfHexString-8                   2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
SprintfHexBytes-8                    2.00 ± 0%      2.00 ± 0%      ~     (all equal)
SprintfBytes-8                       2.00 ± 0%      2.00 ± 0%      ~     (all equal)
SprintfStringer-8                    4.00 ± 0%      4.00 ± 0%      ~     (all equal)
SprintfStructure-8                   7.00 ± 0%      7.00 ± 0%      ~     (all equal)
ManyArgs-8                           8.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
FprintInt-8                          1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
FprintfBytes-8                       1.00 ± 0%      1.00 ± 0%      ~     (all equal)
FprintIntNoAlloc-8                   0.00           0.00           ~     (all equal)
ScanInts-8                          1.60k ± 0%     1.60k ± 0%      ~     (all equal)
ScanRecursiveInt-8                  1.71k ± 0%     1.71k ± 0%      ~     (all equal)
ScanRecursiveIntReaderWrapper-8     1.71k ± 0%     1.71k ± 0%      ~     (all equal)

Change-Id: I7ba72a25fea4140a0ba40a9f443103ed87cc69b5
Reviewed-on: https://go-review.googlesource.com/35554
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-02 21:02:23 +00:00
Michael Munday fd118b69fa cmd/asm, cmd/internal/obj/s390x: fix encoding of VREPI{H,F,G}
Also adds tests for all missing VRI-a instructions (which may be
affected by this change).

Fixes #18749.

Change-Id: I48249dda626f32555da9ab58659e2e140de6504a
Reviewed-on: https://go-review.googlesource.com/35561
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-02-01 19:37:18 +00:00
Russ Cox 47ce87877b all: merge dev.inline into master
Change-Id: I7715581a04e513dcda9918e853fa6b1ddc703770
2017-02-01 09:47:23 -05:00
Robert Griesemer 33c036867f [dev.inline] cmd/internal/obj: remove vestiges of LineHist - not used anymore
Change-Id: I9d3fcdd5b002953fa9d2f001bf7a834073443794
Reviewed-on: https://go-review.googlesource.com/34722
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-01-09 22:52:34 +00:00
Robert Griesemer 472c792e0a [dev.inline] cmd/internal/src: introduce compact source position representation
XPos is a compact (8 instead of 16 bytes on a 64bit machine) source
position representation. There is a 1:1 correspondence between each
XPos and each regular Pos, translated via a global table.

In some sense this brings back the LineHist, though positions can
track line and column information; there is a O(1) translation
between the representations (no binary search), and the translation
is factored out.

The size increase with the prior change is brought down again and
the compiler speed is in line with the master repo (measured on
the same "quiet" machine as for prior change):

name       old time/op     new time/op     delta
Template       256ms ± 1%      262ms ± 2%    ~             (p=0.063 n=5+4)
Unicode        132ms ± 1%      135ms ± 2%    ~             (p=0.063 n=5+4)
GoTypes        891ms ± 1%      871ms ± 1%  -2.28%          (p=0.016 n=5+4)
Compiler       3.84s ± 2%      3.89s ± 2%    ~             (p=0.413 n=5+4)
MakeBash       47.1s ± 1%      46.2s ± 2%    ~             (p=0.095 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        309M ± 1%       314M ± 2%    ~             (p=0.111 n=5+4)
Unicode         165M ± 1%       172M ± 9%    ~             (p=0.151 n=5+5)
GoTypes        1.14G ± 2%      1.12G ± 1%    ~             (p=0.063 n=5+4)
Compiler       5.00G ± 1%      4.96G ± 1%    ~             (p=0.286 n=5+4)

Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47
Reviewed-on: https://go-review.googlesource.com/34506
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-09 22:43:22 +00:00
Robert Griesemer 4808fc4443 [dev.inline] cmd/internal/src: replace src.Pos with syntax.Pos
This replaces the src.Pos LineHist-based position tracking with
the syntax.Pos implementation and updates all uses.

The LineHist table is not used anymore - the respective code is still
there but should be removed eventually. CL forthcoming.

Passes toolstash -cmp when comparing to the master repo (with the
exception of a couple of swapped assembly instructions, likely due
to different instruction scheduling because the line-based sorting
has changed; though this is won't affect correctness).

The sizes of various important compiler data structures have increased
significantly (see the various sizes_test.go files); this is probably
the reason for an increase of compilation times (to be addressed). Here
are the results of compilebench -count 5, run on a "quiet" machine (no
apps running besides a terminal):

name       old time/op     new time/op     delta
Template       256ms ± 1%      280ms ±15%  +9.54%          (p=0.008 n=5+5)
Unicode        132ms ± 1%      132ms ± 1%    ~             (p=0.690 n=5+5)
GoTypes        891ms ± 1%      917ms ± 2%  +2.88%          (p=0.008 n=5+5)
Compiler       3.84s ± 2%      3.99s ± 2%  +3.95%          (p=0.016 n=5+5)
MakeBash       47.1s ± 1%      47.2s ± 2%    ~             (p=0.841 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        309M ± 1%       326M ± 2%  +5.18%          (p=0.008 n=5+5)
Unicode         165M ± 1%       168M ± 4%    ~             (p=0.421 n=5+5)
GoTypes        1.14G ± 2%      1.18G ± 1%  +3.47%          (p=0.008 n=5+5)
Compiler       5.00G ± 1%      5.16G ± 1%  +3.12%          (p=0.008 n=5+5)

Change-Id: I241c4246cdff627d7ecb95cac23060b38f9775ec
Reviewed-on: https://go-review.googlesource.com/34273
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-01-09 22:33:23 +00:00
David Chase 7f1ff65c39 cmd/compile: insert scheduling checks on loop backedges
Loop breaking with a counter.  Benchmarked (see comments),
eyeball checked for sanity on popular loops.  This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.

Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test.  Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.

If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop.  Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.

This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.

Updates #17831.
Updates #10958.

Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09 21:01:29 +00:00
Shenghou Ma 9fe2291efd cmd/internal/obj/mips: replace MOVD with MOVF on 32-bit to avoid unaligned memory access
This is the simplest CL that I can make for Go 1.8. For Go 1.9, we can revisit it
and optimize the redundant address generation instructions or just fix #599 instead.

Fixes #18140.

Change-Id: Ie4804ab0e00dc6bb318da2bece8035c7c71caac3
Reviewed-on: https://go-review.googlesource.com/34193
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-12-12 23:25:06 +00:00
David Lazar 48d029fe43 [dev.inline] cmd/internal/obj: rename Prog.Lineno to Prog.Pos
Change-Id: I7585d85907869f5a286b36936dfd035f1e8e9906
Reviewed-on: https://go-review.googlesource.com/34197
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-12-09 20:35:56 +00:00
David Lazar ad4efedc6c [dev.inline] cmd/internal/obj: use src.Pos in obj.Prog
This will let us use the src.Pos struct to thread inlining
information through to obj.

Change-Id: I96a16d3531167396988df66ae70f0b729049cc82
Reviewed-on: https://go-review.googlesource.com/34195
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-12-09 20:25:10 +00:00
Robert Griesemer 00efa446e1 cmd/compile/internal/obj: remove superfluous addvarint parameter and assignment
Change-Id: I395625dca9b719290c52d2c46f60b53e8fb3abc4
Reviewed-on: https://go-review.googlesource.com/34139
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-12-09 01:04:29 +00:00
David Crawshaw d4177877c6 cmd/internal/obj: regenerate relocation strings
Change-Id: Ib9ba8f0b8785f1b0ddb29214beb8674dc06f7422
Reviewed-on: https://go-review.googlesource.com/34111
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-12-07 20:28:36 +00:00
Matthew Dempsky e3a1d0cb7c cmd/internal/obj: rename obj.go to line.go
This file is entirely about the implementation of LineHist, and I can
never remember which generic filename in cmd/internal/obj has it.
Rename to line.go to match the already existing line_test.go.

Change-Id: Id01f3339dc550c9759569d5610d808b17bca44d0
Reviewed-on: https://go-review.googlesource.com/33803
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-12-01 23:11:15 +00:00
Keith Randall c96e94e69d cmd/compile: generate frame pointers for otherwise frameless functions
func f() {
    g()
}

We mistakenly don't add a frame pointer for f.  This means f
isn't seen when walking the frame pointer linked list.  That
matters for kernel-gathered profiles, and is an impediment for
issues like #16638.

To fix, allocate a stack frame even for otherwise frameless functions
like f.  It is a bit tricky because we need to avoid some runtime
internals that really, really don't want one.

No test at the moment, as only kernel CPU profiles would catch it.
Tests will come with the implementation of #16638.

Fixes #18103

Change-Id: I411206cc9de4c8fdd265bee2e4fa61d161ad1847
Reviewed-on: https://go-review.googlesource.com/33754
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-12-01 19:25:17 +00:00
David Crawshaw 6f31abd23a cmd/compile, cmd/link: weak relocation for ptrTo
Introduce R_WEAKADDROFF, a "weak" variation of the R_ADDROFF relocation
that will only reference the type described if it is in some other way
reachable.

Use this for the ptrToThis field in reflect type information where it
is safe to do so (that is, types that don't need to be included for
interface satisfaction, and types that won't cause the compiler to
recursively generate an endless series of ptr-to-ptr-to-ptr-to...
types).

Also fix a small bug in reflect, where StructOf was not clearing the
ptrToThis field of new types.

Fixes #17931

Change-Id: I4d3b53cb9c916c97b3b16e367794eee142247281
Reviewed-on: https://go-review.googlesource.com/33427
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-11-22 03:10:14 +00:00
Cherry Zhang 1e3c57c2cc cmd/internal/obj/arm64: fix branch too far for CBZ (and like)
The assembler backend fixes too-far conditional branches, but only
for BEQ and like. Add a case for CBZ and like.

Fixes #17925.

Change-Id: Ie516e6c5ca165b582367283a0110f7081e00c214
Reviewed-on: https://go-review.googlesource.com/33304
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2016-11-16 20:31:40 +00:00
Vladimir Stefanovic 5d28bc58b6 cmd/internal/obj/mips: add support for GOARCH=mips{,le}
Implements subset of MIPS32(r1) instruction set.

Change-Id: Iba017350f6c2763de05d4d1bc2f123e8eb76d0ff
Reviewed-on: https://go-review.googlesource.com/31475
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-11-08 17:46:35 +00:00
Cherry Zhang a866df2671 cmd/internal/obj/arm64: materialize float constant 0 from zero register
Materialize float constant 0 from integer zero register, instead
of loading from constant pool.

Also fix assembling FMOV from zero register to FP register.

Change-Id: Ie413dd342cedebdb95ba8cfc220e23ed2a39e885
Reviewed-on: https://go-review.googlesource.com/32250
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-10-28 20:18:29 +00:00
Cherry Zhang 4f1ca8b6f9 cmd/internal/obj/mips: materialize float constant 0 from zero register
Materialize float constant 0 from integer zero register, instead
of loading from constant pool.

Change-Id: Ie4728895b9d617bec2a29d15729c0efaa10eedbb
Reviewed-on: https://go-review.googlesource.com/32109
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-10-28 14:20:09 +00:00
Carlos Eduardo Seo 0acefdbea0 cmd/asm, cmd/internal/obj/ppc64: Add vector scalar (VSX) registers and instructions
The current implementation for Power architecture does not include the vector
scalar (VSX) registers.  This adds the 63 VSX registers and the most commonly
used instructions: load/store VSX vector/scalar, move to/from VSR, logical
operations, select, merge, splat, permute, shift, FP-FP conversion, FP-integer
conversion and integer-FP conversion.

Change-Id: I0f7572d2359fe7f3ea0124a1eb1b0bebab33649e
Reviewed-on: https://go-review.googlesource.com/30510
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-28 13:38:30 +00:00
Cherry Zhang 698bfa17a8 cmd/internal/obj: save link register in leaf function with non-empty frame on PPC64, ARM64, S390X
The runtime traceback code assumes non-empty frame has link
link register saved on LR architectures. Make sure it is so in
the assember.

Also make sure that LR is stored before update SP, so the traceback
code will not see a half-updated stack frame if a signal comes
during the execution of function prologue.

Fixes #17381.

Change-Id: I668b04501999b7f9b080275a2d1f8a57029cbbb3
Reviewed-on: https://go-review.googlesource.com/31760
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2016-10-25 21:44:32 +00:00
shaharko d391dc260a cmd/internal/obj: Use bitfield for LSym attributes
Reduces the size of LSym struct.

On 32bit: before 84  after 76
On 64bit: before 136 after 128

name       old time/op     new time/op     delta
Template       182ms ± 3%      182ms ± 3%    ~           (p=0.607 n=19+20)
Unicode       93.5ms ± 4%     94.2ms ± 3%    ~           (p=0.141 n=20+19)
GoTypes        608ms ± 1%      605ms ± 2%    ~           (p=0.056 n=20+20)

name       old user-ns/op  new user-ns/op  delta
Template        249M ± 7%       249M ± 4%    ~           (p=0.605 n=18+19)
Unicode         149M ±14%       151M ± 5%    ~           (p=0.724 n=20+17)
GoTypes         855M ± 4%       853M ± 3%    ~           (p=0.537 n=19+19)

name       old alloc/op    new alloc/op    delta
Template      40.3MB ± 0%     40.3MB ± 0%  -0.11%        (p=0.000 n=19+20)
Unicode       33.8MB ± 0%     33.8MB ± 0%  -0.08%        (p=0.000 n=20+20)
GoTypes        119MB ± 0%      119MB ± 0%  -0.10%        (p=0.000 n=19+20)

name       old allocs/op   new allocs/op   delta
Template        383k ± 0%       383k ± 0%    ~           (p=0.703 n=20+20)
Unicode         317k ± 0%       317k ± 0%    ~           (p=0.982 n=19+18)
GoTypes        1.14M ± 0%      1.14M ± 0%    ~           (p=0.086 n=20+20)

Change-Id: Id6ba0db3ecc4503a4e9af3ed0d5884d4366e8bf9
Reviewed-on: https://go-review.googlesource.com/31870
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Shahar Kohanim <skohanim@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-25 20:10:05 +00:00
shaharko d8d445280a cmd/compile, cmd/link: more efficient typelink generation
Instead of generating typelink symbols in the compiler
mark types that should have typelinks with a flag.
The linker detects this flag and adds the marked types
to the typelink table.

name            old s/op    new s/op    delta
LinkCmdCompile   0.27 ± 6%   0.25 ± 6%  -6.93%    (p=0.000 n=97+98)
LinkCmdGo        0.30 ± 5%   0.29 ±10%  -4.22%    (p=0.000 n=97+99)

name            old MaxRSS  new MaxRSS  delta
LinkCmdCompile   112k ± 3%   106k ± 2%  -4.85%  (p=0.000 n=100+100)
LinkCmdGo        107k ± 3%   103k ± 3%  -3.00%  (p=0.000 n=100+100)

Change-Id: Ic95dd4b0101e90c1fa262c9c6c03a2028d6b3623
Reviewed-on: https://go-review.googlesource.com/31772
Run-TryBot: Shahar Kohanim <skohanim@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-25 19:44:06 +00:00
Lynn Boger b10b2f8d40 cmd/internal: add shift opcodes with shift operands on ppc64x
Some original shift opcodes for ppc64x expected an operand to be
a mask instead of a shift count, preventing some valid shift counts
from being written.

This adds new opcodes for shifts where needed, using mnemonics that
match the ppc64 asm and allowing the assembler to accept the full set
of valid shift counts.

Fixes #15016

Change-Id: Id573489f852038d06def279c13fd0523736878a7
Reviewed-on: https://go-review.googlesource.com/31853
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
2016-10-25 14:56:20 +00:00
Michael Munday 3ef07c412f cmd, runtime: remove s390x 3 operand immediate logical ops
These are emulated by the assembler and we don't need them.

Change-Id: I2b07c5315a5b642fdb5e50b468453260ae121164
Reviewed-on: https://go-review.googlesource.com/31758
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-25 12:36:06 +00:00
shaharko 80a034642e cmd/compile, cmd/link: stop generating unused go.string.hdr symbols.
name       old s/op    new s/op    delta
LinkCmdGo   0.29 ± 5%   0.29 ± 8%  -2.60%   (p=0.000 n=97+98)

name       old MaxRSS  new MaxRSS  delta
LinkCmdGo   106k ± 4%   105k ± 3%  -1.00%  (p=0.000 n=100+99)

Change-Id: I75a1c3b24ea711a15a5d2eae026b70b97ee7bad4
Reviewed-on: https://go-review.googlesource.com/31030
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-10-25 05:17:05 +00:00
Cherry Zhang b55cee1893 cmd/internal/obj/mips: store LR before update SP in function prologue
This prevents the traceback code from seeing a half-updated
stack frame when a profiling signal comes during the execution
of function prologue. Also fixes mips64x part of #17381.

Change-Id: Iec9683427e546e3648b2e8b1dde956d13f6eb938
Reviewed-on: https://go-review.googlesource.com/31721
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-10-24 22:39:15 +00:00
Matthew Dempsky 7124056f7e cmd/internal/obj: drop Addr's Gotype field
The Gotype field is only used for ATYPE instructions. Instead of
specially storing the Go type symbol in From.Gotype, just store it in
To.Sym like any other 2-argument instruction would.

Modest reduction in allocations:

name       old alloc/op    new alloc/op    delta
Template      42.0MB ± 0%     41.8MB ± 0%  -0.40%         (p=0.000 n=9+10)
Unicode       34.3MB ± 0%     34.1MB ± 0%  -0.48%         (p=0.000 n=9+10)
GoTypes        122MB ± 0%      122MB ± 0%  -0.14%         (p=0.000 n=9+10)
Compiler       518MB ± 0%      518MB ± 0%  -0.04%         (p=0.000 n=9+10)

Passes toolstash -cmp.

Change-Id: I0e603266b5d7d4e405106a26369e22773a0d3a91
Reviewed-on: https://go-review.googlesource.com/31850
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-24 19:29:18 +00:00
Michael Munday 930ab0afd7 cmd/asm, cmd/internal/obj/s390x: fix VFMA and VFMS encoding
The m5 and m6 fields were the wrong way round.

Fixes #17444.

Change-Id: I10297064f2cd09d037eac581c96a011358f70aae
Reviewed-on: https://go-review.googlesource.com/31130
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-10-21 02:14:57 +00:00
Ilya Tocar 5ce715fdfe cmd/internal/obj/x86: add some missing AMD64 instructions
Add VBROADCASTSD, BROADCASTSS, MOVDDUP, MOVSHDUP, MOVSLDUP,
VMOVDDUP, VMOVSHDUP, VMOVSLDUP.

Fixes #16007

Change-Id: I9614e58eed6c1b6f299d9b4f0b1a7750aa7c1725
Reviewed-on: https://go-review.googlesource.com/31491
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-10-20 14:03:14 +00:00
Ian Lance Taylor e32ac7978d cmd/link, cmd/internal/obj: stop exporting various names
Just happened to notice that these names (funcAlign and friends) are
never referenced outside their package, so no need to export them.

Change-Id: I4bbdaa4b0ef330c3c3ef50a2ca39593977a83545
Reviewed-on: https://go-review.googlesource.com/31496
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-10-19 21:16:58 +00:00
Michael Munday 430b82009c cmd/internal/obj/{ppc64,s390x}: mark functions with small stacks NOSPLIT
This change omits the stack check on ppc64 and s390x when the size of
a stack frame is less than obj.StackSmall. This is an optimization
x86 already performs.

The effect on s390x isn't huge because we were already omitting the
stack check when the frame size was 0 (it shaves about 1K from the
size of bin/go). On ppc64 however this change reduces the size of the
.text section in bin/go by 33K (1%).

Updates #13379 (for ppc64).

Change-Id: I6af0eb987646bea47fcaf0a812db3496bab0f680
Reviewed-on: https://go-review.googlesource.com/31357
Reviewed-by: David Chase <drchase@google.com>
2016-10-18 17:07:14 +00:00
Michael Munday 1cfb5c3fd5 cmd/compile: merge loads into operations on s390x
Adds the new canMergeLoad function which can be used by rules to
decide whether a load can be merged into an operation. The function
ensures that the merge will not reorder the load relative to memory
operations (for example, stores) in such a way that the block can no
longer be scheduled.

This new function enables transformations such as:

MOVD 0(R1), R2
ADD  R2, R3

to:

ADD  0(R1), R3

The two-operand form of the following instructions can now read a
single memory operand:

 - ADD
 - ADDC
 - ADDW
 - MULLD
 - MULLW
 - SUB
 - SUBC
 - SUBE
 - SUBW
 - AND
 - ANDW
 - OR
 - ORW
 - XOR
 - XORW

Improves SHA3 performance by 6-8%.

Updates #15054.

Change-Id: Ibcb9122126cd1a26f2c01c0dfdbb42fe5e7b5b94
Reviewed-on: https://go-review.googlesource.com/29272
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-17 19:45:20 +00:00
Lynn Boger 1e28dce80a bytes: improve performance for bytes.Compare on ppc64x
This improves the performance for byte.Compare by rewriting
the cmpbody function in runtime/asm_ppc64x.s.  The previous code
had a simple loop which loaded a pair of bytes and compared them,
which is inefficient for long buffers.  The updated function checks
for 8 or 32 byte chunks and then loads and compares double words where
possible.

Because the byte.Compare result indicates greater or less than,
the doubleword loads must take endianness into account, using a
byte reversed load in the little endian case.

Fixes #17433

benchmark                                   old ns/op     new ns/op     delta
BenchmarkBytesCompare/8-16                  13.6          7.16          -47.35%
BenchmarkBytesCompare/16-16                 25.7          7.83          -69.53%
BenchmarkBytesCompare/32-16                 38.1          7.78          -79.58%
BenchmarkBytesCompare/64-16                 63.0          10.6          -83.17%
BenchmarkBytesCompare/128-16                112           13.0          -88.39%
BenchmarkBytesCompare/256-16                211           28.1          -86.68%
BenchmarkBytesCompare/512-16                410           38.6          -90.59%
BenchmarkBytesCompare/1024-16               807           60.2          -92.54%
BenchmarkBytesCompare/2048-16               1601          103           -93.57%

Change-Id: I121acc74fcd27c430797647b8d682eb0607c63eb
Reviewed-on: https://go-review.googlesource.com/30949
Reviewed-by: David Chase <drchase@google.com>
2016-10-17 18:46:20 +00:00
Michael Pratt ab019da727 cmd/internal/obj: document Prog
Change-Id: Iafc392ba06452419542ec85e91d44991839eb6f8
Reviewed-on: https://go-review.googlesource.com/19593
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-10-13 01:27:34 +00:00
Lynn Boger 6da8bdd2cc cmd/asm: recognize CR1-CR7 on ppc64x branch instructions
Some of the branch instructions (BEQ, BNE, BLT, etc.) accept
all the valid CR values as operands, but the CR register value is
not parsed and not put into the instruction, so that CR0 is always
used regardless of what was specified on the instruction.  For example
BEQ CR2,label becomes beq cr0,label.

This adds the change to the PPC64 assembler to recognize the CR value
and set the approppriate field in the instruction so the correct
CR is used.  This also adds some general comments on the branch
instruction BC and its operand values.

Fixes #17408

Change-Id: I8e956372a42846a4c09a7259e9172eaa29118e71
Reviewed-on: https://go-review.googlesource.com/30930
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-10-12 21:17:47 +00:00
Cherry Zhang 7c431cb7f9 cmd/link: insert trampolines for too-far jumps on ARM
ARM direct CALL/JMP instruction has 24 bit offset, which can only
encodes jumps within +/-32M. When the target is too far, the top
bits get truncated and the program jumps wild.

This CL detects too-far jumps and automatically insert trampolines,
currently only internal linking on ARM.

It is necessary to make the following changes to the linker:
- Resolve direct jump relocs when assigning addresses to functions.
  this allows trampoline insertion without moving all code that
  already laid down.
- Lay down packages in dependency order, so that when resolving a
  inter-package direct jump reloc, the target address is already
  known. Intra-package jumps are assumed never too far.
- a linker flag -debugtramp is added for debugging trampolines:
    "-debugtramp=1 -v" prints trampoline debug message
    "-debugtramp=2"    forces all inter-package jump to use
                       trampolines (currently ARM only)
    "-debugtramp=2 -v" does both
- Some data structures are changed for bookkeeping.

On ARM, pseudo DIV/DIVU/MOD/MODU instructions now clobber R8
(unfortunate). In the standard library there is no ARM assembly
code that uses these instructions, and the compiler no longer emits
them (CL 29390).

all.bash passes with -debugtramp=2, except a disassembly test (this
is unavoidable as we changed the instruction).

TBD: debug info of trampolines?

Fixes #17028.

Change-Id: Idcce347ea7e0af77c4079041a160b2f6e114b474
Reviewed-on: https://go-review.googlesource.com/29397
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-11 13:35:33 +00:00
Michael Munday f13372c9f7 cmd/internal/obj/s390x: remove support for stores of global addresses
This CL removes support for MOVD instructions that store the address
of a global variable. For example:

  MOVD $main·a(SB), (R1)
  MOVD $main·b(SB), main·c(SB)

These instructions are emulated and the new backend doesn't need them
(the stores now always go through an intermediate register).

Change-Id: I3a1bcb3f19c5096ad0426afd76d35a4d7975733b
Reviewed-on: https://go-review.googlesource.com/30720
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-09 20:19:31 +00:00
Wedson Almeida Filho 13c829e5f6 cmd/internal/obj/x86: On amd64, relocation type for and indirect call is pc-relative.
With this change, the code in bug #15609 compiles and runs properly:

0000000000401070 <main.jump>:
  401070:	ff 15 aa 7e 06 00    	callq  *0x67eaa(%rip)        # 468f20 <main.pointer>
  401076:	c3                   	retq

0000000000468f20 g     O .rodata	0000000000000008 main.pointer

Fixes #15609

Change-Id: Iebb4d5a9f9fff335b693f4efcc97882fe04eefd7
Reviewed-on: https://go-review.googlesource.com/22950
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-10-09 19:50:09 +00:00
Michael Munday 45b26a93f3 cmd/{asm,compile}: replace TESTB op with CMPWconst on s390x
TESTB was implemented as AND $0xff, Rx, REGTMP. Unfortunately there
is no 3-operand AND-with-immediate instruction and so it was emulated
by the assembler using two instructions.

This CL uses CMPW instead of AND and also optimizes CMPW to use
the chi instruction where possible.

Overall this CL reduces the size of the .text section of the
bin/go binary by ~2%.

Change-Id: Ic335c29fc1129378fcbb1265bfb10f5b744a0f3f
Reviewed-on: https://go-review.googlesource.com/30690
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-07 20:02:59 +00:00
Michael Munday 91706c04b9 cmd/asm, cmd/internal/obj/s390x: delete unused instructions
Deletes the following s390x instructions:

 - ADDME
 - ADDZE
 - SUBME
 - SUBZE

They appear to be emulated PPC instructions left over from the
porting process and I don't think they will ever be useful.

Change-Id: I9b1ba78019dbd1218d0c8f8ea2903878802d1990
Reviewed-on: https://go-review.googlesource.com/30538
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-06 11:45:48 +00:00
Michael Munday dd1dcf9496 cmd/{asm,compile}: add ANDW, ORW and XORW instructions to s390x
Adds the following instructions and uses them in the SSA backend:

 - ANDW
 - ORW
 - XORW

The instruction encodings for 32-bit operations are typically shorter,
particularly when an immediate is used. For example, XORW $-1, R1
only requires one instruction, whereas XOR requires two.

Also removes some unused instructions (that were emulated):

 - ANDN
 - NAND
 - ORN
 - NOR

Change-Id: Iff2a16f52004ba498720034e354be9771b10cac4
Reviewed-on: https://go-review.googlesource.com/30291
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-06 02:59:04 +00:00
Cherry Zhang b662e524e4 cmd/compile: use CBZ/CBNZ instrunctions on ARM64
These are conditional branches that takes a register instead of
flags as control value.

Reduce binary size by 0.7%, text size by 2.4% (cmd/go as an
exmaple).

Change-Id: I0020cfde745f9eab680b8b949ad28c87fe183afd
Reviewed-on: https://go-review.googlesource.com/30030
Reviewed-by: David Chase <drchase@google.com>
2016-10-05 18:22:56 +00:00
Cherry Zhang 4c9a372946 runtime, cmd/internal/obj: get rid of rewindmorestack
In the function prologue, we emit a jump to the beginning of
the function immediately after calling morestack. And in the
runtime stack growing code, it decodes and emulates that jump.
This emulation was necessary before we had per-PC SP deltas,
since the traceback code assumed that the frame size was fixed
for the whole function, except on the first instruction where
it was 0. Since we now have per-PC SP deltas and PCDATA, we
can correctly record that the frame size is 0. This makes the
emulation unnecessary.

This may be helpful for registerized calling convention, where
there may be unspills of arguments after calling morestack. It
also simplifies the runtime.

Change-Id: I7ebee31eaee81795445b33f521ab6a79624c4ceb
Reviewed-on: https://go-review.googlesource.com/30138
Reviewed-by: David Chase <drchase@google.com>
2016-10-05 18:19:46 +00:00
Matthew Dempsky 518cc7f307 cmd/internal/obj/arm: cleanup some unnecessary temps and conversions
Change-Id: I573278c9aee80e62463b2542774dabeec7c3b098
Reviewed-on: https://go-review.googlesource.com/29969
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-09-29 18:14:05 +00:00
Cherry Zhang fedb0b3018 cmd/internal/obj/arm: optimize MOVW $-off(R), R
When offset < 0 and -offset fits in instruction, generate SUB
instruction, instead of ADD with constant from the pool.

Fixes #13280.

Change-Id: I57d97fe9300fe1f6554365e2262393ef50acbdd3
Reviewed-on: https://go-review.googlesource.com/30014
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-29 15:20:46 +00:00
Ilya Tocar 26531b3846 cmd/internal/obj/x86: cleanup
Remove duplicate vars, commented out code  and duplicate lines.
When choosing between 2 aliases, on with more uses was chosen.

Change-Id: I7bc15f1693de3f6d378cef9c09469970a659db40
Reviewed-on: https://go-review.googlesource.com/29152
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-29 11:45:43 +00:00
Cherry Zhang ba94dd3438 math: add some assembly implementations on ARM64
Also add GP<->FP move addressing mode to FMOVS, FMOVD
instructions.

Ceil-8                 37.1ns ± 0%   7.9ns ± 0%  -78.64%          (p=0.000 n=4+5)
Dim-8                  20.9ns ± 1%  11.3ns ± 0%  -45.93%          (p=0.008 n=5+5)
Floor-8                22.9ns ± 0%   7.9ns ± 0%  -65.41%          (p=0.029 n=4+4)
Gamma-8                 117ns ± 0%    94ns ± 1%  -19.50%          (p=0.016 n=4+5)
PowInt-8                121ns ± 0%   108ns ± 1%  -11.07%          (p=0.008 n=5+5)
PowFrac-8               331ns ± 0%   318ns ± 0%   -3.93%          (p=0.000 n=5+4)
Trunc-8                18.8ns ± 0%   7.9ns ± 0%  -57.83%          (p=0.016 n=4+5)

Change-Id: I709b7f1a914b28adc27414522db551e2630cfb92
Reviewed-on: https://go-review.googlesource.com/29734
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-27 23:52:12 +00:00
Michael Munday 17a8ec2c4f cmd/asm, cmd/internal/obj/s390x: improve add/multiply-immediate codegen
Use the A{,G}HI instructions where possible (4 bytes instead of 6 bytes
for A{,G}FI). Also, use 32-bit operations where appropriate for
multiplication.

Change-Id: I4041781cda26be52b54e4804a9e71552310762d0
Reviewed-on: https://go-review.googlesource.com/29733
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 16:00:51 +00:00
Cherry Zhang 9d4b40f55d runtime, cmd/compile: implement and use DUFFCOPY on ARM64
Change-Id: I8984eac30e5df78d4b94f19412135d3cc36969f8
Reviewed-on: https://go-review.googlesource.com/29910
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-09-27 15:07:31 +00:00
Alberto Donizetti 590f3f0c9d cmd/compile: fix misaligned comments
Realign multi-line comments that got misaligned by the c->go
conversion.

Change-Id: I584b902e95cf588aa14febf1e0b6dfa499c303c2
Reviewed-on: https://go-review.googlesource.com/29871
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-27 14:00:39 +00:00
Lynn Boger 3311275ce8 math, cmd/internal/obj/ppc64: improve floor, ceil, trunc with asm
This adds the instructions frim, frip, and friz to the ppc64x
assembler for use in implementing the math.Floor, math.Ceil, and
math.Trunc functions to improve performance.

Fixes #17185

BenchmarkCeil-128                    21.4          6.99          -67.34%
BenchmarkFloor-128                   13.9          6.37          -54.17%
BenchmarkTrunc-128                   12.7          6.33          -50.16%

Change-Id: I96131bd4e8c9c8dbafb25bfeb544cf9d2dbb4282
Reviewed-on: https://go-review.googlesource.com/29654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2016-09-23 13:03:08 +00:00
David Chase cddddbc623 cmd/compile: use ISEL, cleanup use of zero & extensions
Abandoned earlier efforts to expose zero register,
but left it in numbering to decrease squirrelyness of
register allocator.

ISELrelOp used in code generation of bool := x relOp y.
Some patterns added to better elide zero case and
some sign extension.

Updates: #17109

Change-Id: Ida7839f0023ca8f0ffddc0545f0ac269e65b05d9
Reviewed-on: https://go-review.googlesource.com/29380
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-09-22 17:36:39 +00:00
Matthew Dempsky e6134702bb cmd/compile: simplify obj.ProgInfo and extract from obj.Prog
Updates #16357.

Change-Id: Ia837dd44bad76931baa9469e64371bc253d6694b
Reviewed-on: https://go-review.googlesource.com/29219
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-21 22:08:45 +00:00
Matthew Dempsky 35d22afb4b cmd/internal/obj: remove unused GOROOT-related fields
Change-Id: I6634f70d6bd1a4eced47eda69a2d9b207d222a1b
Reviewed-on: https://go-review.googlesource.com/29470
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-09-21 01:25:29 +00:00
Matthew Dempsky e6158b3c46 cmd/internal/obj: remove unused Textp and Etextp fields
Change-Id: Idcb5a8d6676aa38b4ebd0975edd2068386f5ca83
Reviewed-on: https://go-review.googlesource.com/29449
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-21 01:15:56 +00:00
David Crawshaw e9fddf8f86 cmd/internal/obj, cmd/link: darwin dynlink support
This makes it possible for cmd/compile, when run with -dynlink on
darwin/amd64, to generate TLS_LE relocations which the linker then
turns into the appropriate PC-relative GOT load.

Change-Id: I1a71da432608bdb108ff66c22de600100209c873
Reviewed-on: https://go-review.googlesource.com/29393
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-20 03:15:15 +00:00
Michael Munday e94c52933b cmd/compile: intrinsify Ctz{32,64} and Bswap{32,64} on s390x
Also adds the 'find leftmost one' instruction (FLOGR) and replaces the
WORD-encoded use of FLOGR in math/big with it.

Change-Id: I18e7cd19e75b8501a6ae8bd925471f7e37ded206
Reviewed-on: https://go-review.googlesource.com/29372
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-19 19:03:01 +00:00
Carlos Eduardo Seo f1973fca71 cmd/asm, cmd/internal/obj/ppc64: add ppc64 vector registers and instructions
The current implementation for Power architecture does not include the vector
(Altivec) registers.  This adds the 32 VMX registers and the most commonly used
instructions: X-form loads/stores; VX-form logical operations, add/sub,
rotate/shift, count, splat, SHA Sigma and AES cipher; VC-form compare; and
VA-form permute, shift, add/sub and select.

Fixes #15619

Change-Id: I544b990631726e8fdfcce8ecca0aeeb72faae9aa
Reviewed-on: https://go-review.googlesource.com/25600
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
2016-09-19 18:39:36 +00:00
Matthew Dempsky 246074d043 cmd/internal/obj: remove ACHECKNIL
Updates #16357.

Change-Id: I35f938d675ca5c31f65c4419ee0732bbc593b5cb
Reviewed-on: https://go-review.googlesource.com/29368
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-17 02:38:24 +00:00
Matthew Dempsky e518962a27 cmd/internal/obj: simplify Plists
Keep Plists in a slice instead of a linked list.
Eliminate unnecessary fields.
Also, while here remove gc's unused breakpc and continpc vars.

Change-Id: Ia04264036c0442843869965d247ccf68a5295115
Reviewed-on: https://go-review.googlesource.com/29367
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
2016-09-17 02:09:23 +00:00
Matthew Dempsky 6fe1febc86 cmd/internal/obj: replace AGLOBL with (*Link).Globl
Replace the AGLOBL pseudo-op with a method to directly register an
LSym as a global. Similar to how we previously already replaced the
ADATA pseudo-op with directly writing out data bytes.

Passes toolstash -cmp.

Change-Id: I3631af0a2ab5798152d0c26b833dc309dbec5772
Reviewed-on: https://go-review.googlesource.com/29366
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-17 00:51:47 +00:00
Matthew Dempsky 1bd91d4ccc cmd/internal/obj: remove Addr.Etype and Addr.Width
Since the legacy backends were removed, these fields are write-only.

Change-Id: I4816c39267b7c10a4da2a6d22cd367dc475e564d
Reviewed-on: https://go-review.googlesource.com/29246
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
2016-09-16 02:47:34 +00:00
Josh Bleecher Snyder b92d39ef69 cmd/compile/internal/obj/x86: eliminate some function prologues
The standard sort swap method

func (t T) Swap(i, j int) {
  t[i], t[j] = t[j], t[i]
}

uses no stack space on architectures for which
FixedFrameSize == 0, currently 386 and amd64.

Nevertheless, we insert a stack check prologue.
This is because it contains a call to
runtime.panicindex.

However, for a few common runtime functions,
we know at compile time that they require
no arguments. Allow them to pass unnoticed.

Triggers for 380 functions during make.bash.
Cuts 4k off cmd/go.

encoding/binary benchmarks:

ReadSlice1000Int32s-8     9.49µs ± 3%    9.41µs ± 5%    ~     (p=0.075 n=29+27)
ReadStruct-8              1.50µs ± 3%    1.48µs ± 2%  -1.49%  (p=0.000 n=30+28)
ReadInts-8                 599ns ± 3%     600ns ± 3%    ~     (p=0.471 n=30+29)
WriteInts-8                836ns ± 4%     841ns ± 3%    ~     (p=0.371 n=30+29)
WriteSlice1000Int32s-8    8.84µs ± 3%    8.69µs ± 5%  -1.71%  (p=0.001 n=30+30)
PutUvarint32-8            29.6ns ± 1%    28.1ns ± 3%  -5.21%  (p=0.000 n=28+28)
PutUvarint64-8            82.6ns ± 5%    82.3ns ±10%  -0.43%  (p=0.014 n=27+30)

Swap assembly before:

"".T.Swap t=1 size=74 args=0x28 locals=0x0
	0x0000 00000 (swap.go:5)	TEXT	"".T.Swap(SB), $0-40
	0x0000 00000 (swap.go:5)	MOVQ	(TLS), CX
	0x0009 00009 (swap.go:5)	CMPQ	SP, 16(CX)
	0x000d 00013 (swap.go:5)	JLS	67
	0x000f 00015 (swap.go:5)	FUNCDATA	$0, gclocals·3cadd97b66f25a3a642be35e9362338f(SB)
	0x000f 00015 (swap.go:5)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x000f 00015 (swap.go:5)	MOVQ	"".i+32(FP), AX
	0x0014 00020 (swap.go:5)	MOVQ	"".t+16(FP), CX
	0x0019 00025 (swap.go:5)	CMPQ	AX, CX
	0x001c 00028 (swap.go:5)	JCC	$0, 60
	0x001e 00030 (swap.go:5)	MOVQ	"".t+8(FP), DX
	0x0023 00035 (swap.go:5)	MOVBLZX	(DX)(AX*1), BX
	0x0027 00039 (swap.go:5)	MOVQ	"".j+40(FP), SI
	0x002c 00044 (swap.go:5)	CMPQ	SI, CX
	0x002f 00047 (swap.go:5)	JCC	$0, 60
	0x0031 00049 (swap.go:5)	MOVBLZX	(DX)(SI*1), CX
	0x0035 00053 (swap.go:5)	MOVB	CL, (DX)(AX*1)
	0x0038 00056 (swap.go:5)	MOVB	BL, (DX)(SI*1)
	0x003b 00059 (swap.go:5)	RET
	0x003c 00060 (swap.go:5)	PCDATA	$0, $1
	0x003c 00060 (swap.go:5)	CALL	runtime.panicindex(SB)
	0x0041 00065 (swap.go:5)	UNDEF
	0x0043 00067 (swap.go:5)	NOP
	0x0043 00067 (swap.go:5)	CALL	runtime.morestack_noctxt(SB)
	0x0048 00072 (swap.go:5)	JMP	0

Swap assembly after:

"".T.Swap t=1 size=52 args=0x28 locals=0x0
	0x0000 00000 (swap.go:5)	TEXT	"".T.Swap(SB), $0-40
	0x0000 00000 (swap.go:5)	FUNCDATA	$0, gclocals·3cadd97b66f25a3a642be35e9362338f(SB)
	0x0000 00000 (swap.go:5)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x0000 00000 (swap.go:5)	MOVQ	"".i+32(FP), AX
	0x0005 00005 (swap.go:5)	MOVQ	"".t+16(FP), CX
	0x000a 00010 (swap.go:5)	CMPQ	AX, CX
	0x000d 00013 (swap.go:5)	JCC	$0, 45
	0x000f 00015 (swap.go:5)	MOVQ	"".t+8(FP), DX
	0x0014 00020 (swap.go:5)	MOVBLZX	(DX)(AX*1), BX
	0x0018 00024 (swap.go:5)	MOVQ	"".j+40(FP), SI
	0x001d 00029 (swap.go:5)	CMPQ	SI, CX
	0x0020 00032 (swap.go:5)	JCC	$0, 45
	0x0022 00034 (swap.go:5)	MOVBLZX	(DX)(SI*1), CX
	0x0026 00038 (swap.go:5)	MOVB	CL, (DX)(AX*1)
	0x0029 00041 (swap.go:5)	MOVB	BL, (DX)(SI*1)
	0x002c 00044 (swap.go:5)	RET
	0x002d 00045 (swap.go:5)	PCDATA	$0, $1
	0x002d 00045 (swap.go:5)	CALL	runtime.panicindex(SB)
	0x0032 00050 (swap.go:5)	UNDEF

Change-Id: I57dad14af8aaa5e6112deac407cfadc2bfaf1f54
Reviewed-on: https://go-review.googlesource.com/24814
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-14 18:26:52 +00:00
David Crawshaw 6488201b9b cmd/internal/obj: regenerate stringer values
Created by running 'go generate'.

That made debugging fun today.

Change-Id: I9ffe00877851f2b198275220ad6058b9005daa72
Reviewed-on: https://go-review.googlesource.com/29117
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-13 23:35:14 +00:00
Lynn Boger b6946fb120 cmd/asm: ppc64le support for ISEL for use by SSA
This adds the support for the ppc64le isel instruction so
it can be used by SSA.

Fixed #16771

Change-Id: Ia2517f0834ff5e7ad927e218b84493e0106ab4a7
Reviewed-on: https://go-review.googlesource.com/28611
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-13 14:54:31 +00:00
Cherry Zhang 38d35e714a cmd/compile, runtime/internal/atomic: intrinsify And8, Or8 on ARM64
Also add assembly implementation, in case intrinsics is disabled.

Change-Id: Iff0a8a8ce326651bd29f6c403f5ec08dd3629993
Reviewed-on: https://go-review.googlesource.com/28979
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-13 02:09:15 +00:00
Michael Munday 43bdfa9337 cmd/asm, cmd/internal/obj/s390x: add new s390x instructions for SSA
This commit adds the following instructions to support the new SSA
backend for s390x:

32-bit operations:
ADDW
SUBW
NEGW
FNEGS

Conditional moves:
MOVDEQ
MOVDGE
MOVDGT
MOVDLE
MOVDLT
MOVDNE

Unordered branches (for floating point comparisons):
BLEU
BLTU

Modulo operations:
MODW
MODWU
MODD
MODDU

The modulo operations might be removed in a future commit because
I'd like to change DIV to produce a tuple once the old backend is
removed.

This commit also removes uses of REGZERO from the assembler. They
aren't necessary and R0 will be used as a GPR by SSA.

Change-Id: I05756c1cbb74bf4a35fc492f8f0cd34b50763dc9
Reviewed-on: https://go-review.googlesource.com/29075
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-12 17:12:59 +00:00
David Crawshaw 276803d611 cmd/link: introduce a rel.ro segment
When internally linking with using rel.ro sections, this segment covers
the sections. To do this, move to other read-only sections, SELFROSECT
and SMACHOPLT, out of the way.

Part of adding PIE internal linking on linux/amd64.

Change-Id: I4fb3d180e92f7e801789ab89864010faf5a2cb6d
Reviewed-on: https://go-review.googlesource.com/28538
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-11 20:10:47 +00:00
Josh Bleecher Snyder 0e435347b1 cmd: fix format strings used with obj.Headtype
Found by vet. Introduced by CL 28853.

Change-Id: I3199e0cbdb1c512ba29eb7e4d5c1c98963f5a954
Reviewed-on: https://go-review.googlesource.com/28957
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-10 21:40:10 +00:00
David Crawshaw 3877f820a6 cmd/link, etc: introduce SymKind type
Moves the grouping of symbol kinds (sections) into cmd/internal/obj
to keep it near the definition. Groundwork for CL 28538.

Change-Id: I99112981e69b028f366e1333f31cd7defd4ff82c
Reviewed-on: https://go-review.googlesource.com/28691
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-09 16:56:03 +00:00
David Crawshaw 791f71d192 cmd: use obj.GOOS, obj.GOARCH, etc
As cmd/internal/obj is coordinating the definition of GOOS, GOARCH,
etc across the compiler and linker, turn its functions into globals
and use them everywhere.

Change-Id: I5db5addda3c6b6435c37fd5581c7c3d9a561f492
Reviewed-on: https://go-review.googlesource.com/28854
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-09-09 16:38:45 +00:00
David Crawshaw 428c349552 cmd/link, cmd/internal/obj: give Headtype a type
Separate out windows/windowsgui properly so we are not encoding some
of the Headtype value in a separate headstring global.

Remove one of the two copies of the variable from cmd/link.

Remove duplicate string to headtype list.

Change-Id: Ifa20fb9652a1dc95161e154aac11f15ad0f709d0
Reviewed-on: https://go-review.googlesource.com/28853
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-09 12:25:00 +00:00
Cherry Zhang 4354ffd38b cmd/compile: intrinsify Ctz, Bswap, and some atomics on ARM64
Change-Id: Ia5bf72b70e6f6522d6fb8cd050e78f862d37b5ae
Reviewed-on: https://go-review.googlesource.com/27936
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-08 19:45:25 +00:00
Ilya Tocar 04ade8e428 cmd/internal/obj/x86: Make VPSHUFD accept negative constant
This partially reverts commit 4e24e1d999.
Since in release 1.7 VPSHUFD support negative constant as an argument,
removing it as part of 4e24e1d999 was wrong.
Add it back.

Change-Id: Id1a3e062fe8fb4cf538edb3f9970f0664f3f545f
Reviewed-on: https://go-review.googlesource.com/27712
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-31 15:55:49 +00:00
Cherry Zhang f9dafc742d cmd/compile, runtime, etc: get rid of constant FP registers
On ARM64, MIPS64, and PPC64, some floating point registers were
reserved for constants 0, 1, 2, 0.5, etc. This CL removes them.

On ARM64, they are never used. On MIPS64 and PPC64, the only use
case is a multiplication-by-2 in the old backend of the compiler,
which is replaced with an addition.

Change-Id: I737cbf43283756e3408964fc88c567a938c57036
Reviewed-on: https://go-review.googlesource.com/28095
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-30 23:16:17 +00:00
Keith Randall 842b05832f all: use testing.GoToolPath instead of "go"
This change makes sure that tests are run with the correct
version of the go tool.  The correct version is the one that
we invoked with "go test", not the one that is first in our path.

Fixes #16577

Change-Id: If22c8f8c3ec9e7c35d094362873819f2fbb8559b
Reviewed-on: https://go-review.googlesource.com/28089
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-30 22:49:11 +00:00
Ethan Miller 4955147291 math/big: add assembly implementation of arith for ppc64{le}
The existing implementation used a pure go implementation, leading to slow
cryptographic performance.

Implemented mulWW, subVV, mulAddVWW, addMulVVW, and bitLen for
ppc64{le}.
Implemented divWW for ppc64le only, as the DIVDEU instruction is only
available on Power8 or newer.

benchcmp output:

benchmark                         old ns/op     new ns/op     delta
BenchmarkSignP384                 28934360      10877330      -62.41%
BenchmarkRSA2048Decrypt           41261033      5139930       -87.54%
BenchmarkRSA2048Sign              45231300      7610985       -83.17%
Benchmark3PrimeRSA2048Decrypt     20487300      2481408       -87.89%

Fixes #16621

Change-Id: If8b68963bb49909bde832f2bda08a3791c4f5b7a
Reviewed-on: https://go-review.googlesource.com/26951
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2016-08-29 21:03:21 +00:00
Emmanuel Odeke 7c04633e0c all: fix obsolete inferno-os links
Fixes #16911.

Fix obsolete inferno-os links, since code.google.com shutdown.
This CL points to the right files by replacing
http://code.google.com/p/inferno-os/source/browse
with
https://bitbucket.org/inferno-os/inferno-os/src/default

To implement the change I wrote and ran this script in the root:
$ grep -Rn 'http://code.google.com/p/inferno-os/source/browse' * \
| cut -d":" -f1 | while read F;do perl -pi -e \
's/http:\/\/code.google.com\/p\/inferno-os\/source\/browse/https:\/\/bitbucket.org\/inferno-os\/inferno-os\/src\/default/g'
$F;done

I excluded any cmd/vendor changes from the commit.

Change-Id: Iaaf828ac8f6fc949019fd01832989d00b29b6749
Reviewed-on: https://go-review.googlesource.com/27994
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-29 04:54:42 +00:00
Michael Munday d2dd0dfda8 cmd/internal/obj/s390x: add FIDBR and FIEBR instructions
FIDBR and FIEBR can be used for floating-point to integer rounding.
The relevant functions (Ceil, Floor and Trunc) will be updated
in a future CL.

Change-Id: I5952d67ab29d5ef8923ff1143e17a8d30169d692
Reviewed-on: https://go-review.googlesource.com/27826
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-26 17:23:16 +00:00
Michael Munday 266b349b2d cmd/internal/obj/s390x: add atomic operation instructions
Adds the following s390x instructions from the interlocked access
facility:

 * LAA(G)  - load and add
 * LAAL(G) - load and add logical
 * LAN(G)  - load and and
 * LAX(G)  - load and exclusive or
 * LAO(G)  - load and or

These instructions can be used for atomic arithmetic/logical
operations. The atomic packages will be updated in future CLs.

Change-Id: Idc850ac6749b3e778fda3da66bcd864f6b1df375
Reviewed-on: https://go-review.googlesource.com/27871
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-26 15:08:58 +00:00
Josh Bleecher Snyder 307de6540a cmd/compile/internal/obj/x86: clean up "is leaf?" check
Minor code cleanup. No functional changes.

Change-Id: I2e631b43b122174302a182a1a286c0f873851ce6
Reviewed-on: https://go-review.googlesource.com/24813
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-25 16:36:18 +00:00
Josh Bleecher Snyder 64e152910e cmd/internal/obj/x86: remove pointless NOPs
They are no longer needed by stkcheck.

Fixes #16057

Change-Id: I57cb55de5b7a7a1d31a3da200a3a2d51576b68f5
Reviewed-on: https://go-review.googlesource.com/26667
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-25 15:52:41 +00:00
Cherry Zhang e71e1fe87e cmd/compile: get MIPS64 SSA working
- implement *, /, %, shifts, Zero, Move.
- fix mistakes in comparison.
- fix floating point rounding.
- handle RetJmp in assembler (which was not handled, as a consequence
  Duff's device was disabled in the old backend.)

all.bash now passes with SSA on.

Updates #16359.

Change-Id: Ia14eed0ed1176b5d800592080c8f53dded7fe73f
Reviewed-on: https://go-review.googlesource.com/27592
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-25 12:53:36 +00:00
Dave Cheney d61c07ffd8 cmd/link/internal, cmd/internal/obj: introduce ctxt.Logf
Replace the various calls to Fprintf(ctxt.Bso, ...) with a helper,
ctxt.Logf. This also addresses the various inconsistent flushing of
ctxt.Bso.

Because we have two Link structures, add Link.Logf in both places.

Change-Id: I23093f9b9b3bf33089a0ffd7f815f92dcd1a1fa1
Reviewed-on: https://go-review.googlesource.com/27730
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-08-25 03:34:06 +00:00
Dave Cheney 3167ff7ca9 cmd/internal/*: only call ctx.Bso.Flush when something has been written
In many places where ctx.Bso.Flush is used as the target for some debug
logging, ctx.Bso.Flush is called unconditionally. In the majority of
cases where debug logging is not enabled, this means Flush is called
many times when there is nothing to be flushed (it will be called anyway
when ctx.Bso is eventually closed), sometimes in a loop.

Avoid this by moving the ctx.Bso.Flush call into the same condition
block as the debug print. This pattern was previously applied
sporadically.

Change-Id: I0444cb235cc8b9bac51a59b2e44e59872db2be06
Reviewed-on: https://go-review.googlesource.com/27579
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-08-25 01:31:17 +00:00
Keith Randall a99f812cba cmd/objdump: implement objdump of .o files
Update goobj reader so it can provide all the information
necessary to disassemble .o (and .a) files.

Grab architecture of .o files from header.

.o files have relocations in them.  This CL also contains a simple
mechanism to disassemble relocations and add relocation info as an extra
column in the output.

Fixes #13862

Change-Id: I608fd253ff1522ea47f18be650b38d528dae9054
Reviewed-on: https://go-review.googlesource.com/24818
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-24 17:36:59 +00:00