Remove the runtime ismapkey check from makemap and
add a check that the map key type supports comparison
to the hmap construction in the compiler.
Move the ismapkey check for the reflect code path
into reflect_makemap.
Change-Id: I718f79b0670c05b63ef31721e72408f59ec4ae86
Reviewed-on: https://go-review.googlesource.com/61035
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently, we handle "x op= y" by rewriting as "x = x op y", while
ensuring that any calls or receive operations in 'x' are only
evaluated once. Notably, pointer indirection, indexing operations,
etc. are left alone as it's typically safe to re-evaluate those.
However, those operations were interleaved with evaluating 'y', which
could include function calls that might cause re-evaluation to yield
different memory addresses.
As a fix, simply ensure that we order side-effecting operations in 'y'
before either evaluation of 'x'.
Fixes#21687.
Change-Id: Ib14e77760fda9c828e394e8e362dc9e5319a84b2
Reviewed-on: https://go-review.googlesource.com/60091
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Do the similar thing to CL 55143 to reduce IMUL.
Change-Id: I1bd38f618058e3cd74fac181f003610ea13f2294
Reviewed-on: https://go-review.googlesource.com/56252
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently we only use 1 and 4 as a scale for indexed 4-byte load.
In code generated in #20711 we can use indexed load with scale=8,
to improve performance:
name old time/op new time/op delta
GM-6 108µs ± 0% 95µs ± 0% -12.06% (p=0.000 n=10+10)
So add new ops and combine loadidx1(shift 3..).. into loadidx8,
same for stores.
Change-Id: I5ed1c250ac40960e20606580cf9de221e75b72f1
Reviewed-on: https://go-review.googlesource.com/46134
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Normal shift rules plus constant folding are enough to generate
efficient shift-by-constant instructions.
Add test to make sure we don't generate comparisons for constant
shifts.
TODO: there are still constant shift rules on PPC64. If they
are removed, the constant folding rules are not enough to remove
all the test and mask stuff for constant shifts. Leave them in
for now.
Fixes#20663.
Change-Id: I724cc324aa8607762d0c8aacf9bfa641bda5c2a1
Reviewed-on: https://go-review.googlesource.com/60330
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Some instructions operating on <= 32 bits also zero out upper 32bits.
Remove zeroextensions of such values. Triggers a few times during
all.bash. Also removes ugly code like:
MOVL CX,CX
Change-Id: I66a46c190dd6929b7e3c52f3fe6b967768d00638
Reviewed-on: https://go-review.googlesource.com/58090
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This makes the name of the function to construct the map bucket type
consistent with runtimes naming and the existing hmap function.
Change-Id: If4d8b4a54c92ab914d4adcb96022b48d8b5db631
Reviewed-on: https://go-review.googlesource.com/59915
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Padding needed for map buckets is dependent on the types used to
construct the map bucket. In case of indirect keys or values pointers
are used in the map bucket to the keys or values.
Change the map bucket padding calculation to take the alignment of
the key and value types used to construct the map bucket into account
instead of the original key and value type.
Since pointers are always 32bit aligned on amd64p32 this prevents
adding unneeded padding in case the key or value would have needed
64bit alignment without indirect referencing.
Change-Id: I7943448e882d269b5cff7e921a2a6f3430c50878
Reviewed-on: https://go-review.googlesource.com/60030
Reviewed-by: Keith Randall <khr@golang.org>
Check map invariants, type size and alignments during compile time.
Keep runtime checks for reflect by adding them to reflect_makemap.
Change-Id: Ia28610626591bf7fafb7d5a1ca318da272e54879
Reviewed-on: https://go-review.googlesource.com/59914
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The assembler barfs on large offsets. Make sure that all the
instructions that need to have their offsets in an int32
1) check on any rule that computes offsets for such instructions
2) change their aux fields so the check builder checks it.
The assembler also silently misassembled offsets between 1<<31
and 1<<32. Add a check in the assembler to barf on those as well.
Fixes#21655
Change-Id: Iebf24bf10f9f37b3ea819ceb7d588251c0f46d7d
Reviewed-on: https://go-review.googlesource.com/59630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Just to make it clearer which regexps are positive and which
regexps are negative.
Change-Id: Ia190e89be28048fcae2491506f552afad90a5f85
Reviewed-on: https://go-review.googlesource.com/59490
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
We can rematerialize only ops that have SP or SB as their only argument.
There are some ADDQconst(SP) that can be rematerialized, but are spilled/filled instead,
so mark addconst as rematerializeable. This shaves ~1kb from go tool.
Change-Id: Ib4cf4fe5f2ec9d3d7e5f0f77f1193eba66ca2f08
Reviewed-on: https://go-review.googlesource.com/54393
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The SSA compiler currently generates MOVOstore instructions
to optimize 16 bytes moves on AMD64 architecture.
However, we can't use the MOVOstore instruction on Plan 9,
because floating point operations are not allowed in the
note handler.
We rely on the useSSE flag to disable the use of the
MOVOstore instruction on Plan 9 and replace it by two
MOVQstore instructions.
Fixes#21625
Change-Id: Idfefcceadccafe1752b059b5fe113ce566c0e71c
Reviewed-on: https://go-review.googlesource.com/59171
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Most of these are return values that were part of a receiving parameter,
so they're still accessible.
A few others are not, but those have never had a use.
Found with github.com/mvdan/unparam, after Kevin Burke's suggestion that
the tool should also warn about unused result parameters.
Change-Id: Id8b5ed89912a99db22027703a88bd94d0b292b8b
Reviewed-on: https://go-review.googlesource.com/55910
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
One example of a heavily-used zero-size value is encoding/binary.BigEndian.
Change-Id: I8e873c447e154ab2ca61b7315df774693891270c
Reviewed-on: https://go-review.googlesource.com/59330
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.
Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fix two small but serious bugs in the DWARF location list code that
should have been caught by the automated tests I didn't write.
After emitting debug information for a user variable, mark it as done
so that it doesn't get emitted again. Otherwise it would be written once
per slot it was decomposed into.
Correct a merge error in CL 44350: the location list abbreviations need
to have DW_AT_decl_line too, otherwise the resulting DWARF is gibberish.
Change-Id: I6ab4b8b32b7870981dac80eadf0ebfc4015ccb01
Reviewed-on: https://go-review.googlesource.com/59070
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Sometimes (often for calls) we generate code like this:
MOVQ (addr),AX
MOVQ 8(addr),BX
MOVQ AX,(otheraddr)
MOVQ BX,8(otheraddr)
Replace it with
MOVUPS (addr),X0
MOVUPS X0,(otheraddr)
For completeness do the same for 8,16,32-bit loads/stores too.
Shaves 1% from code sections of go tool.
/localdisk/itocar/golang/bin/go 10293917
go_old 10334877 [40960 bytes]
read-only data = 682 bytes (0.040769%)
global text (code) = 38961 bytes (1.036503%)
Total difference 39643 bytes (0.674628%)
Updates #6853
Change-Id: I1f0d2f60273a63a079b58927cd1c4e3429d2e7ae
Reviewed-on: https://go-review.googlesource.com/57130
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Implement int reg <-> fp reg moves on amd64.
If we see a load to int reg followed by an int->fp move, then we can just
load to the fp reg instead. Same for stores.
math.Abs is now:
MOVQ "".x+8(SP), AX
SHLQ $1, AX
SHRQ $1, AX
MOVQ AX, "".~r1+16(SP)
math.Copysign is now:
MOVQ "".x+8(SP), AX
SHLQ $1, AX
SHRQ $1, AX
MOVQ "".y+16(SP), CX
SHRQ $63, CX
SHLQ $63, CX
ORQ CX, AX
MOVQ AX, "".~r2+24(SP)
math.Float64bits is now:
MOVSD "".x+8(SP), X0
MOVSD X0, "".~r1+16(SP)
(it would be nicer to use a non-SSE reg for this, nothing is perfect)
And due to the fix for #21440, the inlined version of these improve as well.
name old time/op new time/op delta
Abs 1.38ns ± 5% 0.89ns ±10% -35.54% (p=0.000 n=10+10)
Copysign 1.56ns ± 7% 1.35ns ± 6% -13.77% (p=0.000 n=9+10)
Fixes#13095
Change-Id: Ibd7f2792412a6668608780b0688a77062e1f1499
Reviewed-on: https://go-review.googlesource.com/58732
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Since golang.org/cl/31670, we've stopped using the 'embedded' function
for handling struct embeddings within package export data. Now the
only remaining use is for Go source files, which allows for some
substantial simplifications:
1. CenterDot never appears within Go source files, so that logic can
simply be removed.
2. The field name will always be declared in the local package.
Passes toolstash-check.
Change-Id: I59505f62824206dd5de0782918f98fbef6e93224
Reviewed-on: https://go-review.googlesource.com/58790
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Discovered while debugging CL 53644.
No test case because these are purely internal conversions that should
never end up resulting in compiler warnings or even generated code.
Updates #19683.
Change-Id: I0d9333ef2c963fa22eb9b5335bb022bcc9b25708
Reviewed-on: https://go-review.googlesource.com/58190
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
When we remove a nil check, add it back to the free Value pool immediately.
Fixes#18732
Change-Id: I8d644faabbfb52157d3f2d071150ff0342ac28dc
Reviewed-on: https://go-review.googlesource.com/58810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
CL 54410 and CL 56250 recently added use of the MOVOstore
instruction to improve performance.
However, we can't use the MOVOstore instruction on Plan 9,
because floating point operations are not allowed in the
note handler.
This change adds a configuration flag useSSE to enable the
use of SSE instructions for non-floating point operations.
This flag is enabled by default and disabled on Plan 9.
When this flag is disabled, the MOVOstore instruction is
not used and the MOVQstoreconst instruction is used instead.
Fixes#21599
Change-Id: Ie609e5d9b82ec0092ae874bab4ce01caa5bc8fb8
Reviewed-on: https://go-review.googlesource.com/58850
Reviewed-by: Keith Randall <khr@golang.org>
For code like the following (where x escapes):
x := []int{1}
We're currently generating a nil check. The line above is really 3 operations:
t := new([1]int)
t[0] = 1
x := t[:]
We remove the nil check for t[0] = 1, but not for t[:].
Our current nil check removal rule is too strict about the possible
memory arguments of the nil check. Unlike zeroing or storing to the
result of runtime.newobject, the nilness of runtime.newobject is
always false, even after other stores have happened in the meantime.
Change-Id: I95fad4e3a59c27effdb37c43ea215e18f30b1e5f
Reviewed-on: https://go-review.googlesource.com/58711
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This is a crude compiler pass to eliminate stores to auto variables
that are only ever written to.
Eliminates an unnecessary store to x from the following code:
func f() int {
var x := 1
return *(&x)
}
Fixes#19765.
Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa
Reviewed-on: https://go-review.googlesource.com/38746
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This prevents unnecessary reg-reg moves during pointer arithmetic.
This change reduces the size of the full hello world binary by 0.4%.
Updates #21572
Change-Id: Ia0427021e5c94545a0dbd83a6801815806e5b12d
Reviewed-on: https://go-review.googlesource.com/58371
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
mkcall is used to construct calls to builtin functions.
Instead of silently ignoring any additional arguments to mkcall
abort compilation with an error.
This protects against accidentally supplying too many arguments to mkcall
when compiler changes are made.
Change appendslice and copyany to construct calls to
slicestringcopy and slicecopy explicitly instead of
relying on the old behavior as a feature.
Change-Id: I3cfe815a57d454a129e3c08aac824f6107779a42
Reviewed-on: https://go-review.googlesource.com/57770
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Where possible generate calls to runtime makemap with int hint argument
during compile time instead of makemap with int64 hint argument.
This eliminates converting the hint argument for calls to makemap with
int64 hint argument for platforms where int64 values do not fit into
an argument of type int.
A similar optimization for makeslice was introduced in CL
golang.org/cl/27851.
386:
name old time/op new time/op delta
NewEmptyMap 53.5ns ± 5% 41.9ns ± 5% -21.56% (p=0.000 n=10+10)
NewSmallMap 182ns ± 1% 165ns ± 1% -8.92% (p=0.000 n=10+10)
Change-Id: Ibd2b4c57b36f171b173bf7a0602b3a59771e6e44
Reviewed-on: https://go-review.googlesource.com/55142
Reviewed-by: Keith Randall <khr@golang.org>
This change adds to the code-generation harness in asm_test.go support
for the use of a '$' placeholder name for test functions.
A few of uninformative function names are also changed to use the
placeholder, to confirm that the change works as expected.
Fixes#21500
Change-Id: Iba168bd85efc9822253305d003b06682cf8a6c5c
Reviewed-on: https://go-review.googlesource.com/57292
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Some debuggers use the declaration line to avoid showing variables
before they're declared. Emit them for local variables and function
parameters.
DW_AT_decl_file would be nice too, but since its value is an index
into a table built by the linker, that's dramatically harder. In
practice, with inlining disabled it's safe to assume that all a
function's variables are declared in the same file, so this should still
be pretty useful.
Change-Id: I8105818c8940cd71bc5473ec98797cce2f3f9872
Reviewed-on: https://go-review.googlesource.com/44350
Reviewed-by: David Chase <drchase@google.com>
eqstring is only called for strings with equal lengths.
Instead of pushing a pointer and length for each argument string
on the stack we can omit pushing one of the lengths on the stack.
Changing eqstrings signature to eqstring(*uint8, *uint8, int) bool
to implement the above optimization would make it very similar to the
existing memequal(*any, *any, uintptr) bool function.
Since string lengths are positive we can avoid code redundancy and
use memequal instead of using eqstring with an optimized signature.
go command binary size reduced by 4128 bytes on amd64.
name old time/op new time/op delta
CompareStringEqual 6.03ns ± 1% 5.71ns ± 1% -5.23% (p=0.000 n=19+18)
CompareStringIdentical 2.88ns ± 1% 3.22ns ± 7% +11.86% (p=0.000 n=20+20)
CompareStringSameLength 4.31ns ± 1% 4.01ns ± 1% -7.17% (p=0.000 n=19+19)
CompareStringDifferentLength 0.29ns ± 2% 0.29ns ± 2% ~ (p=1.000 n=20+20)
CompareStringBigUnaligned 64.3µs ± 2% 64.1µs ± 3% ~ (p=0.164 n=20+19)
CompareStringBig 61.9µs ± 1% 61.6µs ± 2% -0.46% (p=0.033 n=20+19)
Change-Id: Ice15f3b937c981f0d3bc8479a9ea0d10658ac8df
Reviewed-on: https://go-review.googlesource.com/53650
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
If an error was already printed during LHS conversion step, we don't reprint
the "cannot convert" error.
In particular, this prevents `_ = int("1")` (and all similar casts) from
resulting in multiple identical error messages being printed.
Fixes#20812.
Change-Id: If6e52c59eab438599d641ecf6f110ebafca740a9
Reviewed-on: https://go-review.googlesource.com/46912
Reviewed-by: Robert Griesemer <gri@golang.org>
On arm64, all boolean-generating instructions (CSET, etc.) set the upper
63 bits of the destination register to zero, so there is no need
to zero-extend the lower 8 bits again.
Fixes#21445
Change-Id: I3b176baab706eb684105400bacbaa24175f721f3
Reviewed-on: https://go-review.googlesource.com/55671
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
We can add a constant to loaction in memory with 1 instruction,
as opposed to load+add+store, so add a new op and relevent ssa rules.
Triggers in e. g. encoding/json isValidNumber:
NumberIsValid-6 36.4ns ± 0% 35.2ns ± 1% -3.32% (p=0.000 n=6+10)
Shaves ~2.5 kb from go tool.
Change-Id: I7ba576676c2522432360f77b290cecb9574a93c3
Reviewed-on: https://go-review.googlesource.com/54431
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
All of these are uints of different sizes, so checking >= 0 or < 0 are
effectively no-ops.
Found with staticcheck.
Change-Id: I16ac900eb7007bc8f9018b302136d42e483a4180
Reviewed-on: https://go-review.googlesource.com/56950
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Matt Layher <mdlayher@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Normally 64-bit div/mod is turned into runtime calls on 32-bit
arch, but the front end leaves power-of-two constant division
and hopes the SSA backend turns into a shift or AND. The SSA rule is
(Mod64u <t> n (Const64 [c])) && isPowerOfTwo(c) -> (And64 n (Const64 <t> [c-1]))
But isPowerOfTwo returns true only for positive int64, which leaves
out 1<<63 unhandled. Add a special case for 1<<63.
Fixes#21517.
Change-Id: I02d27dc7177d4af0ee8d7f5533714edecddf8c95
Reviewed-on: https://go-review.googlesource.com/56890
Reviewed-by: Keith Randall <khr@golang.org>
Just to get rid of lots of .Name() stutter in printf calls.
Change-Id: I86cf00b3f7b2172387a1c6a7f189c1897fab6300
Reviewed-on: https://go-review.googlesource.com/56630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
We already combine const stores up-to MOVQstoreconst.
Combine 2 64-bit stores of const zero into 1 sse store of 128-bit zero.
Shaves significant (>1%) amount of code from go tool:
/localdisk/itocar/golang/bin/go 10334877
go_old 10388125 [53248 bytes]
global text (code) = 51041 bytes (1.343944%)
read-only data = 663 bytes (0.039617%)
Total difference 51704 bytes (0.873981%)
Change-Id: I7bc40968023c3a69f379b10fbb433cdb11364f1b
Reviewed-on: https://go-review.googlesource.com/56250
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
Prioritized the chunks of code with 8 or more levels of indentation.
Basically early breaks/returns and joining nested ifs.
Change-Id: I6817df1303226acf2eb904a29f2db720e4f7427a
Reviewed-on: https://go-review.googlesource.com/55630
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
There are a few cases where this can be useful. Apart from the obvious
(and silly)
100*n + 200*n
where we generate one IMUL instead of two, consider:
15*n + 31*n
Currently, the compiler strength-reduces both imuls, generating:
0x0000 00000 MOVQ "".n+8(SP), AX
0x0005 00005 MOVQ AX, CX
0x0008 00008 SHLQ $4, AX
0x000c 00012 SUBQ CX, AX
0x000f 00015 MOVQ CX, DX
0x0012 00018 SHLQ $5, CX
0x0016 00022 SUBQ DX, CX
0x0019 00025 ADDQ CX, AX
0x001c 00028 MOVQ AX, "".~r1+16(SP)
0x0021 00033 RET
But combining the imuls is both faster and shorter:
0x0000 00000 MOVQ "".n+8(SP), AX
0x0005 00005 IMULQ $46, AX
0x0009 00009 MOVQ AX, "".~r1+16(SP)
0x000e 00014 RET
even without strength-reduction.
Moreover, consider:
5*n + 7*(n+1) + 11*(n+2)
We already have a rule that rewrites 7(n+1) into 7n+7, so the
generated code (without imuls merging) looks like this:
0x0000 00000 MOVQ "".n+8(SP), AX
0x0005 00005 LEAQ (AX)(AX*4), CX
0x0009 00009 MOVQ AX, DX
0x000c 00012 NEGQ AX
0x000f 00015 LEAQ (AX)(DX*8), AX
0x0013 00019 ADDQ CX, AX
0x0016 00022 LEAQ (DX)(CX*2), CX
0x001a 00026 LEAQ 29(AX)(CX*1), AX
0x001f 00031 MOVQ AX, "".~r1+16(SP)
But with imuls merging, the 5n, 7n and 11n factors get merged, and the
generated code looks like this:
0x0000 00000 MOVQ "".n+8(SP), AX
0x0005 00005 IMULQ $23, AX
0x0009 00009 ADDQ $29, AX
0x000d 00013 MOVQ AX, "".~r1+16(SP)
0x0012 00018 RET
Which is both faster and shorter; that's also the exact same code that
clang and the intel c compiler generate for the above expression.
Change-Id: Ib4d5503f05d2f2efe31a1be14e2fe6cac33730a9
Reviewed-on: https://go-review.googlesource.com/55143
Reviewed-by: Keith Randall <khr@golang.org>
This reduces the code footprint of code like:
println("foo=", foo, "bar=", bar)
which is fairly common in the runtime.
Prior to this change, this makes function calls to print each of:
"foo=", " ", foo, " ", "bar=", " ", bar, "\n"
After this change, this prints:
"foo= ", foo, " bar= ", bar, "\n"
This shrinks the hello world binary by 0.4%.
More importantly, this improves the instruction
density of important runtime routines.
Change-Id: I8971bdf5382fbaaf4a82bad4442f9da07c28d395
Reviewed-on: https://go-review.googlesource.com/55098
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Rather than emitting spaces and newlines for println
as we walk the expression, construct it all up front.
This enables further optimizations.
This requires using printstring instead of print in
the implementation of printsp and printnl,
on pain of infinite recursion.
That's ok; it's more efficient anyway, and just as simple.
While we're here, do it for other print routines as well.
Change-Id: I61d7df143810e00710c4d4d948d904007a7fd190
Reviewed-on: https://go-review.googlesource.com/55097
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
I noticed that we don't set an itab's function pointers at compile
time. Instead, we currently do it at executable startup.
Set the function pointers at compile time instead. This shortens
startup time. It has no effect on normal binary size. Object files
will have more relocations, but that isn't a big deal.
For PIE there are additional pointers that will need to be adjusted at
load time. There are already other pointers in an itab that need to be
adjusted, so the cache line will already be paged in. There might be
some binary size overhead to mark these pointers. The "go test -c
-buildmode=pie net/http" binary is 0.18% bigger.
Update #20505
Change-Id: I267c82489915b509ff66e512fc7319b2dd79b8f7
Reviewed-on: https://go-review.googlesource.com/44341
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Add support for generating TBZ/TBNZ instructions.
The bit-test-and-branch pattern shows up in a number of
important places, including the runtime (gc bitmaps).
Before this change, there were 3 TB[N]?Z instructions in the Go tool,
all of which were in hand-written assembly. After this change, there
are 285. Also, the go1 benchmark binary gets about 4.5kB smaller.
Fixes#21361
Change-Id: I170c138b852754b9b8df149966ca5e62e6dfa771
Reviewed-on: https://go-review.googlesource.com/54470
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
We don't use it any more, remove it.
Change-Id: I76ce1a4c2e7048fdd13a37d3718b5abf39ed9d26
Reviewed-on: https://go-review.googlesource.com/44474
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Just use fun[0]==0 to indicate a bad itab.
Change-Id: I28ecb2d2d857090c1ecc40b1d1866ac24a844848
Reviewed-on: https://go-review.googlesource.com/44473
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Keep itabs in a growable hash table.
Use a simple open-addressable hash table, quadratic probing, power
of two sized.
Synchronization gets a bit more tricky. The common read path now
has two atomic reads, one to get the table pointer and one to read
the entry out of the table.
I set the max load factor to 75%, kind of arbitrarily. There's a
space-speed tradeoff here, and I'm not sure where we should land.
Because we use open addressing the itab.link field is no longer needed.
I'll remove it in a separate CL.
Fixes#20505
Change-Id: Ifb3d9a337512d6cf968c1fceb1eeaf89559afebf
Reviewed-on: https://go-review.googlesource.com/44472
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This reverts commit f612cd704a.
Reason for revert: We thought the original change had broken the
linux/amd64 and linux/386 builders, but it turned out to be a problem
with the build infrastructure, not the change.
Change-Id: Ic3318a63464fcba8d845ac04494115a7ba620364
Reviewed-on: https://go-review.googlesource.com/55050
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This implements trunc, floor, and ceil in the math package
as intrinsics on ppc64x. Significant improvement mainly due
to avoiding call overhead of args and return value.
BenchmarkCeil-16 5.95 0.69 -88.40%
BenchmarkFloor-16 5.95 0.69 -88.40%
BenchmarkTrunc-16 5.82 0.69 -88.14%
Updates #21390
Change-Id: I951e182694f6e0c431da79c577272b81fb0ebad0
Reviewed-on: https://go-review.googlesource.com/54654
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
This reverts commit f0b3626904.
Reason for revert: this change caused the runtime tests on all linux/amd64 and linux/386 builders to timeout
Change-Id: Idf8cfdfc84540e21e8da403e74df5596a1d9327b
Reviewed-on: https://go-review.googlesource.com/54490
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Mostly node and position parameters that are no longer used.
Also remove an unnecessary node variable while at it.
Found with github.com/mvdan/unparam.
Change-Id: I88f9bd5d20bfc5b0f6f63ea81869daa246175061
Reviewed-on: https://go-review.googlesource.com/54130
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
If we've already imported a named type, then there's no need to
process its associated methods except to validate that the signature
matches the existing known method.
However, the current import code still creates a new function node for
each method, saves its inline body (if any), and adds the node to the
global importlist. Because of this, the duplicate methods are never
garbage collected.
This CL changes the compiler to avoid amassing uncollectable garbage
or performing any unnecessary processing.
This is particularly noticeable for protobuf-heavy code. For the
motivating Go package, this CL reduced compile max-RSS from ~12GB to
~3GB and compile time from ~65s to ~50s.
Passes toolstash -cmp for std, cmd, and k8s.io/kubernetes/cmd/....
Change-Id: Ib53ba9f2ad3212995671cf6ba220ee8a56d8d009
Reviewed-on: https://go-review.googlesource.com/51331
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
These rules trigger a few times during make.bash.
When we eliminate boundedness checks from walk.go
we'll rely on them more heavily.
Updates #19692
Change-Id: I268c36ae2f1401c68dd685b15f2d30f5d6971176
Reviewed-on: https://go-review.googlesource.com/43775
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
gc.Sysfunc must not be called concurrently.
We set up runtime routines used by the backend
prior to doing any backend compilation.
I missed the 387 ones; fix that.
Sysfunc should have been unexported during 1.9.
I will rectify that in a subsequent CL.
Fixes#21352
Change-Id: I8386eaa1e05879c25c672b9c9fc693c938e9aeb6
Reviewed-on: https://go-review.googlesource.com/54090
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
TryBot-Result: Gobot Gobot <gobot@golang.org>
ADDSDmem comment said f32 (likely a copy-paste mistake).
Also swap ADDSSmem and ADDSDmem positions in the list to uniform the
list order.
Fixes#21225
Change-Id: I26bb116900c1cf4c4e6faeef613d7318c9c85b98
Reviewed-on: https://go-review.googlesource.com/52071
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
For address of an auto or arg, on all non-x86 architectures
the assembler backend encodes the actual SP offset in the
instruction but leaves the offset in Prog unchanged. When the
assembly is printed in compile -S, it shows an offset
relative to pseudo FP/SP with an actual hardware SP base
register (e.g. R13 on ARM). This is confusing. Unset the
base register if it is indeed SP, so the assembly output is
consistent. If the base register isn't SP, it should be an
error and the error output contains the actual base register.
For address loading instructions, the base register isn't set
in the compiler on non-x86 architectures. Set it. Normally it
is SP and will be unset in the change mentioned above for
printing. If it is not, it will be an error and the error
output contains the actual base register.
No change in generated binary, only printed assembly. Passes
"go build -a -toolexec 'toolstash -cmp' std cmd" on all
architectures.
Fixes#21064.
Change-Id: Ifafe8d5f9b437efbe824b63b3cbc2f5f6cdc1fd5
Reviewed-on: https://go-review.googlesource.com/49432
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.
After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.
There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:
type myint int
var a int
b := myint(int)
and
b := (*uint64)(unsafe.Pointer(a))
don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.
Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.
A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.
TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.
go build -toolexec 'toolstash -cmp' -a std succeeds.
With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean /tmp/220352263 /tmp/621364410
completed 15 of 15, estimated time remaining 0s (eta 3:52PM)
name old time/op new time/op delta
Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14)
Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15)
GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14)
Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15)
GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15)
Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13)
Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15)
XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15)
[Geo mean] 206ms 377ms +82.86%
name old user-time/op new user-time/op delta
Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15)
Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14)
GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15)
Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15)
GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15)
Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13)
Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15)
XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15)
[Geo mean] 317ms 583ms +83.72%
name old alloc/op new alloc/op delta
Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15)
Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15)
GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14)
Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15)
GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15)
Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15)
Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15)
XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15)
[Geo mean] 42.1MB 75.0MB +78.05%
name old allocs/op new allocs/op delta
Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15)
Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14)
GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14)
Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15)
GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15)
Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15)
Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15)
XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15)
[Geo mean] 439k 755k +72.01%
name old text-bytes new text-bytes delta
HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15)
name old data-bytes new data-bytes delta
HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal)
Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
When the compiler decomposes a user variable, track its origin so that
it can be recomposed during DWARF generation.
Change-Id: Ia71c7f8e7f4d65f0652f1c97b0dda5d9cad41936
Reviewed-on: https://go-review.googlesource.com/50878
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Fix an oversight in decompose that caused floats to be missing from the
Names list.
Change-Id: I5db9c9498e9a4421742389eb929752fdac873b38
Reviewed-on: https://go-review.googlesource.com/50877
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
When we start tracking the mapping from Value to Prog, valueProgs will
be confusing. Disambiguate.
Change-Id: Ib3b302fedb7eb0ff1bde789d70a11656d82f0897
Reviewed-on: https://go-review.googlesource.com/50876
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
After we track decomposition, offset could mean stack offset or offset
in recomposed variable. Disambiguate.
Change-Id: I4d810b8c0dcac7a4ec25ac1e52898f55477025df
Reviewed-on: https://go-review.googlesource.com/50875
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
It is possible to have an unexported name with a nil package,
for an embedded field whose type is a pointer to an unexported type.
We must encode that fact in the type..namedata symbol name,
to avoid incorrectly merging an unexported name with an exported name.
Fixes#21120
Change-Id: I2e3879d77fa15c05ad92e0bf8e55f74082db5111
Reviewed-on: https://go-review.googlesource.com/50710
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Load/store-merging and move optimizations can result in unaligned
memory accesses. This is fine so long as the load/store instruction
used does not take a relative offset. In the SSA rules this means we
must not merge (MOVDaddr (SB)) ops into loads/stores unless we can
guarantee the alignment of the target.
Fixes#21048.
Change-Id: I70f13a62a148d5f0a56e704e8f76e36b4a4226d9
Reviewed-on: https://go-review.googlesource.com/49250
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Framepointer is the default now. Only print an X: list
if the settings are _not_ the default.
Before:
$ go tool compile -V
compile version devel +a5f30d9508 Sun Jul 16 14:43:48 2017 -0400 X:framepointer
$ go1.8 tool compile -V
compile version go1.8 X:framepointer
$
After:
$ go tool compile -V
compile version devel +a5f30d9508 Sun Jul 16 14:43:48 2017 -0400
$ go1.9 tool compile -V # imagined
compile version go1.9
$
Perpetuates #18317.
Change-Id: I981ba5c62be32e650a166fc9740703122595639b
Reviewed-on: https://go-review.googlesource.com/49252
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Previous code failed to account for particular control flow
involving nested loops when updating phi function inputs.
Fix involves:
1) remove incorrect shortcut
2) generate a "better" order for children in dominator tree
3) note inner-loop updates and check before applying
outer-loop updates.
Fixes#20675.
Change-Id: I2fe21470604b5c259e777ad8b15de95f7706894d
Reviewed-on: https://go-review.googlesource.com/45791
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
When a local variable is moved to the heap the declaration position
should be preserved so that later on we can assign it to the correct
DW_TAG_lexical_block.
Fixes#20959
Change-Id: I3700ef53c68ccd506d0633f11374ad88a52b2898
Reviewed-on: https://go-review.googlesource.com/47852
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
If the LHS is unassignable, there's no point in trying to make sure
the RHS can be assigned to it or making sure they're realizable
types. This is consistent with go/types.
In particular, this prevents "1 = 2" from causing a panic when "1"
still ends up with the type "untyped int", which is not realizable.
Fixes#20813.
Change-Id: I4710bdaac2e375ef12ec29b888b8ac84fb640e56
Reviewed-on: https://go-review.googlesource.com/46835
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Fixes crash when printing a related error message later on.
Fixes#20789.
Change-Id: I6d2c35aafcaeda26a211fc6c8b7dfe4a095a3efe
Reviewed-on: https://go-review.googlesource.com/46713
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Loops of the form "for i,e := range" needed to have their
condition rotated to the "bottom" for the preemptible loops
GOEXPERIMENT, but this caused a performance regression
because it degraded bounds check removal. For now, make
the loop rotation/guarding conditional on the experiment.
Fixes#20711.
Updates #10958.
Change-Id: Icfba14cb3b13a910c349df8f84838cf4d9d20cf6
Reviewed-on: https://go-review.googlesource.com/46410
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>