The location list for OpArg starts where the OpArg appears;
this is not necessarily as soon as the OpArg coulde be
observed, and it is reasonable for a user to expect that
if a breakpoint is set "on f" then the arguments to f will
be observable where that breakpoint happens to be set (this
may also require setting the breakpoint after the prologue,
but that is another issue).
Change-Id: I0a1b848e50f475e5d8a5fad781241126872a0400
Reviewed-on: https://go-review.googlesource.com/c/142819
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
This change creates the infrastructure for new lightweight atomics
primitives in runtime/internal/atomic:
- LoadAcq, for load-acquire
- StoreRel, for store-release
- CasRel, for Compare-and-Swap-release
and implements them for ppc64x. There is visible performance improvement
in producer-consumer scenarios, like BenchmarkChanProdCons*:
benchmark old ns/op new ns/op delta
BenchmarkChanProdCons0-48 2034 2034 +0.00%
BenchmarkChanProdCons10-48 1798 1608 -10.57%
BenchmarkChanProdCons100-48 1596 1585 -0.69%
BenchmarkChanProdConsWork0-48 2084 2046 -1.82%
BenchmarkChanProdConsWork10-48 1829 1668 -8.80%
BenchmarkChanProdConsWork100-48 1650 1650 +0.00%
Fixes#21348
Change-Id: I1f6ce377e4a0fe4bd7f5f775e8036f50070ad8db
Reviewed-on: https://go-review.googlesource.com/c/142277
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
All uses of these have been converted to use runtime/internal/math
functions for overflow checking.
Fixes#21588
Change-Id: I0ba57028e471803dc7d445e66d77a8f87edfdafb
Reviewed-on: https://go-review.googlesource.com/c/144037
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The code path for []byte is unused.
Rename function to stringtoruneslit to reflect change in the behavior.
Note that removed code had a bug in it,
it used [0] index instead of [i] inside a loop body.
Change-Id: I58ece5d9d3835887b014446f8a7d3e7fc2fdcaa3
Reviewed-on: https://go-review.googlesource.com/c/125796
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
prove is able to find 94 occurrences in std cmd where a divisor
can't have the value -1. The change removes
the extraneous fix-up code for these cases.
Fixes#25239
Change-Id: Ic184de971f47cc57c702eb72805b8e291c14035d
Reviewed-on: https://go-review.googlesource.com/c/130215
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The second and subsequent return values from f() need to be
converted to the element type of the first return value from f()
(which must be a slice).
Fixes#22327
Change-Id: I5c0a424812c82c1b95b6d124c5626cfc4408bdb6
Reviewed-on: https://go-review.googlesource.com/c/142718
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The goal of this change is to move work from walk to SSA,
and simplify things along the way.
This is hard to accomplish cleanly with small incremental changes,
so this large commit message aims to provide a roadmap to the diff.
High level description:
Prior to this change, walk was responsible for constructing (most of) the stack for function calls.
ascompatte gathered variadic arguments into a slice.
It also rewrote n.List from a list of arguments to a list of assignments to stack slots.
ascompatte was called multiple times to handle the receiver in a method call.
reorder1 then introduced temporaries into n.List as needed to avoid smashing the stack.
adjustargs then made extra stack space for go/defer args as needed.
Node to SSA construction evaluated all the statements in n.List,
and issued the function call, assuming that the stack was correctly constructed.
Intrinsic calls had to dig around inside n.List to extract the arguments,
since intrinsics don't use the stack to make function calls.
This change moves stack construction to the SSA construction phase.
ascompatte, now called walkParams, does all the work that ascompatte and reorder1 did.
It handles variadic arguments, inserts the method receiver if needed, and allocates temporaries.
It does not, however, make any assignments to stack slots.
Instead, it moves the function arguments to n.Rlist, leaving assignments to temporaries in n.List.
(It would be better to use Ninit instead of List; future work.)
During SSA construction, after doing all the temporary assignments in n.List,
the function arguments are assigned to stack slots by
constructing the appropriate SSA Value, using (*state).storeArg.
SSA construction also now handles adjustments for go/defer args.
This change also simplifies intrinsic calls, since we no longer need to undo walk's work.
Along the way, we simplify nodarg by pushing the fp==1 case to its callers, where it fits nicely.
Generated code differences:
There were a few optimizations applied along the way, the old way.
f(g()) was rewritten to do a block copy of function results to function arguments.
And reorder1 avoided introducing the final "save the stack" temporary in n.List.
The f(g()) block copy optimization never actually triggered; the order pass rewrote away g(), so that has been removed.
SSA optimizations mostly obviated the need for reorder1's optimization of avoiding the final temporary.
The exception was when the temporary's type was not SSA-able;
in that case, we got a Move into an autotmp and then an immediate Move onto the stack,
with the autotmp never read or used again.
This change introduces a new rewrite rule to detect such pointless double Moves
and collapse them into a single Move.
This is actually more powerful than the original optimization,
since the original optimization relied on the imprecise Node.HasCall calculation.
The other significant difference in the generated code is that the stack is now constructed
completely in SP-offset order. Prior to this change, the stack was constructed somewhat
haphazardly: first the final argument that Node.HasCall deemed to require a temporary,
then other arguments, then the method receiver, then the defer/go args.
SP-offset is probably a good default order. See future work.
There are a few minor object file size changes as a result of this change.
I investigated some regressions in early versions of this change.
One regression (in archive/tar) was the addition of a single CMPQ instruction,
which would be eliminated were this TODO from flagalloc to be done:
// TODO: Remove original instructions if they are never used.
One regression (in text/template) was an ADDQconstmodify that is now
a regular MOVQLoad+ADDQconst+MOVQStore, due to an unlucky change
in the order in which arguments are written. The argument change
order can also now be luckier, so this appears to be a wash.
All in all, though there will be minor winners and losers,
this change appears to be performance neutral.
Future work:
Move loading the result of function calls to SSA construction; eliminate OINDREGSP.
Consider pushing stack construction deeper into SSA world, perhaps in an arch-specific pass.
Among other benefits, this would make it easier to transition to a new calling convention.
This would require rethinking the handling of stack conflicts and is non-trivial.
Figure out some clean way to indicate that stack construction Stores/Moves
do not alias each other, so that subsequent passes may do things like
CSE+tighten shared stack setup, do DSE using non-first Stores, etc.
This would allow us to eliminate the minor text/template regression.
Possibly make assignments to stack slots not treated as statements by DWARF.
Compiler benchmarks:
name old time/op new time/op delta
Template 182ms ± 2% 179ms ± 2% -1.69% (p=0.000 n=47+48)
Unicode 86.3ms ± 5% 85.1ms ± 4% -1.36% (p=0.001 n=50+50)
GoTypes 646ms ± 1% 642ms ± 1% -0.63% (p=0.000 n=49+48)
Compiler 2.89s ± 1% 2.86s ± 2% -1.36% (p=0.000 n=48+50)
SSA 8.47s ± 1% 8.37s ± 2% -1.22% (p=0.000 n=47+50)
Flate 122ms ± 2% 121ms ± 2% -0.66% (p=0.000 n=47+45)
GoParser 147ms ± 2% 146ms ± 2% -0.53% (p=0.006 n=46+49)
Reflect 406ms ± 2% 403ms ± 2% -0.76% (p=0.000 n=48+43)
Tar 162ms ± 3% 162ms ± 4% ~ (p=0.191 n=46+50)
XML 223ms ± 2% 222ms ± 2% -0.37% (p=0.031 n=45+49)
[Geo mean] 382ms 378ms -0.89%
name old user-time/op new user-time/op delta
Template 219ms ± 3% 216ms ± 3% -1.56% (p=0.000 n=50+48)
Unicode 109ms ± 6% 109ms ± 5% ~ (p=0.190 n=50+49)
GoTypes 836ms ± 2% 828ms ± 2% -0.96% (p=0.000 n=49+48)
Compiler 3.87s ± 2% 3.80s ± 1% -1.81% (p=0.000 n=49+46)
SSA 12.0s ± 1% 11.8s ± 1% -2.01% (p=0.000 n=48+50)
Flate 142ms ± 3% 141ms ± 3% -0.85% (p=0.003 n=50+48)
GoParser 178ms ± 4% 175ms ± 4% -1.66% (p=0.000 n=48+46)
Reflect 520ms ± 2% 512ms ± 2% -1.44% (p=0.000 n=45+48)
Tar 200ms ± 3% 198ms ± 4% -0.61% (p=0.037 n=47+50)
XML 277ms ± 3% 275ms ± 3% -0.85% (p=0.000 n=49+48)
[Geo mean] 482ms 476ms -1.23%
name old alloc/op new alloc/op delta
Template 36.1MB ± 0% 35.3MB ± 0% -2.18% (p=0.008 n=5+5)
Unicode 29.8MB ± 0% 29.3MB ± 0% -1.58% (p=0.008 n=5+5)
GoTypes 125MB ± 0% 123MB ± 0% -2.13% (p=0.008 n=5+5)
Compiler 531MB ± 0% 513MB ± 0% -3.40% (p=0.008 n=5+5)
SSA 2.00GB ± 0% 1.93GB ± 0% -3.34% (p=0.008 n=5+5)
Flate 24.5MB ± 0% 24.3MB ± 0% -1.18% (p=0.008 n=5+5)
GoParser 29.4MB ± 0% 28.7MB ± 0% -2.34% (p=0.008 n=5+5)
Reflect 87.1MB ± 0% 86.0MB ± 0% -1.33% (p=0.008 n=5+5)
Tar 35.3MB ± 0% 34.8MB ± 0% -1.44% (p=0.008 n=5+5)
XML 47.9MB ± 0% 47.1MB ± 0% -1.86% (p=0.008 n=5+5)
[Geo mean] 82.8MB 81.1MB -2.08%
name old allocs/op new allocs/op delta
Template 352k ± 0% 347k ± 0% -1.32% (p=0.008 n=5+5)
Unicode 342k ± 0% 339k ± 0% -0.66% (p=0.008 n=5+5)
GoTypes 1.29M ± 0% 1.27M ± 0% -1.30% (p=0.008 n=5+5)
Compiler 4.98M ± 0% 4.87M ± 0% -2.14% (p=0.008 n=5+5)
SSA 15.7M ± 0% 15.2M ± 0% -2.86% (p=0.008 n=5+5)
Flate 233k ± 0% 231k ± 0% -0.83% (p=0.008 n=5+5)
GoParser 296k ± 0% 291k ± 0% -1.54% (p=0.016 n=5+4)
Reflect 1.05M ± 0% 1.04M ± 0% -0.65% (p=0.008 n=5+5)
Tar 343k ± 0% 339k ± 0% -0.97% (p=0.008 n=5+5)
XML 432k ± 0% 426k ± 0% -1.19% (p=0.008 n=5+5)
[Geo mean] 815k 804k -1.35%
name old object-bytes new object-bytes delta
Template 505kB ± 0% 505kB ± 0% -0.01% (p=0.008 n=5+5)
Unicode 224kB ± 0% 224kB ± 0% ~ (all equal)
GoTypes 1.82MB ± 0% 1.83MB ± 0% +0.06% (p=0.008 n=5+5)
Flate 324kB ± 0% 324kB ± 0% +0.00% (p=0.008 n=5+5)
GoParser 402kB ± 0% 402kB ± 0% +0.04% (p=0.008 n=5+5)
Reflect 1.39MB ± 0% 1.39MB ± 0% -0.01% (p=0.008 n=5+5)
Tar 449kB ± 0% 449kB ± 0% -0.02% (p=0.008 n=5+5)
XML 598kB ± 0% 597kB ± 0% -0.05% (p=0.008 n=5+5)
Change-Id: Ifc9d5c1bd01f90171414b8fb18ffe2290d271143
Reviewed-on: https://go-review.googlesource.com/c/114797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This CL add 3 rules to combine byte-store to word-store on386 and
amd64.
Change-Id: Iffd9cda42f1961680c81def4edc773ad58f211b3
Reviewed-on: https://go-review.googlesource.com/c/143057
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The fix in CL 141649 is not right, the line in question got moved,
not added. Not sure why the -u option didn't do the right thing
when preparing the diff.
Fixes#28198
Change-Id: I6d45fdbbd5a9487cc70da07ab84e090b689a57f5
Reviewed-on: https://go-review.googlesource.com/c/142298
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
This change attaches a slots to the OpArg values for
incoming params, and this in turn causes location lists
to be generated for params, and that yields better
debugging, in delve and sometimes in gdb.
The parameter lifetimes could start earlier; they are in
fact defined on entry, not at the point where the OpArg is
finally mentioned. (that will be addressed in another CL)
Change-Id: Icca891e118291d260c35a14acd5bc92bb82d9e9f
Reviewed-on: https://go-review.googlesource.com/c/141697
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Nowadays there are better ways to safely run untrusted Go programs, like
NaCl and gVisor.
Change-Id: I20c45f13a50dbcf35c343438b720eb93e7b4e13a
Reviewed-on: https://go-review.googlesource.com/c/142717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Too deeply nested code is hard to fit in ssa.html.
This CL reduces the tab size to 4 characters.
Change-Id: I08643b0868bce3439567084c7d701654655f23d7
Reviewed-on: https://go-review.googlesource.com/c/142857
Reviewed-by: David Chase <drchase@google.com>
Ensure that label redefinition error column numbers
print the actual start of the label instead of the
position of the label's delimiting token ":".
For example, given this program:
package main
func main() {
foo:
foo:
foo:
foo :
}
* Before:
main.go:5:13: label foo defined and not used
main.go:6:7: label foo already defined at main.go:5:13
main.go:7:4: label foo already defined at main.go:5:13
main.go:8:16: label foo already defined at main.go:5:13
* After:
main.go:5:13: label foo defined and not used
main.go:6:4: label foo already defined at main.go:5:13
main.go:7:1: label foo already defined at main.go:5:13
main.go:8:1: label foo already defined at main.go:5:13
Fixes#26411
Change-Id: I8eb874b97fdc8862547176d57ac2fa0f075f2367
Reviewed-on: https://go-review.googlesource.com/c/124595
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This used to be used by cmd/vet and some assembly generation tests, but
those were removed in CL 37691 and CL 107336. No point in keeping an
unneeded flag around.
Fixes#28220.
Change-Id: I59f8546954ab36ea61ceba81c10d6e16d74b966a
Reviewed-on: https://go-review.googlesource.com/c/142677
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
There are only a handful of nodes that we need to pass to
typecheckdef (OLITERAL, ONAME, OTYPE, and ONONAME), but typecheck1
takes the awkward approach of calling typecheckdef on every node with
Sym != nil, and then excluding a long list of uninteresting Ops that
have a non-nil Sym.
Passes toolstash-check.
Change-Id: I0271d2faff0208ad57ddc1f1a540a5fbed870234
Reviewed-on: https://go-review.googlesource.com/c/142657
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In golang.org/cl/75310, the compiler's typechecker was changed so that
map key types were validated at a later stage, to make sure that all the
necessary type information was present.
This still worked for map type declarations, but caused a regression for
top-level map variable declarations. These now caused a fatal panic
instead of a typechecking error.
The cause was that checkMapKeys was run too early, before all
typechecking was done. In particular, top-level map variable
declarations are typechecked as external declarations, much later than
where checkMapKeys was run.
Add a test case for both exported and unexported top-level map
declarations, and add a second call to checkMapKeys at the actual end of
typechecking. Simply moving the one call isn't a good solution either;
the comments expand on that.
Fixes#28058.
Change-Id: Ia5febb01a1d877447cf66ba44fb49a7e0f4f18a5
Reviewed-on: https://go-review.googlesource.com/c/140417
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
x = map[string(byteslice)] is already optimized by the compiler to avoid a
string allocation. This CL generalizes this optimization to:
x = map[T1{ ... Tn{..., string(byteslice), ...} ... }]
where T1 to Tn is a nesting of struct and array literals.
Found in a hot code path that used a struct of strings made from []byte
slices to make a map lookup.
There are no uses of the more generalized optimization in the standard library.
Passes toolstash -cmp.
MapStringConversion/32/simple 21.9ns ± 2% 21.9ns ± 3% ~ (p=0.995 n=17+20)
MapStringConversion/32/struct 28.8ns ± 3% 22.0ns ± 2% -23.80% (p=0.000 n=20+20)
MapStringConversion/32/array 28.5ns ± 2% 21.9ns ± 2% -23.14% (p=0.000 n=19+16)
MapStringConversion/64/simple 21.0ns ± 2% 21.1ns ± 3% ~ (p=0.072 n=19+18)
MapStringConversion/64/struct 72.4ns ± 3% 21.3ns ± 2% -70.53% (p=0.000 n=20+20)
MapStringConversion/64/array 72.8ns ± 1% 21.0ns ± 2% -71.13% (p=0.000 n=17+19)
name old allocs/op new allocs/op delta
MapStringConversion/32/simple 0.00 0.00 ~ (all equal)
MapStringConversion/32/struct 0.00 0.00 ~ (all equal)
MapStringConversion/32/array 0.00 0.00 ~ (all equal)
MapStringConversion/64/simple 0.00 0.00 ~ (all equal)
MapStringConversion/64/struct 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20)
MapStringConversion/64/array 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20)
Change-Id: I483b4d84d8d74b1025b62c954da9a365e79b7a3a
Reviewed-on: https://go-review.googlesource.com/c/116275
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Add generic, 386 and amd64 specific ops and SSA rules for multiplication
with overflow and branching based on overflow flags. Use these to intrinsify
runtime/internal/math.MulUinptr.
On amd64
mul, overflow := math.MulUintptr(a, b)
if overflow {
is lowered to two instructions:
MULQ SI
JO 0x10ee35c
No codegen tests as codegen can not currently test unexported internal runtime
functions.
amd64:
name old time/op new time/op delta
MulUintptr/small 1.16ns ± 5% 0.88ns ± 6% -24.36% (p=0.000 n=19+20)
MulUintptr/large 10.7ns ± 1% 1.1ns ± 1% -89.28% (p=0.000 n=17+19)
Change-Id: If60739a86f820e5044d677276c21df90d3c7a86a
Reviewed-on: https://go-review.googlesource.com/c/141820
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The generated code for the append builtin already checks if the appended
to slice is large enough and calls growslice if that is not the case.
Trust that this ensures the slice is large enough and avoid the
implicit bounds check when slicing the slice to its new size.
Removes 365 panicslice calls (-14%) from the go binary which
reduces the binary size by ~12kbyte.
Change-Id: I1b88418675ff409bc0b956853c9e95241274d5a6
Reviewed-on: https://go-review.googlesource.com/c/119315
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This CL adds a new internal math package for use by the runtime.
The new package exports a MulUintptr function with uintptr arguments
a and b and returns uintptr(a*b) and whether the full-width product
x*y does overflow the uintptr value range (uintptr(x*y) != x*y).
Uses of MulUinptr in the runtime and intrinsics for performance
will be added in followup CLs.
Updates #21588
Change-Id: Ia5a02eeabc955249118e4edf68c67d9fc0858058
Reviewed-on: https://go-review.googlesource.com/c/91755
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Ensure that we correctly type the stack temps for regular closures,
method function closures, and slice literals.
Then we don't need to override the dummy types later.
Furthermore, this allows order to reuse temporaries of these types.
OARRAYLIT doesn't need a temporary as far as I can tell, so I
removed that case from order.
Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
Reviewed-on: https://go-review.googlesource.com/c/140306
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Not sure why this changed behavior, but seems mostly harmless.
Fixes#28198
Change-Id: Ie25c6e1fcb64912a582c7ae7bf92c4c1642e83cb
Reviewed-on: https://go-review.googlesource.com/c/141649
Reviewed-by: David Chase <drchase@google.com>
Merge the two for loops that set up the node lists for
temporaries into one for loop.
Passes toolstash -cmp
Change-Id: Ibc739115f38c8869b0dcfbf9819fdc2fc96962e0
Reviewed-on: https://go-review.googlesource.com/c/141819
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Instead of allocating a new temporary each time one
is needed, keep a list of temporaries which are free
(have already been VARKILLed on every path) and use
one of them.
Should save a lot of stack space. In a function like this:
func main() {
fmt.Printf("%d %d\n", 2, 3)
fmt.Printf("%d %d\n", 4, 5)
fmt.Printf("%d %d\n", 6, 7)
}
The three [2]interface{} arrays used to hold the ... args
all use the same autotmp, instead of 3 different autotmps
as happened previous to this CL.
Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
Reviewed-on: https://go-review.googlesource.com/c/140301
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.
This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).
func f(a, b, c string) {
fmt.Printf("%s %s\n", a, b)
fmt.Printf("%s %s\n", b, c)
}
This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.
The go binary shrinks 0.3%.
Update #24286
Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.
Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Instead of
MOVB go.string."foo"(SB), AX
do
MOVB $102, AX
When we know the global we're loading from is readonly, we can
do that read at compile time.
I've made this arch-dependent mostly because the cases where this
happens often are memory->memory moves, and those don't get
decomposed until lowering.
Did amd64/386/arm/arm64. Other architectures could follow.
Update #26498
Change-Id: I41b1dc831b2cd0a52dac9b97f4f4457888a46389
Reviewed-on: https://go-review.googlesource.com/c/141118
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This CL makes sure we walk the newly generated assignment. Part of
that walk makes sure that all symbols for strings are emitted before
we start referencing them during the parallel compilation
phase. Without this change, those references during the parallel phase
do a create-if-not-exist, which leads to a data race.
I'm not 100% sure this is the fix for the issues below, but optimistically
assuming it is...
Fixes#28170Fixes#28159
Change-Id: Ic63d5160ad9be5cb23fa6bbb2183e4848776c0ff
Reviewed-on: https://go-review.googlesource.com/c/141648
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This flag lost its usefulness in CL 34273.
Change-Id: I033c29f105937139b4e359a340906be439f1ed07
Reviewed-on: https://go-review.googlesource.com/c/141646
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Interface and string comparisons don't need separate Ops any more than
struct or array comparisons do.
Removing them requires shuffling some code around in walk (and a
little in order), but overall allows simplifying things a bit.
Passes toolstash-check.
Change-Id: I084b8a6c089b768dc76d220379f4daed8a35db15
Reviewed-on: https://go-review.googlesource.com/c/141637
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This restores the printing of vXX and bYY in the left-hand
edge of the last column of ssa.html, where the generated
progs appear.
Change-Id: I81ab9b2fa5ae28e6e5de1b77665cfbed8d14e000
Reviewed-on: https://go-review.googlesource.com/c/141277
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Yury Smolsky <yury@smolsky.by>
Forgot to do this for golang.org/cl/76312.
Change-Id: Ic20fef3eeff14a805f608221aff1fa03934be3ca
Reviewed-on: https://go-review.googlesource.com/c/141357
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Prior to stack tracing, inlining could cause
dead pointers to be kept alive in some loops.
See #18336 and CL 31674.
The adjustment removed by this change preserved the inlining status quo
in the face of Node structure changes, to avoid creating new problems.
Now that stack tracing provides precision, these hacks can be removed.
Of course, our inlining code model is already hacky (#17566),
but at least now there will be fewer epicyclical hacks.
Newly inline-able functions in std cmd as a result of this change:
hash/adler32/adler32.go:65:6: can inline (*digest).UnmarshalBinary
hash/fnv/fnv.go:281:6: can inline (*sum32).UnmarshalBinary
hash/fnv/fnv.go:292:6: can inline (*sum32a).UnmarshalBinary
reflect/value.go:1298:6: can inline Value.OverflowComplex
compress/bzip2/bit_reader.go:25:6: can inline newBitReader
encoding/xml/xml.go:365:6: can inline (*Decoder).switchToReader
vendor/golang_org/x/crypto/cryptobyte/builder.go:77:6: can inline (*Builder).AddUint16
crypto/x509/x509.go:1851:58: can inline buildExtensions.func2.1.1
crypto/x509/x509.go:1871:58: can inline buildExtensions.func2.3.1
crypto/x509/x509.go:1883:58: can inline buildExtensions.func2.4.1
cmd/vet/internal/cfg/builder.go:463:6: can inline (*builder).labeledBlock
crypto/tls/handshake_messages.go:1450:6: can inline (*newSessionTicketMsg).marshal
crypto/tls/handshake_server.go:769:6: can inline (*serverHandshakeState).clientHelloInfo
crypto/tls/handshake_messages.go:1171:6: can inline (*nextProtoMsg).unmarshal
cmd/link/internal/amd64/obj.go:40:6: can inline Init
cmd/link/internal/ppc64/obj.go:40:6: can inline Init
net/http/httputil/persist.go:54:6: can inline NewServerConn
net/http/fcgi/child.go:83:6: can inline newResponse
cmd/compile/internal/ssa/poset.go:245:6: can inline (*poset).newnode
Change-Id: I19e8e383a6273849673d35189a9358870665f82f
Reviewed-on: https://go-review.googlesource.com/c/141117
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Do []byte(string) conversions more efficiently when the string
is a constant. Instead of calling stringtobyteslice, allocate
just the space we need and encode the initialization directly.
[]byte("foo") rewrites to the following pseudocode:
var s [3]byte // on heap or stack, depending on whether b escapes
s = *(*[3]byte)(&"foo"[0]) // initialize s from the string
b = s[:]
which generates this assembly:
0x001d 00029 (tmp1.go:9) LEAQ type.[3]uint8(SB), AX
0x0024 00036 (tmp1.go:9) MOVQ AX, (SP)
0x0028 00040 (tmp1.go:9) CALL runtime.newobject(SB)
0x002d 00045 (tmp1.go:9) MOVQ 8(SP), AX
0x0032 00050 (tmp1.go:9) MOVBLZX go.string."foo"+2(SB), CX
0x0039 00057 (tmp1.go:9) MOVWLZX go.string."foo"(SB), DX
0x0040 00064 (tmp1.go:9) MOVW DX, (AX)
0x0043 00067 (tmp1.go:9) MOVB CL, 2(AX)
// Then the slice is b = {AX, 3, 3}
The generated code is still not optimal, as it still does load/store
from read-only memory instead of constant stores. Next CL...
Update #26498Fixes#10170
Change-Id: I4b990b19f9a308f60c8f4f148934acffefe0a5bd
Reviewed-on: https://go-review.googlesource.com/c/140698
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Major cleanup to structure the code more similarly to go/constant.
Passes "toolstash -cmp" on std cmd.
Change-Id: I3ec7a7a24e313f119b0da4095001aad02e317894
Reviewed-on: https://go-review.googlesource.com/c/139901
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>