This reverts CL 485500.
Reason for revert: This breaks internal tests at Google, see b/280861579 and b/280820455.
Change-Id: I426278d400f7611170918fc07c524cb059b9cc55
Reviewed-on: https://go-review.googlesource.com/c/go/+/492995
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Chressie Himpel <chressie@google.com>
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 485500 description]
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
Fixes#51676.
Fixes#59294.
Fixes#59678.
Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/485500
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
This reverts commit CL 486381.
Submitted out of order and breaks bootstrap.
Change-Id: Ia472111cb966e884a48f8ee3893b3bf4b4f4f875
Reviewed-on: https://go-review.googlesource.com/c/go/+/486915
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Austin Clements <austin@google.com>
This reverts CL 481061.
Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers
race detection on `g->stacklo` because the synchronization is in Go,
which isn't instrumented.
For #51676.
For #59294.
For #59678.
Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47
Reviewed-on: https://go-review.googlesource.com/c/go/+/485275
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
Fixes#51676.
Fixes#59294.
Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 392854.
Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.
Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 479255.
Reason for revert: need to revert CL 392854, and this caused a conflict.
Change-Id: I6cb105c62e51b47de3f652df5f5ee92673a93919
Reviewed-on: https://go-review.googlesource.com/c/go/+/481058
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
A comparison instruction was missing in CL 392854.
Should fix ARM builders.
For #51676.
Change-Id: Ica27a99be10e595bab4fad35e2e6c00a1c68a662
Reviewed-on: https://go-review.googlesource.com/c/go/+/479255
TryBot-Bypass: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Fixes#51676
Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Have the write barrier call return a pointer to a buffer into which
the generated code records pointers that need write barrier treatment.
Change-Id: I7871764298e0aa1513de417010c8d46b296b199e
Reviewed-on: https://go-review.googlesource.com/c/go/+/447781
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Bypass: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Previously, the write barrier calls themselves did the actual
writes to memory. Instead, move those writes out to a common location
that both the wb-enabled and wb-disabled code paths share.
This enables us to optimize the write barrier path without having
to worry about performing the actual writes.
Change-Id: Ia71ab651908ec124cc33141afb52e4ca19733ac6
Reviewed-on: https://go-review.googlesource.com/c/go/+/447780
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Future CLs will remove the invariant that pointers are always put in
the write barrier in pairs.
The behavior of the assembly code changes a bit, where instead of writing
the pointers unconditionally and then checking for overflow, check for
overflow first and then write the pointers.
Also changed the write barrier flush function to not take the src/dst
as arguments.
Change-Id: I2ef708038367b7b82ea67cbaf505a1d5904c775c
Reviewed-on: https://go-review.googlesource.com/c/go/+/447779
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Keith Randall <khr@golang.org>
On LR architectures, morestack (and morestack_noctxt) are called
with a special calling convention, where the caller doesn't save
LR on stack but passes it as a register, which morestack will save
to g.sched.lr. The stack unwinder currently doesn't understand it,
and would fail to unwind from it. morestack already writes SP (as
it switches stack), but morestack_noctxt (which tailcalls
morestack) doesn't. If a profiling signal lands right in
morestack_noctxt, the unwinder will try to unwind the stack and
go off, and possibly crash.
Marking morestack_noctxt SPWRITE stops the unwinding.
Ideally we could teach the unwinder about the special calling
convention, or change the calling convention to be less special
(so the unwinder doesn't need to fetch a register from the signal
context). This is a stop-gap solution, to stop the unwinder from
crashing.
Fixes#54332.
Change-Id: I75295f2e27ddcf05f1ea0b541aedcb9000ae7576
Reviewed-on: https://go-review.googlesource.com/c/go/+/425396
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
In asmcgocall() we need to switch to the g0 stack if we're not already on
the g0 stack or the gsignal stack. The prefered way of doing this is to
check gsignal first, then g0, since if we are going to switch to g0 we will
need g0 handy (thus avoiding a second load).
Rewrite/reorder 386 and amd64 to check gsignal first - this shaves a few
assembly instructions off and makes the order consistent with arm, arm64,
mips64 and ppc64. Add missing gsignal checks to mips, riscv64 and s390x.
Change-Id: I1b027bf393c25e0c33e1d8eb80de67e4a0a3f561
Reviewed-on: https://go-review.googlesource.com/c/go/+/335869
Trust: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Currently, deferreturn runs deferred functions by backing up its
return PC to the deferreturn call, and then effectively tail-calling
the deferred function (via jmpdefer). The effect of this is that the
deferred function appears to be called directly from the deferee, and
when it returns, the deferee calls deferreturn again so it can run the
next deferred function if necessary.
This unusual flow control leads to a large number of special cases and
complications all over the tool chain.
This used to be necessary because deferreturn copied the deferred
function's argument frame directly into its caller's frame and then
had to invoke that call as if it had been called from its caller's
frame so it could access it arguments. But now that we've simplified
defer processing so the runtime only deals with argument-less
closures, this approach is no longer necessary.
This CL simplifies all of this by making deferreturn simply call
deferred functions in a loop.
This eliminates the need for jmpdefer, so we can delete a bunch of
per-architecture assembly code.
This eliminates several special cases on Wasm, since it couldn't
support these calling shenanigans directly and thus had to simulate
the loop a different way. Now Wasm can largely work the way the other
platforms do.
This eliminates the per-architecture Ginsnopdefer operation. On PPC64,
this was necessary to reload the TOC pointer after the tail call
(since TOC pointers in general make tail calls impossible). The tail
call is gone, and in the case where we do force a jump to the
deferreturn call when recovering from an open-coded defer, we go
through gogo (via runtime.recovery), which handles the TOC. On other
platforms, we needed a NOP so traceback didn't get confused by seeing
the return to the CALL instruction, rather than the usual return to
the instruction following the CALL instruction. Now we don't inject a
return to the CALL instruction at all, so this NOP is also
unnecessary.
The one potential effect of this is that deferreturn could now appear
in stack traces from deferred functions. However, this could already
happen from open-coded defers, so we've long since marked deferreturn
as a "wrapper" so it gets elided not only from printed stack traces,
but from runtime.Callers*.
This is a retry of CL 337652 because we had to back out its parent.
There are no changes in this version.
Change-Id: I3f54b7fec1d7ccac71cc6cf6835c6a46b7e5fb6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/339397
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This reverts CL 227652.
I'm reverting CL 337651 and this builds on top of it.
Change-Id: I03ce363be44c2a3defff2e43e7b1aad83386820d
Reviewed-on: https://go-review.googlesource.com/c/go/+/338709
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Currently, deferreturn runs deferred functions by backing up its
return PC to the deferreturn call, and then effectively tail-calling
the deferred function (via jmpdefer). The effect of this is that the
deferred function appears to be called directly from the deferee, and
when it returns, the deferee calls deferreturn again so it can run the
next deferred function if necessary.
This unusual flow control leads to a large number of special cases and
complications all over the tool chain.
This used to be necessary because deferreturn copied the deferred
function's argument frame directly into its caller's frame and then
had to invoke that call as if it had been called from its caller's
frame so it could access it arguments. But now that we've simplified
defer processing so the runtime only deals with argument-less
closures, this approach is no longer necessary.
This CL simplifies all of this by making deferreturn simply call
deferred functions in a loop.
This eliminates the need for jmpdefer, so we can delete a bunch of
per-architecture assembly code.
This eliminates several special cases on Wasm, since it couldn't
support these calling shenanigans directly and thus had to simulate
the loop a different way. Now Wasm can largely work the way the other
platforms do.
This eliminates the per-architecture Ginsnopdefer operation. On PPC64,
this was necessary to reload the TOC pointer after the tail call
(since TOC pointers in general make tail calls impossible). The tail
call is gone, and in the case where we do force a jump to the
deferreturn call when recovering from an open-coded defer, we go
through gogo (via runtime.recovery), which handles the TOC. On other
platforms, we needed a NOP so traceback didn't get confused by seeing
the return to the CALL instruction, rather than the usual return to
the instruction following the CALL instruction. Now we don't inject a
return to the CALL instruction at all, so this NOP is also
unnecessary.
The one potential effect of this is that deferreturn could now appear
in stack traces from deferred functions. However, this could already
happen from open-coded defers, so we've long since marked deferreturn
as a "wrapper" so it gets elided not only from printed stack traces,
but from runtime.Callers*.
Change-Id: Ie9f700cd3fb774f498c9edce363772a868407bf7
Reviewed-on: https://go-review.googlesource.com/c/go/+/337652
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
newproc/deferproc takes a siz argument for the go'd/deferred
function's argument size. Now it is always zero. Remove the
argument.
Change-Id: If1bb8d427e34015ccec0ba10dbccaae96757fa8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/325917
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
At runtime startup it calls newproc from assembly code to start
the main goroutine. runtime.main has no arguments, so the arg
size should be 0, instead of 8.
While here, use clearer code sequence to open the frame.
Change-Id: I2bbb26a83521ea867897530b86a85b22a3c8be9d
Reviewed-on: https://go-review.googlesource.com/c/go/+/321957
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
This switches openbsd/arm to thread creation via pthreads, rather than doing
direct system calls.
Update #36435
Change-Id: Ia8749e3723a9967905c33b6d93dfd9be797a486c
Reviewed-on: https://go-review.googlesource.com/c/go/+/315790
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
The setg call a few lines earlier has already performed the same iscgo check
and called save_g if necessary.
Change-Id: I6e7c44cef4e0397d6001a3d5b7e334cdfbc3ce22
Reviewed-on: https://go-review.googlesource.com/c/go/+/316929
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
The regabi builders are unhappy about badctxt calling throw
calling systemstack calling gosave_systemstack_switch calling
badctxt, all nosplit, repeating. This wouldn't actually happen
since after one systemstack we'd end up on the system stack
and the next one wouldn't call gosave_systemstack_switch at all.
The badctxt call itself is in a very unlikely assertion failure
inside gosave_systemstack_switch.
Keep the assertion check but call runtime.abort instead on failure,
breaking the detected (but not real) cycle.
Change-Id: Iaf5c0fc065783b8c1c6d0f62d848f023a0714b96
Reviewed-on: https://go-review.googlesource.com/c/go/+/294069
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
No change to actual runtime, but helps reduce the laundry list
of functions.
mcall, morestack, and asmcgocall are not actually top-of-frame,
so those need more attention in follow-up CLs.
mstart moved to assembly so that it can be marked TOPFRAME.
Since TOPFRAME also tells DWARF consumers not to unwind
this way, this change should also improve debuggers a
marginal amount.
This CL is part of a stack adding windows/arm64
support (#36439), intended to land in the Go 1.17 cycle.
This CL is, however, not windows/arm64-specific.
It is cleanup meant to make the port (and future ports) easier.
Change-Id: If1e0d46ca973de5e46b62948d076f675f285b5d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/288802
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The old code was very clever about predicting whether a traceback was safe.
That cleverness has not aged well. In particular, the setsSP function is missing
a bunch of functions that write to SP and will confuse traceback.
And one such function - jmpdefer - was handled as a special case in
gentraceback instead of simply listing it in setsSP.
Throw away all the clever prediction about whether traceback will crash.
Instead, make traceback NOT crash, by checking whether the function
being walked writes to SP.
This CL is part of a stack adding windows/arm64
support (#36439), intended to land in the Go 1.17 cycle.
This CL is, however, not windows/arm64-specific.
It is cleanup meant to make the port (and future ports) easier.
Change-Id: I3d55fe257a22745e4919ac4dc9a9378c984ba0da
Reviewed-on: https://go-review.googlesource.com/c/go/+/288801
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
A g's sched.g is set in newproc1:
newg.sched.g = guintptr(unsafe.Pointer(newg))
After that, it never changes. Yet lots of assembly code does
"g.sched.g = g" unnecessarily. Remove all those lines to avoid
confusion about whether it ever changes.
Also, split gogo into two functions, one that does the nil g check
and a second that does the actual switch. This way, if the nil g check
fails, we get a stack trace showing the call stack that led to the failure.
(The SP write would otherwise cause the stack trace to abort.)
Also restore the proper nil g check in a handful of assembly functions.
(There is little point in checking for nil g *after* installing it as the real g.)
Change-Id: I22866b093f901f765de1d074e36eeec10366abfb
Reviewed-on: https://go-review.googlesource.com/c/go/+/292109
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Both asmcgocall and systemstack need to save the calling Go code's
context for use by traceback, but they do it differently.
Systemstack's appraoch is better, because it doesn't require a
special case in traceback.
So make them both use that.
While we are here, the fake mstart caller in systemstack is
no longer needed and can be removed.
(traceback knows to stop in systemstack because of the writes to SP.)
Also remove the fake mstarts in sys_windows_*.s.
And while we are there, fix the control flow guard code in sys_windows_arm.s.
The current code is using pointers to a stack frame that technically is gone
once we hit the RET instruction. Clearly it's working OK, but better not to depend
on data below SP being preserved, even for just a few instructions.
Store the value we need in other registers instead.
(This code is only used for pushing a sigpanic call, which does not
actually return to the site of the fault and therefore doesn't need to
preserve any of the registers.)
This CL is part of a stack adding windows/arm64
support (#36439), intended to land in the Go 1.17 cycle.
This CL is, however, not windows/arm64-specific.
It is cleanup meant to make the port (and future ports) easier.
Change-Id: Id1e3ef5e54f7ad786e4b87043f2626eba7c3bbd9
Reviewed-on: https://go-review.googlesource.com/c/go/+/288799
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
During a cgocallback, the runtime calls needm to get an m.
The calls made during needm cannot themselves assume that
there is an m or a g (which is attached to the m).
In the old days of making direct system calls, the only thing
you had to do for such functions was mark them //go:nosplit,
to avoid the use of g in the stack split prologue.
But now, on operating systems that make system calls through
shared libraries and use code that saves state in the g or m
before doing so, it's not safe to assume g exists. In fact, it is
not even safe to call getg(), because it might fault deferencing
the TLS storage to find the g pointer (that storage may not be
initialized yet, at least on Windows, and perhaps on other systems
in the future).
The specific routines that are problematic are usleep and osyield,
which are called during lock contention in lockextra, called
from needm.
All this is rather subtle and hidden, so in addition to fixing the
problem on Windows, this CL makes the fact of not running on
a g much clearer by introducing variants usleep_no_g and
osyield_no_g whose names should make clear that there is no g.
And then we can remove the various sketchy getg() == nil checks
in the existing routines.
As part of this cleanup, this CL also deletes onosstack on Windows.
onosstack is from back when the runtime was implemented in C.
It predates systemstack but does essentially the same thing.
Instead of having two different copies of this code, we can use
systemstack consistently. This way we need not port onosstack
to each architecture.
This CL is part of a stack adding windows/arm64
support (#36439), intended to land in the Go 1.17 cycle.
This CL is, however, not windows/arm64-specific.
It is cleanup meant to make the port (and future ports) easier.
Change-Id: I3352de1fd0a3c26267c6e209063e6e86abd26187
Reviewed-on: https://go-review.googlesource.com/c/go/+/288793
Trust: Russ Cox <rsc@golang.org>
Trust: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
This change adds support for the new register ABI on amd64 to
reflect.(Value).Call. If internal/abi's register counts are non-zero,
reflect will try to set up arguments in registers on the Call path.
Note that because the register ABI becomes ABI0 with zero registers
available, this should keep working as it did before.
This change does not add any tests for the register ABI case because
there's no way to do so at the moment.
For #40724.
Change-Id: I8aa089a5aa5a31b72e56b3d9388dd3f82203985b
Reviewed-on: https://go-review.googlesource.com/c/go/+/272568
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
The runtime.gosave function is not used anywhere. Delete.
Note: there is also a gosave<> function, which is actually used
and not deleted.
Change-Id: I64149a7afdd217de26d1e6396233f2becfad7153
Reviewed-on: https://go-review.googlesource.com/c/go/+/289719
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
This redesigns the way calls work from C to exported Go functions. It
removes several steps from the call path, makes cmd/cgo no longer
sensitive to the Go calling convention, and eliminates the use of
reflectcall from cgo.
In order to avoid generating a large amount of FFI glue between the C
and Go ABIs, the cgo tool has long depended on generating a C function
that marshals the arguments into a struct, and then the actual ABI
switch happens in functions with fixed signatures that simply take a
pointer to this struct. In a way, this CL simply pushes this idea
further.
Currently, the cgo tool generates this argument struct in the exact
layout of the Go stack frame and depends on reflectcall to unpack it
into the appropriate Go call (even though it's actually
reflectcall'ing a function generated by cgo).
In this CL, we decouple this struct from the Go stack layout. Instead,
cgo generates a Go function that takes the struct, unpacks it, and
calls the exported function. Since this generated function has a
generic signature (like the rest of the call path), we don't need
reflectcall and can instead depend on the Go compiler itself to
implement the call to the exported Go function.
One complication is that syscall.NewCallback on Windows, which
converts a Go function into a C function pointer, depends on
cgocallback's current dynamic calling approach since the signatures of
the callbacks aren't known statically. For this specific case, we
continue to depend on reflectcall. Really, the current approach makes
some overly simplistic assumptions about translating the C ABI to the
Go ABI. Now we're at least in a much better position to do a proper
ABI translation.
For comparison, the current cgo call path looks like:
GoF (generated C function) ->
crosscall2 (in cgo/asm_*.s) ->
_cgoexp_GoF (generated Go function) ->
cgocallback (in asm_*.s) ->
cgocallback_gofunc (in asm_*.s) ->
cgocallbackg (in cgocall.go) ->
cgocallbackg1 (in cgocall.go) ->
reflectcall (in asm_*.s) ->
_cgoexpwrap_GoF (generated Go function) ->
p.GoF
Now the call path looks like:
GoF (generated C function) ->
crosscall2 (in cgo/asm_*.s) ->
cgocallback (in asm_*.s) ->
cgocallbackg (in cgocall.go) ->
cgocallbackg1 (in cgocall.go) ->
_cgoexp_GoF (generated Go function) ->
p.GoF
Notably:
1. We combine _cgoexp_GoF and _cgoexpwrap_GoF and move the combined
operation to the end of the sequence. This combined function also
handles reflectcall's previous role.
2. We combined cgocallback and cgocallback_gofunc since the only
purpose of having both was to convert a raw PC into a Go function
value. We instead construct the Go function value in cgocallbackg1.
3. cgocallbackg1 no longer reaches backwards through the stack to get
the arguments to cgocallback_gofunc. Instead, we just pass the
arguments down.
4. Currently, we need an explicit msanwrite to mark the results struct
as written because reflectcall doesn't do this. Now, the results are
written by regular Go assignments, so the Go compiler generates the
necessary MSAN annotations. This also means we no longer need to track
the size of the arguments frame.
Updates #40724, since now we don't need to teach cgo about the
register ABI or change how it uses reflectcall.
Change-Id: I7840489a2597962aeb670e0c1798a16a7359c94f
Reviewed-on: https://go-review.googlesource.com/c/go/+/258938
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
You were a useful port and you've served your purpose.
Thanks for all the play.
A subsequent CL will remove amd64p32 (including assembly files and
toolchain bits) and remaining bits. The amd64p32 removal will be
separated into its own CL in case we want to support the Linux x32 ABI
in the future and want our old amd64p32 support as a starting point.
Updates #30439
Change-Id: Ia3a0c7d49804adc87bf52a4dea7e3d3007f2b1cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/199499
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently the standard hasher is memhash, which checks whether aes
instructions are available, and if so redirects to aeshash.
With this CL, we call aeshash directly, which then redirects to the
fallback hash if aes instructions are not available.
This reduces the overhead for the hash function in the common case,
as it requires just one call instead of two. On architectures which
have no assembly hasher, it's a single jump slower.
Thanks to Martin for this idea.
name old time/op new time/op delta
BigKeyMap-4 22.6ns ± 1% 21.1ns ± 2% -6.55% (p=0.000 n=9+10)
Change-Id: Ib7ca77b63d28222eb0189bc3d7130531949d853c
Reviewed-on: https://go-review.googlesource.com/c/go/+/190998
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Working toward making the tree vet-safe instead of having
so many exceptions in cmd/vet/all/whitelist.
This CL makes "GOOS=linux GOARCH=386 go vet -unsafeptr=false runtime" happy,
while keeping "GO_BUILDER_NAME=misc-vetall go tool dist test" happy too.
For #31916.
Change-Id: I3e5586a7ff6e359357350d0602c2259493280ded
Reviewed-on: https://go-review.googlesource.com/c/go/+/176099
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Allows us to stop whitelisting this error on many OS/arch combinations
XXX I'm not sure I am running vet correctly, and testing all platforms right.
Change-Id: I29f548bd5f4a63bd13c4d0667d4209c75c886fd9
GitHub-Last-Rev: 52f6ff4a6b
GitHub-Pull-Request: golang/go#31583
Reviewed-on: https://go-review.googlesource.com/c/go/+/173157
Run-TryBot: Benny Siegert <bsiegert@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
This CL adds a new attribute, TOPFRAME, which can be used to mark
functions that should be treated as being at the top of the call
stack. The function `runtime.goexit` has been marked this way on
architectures that use a link register.
This will stop programs that use DWARF to unwind the call stack
from unwinding past `runtime.goexit` on architectures that use a
link register. For example, it eliminates "corrupt stack?"
warnings when generating a backtrace that hits `runtime.goexit`
in GDB on s390x.
Similar code should be added for non-link-register architectures
(i.e. amd64, 386). They mark the top of the call stack slightly
differently to link register architectures so I haven't added
that code (they need to mark "rip" as undefined).
Fixes#24385.
Change-Id: I15b4c69ac75b491daa0acf0d981cb80eb06488de
Reviewed-on: https://go-review.googlesource.com/c/go/+/169726
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Some of the registers in which indexes + length were supposed to
be passed were wrong.
Update #30116
Change-Id: I1089366b7429c1e0ecad9219b847db069ce6b5d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/168041
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
A few examples (for accessing a slice of length 3):
s[-1] runtime error: index out of range [-1]
s[3] runtime error: index out of range [3] with length 3
s[-1:0] runtime error: slice bounds out of range [-1:]
s[3:0] runtime error: slice bounds out of range [3:0]
s[3:-1] runtime error: slice bounds out of range [:-1]
s[3:4] runtime error: slice bounds out of range [:4] with capacity 3
s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3
Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.
An exhaustive set of examples is in issue30116[u].out in the CL.
The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.
Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.
Fixes#30116
Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Currently, package runtime contains the definition of reflect.call,
even though it's just a jump to runtime.reflectcall. This "push"
symbol is confusing, since it's not clear where the definition of
reflect.call comes from when you're in the reflect package.
Replace this with a "pull" symbol: the runtime now defines only
runtime.reflectcall and package reflect uses a go:linkname to access
this symbol directly. This makes it clear where reflect.call is coming
from without any spooky action at a distance and eliminates all of the
definitions of reflect.call in the runtime.
Change-Id: I3ec73cd394efe9df8d3061a57c73aece2e7048dd
Reviewed-on: https://go-review.googlesource.com/c/148657
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Don't worry, this patch just remove trailing whitespace from
assembly files, and does not touch any logical changes.
Change-Id: Ia724ac0b1abf8bc1e41454bdc79289ef317c165d
Reviewed-on: https://go-review.googlesource.com/c/113595
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Updates #26148
Change-Id: I8f68b2c926c7b11dc86c9664ed7ff2d2f78b64b4
Reviewed-on: https://go-review.googlesource.com/128715
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Also:
- Add extra SystemStack space for darwin/arm64 just
like for darwin/arm.
- Removed redundant stack alignment; the arm64 hardware enforces
the 16 byte alignment.
- Save and restore the g registers at library initialization.
- Zero g registers since libpreinit can call libc functions
that in turn use asmcgocall. asmcgocall requires an initialized g.
- Change asmcgocall to work even if no g is set. The change mimics
amd64.
Change-Id: I1b8c63b07cfec23b909c0d215b50dc229f8adbc8
Reviewed-on: https://go-review.googlesource.com/117176
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This CL is the darwin/arm and darwin/arm64 equivalent to CL 108679,
110215, 110437, 110438, 111258, 110655.
Updates #17490
Change-Id: Ia95b27b38f9c3535012c566f17a44b4ed26b9db6
Reviewed-on: https://go-review.googlesource.com/111015
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently, throw may grow the stack, which means whenever we call it
from a context where it's not safe to grow the stack, we first have to
switch to the system stack. This is pretty easy to get wrong.
Fix this by making throw switch to the system stack so it doesn't grow
the stack and is hence safe to call without a system stack switch at
the call site.
The only thing this complicates is badsystemstack itself, which would
now go into an infinite loop before printing anything (previously it
would also go into an infinite loop, but would at least print the
error first). Fix this by making badsystemstack do a direct write and
then crash hard.
Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049
Reviewed-on: https://go-review.googlesource.com/93659
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>