This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
Fixes#51676.
Fixes#59294.
Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 392854.
Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.
Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Fixes#51676
Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Rename it TestIssue1560 since it no longer sleeps.
For #1560Fixes#45586
Change-Id: I338eee9de43e871da142143943e9435218438e90
Reviewed-on: https://go-review.googlesource.com/c/go/+/400194
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This test otherwise fails to build on windows/arm64 as of CL 364774
due to a warning (promoted to an error) about a mismatched dllexport
attribute. Fortunately, it seems not to need the forward-declared
function in this file anyway.
Updates #49633
Updates #49721
Change-Id: Ia4698b85077d0718a55d2cc667a7950f1d8e50ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/366075
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Fixes#49633
Change-Id: I12ca350f7dd6bfc8753a4a169f29b89ef219b035
Reviewed-on: https://go-review.googlesource.com/c/go/+/364774
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Bryan C. Mills <bcmills@google.com>
An exported Go function like
//export F
func F() {}
gets declared in _cgo_export.h as something like
extern void F(void);
The exact declaration varies by operating system.
In particular, Windows adds __declspec(dllimport).
Clang on Windows/ARM64 rejects code that contains
conflicting declarations for F, like:
extern void F(void);
extern void __declspec(dllimport) F(void);
This means that F must not be declared separately from _cgo_export.h:
any code that wants to refer to F must use #include "_cgo_export.h".
Unfortunately, the cgo prologue itself (the commented code before import "C")
cannot include "_cgo_export.h", because that file is itself produced from the
cgo Go sources and therefore cannot be a dependency of the cgo Go sources.
This CL rewrites misc/cgo/test to avoid redeclaring exported functions.
Most of the time, this is not a significant problem: just move the code
that needs the header into a .c file, perhaps with a wrapper exposed
to the cgo Go sources.
The one case that is potentially problematic is f7665, which is part of
the test for golang.org/issue/7665. That bug report explicitly identified
a bug in referring to the C name for an exported function in the same
Go source file as it was exported function. That is now impossible,
at least on Windows/ARM64, so the test is modified a bit and possibly
does not test what the original bug was. But the original bug should
be long gone: that part of the compiler has been rewritten.
Change-Id: I0d14d9336632f0e5e3db4273d9d32ef2cca0298d
Reviewed-on: https://go-review.googlesource.com/c/go/+/312029
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
A non-trivial Cgo program may need to use callbacks and interact with
go objects per goroutine. Because of the rules for passing pointers
between Go and C, such a program needs to store handles to associated
Go values. This often causes much extra effort to figure out a way to
correctly deal with: 1) map collision; 2) identifying leaks and 3)
concurrency.
This CL implements a Handle representation in runtime/cgo package, and
related methods such as Value, Delete, etc. which allows Go users can
use a standard way to handle the above difficulties.
In addition, the CL allows a Go value to have multiple handles, and the
NewHandle always returns a different handle compare to the previously
returned handles. In comparison, CL 294670 implements a different
behavior of NewHandle that returns a unique handle when the Go value is
referring to the same object.
Benchmark:
name time/op
Handle/non-concurrent-16 487ns ± 1%
Handle/concurrent-16 674ns ± 1%
Fixes#37033
Change-Id: I0eadb9d44332fffef8fb567c745246a49dd6d4c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/295369
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Cherry Zhang <cherryyz@google.com>
Introduce GOOS=ios for iOS systems. GOOS=ios matches "darwin"
build tag, like GOOS=android matches "linux" and GOOS=illumos
matches "solaris". Only ios/arm64 is supported (ios/amd64 is
not).
GOOS=ios and GOOS=darwin remain essentially the same at this
point. They will diverge at later time, to differentiate macOS
and iOS.
Uses of GOOS=="darwin" are changed to (GOOS=="darwin" || GOOS=="ios"),
except if it clearly means macOS (e.g. GOOS=="darwin" && GOARCH=="amd64"),
it remains GOOS=="darwin".
Updates #38485.
Change-Id: I4faacdc1008f42434599efb3c3ad90763a83b67c
Reviewed-on: https://go-review.googlesource.com/c/go/+/254740
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This removes all conditions and conditional code (that I could find)
that depended on darwin/arm.
Fixes#35439 (since that only happened on darwin/arm)
Fixes#37611.
Change-Id: Ia4c32a5a4368ed75231075832b0b5bfb1ad11986
Reviewed-on: https://go-review.googlesource.com/c/go/+/227198
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The test for issue 8945 was marked to only run on gccgo, but there was
no reason for that. It broke for gccgo using GCC 10, because GCC 10
defaults to -fno-common. Make the test run on gc, and split it into
test.go and testx.go to make it work with GCC 10.
The test for issue 9026 used two identical structs which GCC 10 turns
into the same type. The point of the test is not that the structs are
identical, but that they are handled in a particular order. So make
them different.
Updates #8945
Updates #9026
Change-Id: I000fb02f88f346cfbbe5dbefedd944a2c64e8d8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/211217
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
When translating C types, cache the in-progress type under its parent
names, so that anonymous structs can also be translated for multiple
typedefs, without clashing.
Standalone types are not affected by this change.
Also updated the test for issue 9026 because the C struct name
generation algorithm has changed.
Fixes#31891
Change-Id: I00cc64852a2617ce33da13f74caec886af05b9f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/181857
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
GCC has supported the __atomic intrinsics since 4.7, and clang
supports them as well. They are better than the __sync intrinsics in
that they specify a memory model and, more importantly for our purposes,
they are reliably implemented either in the compiler or in libatomic.
Change-Id: I5e0036ea3300f65c28b1c3d1f3b93fb61c1cd646
Reviewed-on: https://go-review.googlesource.com/c/go/+/193603
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Roll back CL 159258 and CL 168337. Those changes broke existing
code. I can't see any way to keep existing code working while also
producing good error messages for types like C.ulong (such as the ones
already tested for in misc/cgo/errors).
This is not an exact roll back because parts of the code have changed
since those CLs.
Updates #29878Fixes#31093
Change-Id: I56fe76c167ff0ab381ed273b9ca4b952402e1434
Reviewed-on: https://go-review.googlesource.com/c/go/+/180357
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Each different file that does import "C" must be compiled
and analyzed separately by cgo. Having fewer files import "C"
reduces the cost of building the test. This is especially important
because this test is built and run four different times (with different
settings) during all.bash.
go test -c in this directory used to take over 20 seconds on my laptop.
Now it takes under 5 seconds.
Removes 23.4r 29.0u 21.5s from all.bash.
For #26473.
Change-Id: Ie7cb7b0d9d6138ebd2eb548d0d8ea6e409ae10b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/177558
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>