Implement CLONE_INTO_CGROUP feature, allowing to put a child in a
specified cgroup in a clean and simple way. Note that the feature only
works for cgroup v2, and requires Linux kernel 5.7 or newer.
Using the feature requires a new syscall, clone3. Currently this is the
only reason to use clone3, but the code is structured in a way so that
other cases may be easily added in the future.
Add a test case.
While at it, try to simplify the syscall calling code in
forkAndExecInChild1, which became complicated over time because:
1. It was using either rawVforkSyscall or RawSyscall6 depending on
whether CLONE_NEWUSER was set.
2. On Linux/s390, the first two arguments to clone(2) system call are
swapped (which deserved a mention in Linux ABI hall of shame). It
was worked around in rawVforkSyscall on s390, but had to be
implemented via a switch/case when using RawSyscall6, making the code
less clear.
Let's
- modify rawVforkSyscall to have two arguments (which is also required
for clone3);
- remove the arguments workaround from s390 asm, instead implementing
arguments swap in the caller (which still looks ugly but at least
it's done once and is clearly documented now);
- use rawVforkSyscall for all cases (since it is essentially similar to
RawSyscall6, except for having less parameters, not returning r2, and
saving/restoring the return address before/after syscall on 386 and
amd64).
Updates #51246.
Change-Id: Ifcd418ebead9257177338ffbcccd0bdecb94474e
Reviewed-on: https://go-review.googlesource.com/c/go/+/417695
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Kirill Kolyshkin <kolyshkin@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This is an exact copy of CL 388478 after fixing #52472 in CL 401654.
For #51087
For #52472
Change-Id: I6c6bd7ddcab1512c682e6b44f61c7bcde97f5c58
Reviewed-on: https://go-review.googlesource.com/c/go/+/401655
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This is a re-do of CL 388477, fixing #52472.
It is unsafe to call syscall.RawSyscall from syscall.Syscall with
-coverpkg=all and -race. This is because:
1. Coverage adds a sync/atomic call in RawSyscall to increment the
coverage counter.
2. Race mode instruments sync/atomic calls with TSAN runtime calls. TSAN
eventually calls runtime.racecallbackfunc, which expects
getg().m.p != 0, which is no longer true after entersyscall().
cmd/go actually avoids adding coverage instrumention to package runtime
in race mode entirely to avoid these kinds of problems. Rather than also
excluding all of syscall for this one function, work around by calling
RawSyscall6 instead, which avoids coverage instrumention both by being
written in assembly and in package runtime/*.
For #51087Fixes#52472
Change-Id: Iaffd27df03753020c4716059a455d6ca7b62f347
Reviewed-on: https://go-review.googlesource.com/c/go/+/401654
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Currently, when register ABI is used, syscall.Syscall calls
entersyscall via a wrapper, so the actual entersyscall records the
caller PC and SP of the wrapper. At the point of the actual
syscall, the wrapper frame is gone, so the recorded PC and SP are
technically invalid. Furthermore, in some functions on some
platforms (e.g. Syscall9 on NetBSD/AMD64), that frame is
overwritten. If we unwind the stack from the recorded syscallpc
and syscallsp, it may go wrong. Fix this by calling the
ABIInternal function directly.
exitsyscall calls are changed as well. It doesn't really matter,
just changed for consistency.
Change-Id: Iead8dd22cf32b05e382414fef664b7c4c1719b7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/376356
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
This allows the use of CLONE_VFORK and CLONE_VM for fork/exec, preventing
'fork/exec ...: cannot allocate memory' failures from occuring when attempting
to execute commands from a Go process that has a large memory footprint.
Additionally, this should reduce the latency of fork/exec on these platforms.
Fixes#31936
Change-Id: I4e28cf0763173145cacaa5340680dca9ff449305
Reviewed-on: https://go-review.googlesource.com/c/go/+/295849
Trust: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This allows the use of CLONE_VFORK and CLONE_VM for fork/exec, preventing
"fork/exec ...: cannot allocate memory" failures from occuring when attempting
to execute commands from a Go process that has a large memory footprint.
Additionally, this should reduce the latency of fork/exec on linux/arm64.
With CLONE_VM the child process shares the same memory with the parent
process. On its own this would lead to conflicting use of the same
memory, so CLONE_VFORK is used to suspend the parent process until the
child releases the memory when switching to the new program binary
via the exec syscall. When the parent process continues to run, one
has to consider the changes to memory that the child process did,
namely the return address of the syscall function needs to be restored
from a register.
exec.Command() callers can start in a faster manner, as child process who
do exec commands job can be cloned faster via vfork than via fork on arm64.
The same problem was addressed on linux/amd64 via issue #5838.
Updates #31936
Contributed by Howard Zhang <howard.zhang@arm.com> and Bin Lu <bin.lu@arm.com>
Change-Id: Ia99d81d877f564ec60d19f17e596276836576eaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/189418
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Add the rawSyscallNoError wrapper function which is used for Linux
syscalls that don't return an error and convert all applicable
occurences of RawSyscall to use it instead.
Fixes#22924
Change-Id: Iff1eddb54573d459faa01471f10398b3d38528dd
Reviewed-on: https://go-review.googlesource.com/84485
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>