During garbage collection, after scanning a stack, we think about
shrinking it to reclaim some memory. The shrinking code (called
while the world is stopped) checked that the status was Gwaiting
or Grunnable and then changed the state to Gcopystack, to essentially
lock the stack so that no other GC thread is scanning it.
The same locking happens for stack growth (and is more necessary there).
oldstatus = runtime·readgstatus(gp);
oldstatus &= ~Gscan;
if(oldstatus == Gwaiting || oldstatus == Grunnable)
runtime·casgstatus(gp, oldstatus, Gcopystack); // oldstatus is Gwaiting or Grunnable
else
runtime·throw("copystack: bad status, not Gwaiting or Grunnable");
Unfortunately, "stop the world" doesn't stop everything. It stops all
normal goroutine execution, but the network polling thread is still
blocked in epoll and may wake up. If it does, and it chooses a goroutine
to mark runnable, and that goroutine is the one whose stack is shrinking,
then it can happen that between readgstatus and casgstatus, the status
changes from Gwaiting to Grunnable.
casgstatus assumes that if the status is not what is expected, it is a
transient change (like from Gwaiting to Gscanwaiting and back, or like
from Gwaiting to Gcopystack and back), and it loops until the status
has been restored to the expected value. In this case, the status has
changed semi-permanently from Gwaiting to Grunnable - it won't
change again until the GC is done and the world can continue, but the
GC is waiting for the status to change back. This wedges the program.
To fix, call a special variant of casgstatus that accepts either Gwaiting
or Grunnable as valid statuses.
Without the fix bug with the extra check+throw in casgstatus, the
program below dies in a few seconds (2-10) with GOMAXPROCS=8
on a 2012 Retina MacBook Pro. With the fix, it runs for minutes
and minutes.
package main
import (
"io"
"log"
"net"
"runtime"
)
func main() {
const N = 100
for i := 0; i < N; i++ {
l, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
log.Fatal(err)
}
ch := make(chan net.Conn, 1)
go func() {
var err error
c1, err := net.Dial("tcp", l.Addr().String())
if err != nil {
log.Fatal(err)
}
ch <- c1
}()
c2, err := l.Accept()
if err != nil {
log.Fatal(err)
}
c1 := <-ch
l.Close()
go netguy(c1, c2)
go netguy(c2, c1)
c1.Write(make([]byte, 100))
}
for {
runtime.GC()
}
}
func netguy(r, w net.Conn) {
buf := make([]byte, 100)
for {
bigstack(1000)
_, err := io.ReadFull(r, buf)
if err != nil {
log.Fatal(err)
}
w.Write(buf)
}
}
var g int
func bigstack(n int) {
var buf [100]byte
if n > 0 {
bigstack(n - 1)
}
g = int(buf[0]) + int(buf[99])
}
Fixes#9186.
LGTM=rlh
R=austin, rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/179680043
If you get a stack of PCs from Callers, it would be expected
that every PC is immediately after a call instruction, so to find
the line of the call, you look up the line for PC-1.
CL 163550043 now explicitly documents that.
The most common exception to this is the top-most return PC
on the stack, which is the entry address of the runtime.goexit
function. Subtracting 1 from that PC will end up in a different
function entirely.
To remove this special case, make the top-most return PC
goexit+PCQuantum and then implement goexit in assembly
so that the first instruction can be skipped.
Fixes#7690.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/170720043
Originally traceback was only used for printing the stack
when an unexpected signal came in. In that case, the
initial PC is taken from the signal and should be used
unaltered. For the callers, the PC is the return address,
which might be on the line after the call; we subtract 1
to get to the CALL instruction.
Traceback is now used for a variety of things, and for
almost all of those the initial PC is a return address,
whether from getcallerpc, or gp->sched.pc, or gp->syscallpc.
In those cases, we need to subtract 1 from this initial PC,
but the traceback code had a hard rule "never subtract 1
from the initial PC", left over from the signal handling days.
Change gentraceback to take a flag that specifies whether
we are tracing a trap.
Change traceback to default to "starting with a return PC",
which is the overwhelmingly common case.
Add tracebacktrap, like traceback but starting with a trap PC.
Use tracebacktrap in signal handlers.
Fixes#7690.
LGTM=iant, r
R=r, iant
CC=golang-codereviews
https://golang.org/cl/167810044
Dmitriy believes this broke Windows.
It looks like build.golang.org stopped before that,
but it's worth a shot.
««« original CL description
runtime: make pprof a little nicer
Update #8942
This does not fully address issue 8942 but it does make
the profiles much more useful, until that issue can be
fixed completely.
LGTM=dvyukov
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/159990043
»»»
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/160030043
Update #8942
This does not fully address issue 8942 but it does make
the profiles much more useful, until that issue can be
fixed completely.
LGTM=dvyukov
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/159990043
Structs without tags have no unique name to use in the
Go definitions generated from the C types.
This caused issue 8812, fixed by CL 149260043.
Avoid future problems by requiring struct tags.
Update runtime as needed.
(There is no other C code in the tree.)
LGTM=bradfitz, iant
R=golang-codereviews, bradfitz, dave, iant
CC=golang-codereviews, khr, r
https://golang.org/cl/150360043
Our traceback code needs to know the PC of several special
functions, including goexit, mcall, etc. Make sure that
these PCs are initialized before any traceback occurs.
Fixes#8766
LGTM=rsc
R=golang-codereviews, rsc, khr, bradfitz
CC=golang-codereviews
https://golang.org/cl/145570043
In linker, refuse to write conservative (array of pointers) as the
garbage collection type for any variable in the data/bss GC program.
In the linker, attach the Go type to an already-read C declaration
during dedup. This gives us Go types for C globals for free as long
as the cmd/dist-generated Go code contains the declaration.
(Most runtime C declarations have a corresponding Go declaration.
Both are bss declarations and so the linker dedups them.)
In cmd/dist, add a few more C files to the auto-Go-declaration list
in order to get Go type information for the C declarations into the linker.
In C compiler, mark all non-pointer-containing global declarations
and all string data as NOPTR. This allows them to exist in C files
without any corresponding Go declaration. Count C function pointers
as "non-pointer-containing", since we have no heap-allocated C functions.
In runtime, add NOPTR to the remaining pointer-containing declarations,
none of which refer to Go heap objects.
In runtime, also move os.Args and syscall.envs data into runtime-owned
variables. Otherwise, in programs that do not import os or syscall, the
runtime variables named os.Args and syscall.envs will be missing type
information.
I believe that this CL eliminates the final source of conservative GC scanning
in non-SWIG Go programs, and therefore...
Fixes#909.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/149770043
Normally, the caller to runtime.entersyscall() must not return before
calling runtime.exitsyscall(), lest g->syscallsp become a dangling
pointer. runtime.cgocallbackg() violates this constraint. To work around
this, save g->syscallsp and g->syscallpc around cgo->Go callbacks, then
restore them after calling runtime.entersyscall(), which restores the
syscall stack frame pointer saved by cgocall. This allows the GC to
correctly trace a goroutine that is currently returning from a
Go->cgo->Go chain.
This also adds a check to proc.c that panics if g->syscallsp is clearly
invalid. It is not 100% foolproof, as it will not catch a case where the
stack was popped then pushed back beyond g->syscallsp, but it does catch
the present cgo issue and makes existing tests fail without the bugfix.
Fixes#7978.
LGTM=dvyukov, rsc
R=golang-codereviews, dvyukov, minux, bradfitz, iant, gobot, rsc
CC=golang-codereviews, rsc
https://golang.org/cl/131910043
CL 144940043 renamed it from Sched to SchedType
to avoid a lowercasing conflict in the Go code with
the variable named sched.
We've been using just T resolve those conflicts, not Type.
The FooType pattern is already taken for the kind-specific
variants of the runtime Type structure: ChanType, MapType,
and so on. SchedType isn't a Type.
LGTM=bradfitz, khr
R=khr, bradfitz
CC=golang-codereviews
https://golang.org/cl/145180043
In Go 1.3 the runtime called panicstring to report errors like
divide by zero or memory faults. Now we call panic (gopanic)
with pre-allocated error values. That new path is missing the
checking that panicstring did, so add it there.
The only call to panicstring left is in cnew, which is problematic
because if it fails, probably the heap is corrupt. In that case,
calling panicstring creates a new errorCString (no allocation there),
but then panic tries to print it, invoking errorCString.Error, which
does a string concatenation (allocating), which then dies.
Replace that one panicstring with a throw: cnew is for allocating
runtime data structures and should never ask for an inappropriate
amount of memory.
With panicstring gone, delete newErrorCString, errorCString.
While we're here, delete newErrorString, not called by anyone.
(It can't be: that would be C code calling Go code that might
block or grow the stack.)
Found while debugging a malloc corruption.
This resulted in 'panic during panic' instead of a more useful message.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/138290045
It will be 8K on windows because it needs 4K for the OS.
Similarly, plan9 will be 4K.
On linux/amd64, reduces size of 100,000 goroutines
from ~819MB to ~245MB.
Update #7514
LGTM=dvyukov
R=golang-codereviews, dvyukov, khr, aram
CC=golang-codereviews
https://golang.org/cl/145790043
semacquire might need to park the currently running G. It can
only park if called from the G stack (because it has no way of
saving the M stack state). So all calls to semacquire must come
from the G stack.
The three violators are GOMAXPROCS, ReadMemStats, and WriteHeapDump.
This change moves the semacquire call earlier, out of their C code
and into their Go code.
This seldom caused bugs because semacquire seldom actually had
to park the caller. But it did happen intermittently.
Fixes#8749
LGTM=dvyukov
R=golang-codereviews, dvyukov, bradfitz
CC=golang-codereviews
https://golang.org/cl/144940043
The uses of onM in dopanic/startpanic are okay even from the signal stack.
Fixes#8666.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/134710043
A race exists between the parent and child processes after a fork.
The child needs to access the new M pointer passed as an argument
but the parent may have already returned and clobbered it.
Previously, we avoided this by saving the necessary data into
registers before the rfork system call but this isn't guaranteed
to work because Plan 9 makes no promises about the register state
after a system call. Only the 386 kernel seems to save them.
For amd64 and arm, this method won't work.
We eliminate the race by allocating stack space for the scheduler
goroutines (g0) in the per-process copy-on-write stack segment and
by only calling rfork on the scheduler stack.
LGTM=aram, 0intro, rsc
R=aram, 0intro, mischief, rsc
CC=golang-codereviews
https://golang.org/cl/110680044
Start the stack a few words below the actual top, so that
if something tries to read goexit's caller PC from the stack,
it won't fault on a bad memory address.
Today, heapdump does that.
Maybe tomorrow, traceback or something else will do that.
Make it not a bug.
TBR=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/136450043
Commit to stack copying for stack growth.
We're carrying around a surprising amount of cruft from older schemes.
I am confident that precise stack scans and stack copying are here to stay.
Delete fallback code for when precise stack info is disabled.
Delete fallback code for when copying stacks is disabled.
Delete fallback code for when StackCopyAlways is disabled.
Delete Stktop chain - there is only one stack segment now.
Delete M.moreargp, M.moreargsize, M.moreframesize, M.cret.
Delete G.writenbuf (unrelated, just dead).
Delete runtime.lessstack, runtime.oldstack.
Delete many amd64 morestack variants.
Delete initialization of morestack frame/arg sizes (shortens split prologue!).
Replace G's stackguard/stackbase/stack0/stacksize/
syscallstack/syscallguard/forkstackguard with simple stack
bounds (lo, hi).
Update liblink, runtime/cgo for adjustments to G.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, iant, r
https://golang.org/cl/137410043
I assumed they were the same when I wrote
cgocallback.go earlier today. Merge them
to eliminate confusion.
I can't tell what gomallocgc did before with
a nil type but without FlagNoScan.
I created a call like that in cgocallback.go
this morning, translating from a C file.
It was supposed to do what the C version did,
namely treat the block conservatively.
Now it will.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/141810043
This CL contains compiler+runtime changes that detect C code
running on Go (not g0, not gsignal) stacks, and it contains
corrections for what it detected.
The detection works by changing the C prologue to use a different
stack guard word in the G than Go prologue does. On the g0 and
gsignal stacks, that stack guard word is set to the usual
stack guard value. But on ordinary Go stacks, that stack
guard word is set to ^0, which will make any stack split
check fail. The C prologue then calls morestackc instead
of morestack, and morestackc aborts the program with
a message about running C code on a Go stack.
This check catches all C code running on the Go stack
except NOSPLIT code. The NOSPLIT code is allowed,
so the check is complete. Since it is a dynamic check,
the code must execute to be caught. But unlike the static
checks we've been using in cmd/ld, the dynamic check
works with function pointers and other indirect calls.
For example it caught sigpanic being pushed onto Go
stacks in the signal handlers.
Fixes#8667.
LGTM=khr, iant
R=golang-codereviews, khr, iant
CC=golang-codereviews, r
https://golang.org/cl/133700043
avoid tight coupling between deferreturn and jmpdefer.
before, jmpdefer knew the exact frame size of deferreturn
in order to pop it off the stack. now, deferreturn passes
jmpdefer a pointer to the frame above it explicitly.
that avoids a magic constant and should be less fragile.
R=r
DELTA=32 (6 added, 3 deleted, 23 changed)
OCL=29801
CL=29804
160 - 75 was just barely not enough for deferproc + morestack.
added enum names and bumped to 256 - 128.
added explanation.
changed a few mal() (garbage-collected) to
malloc()/free() (manually collected).
R=ken
OCL=26981
CL=26981
trying to find all the places where
spans might be recorded.
Free can cascade into complicated
span manipulations that move them
from list to list; the old code had the
possibility of accidentally processing
a span twice or jumping to a different
list, causing an infinite loop.
R=r
DELTA=70 (28 added, 25 deleted, 17 changed)
OCL=23704
CL=23710
mark and sweep, stop the world garbage collector
(intermediate step in the way to ref counting).
can run pretty with an explicit gc after each file.
R=r
DELTA=502 (346 added, 143 deleted, 13 changed)
OCL=20630
CL=20635
run oldstack on g0's stack, just like newstack does,
so that oldstack can free the old stack.
R=r
DELTA=53 (44 added, 0 deleted, 9 changed)
OCL=20404
CL=20433
not number of threads. can still starve all the other threads,
but only by looping, not by waiting in a system call.
fix darwin syscall.Syscall6 bug.
fix chanclient bug.
delete $GOMAXPROCS from network tests.
add stripped down printf, sys.printhex to runtime.
R=r
DELTA=355 (217 added, 36 deleted, 102 changed)
OCL=20017
CL=20019