Manifested as increased memory usage in a Google production system.
Not an unbounded leak, but can significantly increase the number
of sudogs allocated between garbage collections.
I checked all the other calls to acquireSudog.
This is the only one that was missing a releaseSudog.
LGTM=r, dneil
R=dneil, r
CC=golang-codereviews
https://golang.org/cl/169260043
These are being built into the runtime/cgo for every
operating system. It doesn't seem to matter, but
restore the Go 1.3 behavior anyway.
LGTM=r
R=r, dave
CC=golang-codereviews
https://golang.org/cl/171290043
Stack bitmaps need to be scanned past any BitsDead entries.
Object bitmaps will not have any BitsDead in them (bitmap extraction stops at
the first BitsDead entry in makeheapobjbv). data/bss bitmaps also have no BitsDead entries.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/168270043
Gentraceback may grow the stack.
One of the gentraceback wrappers may grow the stack.
One of the gentraceback callback calls may grow the stack.
Various stack pointers are stored in various stack locations
as type uintptr during the execution of these calls.
If the stack does grow, these stack pointers will not be
updated and will start trying to decode stack memory that
is no longer valid.
It may be possible to change the type of the stack pointer
variables to be unsafe.Pointer, but that's pretty subtle and
may still have problems, even if we catch every last one.
An easier, more obviously correct fix is to require that
gentraceback of the currently running goroutine must run
on the g0 stack, not on the goroutine's own stack.
Not doing this causes faults when you set
StackFromSystem = 1
StackFaultOnFree = 1
The new check in gentraceback will catch future lapses.
The more general problem is calling getcallersp but then
calling a function that might relocate the stack, which would
invalidate the result of getcallersp. Add note to stubs.go
declaration of getcallersp explaining the problem, and
check all existing calls to getcallersp. Most needed fixes.
This affects Callers, Stack, and nearly all the runtime
profiling routines. It does not affect stack copying directly
nor garbage collection.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, r
https://golang.org/cl/167060043
Previously, the flags argument to mallocgc was an int in Go,
but a uint32 in C. Change the Go type to use uint32 so these
agree. The largest flag value is 2 (and of course no flag
values are negative), so this won't change anything on little
endian architectures, but it matters on big endian.
LGTM=rsc
R=khr, rsc
CC=golang-codereviews
https://golang.org/cl/169920043
If you get a stack of PCs from Callers, it would be expected
that every PC is immediately after a call instruction, so to find
the line of the call, you look up the line for PC-1.
CL 163550043 now explicitly documents that.
The most common exception to this is the top-most return PC
on the stack, which is the entry address of the runtime.goexit
function. Subtracting 1 from that PC will end up in a different
function entirely.
To remove this special case, make the top-most return PC
goexit+PCQuantum and then implement goexit in assembly
so that the first instruction can be skipped.
Fixes#7690.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/170720043
Originally traceback was only used for printing the stack
when an unexpected signal came in. In that case, the
initial PC is taken from the signal and should be used
unaltered. For the callers, the PC is the return address,
which might be on the line after the call; we subtract 1
to get to the CALL instruction.
Traceback is now used for a variety of things, and for
almost all of those the initial PC is a return address,
whether from getcallerpc, or gp->sched.pc, or gp->syscallpc.
In those cases, we need to subtract 1 from this initial PC,
but the traceback code had a hard rule "never subtract 1
from the initial PC", left over from the signal handling days.
Change gentraceback to take a flag that specifies whether
we are tracing a trap.
Change traceback to default to "starting with a return PC",
which is the overwhelmingly common case.
Add tracebacktrap, like traceback but starting with a trap PC.
Use tracebacktrap in signal handlers.
Fixes#7690.
LGTM=iant, r
R=r, iant
CC=golang-codereviews
https://golang.org/cl/167810044
Attempt to clear up confusion about how to turn
the PCs reported by Callers into the file and line
number people actually want.
Fixes#7690.
LGTM=r, chris.cs.guy
R=r, chris.cs.guy
CC=golang-codereviews
https://golang.org/cl/163550043
goprintf is a printf-like print for Go.
It is used in the code generated by 'defer print(...)' and 'go print(...)'.
Normally print(1, 2, 3) turns into
printint(1)
printint(2)
printint(3)
but defer and go need a single function call to give the runtime;
they give the runtime something like goprintf("%d%d%d", 1, 2, 3).
Variadic functions like goprintf cannot be described in the new
type information world, so we have to replace it.
Replace with a custom function, so that defer print(1, 2, 3) turns
into
defer func(a1, a2, a3 int) {
print(a1, a2, a3)
}(1, 2, 3)
(and then the print becomes three different printints as usual).
Fixes#8614.
LGTM=austin
R=austin
CC=golang-codereviews, r
https://golang.org/cl/159700043
The ftab ends with a half functab record consisting only of
the 'entry' field followed by a uint32 giving the offset of
the next table. Previously, symtabinit assumed it could read
this uint32 as a uintptr. Since this is unsafe on big endian,
explicitly read the offset as a uint32.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/157660043
Fixes#8348.
Trying to work around clang's dodgy support for .arch by reverting to the external assembler didn't work out so well. Minux had a much better solution to encode the instructions we need as .word directives which avoids .arch altogether.
I've confirmed with gdb that this form produces the expected machine code
Dump of assembler code for function crosscall_arm1:
0x00000000 <+0>: push {r4, r5, r6, r7, r8, r9, r10, r11, r12, lr}
0x00000004 <+4>: mov r4, r0
0x00000008 <+8>: mov r5, r1
0x0000000c <+12>: mov r0, r2
0x00000010 <+16>: blx r5
0x00000014 <+20>: blx r4
0x00000018 <+24>: pop {r4, r5, r6, r7, r8, r9, r10, r11, r12, pc}
There is another compilation failure that blocks building Go with clang on arm
# ../misc/cgo/test
# _/home/dfc/go/misc/cgo/test
/tmp/--407b12.s: Assembler messages:
/tmp/--407b12.s:59: Error: selected processor does not support ARM mode `blx r0'
clang: error: assembler command failed with exit code 1 (use -v to see invocation)
FAIL _/home/dfc/go/misc/cgo/test [build failed]
I'll open a new issue for that
LGTM=iant
R=iant, minux
CC=golang-codereviews
https://golang.org/cl/158180047
Get rid of gocputicks(), it is no longer used.
LGTM=bradfitz, dave
R=golang-codereviews, bradfitz, dave, minux
CC=golang-codereviews
https://golang.org/cl/161110044
It has been failing periodically on Solaris/x64.
Change blockevent so it always records an event if we called
SetBlockProfileRate(1), even if the time delta is negative or zero.
Hopefully this will fix the test on Solaris.
Caveat: I don't actually know what the Solaris problem is, this
is just an educated guess.
LGTM=dave
R=dvyukov, dave
CC=golang-codereviews
https://golang.org/cl/159150043
Russ Cox pointed out that environment strings are not
required to be nil-terminated on Plan 9.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/159130044
select {
case <- c:
case <- c:
}
In this case, c.recvq lists two SudoGs which have the same G.
So we can't use the G as the key to dequeue the correct SudoG,
as that key is ambiguous. Dequeueing the wrong SudoG ends up
freeing a SudoG that is still in c.recvq.
The fix is to use the actual SudoG pointer as the key.
LGTM=dvyukov
R=rsc, bradfitz, dvyukov, khr
CC=austin, golang-codereviews
https://golang.org/cl/159040043
Don't use cmd/pprof as it is not necessary installed
and does not work on nacl and plan9.
Instead just look at the raw profile.
LGTM=crawshaw, rsc
R=golang-codereviews, crawshaw, 0intro, rsc
CC=golang-codereviews
https://golang.org/cl/159010043
gogo called from GC is okay
for the same reasons that
gogo called from System or ExternalCode is okay.
All three are fake stack traces.
Fixes#8408.
LGTM=dvyukov, r
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/152580043
Dmitriy believes this broke Windows.
It looks like build.golang.org stopped before that,
but it's worth a shot.
««« original CL description
runtime: make pprof a little nicer
Update #8942
This does not fully address issue 8942 but it does make
the profiles much more useful, until that issue can be
fixed completely.
LGTM=dvyukov
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/159990043
»»»
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/160030043
It cannot run 'go tool pprof'. There is no guarantee that's installed.
It needs to build a temporary pprof binary and run that.
It also needs to skip the test on systems that can't build and
run binaries, namely android and nacl.
See src/cmd/nm/nm_test.go's TestNM for a template.
Update #8867
Status: Accepted
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/153710043
Update #8942
This does not fully address issue 8942 but it does make
the profiles much more useful, until that issue can be
fixed completely.
LGTM=dvyukov
R=r, dvyukov
CC=golang-codereviews
https://golang.org/cl/159990043
There are 3 issues:
1. Skip argument of callers is off by 3,
so that all allocations are deep inside of memory profiler.
2. Memory profiling statistics are not updated after runtime.GC.
3. Testing package does not update memory profiling statistics
before capturing the profile.
Also add an end-to-end test.
Fixes#8867.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/148710043
A Go prototype can be used instead now, and the compiler
will do a better job than we will doing it by hand.
(We got it wrong in amd64p32, causing the current build
breakage.)
The auto-prototype-matching only applies to functions
without an explicit package path, so the TEXT lines for
reflectcall and callXX are s/runtime·/·/.
LGTM=khr
R=khr
CC=golang-codereviews, iant, r
https://golang.org/cl/153600043
The racewalk code was not updated for the new write barriers.
Make it more future-proof.
The new write barrier code assumed that +1 pointer would
be aligned properly for any type that might follow, but that's
not true on 32-bit systems where some types are 64-bit aligned.
The only system like that today is nacl/amd64p32.
Insert a dummy pointer so that the ambiguously typed
value is at +2 pointers, which is always max-aligned.
LGTM=r
R=r
CC=golang-codereviews, iant, khr
https://golang.org/cl/158890046
includes undo of 22318cd31d7d and also:
- always use SetUnhandledExceptionFilter on windows-386;
- crash when receive EXCEPTION_BREAKPOINT in exception handler.
Fixes#8006.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/155360043
Assignments of 2-, 3-, and 4-word values were handled
by individual MOV instructions (and for scalars still are).
But if there are pointers involved, those assignments now
go through the write barrier routine. Before this CL, they
went to writebarrierfat, which calls memmove.
Memmove is too much overhead for these small
amounts of data.
Instead, call writebarrierfat{2,3,4}, which are specialized
for the specific amount of data being copied.
Today the write barrier does not care which words are
pointers, so size alone is enough to distinguish the cases.
If we keep these distinctions in Go 1.5 we will need to
expand them for all the pointer-vs-scalar possibilities,
so the current 3 functions will become 3+7+15 = 25,
still not a large burden (we deleted more morestack
functions than that when we dropped segmented stacks).
BenchmarkBinaryTree17 3250972583 3123910344 -3.91%
BenchmarkFannkuch11 3067605223 2964737839 -3.35%
BenchmarkFmtFprintfEmpty 101 96.0 -4.95%
BenchmarkFmtFprintfString 267 235 -11.99%
BenchmarkFmtFprintfInt 261 253 -3.07%
BenchmarkFmtFprintfIntInt 444 402 -9.46%
BenchmarkFmtFprintfPrefixedInt 374 346 -7.49%
BenchmarkFmtFprintfFloat 472 449 -4.87%
BenchmarkFmtManyArgs 1537 1476 -3.97%
BenchmarkGobDecode 13986528 12432985 -11.11%
BenchmarkGobEncode 13120323 12537420 -4.44%
BenchmarkGzip 451925758 437500578 -3.19%
BenchmarkGunzip 113267612 110053644 -2.84%
BenchmarkHTTPClientServer 103151 77100 -25.26%
BenchmarkJSONEncode 25002733 23435278 -6.27%
BenchmarkJSONDecode 94213717 82568789 -12.36%
BenchmarkMandelbrot200 4804246 4713070 -1.90%
BenchmarkGoParse 4646114 4379456 -5.74%
BenchmarkRegexpMatchEasy0_32 163 158 -3.07%
BenchmarkRegexpMatchEasy0_1K 433 391 -9.70%
BenchmarkRegexpMatchEasy1_32 154 138 -10.39%
BenchmarkRegexpMatchEasy1_1K 1481 1132 -23.57%
BenchmarkRegexpMatchMedium_32 282 270 -4.26%
BenchmarkRegexpMatchMedium_1K 92421 86149 -6.79%
BenchmarkRegexpMatchHard_32 5209 4718 -9.43%
BenchmarkRegexpMatchHard_1K 158141 147921 -6.46%
BenchmarkRevcomp 699818791 642222464 -8.23%
BenchmarkTemplate 132402383 108269713 -18.23%
BenchmarkTimeParse 509 478 -6.09%
BenchmarkTimeFormat 462 456 -1.30%
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/156200043
That was complete failure - builders are broken,
but original cl worked fine on my system.
I will need access to builders
to test this change properly.
««« original CL description
runtime: handle all windows exception
Fixes#8006.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/145150043
»»»
TBR=rsc
R=golang-codereviews
CC=golang-codereviews
https://golang.org/cl/154180043
In channels, zeroing of gp.waiting is missed on a closed channel panic.
m.morebuf.g is not zeroed.
I don't expect the latter causes any problems, but just in case.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/151610043
This change prevents confusion in the garbage collector.
The collector wants to make sure that every pointer it finds
isn't junk. Its criteria for junk is (among others) points
to a "free" span.
Because the stack shrinker modifies pointers in the heap,
there is a race condition between the GC scanner and the
shrinker. The GC scanner can see old pointers (pointers to
freed stacks). In particular this happens with SudoG.elem
pointers.
Normally this is not a problem, as pointers into stack spans
are ok. But if the freed stack is the last one in its span,
the span is marked as "free" instead of "contains stacks".
This change makes sure that even if the GC scanner sees
an old pointer, the span into which it points is still
marked as "contains stacks", and thus the GC doesn't
complain about it.
This change will make the GC pause a tiny bit slower, as
the stack freeing now happens in serial with the mark pause.
We could delay the freeing until the mutators start back up,
but this is the simplest change for now.
TBR=dvyukov
CC=golang-codereviews
https://golang.org/cl/158750043
The change contains 3 spot optimizations to scan loop:
1. Don't use byte vars, use uintptr's instead.
This seems to alleviate some codegen issue,
and alone accounts to a half of speedup.
2. Remove bitmap cache. Currently we cache only 1 byte,
so caching is not particularly effective anyway.
Removal of the cache simplifies code and positively affects regalloc.
3. Replace BitsMultiword switch with if and
do debug checks only in Debug mode.
I've benchmarked changes separately and ensured that
each of them provides speedup on top of the previous one.
This change as a whole fixes the unintentional regressions
of scan loop that were introduced during development cycle.
Fixes#8625.
Fixes#8565.
On go.benchmarks/garbage benchmark:
GOMAXPROCS=1
time: -3.13%
cputime: -3.22%
gc-pause-one: -15.71%
gc-pause-total: -15.71%
GOMAXPROCS=32
time: -1.96%
cputime: -4.43%
gc-pause-one: -6.22%
gc-pause-total: -6.22%
LGTM=khr, rsc
R=golang-codereviews, khr
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/153990043
Out of stack space due to new 2-word call in freedefer.
Go back to smaller function calls.
TBR=brainman
CC=golang-codereviews
https://golang.org/cl/152340043
It appears to be an opaque bit pattern more than a pointer.
The Go garbage collector has discovered that for m0
it is set to 0x4c.
Should fix Windows build.
TBR=brainman
CC=golang-codereviews
https://golang.org/cl/149640043
Another dangling stack pointer in a cached structure.
Same as SudoG.elem and SudoG.selectdone.
Definitely a fix, and the new test in freedefer makes the
crash reproducible, but probably not a complete fix.
I have seen one dangling pointer in a Defer.panic even
after this fix; I cannot see where it could be coming from.
I think this will fix the solaris build.
I do not think this will fix the occasional failure on the darwin build.
TBR=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/155080043
I have a CL which at every gc looks through data and bss
sections for nonpointer data (according to gc maps) that
looks like a pointer. These are potential missing roots.
The only thing it finds are begnign, storing stack pointers
into m0.scalararg[1] and never cleaning them up. Let's
clean them up now so the test CL passes all.bash cleanly.
The test CL can't be checked in because we might store
pointer-looking things in nonpointer data by accident.
LGTM=iant
R=golang-codereviews, iant, khr
CC=golang-codereviews
https://golang.org/cl/153210043
We no longer have full type information in the heap, so
we can't dump that any more. Instead we dump ptr/noptr
maps so at least we can compute graph connectivity.
In addition, we still dump Iface/Eface types so together
with dwarf type info we might be able to reconstruct
types of most things in the heap.
LGTM=dvyukov
R=golang-codereviews, dvyukov, rsc, khr
CC=golang-codereviews
https://golang.org/cl/155940043