This gives them correct types in Go and also makes it
possible to use them to run Go code on an m stack.
LGTM=iant
R=golang-codereviews, dave, iant
CC=dvyukov, golang-codereviews, khr, r
https://golang.org/cl/137970044
The two converted files were nearly identical.
Instead of continuing that duplication, I merged them
into a single traceback.go.
Tested on arm, amd64, amd64p32, and 386.
LGTM=r
R=golang-codereviews, remyoudompheng, dave, r
CC=dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/134200044
The exported Go definitions appearing in mprof.go are
copied verbatim from debug.go.
The unexported Go funcs and types are new.
The C Bucket type used a union and was not a line-for-line translation.
LGTM=remyoudompheng
R=golang-codereviews, remyoudompheng
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/137040043
uintptr is better when translating to Go,
and in a few places it's better in C too.
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant, khr
https://golang.org/cl/138980043
Renaming the C SysAlloc will let Go define a prototype without exporting it.
For use in cpuprof.goc's translation to Go.
LGTM=mdempsky
R=golang-codereviews, mdempsky
CC=golang-codereviews, iant
https://golang.org/cl/140060043
Every change to g->atomicstatus is now done atomically so that we can
ensure that all gs pass through a gc safepoint on demand. This allows
the GC to move from one phase to the next safely. In some phases the
stack will be scanned. This CL only deals with the infrastructure that
allows g->atomicstatus to go from one state to another. Future CLs
will deal with scanning and monitoring what phase the GC is in.
The major change was to moving to using a Gscan bit to indicate that
the status is in a scan state. The only bug fix was in oldstack where
I wasn't moving to a Gcopystack state in order to block scanning until
the new stack was in place. The proc.go file is waiting for an atomic
load instruction.
LGTM=rsc
R=golang-codereviews, dvyukov, josharian, rsc
CC=golang-codereviews, khr
https://golang.org/cl/132960044
Before, a slice with cap=0 or a string with len=0 might have its
base pointer pointing beyond the actual slice/string data into
the next block. The collector had to ignore slices and strings with
cap=0 in order to avoid misinterpreting the base pointer.
Now, a slice with cap=0 or a string with len=0 still has a base
pointer pointing into the actual slice/string data, no matter what.
The collector can now always scan the pointer, which means
strings and slices are no longer special.
Fixes#8404.
LGTM=khr, josharian
R=josharian, khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/112570044
A whole thread is too much for background scavenger that sleeps all the time anyway.
We already have sysmon thread that can do this work.
Also remove g->isbackground and simplify enter/exitsyscall.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, khr, rlh
https://golang.org/cl/108640043
These are required for chans, semaphores, timers, etc.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/123640043
Calling ReadMemStats which does stoptheworld on m0 holding locks
was not a good idea.
Stoptheworld holding locks is a recipe for deadlocks (added check for this).
Stoptheworld on g0 may or may not work (added check for this as well).
As far as I understand scavenger will print incorrect numbers now,
as stack usage is not subtracted from heap. But it's better than deadlocking.
LGTM=khr
R=golang-codereviews, rsc, khr
CC=golang-codereviews, rlh
https://golang.org/cl/124670043
Half the code in the garbage collector accesses the bitmap
as an array of bytes instead of as an array of uintptrs.
This is tricky to do correctly in a portable fashion,
it breaks on big-endian systems.
Make the bitmap a byte array.
Simplifies markallocated, scanblock and span sweep along the way,
as we don't need to recalculate bitmap position for each word.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/125250043
We need to change the interface value representation for
concurrent garbage collection, so that there is no ambiguity
about whether the data word holds a pointer or scalar.
This CL does NOT make any representation changes.
Instead, it removes representation assumptions from
various pieces of code throughout the tree.
The isdirectiface function in cmd/gc/subr.c is now
the only place that decides that policy.
The policy propagates out from there in the reflect
metadata, as a new flag in the internal kind value.
A follow-up CL will change the representation by
changing the isdirectiface function. If that CL causes
problems, it will be easy to roll back.
Update #8405.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/129090043
Restore https://golang.org/cl/41040043 after GC rewrite.
Original description:
On the plus side, we don't need to change the bits on malloc and free.
On the downside, we need to mark objects in the free lists during GC.
But the free lists are small at GC time, so it should be a net win.
benchmark old ns/op new ns/op delta
BenchmarkMalloc8 21.9 20.4 -6.85%
BenchmarkMalloc16 31.1 29.6 -4.82%
LGTM=khr
R=khr
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/122280043
Eliminating use of this extension makes it easier to port the Go runtime
to other compilers. This CL also disables the extension in cc to prevent
accidental use.
LGTM=rsc, khr
R=rsc, aram, khr, dvyukov
CC=axwalk, golang-codereviews
https://golang.org/cl/106790044
FlagNoGC is unused now.
FlagNoInvokeGC is unneeded as we don't invoke GC
on g0 and when holding locks anyway.
mal/malloc have very few uses and you never remember
the exact set of flags they use and the difference between them.
Moreover, eventually we need to give exact types to all allocations,
something what mal/malloc do not support.
LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rsc
https://golang.org/cl/117580043
Implement the design described in:
https://docs.google.com/document/d/1v4Oqa0WwHunqlb8C3ObL_uNQw3DfSY-ztoA-4wWbKcg/pub
Summary of the changes:
GC uses "2-bits per word" pointer type info embed directly into bitmap.
Scanning of stacks/data/heap is unified.
The old spans types go away.
Compiler generates "sparse" 4-bits type info for GC (directly for GC bitmap).
Linker generates "dense" 2-bits type info for data/bss (the same as stacks use).
Summary of results:
-1680 lines of code total (-1000+ in mgc0.c only)
-25% memory consumption
-3-7% binary size
-15% GC pause reduction
-7% run time reduction
LGTM=khr
R=golang-codereviews, rsc, christoph, khr
CC=golang-codereviews, rlh
https://golang.org/cl/106260045
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).
This CL removes the extern register m; code now uses g->m.
On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.
The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:
BenchmarkBinaryTree17 5491374955 5471024381 -0.37%
BenchmarkFannkuch11 4357101311 4275174828 -1.88%
BenchmarkGobDecode 11029957 11364184 +3.03%
BenchmarkGobEncode 6852205 6784822 -0.98%
BenchmarkGzip 650795967 650152275 -0.10%
BenchmarkGunzip 140962363 141041670 +0.06%
BenchmarkHTTPClientServer 71581 73081 +2.10%
BenchmarkJSONEncode 31928079 31913356 -0.05%
BenchmarkJSONDecode 117470065 113689916 -3.22%
BenchmarkMandelbrot200 6008923 5998712 -0.17%
BenchmarkGoParse 6310917 6327487 +0.26%
BenchmarkRegexpMatchMedium_1K 114568 114763 +0.17%
BenchmarkRegexpMatchHard_1K 168977 169244 +0.16%
BenchmarkRevcomp 935294971 914060918 -2.27%
BenchmarkTemplate 145917123 148186096 +1.55%
Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.
Actual code changes by Minux.
I only did the docs and the benchmarking.
LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
C globals are conservatively scanned. This helps
avoid false retention, especially for 32 bit.
LGTM=rsc
R=golang-codereviews, khr, rsc
CC=golang-codereviews
https://golang.org/cl/102040043
The 'continuation pc' is where the frame will continue
execution, if anywhere. For a frame that stopped execution
due to a CALL instruction, the continuation pc is immediately
after the CALL. But for a frame that stopped execution due to
a fault, the continuation pc is the pc after the most recent CALL
to deferproc in that frame, or else 0. That is where execution
will continue, if anywhere.
The liveness information is only recorded for CALL instructions.
This change makes sure that we never look for liveness information
except for CALL instructions.
Using a valid PC fixes crashes when a garbage collection or
stack copying tries to process a stack frame that has faulted.
Record continuation pc in heapdump (format change).
Fixes#8048.
LGTM=iant, khr
R=khr, iant, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/100870044
Iterate the right number of times in arrays and channels.
Handle channels with zero-sized objects in them.
Output longer type names if we have them.
Compute argument offset correctly.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/82980043
Reduce footprint of liveness bitmaps by about 5x.
1. Mark all liveness bitmap symbols as 4-byte aligned
(they were aligned to a larger size by default).
2. The bitmap data is a bitmap count n followed by n bitmaps.
Each bitmap begins with its own count m giving the number
of bits. All the m's are the same for the n bitmaps.
Emit this bitmap length once instead of n times.
3. Many bitmaps within a function have the same bit values,
but each call site was given a distinct bitmap. Merge duplicate
bitmaps so that no bitmap is written more than once.
4. Many functions end up with the same aggregate bitmap data.
We used to name the bitmap data funcname.gcargs and funcname.gclocals.
Instead, name it gclocals.<md5 of data> and mark it dupok so
that the linker coalesces duplicate sets. This cut the bitmap
data remaining after step 3 by 40%; I was not expecting it to
be quite so dramatic.
Applied to "go build -ldflags -w code.google.com/p/go.tools/cmd/godoc":
bitmaps pclntab binary on disk
before this CL 1326600 1985854 12738268
4-byte align 1154288 (0.87x) 1985854 (1.00x) 12566236 (0.99x)
one bitmap len 782528 (0.54x) 1985854 (1.00x) 12193500 (0.96x)
dedup bitmap 414748 (0.31x) 1948478 (0.98x) 11787996 (0.93x)
dedup bitmap set 245580 (0.19x) 1948478 (0.98x) 11620060 (0.91x)
While here, remove various dead blocks of code from plive.c.
Fixes#6929.
Fixes#7568.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83630044