Commit to stack copying for stack growth.
We're carrying around a surprising amount of cruft from older schemes.
I am confident that precise stack scans and stack copying are here to stay.
Delete fallback code for when precise stack info is disabled.
Delete fallback code for when copying stacks is disabled.
Delete fallback code for when StackCopyAlways is disabled.
Delete Stktop chain - there is only one stack segment now.
Delete M.moreargp, M.moreargsize, M.moreframesize, M.cret.
Delete G.writenbuf (unrelated, just dead).
Delete runtime.lessstack, runtime.oldstack.
Delete many amd64 morestack variants.
Delete initialization of morestack frame/arg sizes (shortens split prologue!).
Replace G's stackguard/stackbase/stack0/stacksize/
syscallstack/syscallguard/forkstackguard with simple stack
bounds (lo, hi).
Update liblink, runtime/cgo for adjustments to G.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, iant, r
https://golang.org/cl/137410043
This CL contains compiler+runtime changes that detect C code
running on Go (not g0, not gsignal) stacks, and it contains
corrections for what it detected.
The detection works by changing the C prologue to use a different
stack guard word in the G than Go prologue does. On the g0 and
gsignal stacks, that stack guard word is set to the usual
stack guard value. But on ordinary Go stacks, that stack
guard word is set to ^0, which will make any stack split
check fail. The C prologue then calls morestackc instead
of morestack, and morestackc aborts the program with
a message about running C code on a Go stack.
This check catches all C code running on the Go stack
except NOSPLIT code. The NOSPLIT code is allowed,
so the check is complete. Since it is a dynamic check,
the code must execute to be caught. But unlike the static
checks we've been using in cmd/ld, the dynamic check
works with function pointers and other indirect calls.
For example it caught sigpanic being pushed onto Go
stacks in the signal handlers.
Fixes#8667.
LGTM=khr, iant
R=golang-codereviews, khr, iant
CC=golang-codereviews, r
https://golang.org/cl/133700043
This CL adjusts code referring to src/pkg to refer to src.
Immediately after submitting this CL, I will submit
a change doing 'hg mv src/pkg/* src'.
That change will be too large to review with Rietveld
but will contain only the 'hg mv'.
This CL will break the build.
The followup 'hg mv' will fix it.
For more about the move, see golang.org/s/go14nopkg.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/134570043
This was supposed to be in CL 135490044
but got lost in a transfer from machine to machine.
TBR=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/135560043
The gp->panicwrap adjustment is just fatally flawed.
Now that there is a Panic.argp field, update that instead.
That can be done on entry only, so that unwinding doesn't
need to worry about undoing anything. The wrappers
emit a few more instructions in the prologue but everything
else in the system gets much simpler.
It also fixes (without trying) a broken test I never checked in.
Fixes#7491.
LGTM=khr
R=khr
CC=dvyukov, golang-codereviews, iant, r
https://golang.org/cl/135490044
The helps certain diagnostics and also removed duplicated enums as a side effect.
LGTM=dave, rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/115060044
Encode MOV $0, %ax as XOR %eax, %eax instead of
XOR %rax, %rax. If an operand register does not
need REX.w bit (i.e. not one of R8-R15), it is
encoded in 2 bytes instead of 3 bytes.
LGTM=rsc
R=golang-codereviews, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/115580044
While we're here, make it lookup the tlsfallback symbol only once.
LGTM=crawshaw
R=golang-codereviews, crawshaw, dave
CC=golang-codereviews
https://golang.org/cl/107430044
There are fields in the Addr that do not matter for the
purpose of deciding that the same word is already
in the current literal pool. Copy only the fields that
do matter.
This came up when comparing against the Go version
because the way it is invoked doesn't copy a few fields
(like node) that are never directly used by liblink itself.
Also remove a stray print that is not well-defined in
the new liblink. (Cannot use %D outside of %P, because
%D needs the outer Prog*.)
LGTM=minux
R=minux
CC=golang-codereviews
https://golang.org/cl/119000043
Rewrite gotos that violate Go's stricter rules.
Use uchar* instead of char* in a few places that aren't strings.
Remove dead opcross code from asm5.c.
Declare regstr (in both list6 and list8) static.
LGTM=minux, dave
R=minux, dave
CC=golang-codereviews
https://golang.org/cl/113230043
As written, the ! applies before the &1.
This would crash writing out missing pcdata tables
if we ever used non-contiguous IDs in a function.
We don't, but fix anyway.
LGTM=iant, minux
R=minux, iant
CC=golang-codereviews
https://golang.org/cl/117810047
warning: /usr/go/src/liblink/asm5.c:720 set and not used: m
warning: /usr/go/src/liblink/asm5.c:807 set and not used: c
LGTM=minux
R=minux
CC=golang-codereviews
https://golang.org/cl/108570043
The main changes fall into a few patterns:
1. Replace #define with enum.
2. Add /*c2go */ comment giving effect of #define.
This is necessary for function-like #defines and
non-enum-able #defined constants.
(Not all compilers handle negative or large enums.)
3. Add extra braces in struct initializer.
(c2go does not implement the full rules.)
This is enough to let c2go typecheck the source tree.
There may be more changes once it is doing
other semantic analyses.
LGTM=minux, iant
R=minux, dave, iant
CC=golang-codereviews
https://golang.org/cl/106860045
A TLS slot is reserved by _rt0_.*_plan9 as an automatic and
its address (which is static on Plan 9) is saved in the
global _privates symbol. The startup linkage now is exactly
like that from Plan 9 libc, and the way we access g is
exactly as if we'd have used privalloc(2).
Aside from making the code more standard, this change
drastically simplifies it, both for 386 and for amd64, and
makes the Plan 9 code in liblink common for both 386 and
amd64.
The amd64 runtime code was cleared of nxm assumptions, and
now runs on the standard Plan 9 kernel.
Note handling fixes will follow in a separate CL.
LGTM=rsc
R=golang-codereviews, rsc, bradfitz, dave
CC=0intro, ality, golang-codereviews, jas, minux.ma, mischief
https://golang.org/cl/101510049
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).
This CL removes the extern register m; code now uses g->m.
On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.
The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:
BenchmarkBinaryTree17 5491374955 5471024381 -0.37%
BenchmarkFannkuch11 4357101311 4275174828 -1.88%
BenchmarkGobDecode 11029957 11364184 +3.03%
BenchmarkGobEncode 6852205 6784822 -0.98%
BenchmarkGzip 650795967 650152275 -0.10%
BenchmarkGunzip 140962363 141041670 +0.06%
BenchmarkHTTPClientServer 71581 73081 +2.10%
BenchmarkJSONEncode 31928079 31913356 -0.05%
BenchmarkJSONDecode 117470065 113689916 -3.22%
BenchmarkMandelbrot200 6008923 5998712 -0.17%
BenchmarkGoParse 6310917 6327487 +0.26%
BenchmarkRegexpMatchMedium_1K 114568 114763 +0.17%
BenchmarkRegexpMatchHard_1K 168977 169244 +0.16%
BenchmarkRevcomp 935294971 914060918 -2.27%
BenchmarkTemplate 145917123 148186096 +1.55%
Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.
Actual code changes by Minux.
I only did the docs and the benchmarking.
LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
The USEFIELD instructions no longer make it to the linker,
so we have to do something else to pin the references
they were pinning. Emit a 0-length relocation of type R_USEFIELD.
Fixes#7486.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/95530043
Before we used line 1 of the first source file.
This should be clearer.
Fixes#4388.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/92250044
The new code is adapted from the Go 1.2 nosplit code,
but it does not have the bug reported in issue 7623:
g% go run nosplit.go
g% go1.2 run nosplit.go
BUG
rejected incorrectly:
main 0 call f; f 120
linker output:
# _/tmp/go-test-nosplit021064539
main.main: nosplit stack overflow
120 guaranteed after split check in main.main
112 on entry to main.f
-8 after main.f uses 120
g%
Fixes#6931.
Fixes#7623.
LGTM=iant
R=golang-codereviews, iant, ality
CC=golang-codereviews, r
https://golang.org/cl/88190043
The name linkwriteobj is misleading because it implies that
the function has something to do with the linker, which it
does not. The name is historical: the function performs an
operation that was previously performed by the linker, but no
longer is.
LGTM=rsc
R=rsc, minux.ma
CC=golang-codereviews
https://golang.org/cl/88210045
Without the leaf bit, the linker cannot record
the correct frame size in the symbol table, and
then stack traces get mangled. (Only for ARM.)
Fixes#7338.
Fixes#7347.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/88550043
linklookup uses hash(name, v) as the hash table index but then
only compares name to find a symbol to return.
If hash(name, v1) == hash(name, v2) for v1 != v2, the lookup
for v2 will return the symbol with v1.
The input routines assume that each symbol is found only once,
and then each symbol is added to a linked list, with the list header
in the symbol. Adding a symbol to such a list multiple times
short-circuits the list the second time it is added, causing symbols
to be dropped.
The liblink rewrite introduced an elegant, if inefficient, handling
of duplicated symbols by creating a dummy symbol to read the
duplicate into. The dummy symbols are named .dup with
sequential version numbers. With many .dup symbols, eventually
there will be a conflict, causing a duplicate list add, causing elided
symbols, causing a crash when calling one of the elided symbols.
The bug is old (2011) but could not have manifested until the
liblink rewrite introduced this heavily duplicated symbol .dup.
(See History section below.)
1. Correct the lookup function.
2. Since we want all the .dup symbols to be different, there's no
point in inserting them into the table. Call linknewsym directly,
avoiding the lookup function entirely.
3. Since nothing can refer to the .dup symbols, do not bother
adding them to the list of functions (textp) at all.
4. In lieu of a unit test, introduce additional consistency checks to
detect adding a symbol to a list multiple times. This would have
caught the short-circuit more directly, and it will detect a variety
of double-use bugs, including the one arising from the bad lookup.
Fixes#7749.
History
On April 9, 2011, I submitted CL 4383047, making ld 25% faster.
Much of the focus was on the hash table lookup function, and
one of the changes was to remove the s->version == v comparison [1].
I don't know if this was a simple editing error or if I reasoned that
same name but different v would yield a different hash slot and
so the name test alone sufficed. It is tempting to claim the former,
but it was probably the latter.
Because the hash is an iterated multiply+add, the version ends up
adding v*3ⁿ to the hash, where n is the length of the name.
A collision would need x*3ⁿ ≡ y*3ⁿ (mod 2²⁴ mod 100003),
or equivalently x*3ⁿ ≡ x*3ⁿ + (y-x)*3ⁿ (mod 2²⁴ mod 100003),
so collisions will actually be periodic: versions x and y collide
when d = y-x satisfies d*3ⁿ ≡ 0 (mod 2²⁴ mod 100003).
Since we allocate version numbers sequentially, this is actually
about the best case one could imagine: the collision rate is
much lower than if the hash were more random.
http://play.golang.org/p/TScD41c_hA computes the collision
period for various name lengths.
The most common symbol in the new linker is .dup, and for n=4
the period is maximized: the 100004th symbol is the first collision.
Unfortunately, there are programs with more duplicated symbols
than that.
In Go 1.2 and before, duplicate symbols were handled without
creating a dummy symbol, so this particular case for generating
many duplicate symbols could not happen. Go does not use
versioned symbols. Only C does; each input file gives a different
version to its static declarations. There just aren't enough C files
for this to come up in that context.
So the bug is old but the realization of the bug is new.
[1] https://golang.org/cl/4383047/diff/5001/src/cmd/ld/lib.c
LGTM=minux.ma, iant, dave
R=golang-codereviews, minux.ma, bradfitz, iant, dave
CC=golang-codereviews, r
https://golang.org/cl/87910047
If we compile a generated file stored in a temporary
directory - let's say /tmp/12345/work/x.c - then by default
6c stores the full path and then the pcln table in the
final binary includes the full path. This makes repeated builds
(using different temporary directories) produce different
binaries, even if the inputs are the same.
In the old 'go tool pack', the P flag specified a prefix to remove
from all stored paths (if present), and cmd/go invoked
'go tool pack grcP $WORK' to remove references to the
temporary work directory.
We've changed the build to avoid pack as much as possible,
under the theory that instead of making pack convert from
.6 to .a, the tools should just write the .a directly and save a
round of I/O.
Instead of going back to invoking pack always, define a common
flag -trimpath in the assemblers, C compilers, and Go compilers,
implemented in liblink, and arrange for cmd/go to use the flag.
Then the object files being written out have the shortened paths
from the start.
While we are here, reimplement pcln support for GOROOT_FINAL.
A build in /tmp/go uses GOROOT=/tmp/go, but if GOROOT_FINAL=/usr/local/go
is set, then a source file named /tmp/go/x.go is recorded instead as
/usr/local/go/x.go. We use this so that we can prepare distributions
to be installed in /usr/local/go without actually working in that
directory. The conversion to liblink deleted all the old file name
handling code, including the GOROOT_FINAL translation.
Bring the GOROOT_FINAL translation back.
Before this CL, using GOROOT_FINAL=/goroot make.bash:
g% strings $(which go) | grep -c $TMPDIR
6
g% strings $(which go) | grep -c $GOROOT
793
g%
After this CL:
g% strings $(which go) | grep -c $TMPDIR
0
g% strings $(which go) | grep -c $GOROOT
0
g%
(The references to $TMPDIR tend to be cgo-generated source files.)
Adding the -trimpath flag to the assemblers required converting
them to the new Go-semantics flag parser. The text in go1.3.html
is copied and adjusted from go1.1.html, which is when we applied
that conversion to the compilers and linkers.
Fixes#6989.
LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/88300045
When I did the original 386 ports on Linux and OS X, I chose to
define GS-relative expressions like 4(GS) as relative to the actual
thread-local storage base, which was usually GS but might not be
(it might be FS, or it might be a different constant offset from GS or FS).
The original scope was limited but since then the rewrites have
gotten out of control. Sometimes GS is rewritten, sometimes FS.
Some ports do other rewrites to enable shared libraries and
other linking. At no point in the code is it clear whether you are
looking at the real GS/FS or some synthesized thing that will be
rewritten. The code manipulating all these is duplicated in many
places.
The first step to fixing issue 7719 is to make the code intelligible
again.
This CL adds an explicit TLS pseudo-register to the 386 and amd64.
As a register, TLS refers to the thread-local storage base, and it
can only be loaded into another register:
MOVQ TLS, AX
An offset from the thread-local storage base is written off(reg)(TLS*1).
Semantically it is off(reg), but the (TLS*1) annotation marks this as
indexing from the loaded TLS base. This emits a relocation so that
if the linker needs to adjust the offset, it can. For example:
MOVQ TLS, AX
MOVQ 8(AX)(TLS*1), CX // load m into CX
On systems that support direct access to the TLS memory, this
pair of instructions can be reduced to a direct TLS memory reference:
MOVQ 8(TLS), CX // load m into CX
The 2-instruction and 1-instruction forms correspond roughly to
ELF TLS initial exec mode and ELF TLS local exec mode, respectively.
Liblink applies this rewrite on systems that support the 1-instruction form.
The decision is made using only the operating system (and probably
the -shared flag, eventually), not the link mode. If some link modes
on a particular operating system require the 2-instruction form,
then all builds for that operating system will use the 2-instruction
form, so that the link mode decision can be delayed to link time.
Obviously it is late to be making changes like this, but I despair
of correcting issue 7719 and issue 7164 without it. To make sure
I am not changing existing behavior, I built a "hello world" program
for every GOOS/GOARCH combination we have and then worked
to make sure that the rewrite generates exactly the same binaries,
byte for byte. There are a handful of TODOs in the code marking
kludges to get the byte-for-byte property, but at least now I can
explain exactly how each binary is handled.
The targets I tested this way are:
darwin-386
darwin-amd64
dragonfly-386
dragonfly-amd64
freebsd-386
freebsd-amd64
freebsd-arm
linux-386
linux-amd64
linux-arm
nacl-386
nacl-amd64p32
netbsd-386
netbsd-amd64
openbsd-386
openbsd-amd64
plan9-386
plan9-amd64
solaris-amd64
windows-386
windows-amd64
There were four exceptions to the byte-for-byte goal:
windows-386 and windows-amd64 have a time stamp
at bytes 137 and 138 of the header.
darwin-386 and plan9-386 have five or six modified
bytes in the middle of the Go symbol table, caused by
editing comments in runtime/sys_{darwin,plan9}_386.s.
Fixes#7164.
LGTM=iant
R=iant, aram, minux.ma, dave
CC=golang-codereviews
https://golang.org/cl/87920043
The relocation and automatic variable types were using
arch-specific numbers. Introduce portable enumerations
instead.
To the best of my knowledge, these are the only arch-specific
bits left in the new object file format.
Remove now, before Go 1.3, because file formats are forever.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/87670044
There are changes we know we want to make, but not before Go 1.3
Add a version number so that we can make them more easily later.
LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/87670043
This code tests linkmode == LinkExternal but is only invoked
by the compiler/assembler, not the linker.
Update #7164
LGTM=rsc
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/85080043
Reduce footprint of liveness bitmaps by about 5x.
1. Mark all liveness bitmap symbols as 4-byte aligned
(they were aligned to a larger size by default).
2. The bitmap data is a bitmap count n followed by n bitmaps.
Each bitmap begins with its own count m giving the number
of bits. All the m's are the same for the n bitmaps.
Emit this bitmap length once instead of n times.
3. Many bitmaps within a function have the same bit values,
but each call site was given a distinct bitmap. Merge duplicate
bitmaps so that no bitmap is written more than once.
4. Many functions end up with the same aggregate bitmap data.
We used to name the bitmap data funcname.gcargs and funcname.gclocals.
Instead, name it gclocals.<md5 of data> and mark it dupok so
that the linker coalesces duplicate sets. This cut the bitmap
data remaining after step 3 by 40%; I was not expecting it to
be quite so dramatic.
Applied to "go build -ldflags -w code.google.com/p/go.tools/cmd/godoc":
bitmaps pclntab binary on disk
before this CL 1326600 1985854 12738268
4-byte align 1154288 (0.87x) 1985854 (1.00x) 12566236 (0.99x)
one bitmap len 782528 (0.54x) 1985854 (1.00x) 12193500 (0.96x)
dedup bitmap 414748 (0.31x) 1948478 (0.98x) 11787996 (0.93x)
dedup bitmap set 245580 (0.19x) 1948478 (0.98x) 11620060 (0.91x)
While here, remove various dead blocks of code from plive.c.
Fixes#6929.
Fixes#7568.
LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83630044
REP MOVSQ and REP STOSQ have a really high startup overhead.
Use a Duff's device to do the repetition instead.
benchmark old ns/op new ns/op delta
BenchmarkClearFat32 7.20 1.60 -77.78%
BenchmarkCopyFat32 6.88 2.38 -65.41%
BenchmarkClearFat64 7.15 3.20 -55.24%
BenchmarkCopyFat64 6.88 3.44 -50.00%
BenchmarkClearFat128 9.53 5.34 -43.97%
BenchmarkCopyFat128 9.27 5.56 -40.02%
BenchmarkClearFat256 13.8 9.53 -30.94%
BenchmarkCopyFat256 13.5 10.3 -23.70%
BenchmarkClearFat512 22.3 18.0 -19.28%
BenchmarkCopyFat512 22.0 19.7 -10.45%
BenchmarkCopyFat1024 36.5 38.4 +5.21%
BenchmarkClearFat1024 35.1 35.0 -0.28%
TODO: use for stack frame zeroing
TODO: REP prefixes are still used for "reverse" copying when src/dst
regions overlap. Might be worth fixing.
LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, r
https://golang.org/cl/81370046
They were rejected by NaCl due to AES instructions and
accesses to %gs:0x8, caused by wrong tlsoffset value.
LGTM=iant
R=rsc, dave, iant
CC=golang-codereviews
https://golang.org/cl/76050044
The byte that r is or'd into is already 0x7, so the failure to zero r only
impacts the generated machine code if the register is > 7.
Fixes#7044.
LGTM=dave, minux.ma, rsc
R=dave, minux.ma, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/73730043
It was using MOVL to pass a 64-bit argument
(concatenated framesize and argsize) to morestack11.
LGTM=dave, rsc
R=dave, rsc, iant
CC=golang-codereviews
https://golang.org/cl/72360044
The arm puts the text flags in a different place
than the other architectures. This needs to be
cleaned up.
TBR=minux
CC=golang-codereviews
https://golang.org/cl/71260043
For non-closure functions, the context register is uninitialized
on entry and will not be used, but morestack saves it and then the
garbage collector treats it as live. This can be a source of memory
leaks if the context register points at otherwise dead memory.
Avoid this by introducing a parallel set of morestack functions
that clear the context register, and use those for the non-closure functions.
I hope this will help with some of the finalizer flakiness, but it probably won't.
Fixes#7244.
LGTM=dvyukov
R=khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/71030044
This CL replays the following one CL from the rsc-go13nacl repo.
This is the last replay CL: after this CL the main repo will have
everything the rsc-go13nacl repo did. Changes made to the main
repo after the rsc-go13nacl repo branched off probably mean that
NaCl doesn't actually work after this CL, but all the code is now moved
over and just needs to be redebugged.
---
cmd/6l, cmd/8l, cmd/ld: support for Native Client
See golang.org/s/go13nacl for design overview.
This CL is publicly visible but not CC'ed to golang-dev,
to avoid distracting from the preparation of the Go 1.2
release.
This CL and the others will be checked into my rsc-go13nacl
clone repo for now, and I will send CLs against the main
repo early in the Go 1.3 development.
R≡khr
https://golang.org/cl/15750044
---
LGTM=bradfitz, dave, iant
R=dave, bradfitz, iant
CC=golang-codereviews
https://golang.org/cl/69040044
The "fat" referred to being used for multiword values only.
We're going to use it for non-fat values sometimes too.
No change other than the renaming.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/63650043
We now use the %A, %D, %P, and %R routines from liblink
across the board.
Fixes#7178.
Fixes#7055.
LGTM=iant
R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng
CC=golang-codereviews
https://golang.org/cl/49170043
rsc suggested that we split the whole linker changes into three parts.
This is the first one, mostly dealing with adding Hsolaris.
LGTM=iant
R=golang-codereviews, iant, dave
CC=golang-codereviews
https://golang.org/cl/54210050