When I did the original 386 ports on Linux and OS X, I chose to
define GS-relative expressions like 4(GS) as relative to the actual
thread-local storage base, which was usually GS but might not be
(it might be FS, or it might be a different constant offset from GS or FS).
The original scope was limited but since then the rewrites have
gotten out of control. Sometimes GS is rewritten, sometimes FS.
Some ports do other rewrites to enable shared libraries and
other linking. At no point in the code is it clear whether you are
looking at the real GS/FS or some synthesized thing that will be
rewritten. The code manipulating all these is duplicated in many
places.
The first step to fixing issue 7719 is to make the code intelligible
again.
This CL adds an explicit TLS pseudo-register to the 386 and amd64.
As a register, TLS refers to the thread-local storage base, and it
can only be loaded into another register:
MOVQ TLS, AX
An offset from the thread-local storage base is written off(reg)(TLS*1).
Semantically it is off(reg), but the (TLS*1) annotation marks this as
indexing from the loaded TLS base. This emits a relocation so that
if the linker needs to adjust the offset, it can. For example:
MOVQ TLS, AX
MOVQ 8(AX)(TLS*1), CX // load m into CX
On systems that support direct access to the TLS memory, this
pair of instructions can be reduced to a direct TLS memory reference:
MOVQ 8(TLS), CX // load m into CX
The 2-instruction and 1-instruction forms correspond roughly to
ELF TLS initial exec mode and ELF TLS local exec mode, respectively.
Liblink applies this rewrite on systems that support the 1-instruction form.
The decision is made using only the operating system (and probably
the -shared flag, eventually), not the link mode. If some link modes
on a particular operating system require the 2-instruction form,
then all builds for that operating system will use the 2-instruction
form, so that the link mode decision can be delayed to link time.
Obviously it is late to be making changes like this, but I despair
of correcting issue 7719 and issue 7164 without it. To make sure
I am not changing existing behavior, I built a "hello world" program
for every GOOS/GOARCH combination we have and then worked
to make sure that the rewrite generates exactly the same binaries,
byte for byte. There are a handful of TODOs in the code marking
kludges to get the byte-for-byte property, but at least now I can
explain exactly how each binary is handled.
The targets I tested this way are:
darwin-386
darwin-amd64
dragonfly-386
dragonfly-amd64
freebsd-386
freebsd-amd64
freebsd-arm
linux-386
linux-amd64
linux-arm
nacl-386
nacl-amd64p32
netbsd-386
netbsd-amd64
openbsd-386
openbsd-amd64
plan9-386
plan9-amd64
solaris-amd64
windows-386
windows-amd64
There were four exceptions to the byte-for-byte goal:
windows-386 and windows-amd64 have a time stamp
at bytes 137 and 138 of the header.
darwin-386 and plan9-386 have five or six modified
bytes in the middle of the Go symbol table, caused by
editing comments in runtime/sys_{darwin,plan9}_386.s.
Fixes#7164.
LGTM=iant
R=iant, aram, minux.ma, dave
CC=golang-codereviews
https://golang.org/cl/87920043
This CL replays the following one CL from the rsc-go13nacl repo.
This is the last replay CL: after this CL the main repo will have
everything the rsc-go13nacl repo did. Changes made to the main
repo after the rsc-go13nacl repo branched off probably mean that
NaCl doesn't actually work after this CL, but all the code is now moved
over and just needs to be redebugged.
---
cmd/6l, cmd/8l, cmd/ld: support for Native Client
See golang.org/s/go13nacl for design overview.
This CL is publicly visible but not CC'ed to golang-dev,
to avoid distracting from the preparation of the Go 1.2
release.
This CL and the others will be checked into my rsc-go13nacl
clone repo for now, and I will send CLs against the main
repo early in the Go 1.3 development.
R≡khr
https://golang.org/cl/15750044
---
LGTM=bradfitz, dave, iant
R=dave, bradfitz, iant
CC=golang-codereviews
https://golang.org/cl/69040044
rsc suggested that we split the whole linker changes into three parts.
This is the first one, mostly dealing with adding Hsolaris.
LGTM=iant
R=golang-codereviews, iant, dave
CC=golang-codereviews
https://golang.org/cl/54210050
- new object file reader/writer (liblink/objfile.c)
- remove old object file writing routines
- add pcdata iterator
- remove all trace of "line number stack" and "path fragments" from
object files, linker (!!!)
- dwarf now writes a single "compilation unit" instead of one per package
This CL disables the check for chains of no-split functions that
could overflow the stack red zone. A future CL will attack the problem
of reenabling that check (issue 6931).
This CL is just the liblink and cmd/ld changes.
There are minor associated adjustments in CL 37030045.
Each depends on the other.
R=golang-dev, dave, iant
CC=golang-dev
https://golang.org/cl/39680043
There is an enormous amount of code moving around in this CL,
but the code is the same, and it is invoked in the same ways.
This CL is preparation for the new linker structure, not the new
structure itself.
The new library's definition is in include/link.h.
The main change is the use of a Link structure to hold all the
linker-relevant state, replacing the smattering of global variables.
The Link structure should both make it clearer which state must
be carried around and make it possible to parallelize more easily
later.
The main body of the linker has moved into the architecture-independent
cmd/ld directory. That includes the list of known header types, so the
distinction between Hplan9x32 and Hplan9x64 is removed (no other
header type distinguished 32- and 64-bit formats), and code for unused
formats such as ipaq kernels has been deleted.
The code being deleted from 5l, 6l, and 8l reappears in liblink or in ld.
Because multiple files are being merged in the liblink directory,
it is not possible to show the diffs nicely in hg.
The Prog and Addr structures have been unified into an
architecture-independent form and moved to link.h, where they will
be shared by all tools: the assemblers, the compilers, and the linkers.
The unification makes it possible to write architecture-independent
traversal of Prog lists, among other benefits.
The Sym structures cannot be unified: they are too fundamentally
different between the linker and the compilers. Instead, liblink defines
an LSym - a linker Sym - to be used in the Prog and Addr structures,
and the linker now refers exclusively to LSyms. The compilers will
keep using their own syms but will fill out the corresponding LSyms in
the Prog and Addr structures.
Although code from 5l, 6l, and 8l is now in a single library, the
code has been arranged so that only one architecture needs to
be linked into a particular program: 5l will not contain the code
needed for x86 instruction layout, for example.
The object file writing code in liblink/obj.c is from cmd/gc/obj.c.
Preparation for golang.org/s/go13linker work.
This CL does not build by itself. It depends on 35740044
and will be submitted at the same time.
R=iant
CC=golang-dev
https://golang.org/cl/35790044
Add the -installsuffix flag to gc and {5,6,8}l, which overrides -race
for the suffix if both are supplied.
Pass this flag from the go tool for build and install.
R=rsc
CC=golang-dev
https://golang.org/cl/14246044
Also introduce BGET2/4, BPUT2/4 as they are widely used.
Slightly improve BGETC/BPUTC implementation.
This gives ~5% CPU time improvement on go install -a -p1 std.
Before:
real user sys
0m23.561s 0m16.625s 0m5.848s
0m23.766s 0m16.624s 0m5.846s
0m23.742s 0m16.621s 0m5.868s
after:
0m22.999s 0m15.841s 0m5.889s
0m22.845s 0m15.808s 0m5.850s
0m22.889s 0m15.832s 0m5.848s
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12745047
This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow:
10499043
runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS
This CL prepares for external linking support to ARM.
The pseudo-symbols runtime.g and runtime.m are merged into a single
runtime.tlsgm symbol. When external linking, the offset of a thread local
variable is stored at a memory location instead of being embedded into a offset
of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only
one such offset is needed.
The larger part of this CL moves TLS code from gcc compiled to internally
compiled. The TLS code now uses the modern MRC instruction, and 5l is taught
about TLS fallbacks in case the instruction is not available or appropriate.
10271047
This CL adds support for -linkmode external to 5l.
For 5l itself, use addrel to allow for D_CALL relocations to be handled by the
host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and
63 needed an update. One of the TODO: addrel cases was since replaced, and the
rest of the cases are either covered by indirection through addpool (cases with
LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because
addpool emits AWORD instructions, which in turn are handled by case 11.
In the runtime, change the argv argument in the rt0* functions slightly to be a
pointer to the argv list, instead of relying on a particular location of argv.
9733044
The -shared flag to 6l outputs a shared library, implemented in Go
and callable from non-Go programs such as C.
The main part of this CL change the thread local storage model.
Go uses the fastest and least general mode, local exec. TLS data in shared
libraries normally requires at least the local dynamic mode, however, this CL
instead opts for using the initial exec mode. Initial exec mode is faster than
local dynamic mode and can be used in linux since the linker has reserved a
limited amount of TLS space for performance sensitive TLS code.
Initial exec mode requires an extra load from the GOT table to determine the
TLS offset. This penalty will not be paid if ld is not in -shared mode, since
TLS accesses will be reduced to local exec.
The elf sections .init_array and .rela.init_array are added to register the Go
runtime entry with cgo at library load time.
The "hidden" attribute is added to Cgo functions called from Go, since Go
does not generate call through the GOT table, and adding non-GOT relocations for
a global function is not supported by gcc. Cgo symbols don't need to be global
and avoiding the GOT table is also faster.
The changes to 8l are only removes code relevant to the old -shared mode where
internal linking was used.
This CL only address the low level linker work. It can be submitted by itself,
but to be useful, the runtime changes in CL 9738047 is also needed.
Design discussion at
https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6QFixes#5590.
R=rsc
CC=golang-dev
https://golang.org/cl/12871044
This CL introduces a FUNCDATA number for runtime-specific
garbage collection metadata, changes the C and Go compilers
to emit that metadata, and changes the runtime to expect it.
The old pseudo-instructions that carried this information
are gone, as is the linker code to process them.
R=golang-dev, dvyukov, cshapiro
CC=golang-dev
https://golang.org/cl/11406044
Design at http://golang.org/s/go12symtab.
This enables some cleanup of the garbage collector metadata
that will be done in future CLs.
This CL does not move the old symtab and pclntab back into
an unmapped section of the file. That's a bit tricky and will be
done separately.
Fixes#4020.
R=golang-dev, dave, cshapiro, iant, r
CC=golang-dev, nigeltao
https://golang.org/cl/11085043
With this change the compiler emits a bitmap for each function
covering its stack frame arguments area. If an argument word
is known to contain a pointer, a bit is set. The garbage
collector reads this information when scanning the stack by
frames and uses it to ignores locations known to not contain a
pointer.
R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro
CC=golang-dev
https://golang.org/cl/9223046
Permits specifying the linker to use, and trailing flags to
pass to that linker, when linking in external mode. External
mode linking is used when building a package that uses cgo, as
described in the cgo docs.
Also document -linkmode and -tmpdir.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8225043
Change build system to set GO_EXTLINK_ENABLED=0 by default for
OS X 10.6, since the system linker has a bug and can not
handle the object files generated by 6l.
Fixes#5130.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8183043
This CL was written by rsc. I just tweaked 8l.
This CL adds TLS relocation to the ELF .o file we write during external linking,
so that the host linker (gcc) can decide the final location of m and g.
Similar relocations are not necessary on OS X because we use an alternate
program start-time mechanism to acquire thread-local storage.
Similar relocations are not necessary on ARM or Plan 9 or Windows
because external linking mode is not yet supported on those systems.
On almost all ELF systems, the references we use are like %fs:-0x4 or %gs:-0x4,
which we write in 6a/8a as -0x4(FS) or -0x4(GS). On Linux/ELF, however,
Xen's lack of support for this mode forced us long ago to use a two-instruction
sequence: first we load %gs:0x0 into a register r, and then we use -0x4(r).
(The ELF program loader arranges that %gs:0x0 contains a regular pointer to
that same memory location.) In order to relocate those -0x4(r) references,
the linker must know where they are. This CL adds the equivalent notation
-0x4(r)(GS*1) for this purpose: it assembles to the same encoding as -0x4(r)
but the (GS*1) indicates to the linker that this is one of those thread-local
references that needs relocation.
Thanks to Elias Naur for reminding me about this missing piece and
also for writing the test.
R=r
CC=golang-dev
https://golang.org/cl/7891047
Still to do: non-linux and non-amd64.
It may work on other ELF-based amd64 systems too, but untested.
"go test -ldflags -hostobj $GOROOT/misc/cgo/test" passes.
Much may yet change, but this seems a reasonable checkpoint.
R=iant
CC=golang-dev
https://golang.org/cl/7369057
The type information is (and for years has been) included
as an extra field in the address chunk of an instruction.
Unfortunately, suppose there is a string at a+24(FP) and
we have an instruction reading its length. It will say:
MOVQ x+32(FP), AX
and the type of *that* argument is int (not slice), because
it is the length being read. This confuses the picture seen
by debuggers and now, worse, by the garbage collector.
Instead of attaching the type information to all uses,
emit an explicit list of TYPE instructions with the information.
The TYPE instructions are no-ops whose only role is to
provide an address to attach type information to.
For example, this function:
func f(x, y, z int) (a, b string) {
return
}
now compiles into:
--- prog list "f" ---
0000 (/Users/rsc/x.go:3) TEXT f+0(SB),$0-56
0001 (/Users/rsc/x.go:3) LOCALS ,
0002 (/Users/rsc/x.go:3) TYPE x+0(FP){int},$8
0003 (/Users/rsc/x.go:3) TYPE y+8(FP){int},$8
0004 (/Users/rsc/x.go:3) TYPE z+16(FP){int},$8
0005 (/Users/rsc/x.go:3) TYPE a+24(FP){string},$16
0006 (/Users/rsc/x.go:3) TYPE b+40(FP){string},$16
0007 (/Users/rsc/x.go:3) MOVQ $0,b+40(FP)
0008 (/Users/rsc/x.go:3) MOVQ $0,b+48(FP)
0009 (/Users/rsc/x.go:3) MOVQ $0,a+24(FP)
0010 (/Users/rsc/x.go:3) MOVQ $0,a+32(FP)
0011 (/Users/rsc/x.go:4) RET ,
The { } show the formerly hidden type information.
The { } syntax is used when printing from within the gc compiler.
It is not accepted by the assemblers.
The same type information is now included on global variables:
0055 (/Users/rsc/x.go:15) GLOBL slice+0(SB){[]string},$24(AL*0)
This more accurate type information fixes a bug in the
garbage collector's precise heap collection.
The linker only cares about globals right now, but having the
local information should make things a little nicer for Carl
in the future.
Fixes#4907.
R=ken2
CC=golang-dev
https://golang.org/cl/7395056
Previously, the func structure contained an inaccurate value for
the args member and a 0 value for the locals member.
This change populates the func structure with args and locals
values computed by the compiler. The number of args was
already available in the ATEXT instruction. The number of
locals is now passed through in the new ALOCALS instruction.
This change also switches the unit of args and locals to be
bytes, just like the frame member, instead of 32-bit words.
R=golang-dev, bradfitz, cshapiro, dave, rsc
CC=golang-dev
https://golang.org/cl/7399045
A step toward a fix for issue 4069.
To allow linking with arbitrary host object files, add a linker mode
that can generate a host object file instead of an executable.
Then the host linker can be invoked to generate the final executable.
This CL adds a new -hostobj flag that instructs the linker to write
a host object file instead of an executable.
That is, this works:
go tool 6g x.go
go tool 6l -hostobj -o x.o x.6
ld -e _rt0_amd64_linux x.o
./a.out
as does:
go tool 8g x.go
go tool 8l -hostld ignored -o x.o x.8
ld -m elf_i386 -e _rt0_386_linux x.o
./a.out
Because 5l was never updated to use the standard relocation scheme,
it will take more work to get this working on ARM.
This is a checkpoint of the basic functionality. It does not work
with cgo yet, and cgo is the main reason for the change.
The command-line interface will likely change too.
The gc linker has other information that needs to be returned to
the caller for use when invoking the host linker besides the single
object file.
R=iant, iant
CC=golang-dev
https://golang.org/cl/7060044
There's no b in race detector.
The new flag matches the one in the go command
(go test -race math).
R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/7072043
This CL adds a flag parser that matches the semantics of Go's
package flag. It also changes the linkers and compilers to use
the new flag parser.
Command lines that used to work, like
8c -FVw
6c -Dfoo
5g -I/foo/bar
now need to be split into separate arguments:
8c -F -V -w
6c -D foo
5g -I /foo/bar
The new spacing will work with both old and new tools.
The new parser also allows = for arguments, as in
6c -D=foo
5g -I=/foo/bar
but that syntax will not work with the old tools.
In addition to matching standard Go binary flag parsing,
the new flag parser generates more detailed usage messages
and opens the door to long flag names.
The recently added gc flag -= has been renamed -complete.
R=remyoudompheng, daniel.morsing, minux.ma, iant
CC=golang-dev
https://golang.org/cl/7035043
This is an experiment in static analysis of Go programs
to understand which struct fields a program might use.
It is not part of the Go language specification, it must
be enabled explicitly when building the toolchain,
and it may be removed at any time.
After building the toolchain with GOEXPERIMENT=fieldtrack,
a specific field can be marked for tracking by including
`go:"track"` in the field tag:
package pkg
type T struct {
F int `go:"track"`
G int // untracked
}
To simplify usage, only named struct types can have
tracked fields, and only exported fields can be tracked.
The implementation works by making each function begin
with a sequence of no-op USEFIELD instructions declaring
which tracked fields are accessed by a specific function.
After the linker's dead code elimination removes unused
functions, the fields referred to by the remaining
USEFIELD instructions are the ones reported as used by
the binary.
The -k option to the linker specifies the fully qualified
symbol name (such as my/pkg.list) of a string variable that
should be initialized with the field tracking information
for the program. The field tracking string is a sequence
of lines, each terminated by a \n and describing a single
tracked field referred to by the program. Each line is made
up of one or more tab-separated fields. The first field is
the name of the tracked field, fully qualified, as in
"my/pkg.T.F". Subsequent fields give a shortest path of
reverse references from that field to a global variable or
function, corresponding to one way in which the program
might reach that field.
A common source of false positives in field tracking is
types with large method sets, because a reference to the
type descriptor carries with it references to all methods.
To address this problem, the CL also introduces a comment
annotation
//go:nointerface
that marks an upcoming method declaration as unavailable
for use in satisfying interfaces, both statically and
dynamically. Such a method is also invisible to package
reflect.
Again, all of this is disabled by default. It only turns on
if you have GOEXPERIMENT=fieldtrack set during make.bash.
R=iant, ken
CC=golang-dev
https://golang.org/cl/6749064
Plan 9 versions for amd64 have 2 megabyte pages.
This also fixes the logic for 32-bit vs 64-bit Plan 9,
making 64-bit the default, and adds logic to generate
a symbols table.
R=golang-dev, rsc, rminnich, ality, 0intro
CC=golang-dev, john
https://golang.org/cl/6218046
Some newer Linux distributions (Ubuntu ARM at least) use a new multiarch
directory organization, where dynamic linker is no longer in the hardcoded
path in our linker.
For example, Ubuntu 12.04 ARM hardfloat places its dynamic linker at
/lib/arm-linux-gnueabihf/ld-linux.so.3
Ref: http://lackof.org/taggart/hacking/multiarch/
Also, to support Debian GNU/kFreeBSD as a FreeBSD variant, we need this capability, so it's part of issue 3533.
This CL add a new pragma (#pragma dynlinker "path") to cc.
R=iant, rsc
CC=golang-dev
https://golang.org/cl/6086043
dodata will convert to SNOPTRDATA if appropriate.
Should fix arm build (hope springs eternal).
TBR=golang-dev
CC=golang-dev
https://golang.org/cl/5687074
cc: add #pragma textflag to set it
runtime: mark mheap to go into noptr-bss.
remove special case in garbage collector
Remove the ARM from.flag field created by CL 5687044.
The DUPOK flag was already in p->reg, so keep using that.
Otherwise test/nilptr.go creates a very large binary.
Should fix the arm build.
Diagnosed by minux.ma; replacement for CL 5690044.
R=golang-dev, minux.ma, r
CC=golang-dev
https://golang.org/cl/5686060
The garbage collector can avoid scanning this section, with
reduces collection time as well as the number of false positives.
Helps a little bit with issue 909, but certainly does not solve it.
R=ken2
CC=golang-dev
https://golang.org/cl/5671099
To allow these types as map keys, we must fill in
equal and hash functions in their algorithm tables.
Structs or arrays that are "just memory", like [2]int,
can and do continue to use the AMEM algorithm.
Structs or arrays that contain special values like
strings or interface values use generated functions
for both equal and hash.
The runtime helper func runtime.equal(t, x, y) bool handles
the general equality case for x == y and calls out to
the equal implementation in the algorithm table.
For short values (<= 4 struct fields or array elements),
the sequence of elementwise comparisons is inlined
instead of calling runtime.equal.
R=ken, mpimenov
CC=golang-dev
https://golang.org/cl/5451105
Reduces number of write+seek's from 88516 to 2080
when linking godoc with 6l.
Thanks to Alex Brainman for pointing out the
many small writes.
R=golang-dev, r, alex.brainman, robert.hencke
CC=golang-dev
https://golang.org/cl/4743043