It's next to useless and confusing as well. Let's make godoc better instead.
Fixes#4849.
R=golang-dev, dsymonds, adg, rogpeppe, rsc
CC=golang-dev
https://golang.org/cl/12974043
See golang.org/s/go12nil.
This CL is about getting all the right checks inserted.
A followup CL will add an optimization pass to
remove redundant checks.
R=ken2
CC=golang-dev
https://golang.org/cl/12970043
Was checking for nil map; must check for empty map instead.
Fixes#6065
Before:
go test -cover
# testmain
/var/folders/00/013l0000h01000cxqpysvccm0004fc/T/go-build233480051/_/Users/r/issue/_test/_testmain.go:11: imported and not used: "_/Users/r/issue"
FAIL _/Users/r/issue [build failed]
Now:
go test -cover
testing: warning: no tests to run
PASS
coverage: 0.0% of statements
ok _/Users/r/issue 0.021s
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12916043
The baseline architecture had been left to the GCC configured
default which can be more accomodating than the rest of the Go
toolchain. This prevented instructions used by the 5g compiler,
like BLX, from being used in GCC compiled assembler code.
R=golang-dev, dave, rsc, elias.naur, cshapiro
CC=golang-dev
https://golang.org/cl/12954043
The shared library changes broke the windows build because __attribute__ ((visibility ("hidden"))) is not supported in windows gcc. This change removes the attribute, as it is only needed when building shared libraries.
R=rsc
CC=golang-dev
https://golang.org/cl/12829044
This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow:
10499043
runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS
This CL prepares for external linking support to ARM.
The pseudo-symbols runtime.g and runtime.m are merged into a single
runtime.tlsgm symbol. When external linking, the offset of a thread local
variable is stored at a memory location instead of being embedded into a offset
of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only
one such offset is needed.
The larger part of this CL moves TLS code from gcc compiled to internally
compiled. The TLS code now uses the modern MRC instruction, and 5l is taught
about TLS fallbacks in case the instruction is not available or appropriate.
10271047
This CL adds support for -linkmode external to 5l.
For 5l itself, use addrel to allow for D_CALL relocations to be handled by the
host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and
63 needed an update. One of the TODO: addrel cases was since replaced, and the
rest of the cases are either covered by indirection through addpool (cases with
LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because
addpool emits AWORD instructions, which in turn are handled by case 11.
In the runtime, change the argv argument in the rt0* functions slightly to be a
pointer to the argv list, instead of relying on a particular location of argv.
9733044
The -shared flag to 6l outputs a shared library, implemented in Go
and callable from non-Go programs such as C.
The main part of this CL change the thread local storage model.
Go uses the fastest and least general mode, local exec. TLS data in shared
libraries normally requires at least the local dynamic mode, however, this CL
instead opts for using the initial exec mode. Initial exec mode is faster than
local dynamic mode and can be used in linux since the linker has reserved a
limited amount of TLS space for performance sensitive TLS code.
Initial exec mode requires an extra load from the GOT table to determine the
TLS offset. This penalty will not be paid if ld is not in -shared mode, since
TLS accesses will be reduced to local exec.
The elf sections .init_array and .rela.init_array are added to register the Go
runtime entry with cgo at library load time.
The "hidden" attribute is added to Cgo functions called from Go, since Go
does not generate call through the GOT table, and adding non-GOT relocations for
a global function is not supported by gcc. Cgo symbols don't need to be global
and avoiding the GOT table is also faster.
The changes to 8l are only removes code relevant to the old -shared mode where
internal linking was used.
This CL only address the low level linker work. It can be submitted by itself,
but to be useful, the runtime changes in CL 9738047 is also needed.
Design discussion at
https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6QFixes#5590.
R=rsc
CC=golang-dev
https://golang.org/cl/12871044
mkvar was taking care of the "LeftAddr" case,
effectively hiding it from the temp-merging optimization.
Move it into prog.c.
R=ken2
CC=golang-dev
https://golang.org/cl/12884045
Before,
go test -bench .
would just dump the long generic "go help" message. Confusing and
unhelpful. Now the message is short and on point and also reminds the
user about the oft-forgotten "go help testflag".
% go test -bench
go test: missing argument for flag bench
run "go help test" or "go help testflag" for more information
%
R=rsc
CC=golang-dev
https://golang.org/cl/12662046
* Add a new kind of Name, "fpvar" which stands for function pointer variable
* When walking the AST, find functions used as expressions and create a new Name object for them
* Track functions which are only used in expr contexts, and avoid generating bridge code for them
R=golang-dev, minux.ma, fullung, rsc, iant
CC=golang-dev
https://golang.org/cl/9835047
The compilers assume they can generate temporary variables
as needed to preserve the right semantics or simplify code
generation and the back end will still generate good code.
This turns out not to be true. The back ends will only
track the first 128 variables per function and give up
on the remainder. That needs to be fixed too, in a later CL.
This CL merges temporary variables with equal types and
non-overlapping lifetimes using the greedy algorithm in
Poletto and Sarkar, "Linear Scan Register Allocation",
ACM TOPLAS 1999.
The result can be striking in the right functions.
Top 20 frame size changes in a 6g godoc binary by bytes saved:
5464 1984 (-3480, -63.7%) go/build.(*Context).Import
4456 1824 (-2632, -59.1%) go/printer.(*printer).expr1
2560 80 (-2480, -96.9%) time.nextStdChunk
3496 1608 (-1888, -54.0%) go/printer.(*printer).stmt
1896 272 (-1624, -85.7%) net/http.init
2688 1400 (-1288, -47.9%) fmt.(*pp).printReflectValue
2800 1512 (-1288, -46.0%) main.main
3296 2016 (-1280, -38.8%) crypto/tls.(*Conn).clientHandshake
1664 488 (-1176, -70.7%) time.loadZoneZip
1760 608 (-1152, -65.5%) time.parse
4104 3072 (-1032, -25.1%) runtime/pprof.writeHeap
1680 712 ( -968, -57.6%) go/ast.Walk
2488 1560 ( -928, -37.3%) crypto/x509.parseCertificate
1128 392 ( -736, -65.2%) math/big.nat.divLarge
1528 864 ( -664, -43.5%) go/printer.(*printer).fieldList
1360 712 ( -648, -47.6%) regexp/syntax.(*parser).factor
2104 1528 ( -576, -27.4%) encoding/asn1.parseField
1064 504 ( -560, -52.6%) encoding/xml.(*Decoder).text
584 48 ( -536, -91.8%) html.init
1400 864 ( -536, -38.3%) go/doc.playExample
In the same godoc build, cuts the number of functions with
too many vars from 83 to 32.
R=ken2
CC=golang-dev
https://golang.org/cl/12829043
If the hg checkout of go.tools fails, check for Internet
connectivity before failing.
R=golang-dev, shivakumar.gn
CC=golang-dev
https://golang.org/cl/12814043
Now there's only one copy of the flow graph construction
and dominator computation, and different optimizations
can attach different annotations to the instructions.
R=ken2
CC=golang-dev
https://golang.org/cl/12797045
Code in gc/popt.c is compiled as part of 5g, 6g, and 8g,
meaning it can use arch-specific headers but there's
just one copy of the code.
This is the same arrangement we use for the portable
code generation logic in gc/pgen.c.
Move fixjmp and noreturn there to get the ball rolling.
R=ken2
CC=golang-dev
https://golang.org/cl/12789043
Add new proginfo function that returns information about a
Prog*. The information includes various instruction
description bits as well as a list of required registers set
and used and indexing registers used.
Convert the large instruction switches to use proginfo.
This information was formerly duplicated in multiple
optimization passes, inconsistently. For example, the
information about which registers an instruction requires
appeared three times for most instructions.
Most of the switches were incomplete or incorrect in some way.
For example, the switch in copyu did not list cases for INCB,
JPS, MOVAPD, MOVBWSX, MOVBWZX, PCDATA, POPQ, PUSHQ, STD,
TESTB, TESTQ, and XCHGL. Those were all falling into the
"unknown instruction" default case and stopping the rewrite,
perhaps unnecessarily. Similarly, the switch in needc only
listed a handful of the instructions that use or set the carry bit.
We still need to decide whether to use proginfo to generalize
a few of the remaining smaller switches in peep.c.
If this goes well, we'll make similar changes in 8g and 5g.
R=ken2
CC=golang-dev
https://golang.org/cl/12637051
On entry to a function, zero the results and zero the pointer
section of the local variables.
This is an intermediate step on the way to precise collection
of Go frames.
This can incur a significant (up to 30%) slowdown, but it also ensures
that the garbage collector never looks at a word in a Go frame
and sees a stale pointer value that could cause a space leak.
(C frames and assembly frames are still possibly problematic.)
This CL is required to start making collection of interface values
as precise as collection of pointer values are today.
Since we have to dereference the interface type to understand
whether the value is a pointer, it is critical that the type field be
initialized.
A future CL by Carl will make the garbage collection pointer
bitmaps context-sensitive. At that point it will be possible to
remove most of the zeroing. The only values that will still need
zeroing are values whose addresses escape the block scoping
of the function but do not escape to the heap.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4420289180 4331060459 -2.02%
BenchmarkFannkuch11 3442469663 3277706251 -4.79%
BenchmarkFmtFprintfEmpty 100 142 +42.00%
BenchmarkFmtFprintfString 262 310 +18.32%
BenchmarkFmtFprintfInt 213 281 +31.92%
BenchmarkFmtFprintfIntInt 355 431 +21.41%
BenchmarkFmtFprintfPrefixedInt 321 383 +19.31%
BenchmarkFmtFprintfFloat 444 533 +20.05%
BenchmarkFmtManyArgs 1380 1559 +12.97%
BenchmarkGobDecode 10240054 11794915 +15.18%
BenchmarkGobEncode 17350274 19970478 +15.10%
BenchmarkGzip 455179460 460699139 +1.21%
BenchmarkGunzip 114271814 119291574 +4.39%
BenchmarkHTTPClientServer 89051 89894 +0.95%
BenchmarkJSONEncode 40486799 52691558 +30.15%
BenchmarkJSONDecode 94193361 112428781 +19.36%
BenchmarkMandelbrot200 4747060 4748043 +0.02%
BenchmarkGoParse 6363798 6675098 +4.89%
BenchmarkRegexpMatchEasy0_32 129 171 +32.56%
BenchmarkRegexpMatchEasy0_1K 365 395 +8.22%
BenchmarkRegexpMatchEasy1_32 106 152 +43.40%
BenchmarkRegexpMatchEasy1_1K 952 1245 +30.78%
BenchmarkRegexpMatchMedium_32 198 283 +42.93%
BenchmarkRegexpMatchMedium_1K 79006 101097 +27.96%
BenchmarkRegexpMatchHard_32 3478 5115 +47.07%
BenchmarkRegexpMatchHard_1K 110245 163582 +48.38%
BenchmarkRevcomp 777384355 793270857 +2.04%
BenchmarkTemplate 136713089 157093609 +14.91%
BenchmarkTimeParse 1511 1761 +16.55%
BenchmarkTimeFormat 535 850 +58.88%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 74.95 65.07 0.87x
BenchmarkGobEncode 44.24 38.43 0.87x
BenchmarkGzip 42.63 42.12 0.99x
BenchmarkGunzip 169.81 162.67 0.96x
BenchmarkJSONEncode 47.93 36.83 0.77x
BenchmarkJSONDecode 20.60 17.26 0.84x
BenchmarkGoParse 9.10 8.68 0.95x
BenchmarkRegexpMatchEasy0_32 247.24 186.31 0.75x
BenchmarkRegexpMatchEasy0_1K 2799.20 2591.93 0.93x
BenchmarkRegexpMatchEasy1_32 299.31 210.44 0.70x
BenchmarkRegexpMatchEasy1_1K 1074.71 822.45 0.77x
BenchmarkRegexpMatchMedium_32 5.04 3.53 0.70x
BenchmarkRegexpMatchMedium_1K 12.96 10.13 0.78x
BenchmarkRegexpMatchHard_32 9.20 6.26 0.68x
BenchmarkRegexpMatchHard_1K 9.29 6.26 0.67x
BenchmarkRevcomp 326.95 320.40 0.98x
BenchmarkTemplate 14.19 12.35 0.87x
R=cshapiro
CC=golang-dev
https://golang.org/cl/12616045
Prior to this change, pointer maps encoded the disposition of
a word using a single bit. A zero signaled a non-pointer
value and a one signaled a pointer value. Interface values,
which are a effectively a union type, were conservatively
labeled as a pointer.
This change widens the logical element size of the pointer map
to two bits per word. As before, zero signals a non-pointer
value and one signals a pointer value. Additionally, a two
signals an iface pointer and a three signals an eface pointer.
Following other changes to the runtime, values two and three
will allow a type information to drive interpretation of the
subsequent word so only those interface values containing a
pointer value will be scanned.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12689046
On my Mac, cuts the API checks from 15 seconds to 6 seconds.
Also clean up some tag confusion: go run list-of-files ignores tags.
R=bradfitz, gri
CC=golang-dev
https://golang.org/cl/12699048
This change makes the way cc constructs pointer maps closer to
what gc does and is being done in preparation for changes to
the internal content of the pointer map such as a change to
distinguish interface pointers from ordinary pointers.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12692043
g% 6c ~/x.c
/Users/rsc/x.c:1 duplicate types given: STRUCT s and VOID
/Users/rsc/x.c:1 no return at end of function: f
g%
Fixes#6083.
R=ken2
CC=golang-dev
https://golang.org/cl/12691043
MOVBS and MOVHS are defined as duplicates of MOVB and MOVH,
and perform sign-extension moving.
No change is made to code generation.
Update #1837
R=rsc, bradfitz
CC=golang-dev
https://golang.org/cl/12682043
- adjusted test files so that they actually type-check
- adjusted go1.txt, go1.1.txt, next.txt
- to run, provide build tag: api_tool
Fixes#4538.
R=bradfitz
CC=golang-dev
https://golang.org/cl/12300043
I moved the pointer block from one end of the frame
to the other toward the end of working on the last CL,
and of course that made the optimization no longer work.
Now it works again:
0030 (bug361.go:12) DATA gclocals·0+0(SB)/4,$4
0030 (bug361.go:12) DATA gclocals·0+4(SB)/4,$3
0030 (bug361.go:12) GLOBL gclocals·0+0(SB),8,$8
Fixes arm build (this time for sure!).
TBR=golang-dev
CC=cshapiro, golang-dev, iant
https://golang.org/cl/12627044
Sort non-pointer-containing data to the low end of the
stack frame, and make the bitmaps only cover the
pointer-containing top end.
Generates significantly less garbage collection bitmap
for programs with large byte buffers on the stack.
Only 2% shorter for godoc, but 99.99998% shorter
in some test cases.
Fixes arm build.
TBR=golang-dev
CC=cshapiro, golang-dev, iant
https://golang.org/cl/12541047
Individual variables bigger than 10 MB are now
moved to the heap, as if they had escaped on
their own.
This avoids ridiculous stacks for programs that
do things like
x := [1<<30]byte{}
... use x ...
If 10 MB is too small, we can raise the limit.
Fixes#6077.
R=ken2
CC=golang-dev
https://golang.org/cl/12650045
In prep for Robert's forthcoming cmd/api rewrite which
depends on the go.tools subrepo, we'll need to be more
careful about how and when we run cmd/api.
Rather than implement this policy in both run.bash and
run.bat, this change moves the policy and mechanism into
cmd/api/run.go, which will then evolve.
The plan is in a TODO in run.go.
R=golang-dev, gri
CC=golang-dev
https://golang.org/cl/12482044
Previously, all word aligned locations in the local variables
area were scanned as conservative roots. With this change, a
bitmap is generated describing the locations of pointer values
in local variables.
With this change the argument bitmap information has been
changed to only store information about arguments. The locals
member, has been removed. In its place, the bitmap data for
local variables is now used to store the size of locals. If
the size is negative, the magnitude indicates the size of the
local variables area.
R=rsc
CC=golang-dev
https://golang.org/cl/12328044
We can then include this file in assembly to replace
cryptic constants like "7" with meaningful constants
like "(NOPROF|DUPOK|NOSPLIT)".
Converting just pkg/runtime/asm*.s for now. Dropping NOPROF
and DUPOK from lots of places where they aren't needed.
More .s files to come in a subsequent changelist.
A nonzero number in the textflag field now means
"has not been converted yet".
R=golang-dev, daniel.morsing, rsc, khr
CC=golang-dev
https://golang.org/cl/12568043
For normal slices a[i:j] we're generating 3 bounds
checks: j<={len(string),cap(slice)}, j<=j (!), and i<=j.
Somehow snuck in as part of the [i:j:k] implementation
where the second check does something.
Remove the second check when we don't need it.
R=rsc, r
CC=golang-dev
https://golang.org/cl/12311046
Also, add a meaningful error message when an encoding which
can't be parsed is found.
Fixes#5801.
R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12343043