All three cases of clearfat were wrong on power64x.
The cases that handle 1032 bytes and up and 32 bytes and up
both use MOVDU (one directly generated in a loop and the other
via duffzero), which leaves the pointer register pointing at
the *last written* address. The generated code was not
accounting for this, so the byte fill loop was re-zeroing the
last zeroed dword, rather than the bytes following the last
zeroed dword. Fix this by simply adding an additional 8 byte
offset to the byte zeroing loop.
The case that handled under 32 bytes was also wrong. It
didn't update the pointer register at all, so the byte zeroing
loop was simply re-zeroing the beginning of region. Again,
the fix is to add an offset to the byte zeroing loop to
account for this.
LGTM=dave, bradfitz
R=rsc, dave, bradfitz
CC=golang-codereviews
https://golang.org/cl/168870043
goprintf is a printf-like print for Go.
It is used in the code generated by 'defer print(...)' and 'go print(...)'.
Normally print(1, 2, 3) turns into
printint(1)
printint(2)
printint(3)
but defer and go need a single function call to give the runtime;
they give the runtime something like goprintf("%d%d%d", 1, 2, 3).
Variadic functions like goprintf cannot be described in the new
type information world, so we have to replace it.
Replace with a custom function, so that defer print(1, 2, 3) turns
into
defer func(a1, a2, a3 int) {
print(a1, a2, a3)
}(1, 2, 3)
(and then the print becomes three different printints as usual).
Fixes#8614.
LGTM=austin
R=austin
CC=golang-codereviews, r
https://golang.org/cl/159700043
I removed support for jumping between functions years ago,
as part of doing the instruction layout for each function separately.
Given that, it makes sense to treat labels as function-scoped.
This lets each function have its own 'loop' label, for example.
Makes the assembly much cleaner and removes the last
reason anyone would reach for the 123(PC) form instead.
Note that this is on the dev.power64 branch, but it changes all
the assemblers. The change will ship in Go 1.5 (perhaps after
being ported into the new assembler).
Came up as part of CL 167730043.
LGTM=r
R=r
CC=austin, dave, golang-codereviews, minux
https://golang.org/cl/159670043
The "to" field was the penultimate argument to outgcode,
instead of the last argument, which swapped the third and
fourth operands. The argument order was correct in a.y, so
just swap the meaning of the arguments in outgcode. This
hadn't come up because we hadn't used these more obscure
operations in any hand-written assembly until now.
LGTM=rsc, dave
R=rsc, dave
CC=golang-codereviews
https://golang.org/cl/160690043
get -u now checks that remote repo paths match the
ones predicted by the import paths: if you are get -u'ing
rsc.io/pdf, it has to be checked out from the right location.
This is important in case the rsc.io/pdf redirect changes.
In some cases, people have good reasons to use
non-standard remote repos. Add -f flag to allow that.
The f can stand for force or fork, as you see fit.
Fixes#8850.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/164120043
This brings dev.power64 up-to-date with the current tip of
default. go_bootstrap is still panicking with a bad defer
when initializing the runtime (even on amd64).
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/152570049
This also removes pkg/runtime/traceback_lr.c, which was ported
to Go in an earlier commit and then moved to
runtime/traceback.go.
Reviewer: rsc@golang.org
rsc: LGTM
Partial undo, changes to ldelf.c retained.
Some platforms are still not working even with the integrated assembler disabled, will have to find another solution.
««« original CL description
cmd/cgo: disable clang's integrated assembler
Fixes#8348.
Clang's internal assembler (introduced by default in clang 3.4) understands the .arch directive, but doesn't change the default value of -march. This causes the build to fail when we use BLX (armv5 and above) when clang is compiled for the default armv4t architecture (which appears to be the default on all the distros I've used).
This is probably a clang bug, so work around it for the time being by disabling the integrated assembler when compiling the cgo assembly shim.
This CL also includes a small change to ldelf.c which was required as clang 3.4 and above generate more weird symtab entries.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/156430044
»»»
LGTM=minux
R=iant, minux
CC=golang-codereviews
https://golang.org/cl/162880044
This brings cmd/gc in line with the spec on this question.
It might break existing code, but that code was not conformant
with the spec.
Credit to Rémy for finding the broken code.
Fixes#6366.
LGTM=r
R=golang-codereviews, r
CC=adonovan, golang-codereviews, gri
https://golang.org/cl/129550043
Fixes#8348.
Clang's internal assembler (introduced by default in clang 3.4) understands the .arch directive, but doesn't change the default value of -march. This causes the build to fail when we use BLX (armv5 and above) when clang is compiled for the default armv4t architecture (which appears to be the default on all the distros I've used).
This is probably a clang bug, so work around it for the time being by disabling the integrated assembler when compiling the cgo assembly shim.
This CL also includes a small change to ldelf.c which was required as clang 3.4 and above generate more weird symtab entries.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/156430044
https://golang.org/cl/152700045/ made it possible for struct literals assigned to globals to use <N> as the RHS. Normally, this is to zero out variables on first use. Because globals are already zero (or their linker initialized value), we just ignored this.
Now that <N> can occur from non-initialization code, we need to emit this code. We don't use <N> for initialization of globals any more, so this shouldn't cause any excessive zeroing.
Fixes#8961.
LGTM=rsc
R=golang-codereviews, rsc
CC=bradfitz, golang-codereviews
https://golang.org/cl/154540044
Better to avoid the memory loads and just use immediate constants.
This especially applies to zeroing, which was being done by
copying zeros from elsewhere in the binary, even if the value
was going to be completely initialized with non-zero values.
The zero writes were optimized away but the zero loads from
the data segment were not.
LGTM=r
R=r, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/152700045
Both of these forms can avoid writing to the base pointer in x
(in the slice, always, and in the append, most of the time).
For Go 1.5, will need to change the compilation of x = x[0:y]
to avoid writing to the base pointer, so that the elision is safe,
and will need to change the compilation of x = append(x, ...)
to write to the base pointer (through a barrier) only when
growing the underlying array, so that the general elision is safe.
For Go 1.4, elide the write barrier always, a change that should
have equivalent performance characteristics but is much
simpler and therefore safer.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3910526122 3918802545 +0.21%
BenchmarkFannkuch11 3747650699 3732600693 -0.40%
BenchmarkFmtFprintfEmpty 106 98.7 -6.89%
BenchmarkFmtFprintfString 280 269 -3.93%
BenchmarkFmtFprintfInt 296 282 -4.73%
BenchmarkFmtFprintfIntInt 467 470 +0.64%
BenchmarkFmtFprintfPrefixedInt 418 398 -4.78%
BenchmarkFmtFprintfFloat 574 535 -6.79%
BenchmarkFmtManyArgs 1768 1818 +2.83%
BenchmarkGobDecode 14916799 14925182 +0.06%
BenchmarkGobEncode 14110076 13358298 -5.33%
BenchmarkGzip 546609795 542630402 -0.73%
BenchmarkGunzip 136270657 136496277 +0.17%
BenchmarkHTTPClientServer 126574 125245 -1.05%
BenchmarkJSONEncode 30006238 27862354 -7.14%
BenchmarkJSONDecode 106020889 102664600 -3.17%
BenchmarkMandelbrot200 5793550 5818320 +0.43%
BenchmarkGoParse 5437608 5463962 +0.48%
BenchmarkRegexpMatchEasy0_32 192 179 -6.77%
BenchmarkRegexpMatchEasy0_1K 462 460 -0.43%
BenchmarkRegexpMatchEasy1_32 168 153 -8.93%
BenchmarkRegexpMatchEasy1_1K 1420 1280 -9.86%
BenchmarkRegexpMatchMedium_32 338 286 -15.38%
BenchmarkRegexpMatchMedium_1K 107435 98027 -8.76%
BenchmarkRegexpMatchHard_32 5941 4846 -18.43%
BenchmarkRegexpMatchHard_1K 185965 153830 -17.28%
BenchmarkRevcomp 795497458 798447829 +0.37%
BenchmarkTemplate 132091559 134938425 +2.16%
BenchmarkTimeParse 604 608 +0.66%
BenchmarkTimeFormat 551 548 -0.54%
LGTM=r
R=r, dave
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/159960043
Among other things, *x = T{} does not need a write barrier.
The changes here avoid an unnecessary copy even when
no pointers are involved, so it may have larger effects.
In 6g and 8g, avoid manually repeated STOSQ in favor of
writing explicit MOVs, under the theory that the MOVs
should have fewer dependencies and pipeline better.
Benchmarks compare best of 5 on a 2012 MacBook Pro Core i5
with TurboBoost disabled. Most improvements can be explained
by the changes in this CL.
The effect in Revcomp is real but harder to explain: none of
the instructions in the inner loop changed. I suspect loop
alignment but really have no idea.
benchmark old new delta
BenchmarkBinaryTree17 3809027371 3819907076 +0.29%
BenchmarkFannkuch11 3607547556 3686983012 +2.20%
BenchmarkFmtFprintfEmpty 118 103 -12.71%
BenchmarkFmtFprintfString 289 277 -4.15%
BenchmarkFmtFprintfInt 304 290 -4.61%
BenchmarkFmtFprintfIntInt 507 458 -9.66%
BenchmarkFmtFprintfPrefixedInt 425 408 -4.00%
BenchmarkFmtFprintfFloat 555 555 +0.00%
BenchmarkFmtManyArgs 1835 1733 -5.56%
BenchmarkGobDecode 14738209 14639331 -0.67%
BenchmarkGobEncode 14239039 13703571 -3.76%
BenchmarkGzip 538211054 538701315 +0.09%
BenchmarkGunzip 135430877 134818459 -0.45%
BenchmarkHTTPClientServer 116488 116618 +0.11%
BenchmarkJSONEncode 28923406 29294334 +1.28%
BenchmarkJSONDecode 105779820 104289543 -1.41%
BenchmarkMandelbrot200 5791758 5771964 -0.34%
BenchmarkGoParse 5376642 5310943 -1.22%
BenchmarkRegexpMatchEasy0_32 195 190 -2.56%
BenchmarkRegexpMatchEasy0_1K 477 455 -4.61%
BenchmarkRegexpMatchEasy1_32 170 165 -2.94%
BenchmarkRegexpMatchEasy1_1K 1410 1394 -1.13%
BenchmarkRegexpMatchMedium_32 336 329 -2.08%
BenchmarkRegexpMatchMedium_1K 108979 106328 -2.43%
BenchmarkRegexpMatchHard_32 5854 5821 -0.56%
BenchmarkRegexpMatchHard_1K 185089 182838 -1.22%
BenchmarkRevcomp 834920364 780202624 -6.55%
BenchmarkTemplate 137046937 129728756 -5.34%
BenchmarkTimeParse 600 594 -1.00%
BenchmarkTimeFormat 559 539 -3.58%
LGTM=r
R=r
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/157910047
The assembler could give a better error, but this one
is good enough for now.
Fixes#8880.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/153610043
The racewalk code was not updated for the new write barriers.
Make it more future-proof.
The new write barrier code assumed that +1 pointer would
be aligned properly for any type that might follow, but that's
not true on 32-bit systems where some types are 64-bit aligned.
The only system like that today is nacl/amd64p32.
Insert a dummy pointer so that the ambiguously typed
value is at +2 pointers, which is always max-aligned.
LGTM=r
R=r
CC=golang-codereviews, iant, khr
https://golang.org/cl/158890046
Assignments of 2-, 3-, and 4-word values were handled
by individual MOV instructions (and for scalars still are).
But if there are pointers involved, those assignments now
go through the write barrier routine. Before this CL, they
went to writebarrierfat, which calls memmove.
Memmove is too much overhead for these small
amounts of data.
Instead, call writebarrierfat{2,3,4}, which are specialized
for the specific amount of data being copied.
Today the write barrier does not care which words are
pointers, so size alone is enough to distinguish the cases.
If we keep these distinctions in Go 1.5 we will need to
expand them for all the pointer-vs-scalar possibilities,
so the current 3 functions will become 3+7+15 = 25,
still not a large burden (we deleted more morestack
functions than that when we dropped segmented stacks).
BenchmarkBinaryTree17 3250972583 3123910344 -3.91%
BenchmarkFannkuch11 3067605223 2964737839 -3.35%
BenchmarkFmtFprintfEmpty 101 96.0 -4.95%
BenchmarkFmtFprintfString 267 235 -11.99%
BenchmarkFmtFprintfInt 261 253 -3.07%
BenchmarkFmtFprintfIntInt 444 402 -9.46%
BenchmarkFmtFprintfPrefixedInt 374 346 -7.49%
BenchmarkFmtFprintfFloat 472 449 -4.87%
BenchmarkFmtManyArgs 1537 1476 -3.97%
BenchmarkGobDecode 13986528 12432985 -11.11%
BenchmarkGobEncode 13120323 12537420 -4.44%
BenchmarkGzip 451925758 437500578 -3.19%
BenchmarkGunzip 113267612 110053644 -2.84%
BenchmarkHTTPClientServer 103151 77100 -25.26%
BenchmarkJSONEncode 25002733 23435278 -6.27%
BenchmarkJSONDecode 94213717 82568789 -12.36%
BenchmarkMandelbrot200 4804246 4713070 -1.90%
BenchmarkGoParse 4646114 4379456 -5.74%
BenchmarkRegexpMatchEasy0_32 163 158 -3.07%
BenchmarkRegexpMatchEasy0_1K 433 391 -9.70%
BenchmarkRegexpMatchEasy1_32 154 138 -10.39%
BenchmarkRegexpMatchEasy1_1K 1481 1132 -23.57%
BenchmarkRegexpMatchMedium_32 282 270 -4.26%
BenchmarkRegexpMatchMedium_1K 92421 86149 -6.79%
BenchmarkRegexpMatchHard_32 5209 4718 -9.43%
BenchmarkRegexpMatchHard_1K 158141 147921 -6.46%
BenchmarkRevcomp 699818791 642222464 -8.23%
BenchmarkTemplate 132402383 108269713 -18.23%
BenchmarkTimeParse 509 478 -6.09%
BenchmarkTimeFormat 462 456 -1.30%
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/156200043
Our current pe object reader assumes that every symbol starting with
'.' is section. It appeared to be true, until now gcc 4.9.1 generates
some symbols with '.' at the front. Change that logic to check other
symbol fields in addition to checking for '.'. I am not an expert
here, but it seems reasonable to me.
Added test, but it is only good, if tested with gcc 4.9.1. Otherwise
the test PASSes regardless.
Fixes#8811.
Fixes#8856.
LGTM=jfrederich, iant, stephen.gutekanst
R=golang-codereviews, jfrederich, stephen.gutekanst, iant
CC=alex.brainman, golang-codereviews
https://golang.org/cl/152410043
gcc 4.9.1 generates pe sections with names longer then 8 charters.
From IMAGE_SECTION_HEADER definition:
Name
An 8-byte, null-padded UTF-8 string. There is no terminating null character
if the string is exactly eight characters long. For longer names, this
member contains a forward slash (/) followed by an ASCII representation
of a decimal number that is an offset into the string table.
Our current pe object file reader does not read string table when section
names starts with /. Do that, so (issue 8811 example)
c:\go\path\src\isssue8811>go build
# isssue8811
isssue8811/glfw(.text): isssue8811/glfw(/76): not defined
isssue8811/glfw(.text): undefined: isssue8811/glfw(/76)
becomes
c:\go\path\src\isssue8811>go build
# isssue8811
isssue8811/glfw(.text): isssue8811/glfw(.rdata$.refptr._glfwInitialized): not defined
isssue8811/glfw(.text): undefined: isssue8811/glfw(.rdata$.refptr._glfwInitialized)
Small progress to
Update #8811
LGTM=iant, jfrederich
R=golang-codereviews, iant, jfrederich
CC=golang-codereviews
https://golang.org/cl/154210044
I diffed the output of `nm -n gofmt' before and after this change,
and verified that all changes are correct and all corrupted symbol
names are fixed.
Fixes#8906.
LGTM=iant, cookieo9
R=golang-codereviews, iant, cookieo9
CC=golang-codereviews
https://golang.org/cl/159750043
It seems reasonable that people might want to look up the
ImportComment with "go list".
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/143600043
This will help find bugs during the release freeze.
It's not clear it should be kept for the release itself.
That's issue 8861.
The most likely thing that would trigger this is stale
pointers that previously were ignored or caused memory
leaks. These were allowed due to the use of conservative
collection. Now that everything is precise, we should not
see them anymore.
The small number check reinforces what the stack copier
is already doing, catching the storage of integers in pointers.
It caught issue 8864.
The check is disabled if _cgo_allocate is linked into the binary,
which is to say if the binary is using SWIG to allocate untyped
Go memory. In that case, there are invalid pointers and there's
nothing we can do about it.
LGTM=rlh
R=golang-codereviews, dvyukov, rlh
CC=golang-codereviews, iant, khr, r
https://golang.org/cl/148470043
Depending on flags&KindGCProg,
gc[0] and gc[1] are either pointers or inlined bitmap bits.
That's not compatible with a precise garbage collector:
it needs to be always pointers or never pointers.
Change the inlined bitmap case to store a pointer to an
out-of-line bitmap in gc[0]. The out-of-line bitmaps are
dedup'ed, so that for example all pointer types share the
same out-of-line bitmap.
Fixes#8864.
LGTM=r
R=golang-codereviews, dvyukov, r
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/155820043
http://build.golang.org/log/c7a91b6eac8f8daa2bd17801be273e58403a15f2
# cmd/pprof
/linux-386-clang-9115aad1dc4a/go/pkg/linux_386/net.a(_all.o): sym#16: ignoring .Linfo_string0 in section 16 (type 0)
/linux-386-clang-9115aad1dc4a/go/pkg/linux_386/net.a(_all.o): sym#17: ignoring .Linfo_string1 in section 16 (type 0)
/linux-386-clang-9115aad1dc4a/go/pkg/linux_386/net.a(_all.o): sym#18: ignoring .Linfo_string2 in section 16 (type 0)
/linux-386-clang-9115aad1dc4a/go/pkg/linux_386/net.a(_all.o): sym#20: ignoring .Linfo_string0 in section 16 (type 0)
/linux-386-clang-9115aad1dc4a/go/pkg/linux_386/net.a(_all.o): sym#21: ignoring .Linfo_string1 in section 16 (type 0)
...
I don't know what these are. Let's ignore them and see if we get any further.
TBR=iant
CC=golang-codereviews
https://golang.org/cl/155030043
For example, fixes 'go vet syscall', which has source
files in package syscall_test.
Fixes#8511.
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant
https://golang.org/cl/152220044
Structs without tags have no unique name to use in the
Go definitions generated from the C types.
This caused issue 8812, fixed by CL 149260043.
Avoid future problems by requiring struct tags.
Update runtime as needed.
(There is no other C code in the tree.)
LGTM=bradfitz, iant
R=golang-codereviews, bradfitz, dave, iant
CC=golang-codereviews, khr, r
https://golang.org/cl/150360043
+ static test
NB: there's a preexisting (dynamic) failure of test issue7978.go.
LGTM=iant
R=rsc, iant
CC=golang-codereviews
https://golang.org/cl/144650045