Commit Graph

5178 Commits

Author SHA1 Message Date
Cherry Mui 6436f5c13d runtime: handle end PC in textAddr
As the func table contains the end marker of the text section, we
sometimes need to get that address from an offset. Currently
textAddr doesn't handle that address, as it is not within any
text section. Instead of letting the callers not call textAddr
with the end offset, just handle it more elegantly in textAddr.

For #48837.

Change-Id: I6e97e455f6cb66e9680a7aac6152ba6f4cda2e12
Reviewed-on: https://go-review.googlesource.com/c/go/+/354635
Trust: Cherry Mui <cherryyz@google.com>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-10-07 22:19:51 +00:00
Keith Randall 2043b3b47b cmd/compile,runtime: implement uint64->float32 correctly on 32-bit archs
The old way of implementing it, float32(float64(x)), involves 2 roundings
which can cause accuracy errors in some strange cases. Implement a runtime
version of [u]int64tofloat32 which only does one rounding.

Fixes #48807

Change-Id: Ie580be480bee4f3a479e58ef8dce23032f231704
Reviewed-on: https://go-review.googlesource.com/c/go/+/354429
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2021-10-07 18:34:24 +00:00
Meng Zhuo ecb2f231fa runtime,sync: using fastrandn instead of modulo reduction
fastrandn is ~50% faster than fastrand() % n.
`ack -v 'fastrand\(\)\s?\%'` finds all modulo on fastrand()

name              old time/op  new time/op  delta
Fastrandn/2       2.86ns ± 0%  1.59ns ± 0%  -44.35%  (p=0.000 n=9+10)
Fastrandn/3       2.87ns ± 1%  1.59ns ± 0%  -44.41%  (p=0.000 n=10+9)
Fastrandn/4       2.87ns ± 1%  1.58ns ± 1%  -45.10%  (p=0.000 n=10+10)
Fastrandn/5       2.86ns ± 1%  1.58ns ± 1%  -44.84%  (p=0.000 n=10+10)

Change-Id: Ic91f5ca9b9e3b65127bc34792b62fd64fbd13b5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/353269
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-10-07 14:01:52 +00:00
Josh Bleecher Snyder 8238f82bf1 runtime: streamline moduledata.textAddr
Accept a uint32 instead of a uintptr to make call sites simpler.

Do less work in the common case in which len(textsectmap) == 1.

Change-Id: Idd6cdc3fdad7a9356864c83790463b5d3000171b
Reviewed-on: https://go-review.googlesource.com/c/go/+/354132
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-06 20:29:59 +00:00
Josh Bleecher Snyder 058fa255bc cmd/link,runtime: make textsectmap fields more convenient for runtime
They're only used in a single place.
Instead of calculating the end every time,
calculate it in the linker.

It'd be nice to recalculate baseaddr-vaddr,
but that generates relocations that are too large.

While we're here, remove some pointless uintptr -> uintptr conversions.

Change-Id: I91758f9bff11b365bc3a63fee172dbdc3d90b966
Reviewed-on: https://go-review.googlesource.com/c/go/+/354089
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-10-06 19:54:45 +00:00
Russ Cox 4d8db00641 all: use bytes.Cut, strings.Cut
Many uses of Index/IndexByte/IndexRune/Split/SplitN
can be written more clearly using the new Cut functions.
Do that. Also rewrite to other functions if that's clearer.

For #46336.

Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448
Reviewed-on: https://go-review.googlesource.com/c/go/+/351711
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-10-06 15:53:04 +00:00
Josh Bleecher Snyder e82ed0cd83 runtime: start moduledata memory load early
The slowest thing that can happen in funcdata is a cache miss
on moduledata.gofunc. Move that memory load earlier.

Also, for better ergonomics when working on this code,
do more calculations as uintptrs.

name                   old time/op  new time/op  delta
StackCopyWithStkobj-8  10.5ms ± 5%   9.9ms ± 4%  -6.03%  (p=0.000 n=15+15)

Change-Id: I590f4449725983c7f8d274c4ac7ed384d9018d85
Reviewed-on: https://go-review.googlesource.com/c/go/+/354134
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 23:36:43 +00:00
Josh Bleecher Snyder 2a5d4ea97e runtime: make funcspdelta inlineable
funcspdelta should be inlined: It is a tiny wrapper around another func.
The sanity check prevents that. Condition the sanity check on debugPcln.
While we're here, make the sanity check throw when it fails.

Change-Id: Iec022b8463b13a8e5a6d8479e7ddcb68909d6fe0
Reviewed-on: https://go-review.googlesource.com/c/go/+/354133
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 23:36:03 +00:00
Josh Bleecher Snyder 5758c40ac8 runtime: add a single-text-section fast path to findfunc
name                   old time/op  new time/op  delta
StackCopyWithStkobj-8  11.5ms ± 4%  10.7ms ± 7%  -7.10%  (p=0.000 n=10+10)

Change-Id: Ib806d732ec11f2a6cfde229fd88aff0fe68d9e7d
Reviewed-on: https://go-review.googlesource.com/c/go/+/354129
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 23:34:32 +00:00
Josh Bleecher Snyder 77bd0da688 cmd/link,runtime: remove unnecessary funcdata alignment
Change-Id: I2777feaae4f266de99b56b444045370c82447cff
Reviewed-on: https://go-review.googlesource.com/c/go/+/354011
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 23:25:24 +00:00
Josh Bleecher Snyder e31c9ab557 cmd/link,runtime: remove functab relocations
Use an offset from runtime.text instead.
This removes the last relocation from functab generation,
which lets us simplify that code.

size      before    after     Δ        %
addr2line 3680818   3652498   -28320   -0.769%
api       4944850   4892418   -52432   -1.060%
asm       4757586   4711266   -46320   -0.974%
buildid   2418546   2392578   -25968   -1.074%
cgo       4197346   4164818   -32528   -0.775%
compile   22076882  21875890  -200992  -0.910%
cover     4411362   4358418   -52944   -1.200%
dist      3091346   3062738   -28608   -0.925%
doc       3563234   3532610   -30624   -0.859%
fix       3020658   2991666   -28992   -0.960%
link      6164642   6110834   -53808   -0.873%
nm        3646818   3618482   -28336   -0.777%
objdump   4012594   3983042   -29552   -0.736%
pack      2153554   2128338   -25216   -1.171%
pprof     13011666  12870114  -141552  -1.088%
test2json 2383906   2357554   -26352   -1.105%
trace     9736514   9631186   -105328  -1.082%
vet       6655058   6580370   -74688   -1.122%
total     103927380 102914820 -1012560 -0.974%

relocs    before  after   Δ       %
addr2line 25069   22709   -2360   -9.414%
api       17176   13321   -3855   -22.444%
asm       18271   15630   -2641   -14.455%
buildid   9233    7352    -1881   -20.373%
cgo       16222   13044   -3178   -19.591%
compile   60421   46299   -14122  -23.373%
cover     18479   14526   -3953   -21.392%
dist      10135   7733    -2402   -23.700%
doc       12735   9940    -2795   -21.947%
fix       10820   8341    -2479   -22.911%
link      21849   17785   -4064   -18.600%
nm        24988   22642   -2346   -9.389%
objdump   26060   23462   -2598   -9.969%
pack      7665    5936    -1729   -22.557%
pprof     60764   50998   -9766   -16.072%
test2json 8389    6431    -1958   -23.340%
trace     37180   29382   -7798   -20.974%
vet       24044   19055   -4989   -20.749%
total     409499  334585  -74914  -18.294%


Caching the field size in debug/gosym.funcTab
avoids a 20% PCToLine performance regression.

name            old time/op    new time/op    delta
115/LineToPC-8    56.4µs ± 3%    57.3µs ± 2%  +1.66%  (p=0.006 n=15+13)
115/PCToLine-8     188ns ± 2%     190ns ± 3%  +1.46%  (p=0.030 n=15+15)


Change-Id: I2816a1b28e62b01852e3b306f08546f1e56cd5ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/352191
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-10-05 23:25:06 +00:00
Josh Bleecher Snyder 990c9c6cab Revert "runtime: use unsafe.Slice in getStackMap"
This reverts commit golang.org/cl/352953.

Reason for revert: unsafe.Slice is considerably slower.
Part of this is extra safety checks (good), but most of it
is the function call overhead. We should consider open-coding it (#48798).

Impact of this change:

name                   old time/op  new time/op  delta
StackCopyWithStkobj-8  12.1ms ± 5%  11.6ms ± 3%  -4.03%  (p=0.009 n=10+8)

Change-Id: Ib2448e3edac25afd8fb55ffbea073b8b11521bde
Reviewed-on: https://go-review.googlesource.com/c/go/+/354090
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2021-10-05 20:35:54 +00:00
Josh Bleecher Snyder 113b52979f runtime: remove a branch from funcdata
name                   old time/op  new time/op  delta
StackCopyWithStkobj-8  12.1ms ± 7%  11.6ms ± 8%  -3.88%  (p=0.002 n=19+19)

Change-Id: Idf810017d541eba70bcf9c736267de9efae916d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/354072
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 20:35:48 +00:00
Josh Bleecher Snyder 75773b0e7b runtime: add BenchmarkStackCopyWithStkobj
For benchmarking and improving recent stkobj-related changes.

Co-Authored-By: Cherry Mui <cherryyz@google.com>
Change-Id: I34c8b1a09e4cf98547460882b0d3908158269f57
Reviewed-on: https://go-review.googlesource.com/c/go/+/354071
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 20:35:41 +00:00
Josh Bleecher Snyder 3100dc1a7f cmd/link,runtime: remove relocations from stkobjs
Use an offset from go.func.* instead.
This removes the last relocation from funcdata symbols,
which lets us simplify that code.

size      before    after     Δ       %
addr2line 3683218   3680706   -2512   -0.068%
api       4951074   4944850   -6224   -0.126%
asm       4744258   4757586   +13328  +0.281%
buildid   2419986   2418546   -1440   -0.060%
cgo       4218306   4197346   -20960  -0.497%
compile   22132066  22076882  -55184  -0.249%
cover     4432834   4411362   -21472  -0.484%
dist      3111202   3091346   -19856  -0.638%
doc       3583602   3563234   -20368  -0.568%
fix       3023922   3020658   -3264   -0.108%
link      6188034   6164642   -23392  -0.378%
nm        3665826   3646818   -19008  -0.519%
objdump   4015234   4012450   -2784   -0.069%
pack      2155010   2153554   -1456   -0.068%
pprof     13044178  13011522  -32656  -0.250%
test2json 2402146   2383906   -18240  -0.759%
trace     9765410   9736514   -28896  -0.296%
vet       6681250   6655058   -26192  -0.392%
total     104217556 103926980 -290576 -0.279%

relocs    before  after   Δ       %
addr2line 25563   25066   -497    -1.944%
api       18409   17176   -1233   -6.698%
asm       18903   18271   -632    -3.343%
buildid   9513    9233    -280    -2.943%
cgo       17103   16222   -881    -5.151%
compile   64825   60421   -4404   -6.794%
cover     19464   18479   -985    -5.061%
dist      10798   10135   -663    -6.140%
doc       13503   12735   -768    -5.688%
fix       11465   10820   -645    -5.626%
link      23214   21849   -1365   -5.880%
nm        25480   24987   -493    -1.935%
objdump   26610   26057   -553    -2.078%
pack      7951    7665    -286    -3.597%
pprof     63964   60761   -3203   -5.008%
test2json 8735    8389    -346    -3.961%
trace     39639   37180   -2459   -6.203%
vet       25970   24044   -1926   -7.416%
total     431108  409489  -21619  -5.015%

Change-Id: I43c26196a008da6d1cb3a782eea2f428778bd569
Reviewed-on: https://go-review.googlesource.com/c/go/+/353138
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 19:16:08 +00:00
Josh Bleecher Snyder 017ffcd10d cmd/link, runtime: convert FUNCDATA relocations to offsets
Every function has associated numbered extra funcdata to another symbol.
Prior to this change, a funcdata pointer was stored as a relocation.

This change alters this to be an offset relative to go.func.* or go.funcrel.*.

This reduces the number of relocations on darwin/arm64 by about 40%.
It also shrinks externally linked binaries. On darwin/arm64:

size      before    after     Δ        %
addr2line 3788498   3699730   -88768   -2.343%
api       5100018   4951074   -148944  -2.920%
asm       4855234   4744274   -110960  -2.285%
buildid   2500162   2419986   -80176   -3.207%
cgo       4338258   4218306   -119952  -2.765%
compile   22764418  22132226  -632192  -2.777%
cover     4583186   4432770   -150416  -3.282%
dist      3200962   3094626   -106336  -3.322%
doc       3680402   3583602   -96800   -2.630%
fix       3114914   3023922   -90992   -2.921%
link      6308578   6154786   -153792  -2.438%
nm        3754338   3665826   -88512   -2.358%
objdump   4124738   4015234   -109504  -2.655%
pack      2232626   2155010   -77616   -3.476%
pprof     13497474  13044066  -453408  -3.359%
test2json 2483810   2402146   -81664   -3.288%
trace     10108898  9748802   -360096  -3.562%
vet       6884322   6681314   -203008  -2.949%
total     107320836 104167700 -3153136 -2.938%

relocs    before  after   Δ       %
addr2line 33357   25563   -7794   -23.365%
api       31589   18409   -13180  -41.723%
asm       27825   18904   -8921   -32.061%
buildid   15603   9513    -6090   -39.031%
cgo       27809   17103   -10706  -38.498%
compile   114769  64829   -49940  -43.513%
cover     32932   19462   -13470  -40.902%
dist      18797   10796   -8001   -42.565%
doc       22891   13503   -9388   -41.012%
fix       19700   11465   -8235   -41.802%
link      37324   23198   -14126  -37.847%
nm        33226   25480   -7746   -23.313%
objdump   35237   26610   -8627   -24.483%
pack      13535   7951    -5584   -41.256%
pprof     97986   63961   -34025  -34.724%
test2json 15113   8735    -6378   -42.202%
trace     66786   39636   -27150  -40.652%
vet       43328   25971   -17357  -40.060%
total     687806  431088  -256718 -37.324%

It should also incrementally speed up binary launching
and may reduce linker memory use.

This is another step towards removing relocations so
that pages that were previously dirtied by the loader may remain clean,
which will offer memory savings useful in constrained environments like iOS.

Removing the relocations in .stkobj symbols will allow some simplifications.
There will be no references into go.funcrel.*,
so we will no longer need to use the bottom bit to distinguish offset bases.

Change-Id: I83d34c1701d6f3f515b9905941477d522441019d
Reviewed-on: https://go-review.googlesource.com/c/go/+/352110
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-05 15:47:57 +00:00
Lynn Boger 7983830423 runtime: add ABIInternal to strhash and memhash on ppc64x
In testing the register ABI changes I found that the benchmarks
for strhash and memhash degraded unless I marked them as
ABIInternal. This fixes that.

Change-Id: I9c7a04eaa6a66b888877f43454c51277c07e638a
Reviewed-on: https://go-review.googlesource.com/c/go/+/353832
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2021-10-05 14:12:44 +00:00
Josh Bleecher Snyder c2483a5c03 cmd, runtime: eliminate runtime.no_pointers_stackmap
runtime.no_pointers_stackmap is an odd beast.
It is defined in a Go file, populated by assembly,
used by the GC, and its address is magic used
by async pre-emption to ascertain whether a
routine was implemented in assembly.

A subsequent change will force all GC data into the go.func.* linker symbol.
runtime.no_pointers_stackmap is GC data, so it must go there.
Yet it also needs to go into rodata, for the runtime address trick.

This change eliminates it entirely.

Replace the runtime address check with the newly introduced asm funcflag.

Handle the assembly macro as magic, similarly to our handling of go_args_stackmap.
This allows the no_pointers_stackmap to be identical in all ways
to other gclocals stackmaps, including content-addressability.

Change-Id: Id2f20a262cfab0719beb88e6342984ec4b196268
Reviewed-on: https://go-review.googlesource.com/c/go/+/353672
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-04 22:45:17 +00:00
Josh Bleecher Snyder 752cc07c77 cmd, runtime: mark assembly routines in FuncFlags
There's no good way to ascertain at runtime whether
a function was implemented in assembly.
The existing workaround doesn't play nicely
with some upcoming linker changes.

This change introduces an explicit marker for routines
implemented in assembly.

This change doesn't use the new bit anywhere,
it only introduces it.

Change-Id: I4051dc0afc15b260724a04b9d18aeeb94911bb29
Reviewed-on: https://go-review.googlesource.com/c/go/+/353671
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-04 22:27:40 +00:00
Michael Pratt 7ee4c16654 Revert "runtime: add padding to Linux kernel structures"
This reverts commit f0db7eae74.

Reason for revert: Breaks linux-386 tests

Change-Id: Ia51fbf97460ab52920b67d6db6177ac2d6b0058e
Reviewed-on: https://go-review.googlesource.com/c/go/+/353432
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-10-04 22:08:53 +00:00
Rhys Hiltner f0db7eae74 runtime: add padding to Linux kernel structures
Go exchanges siginfo and sigevent structures with the kernel. They
contain unions, but Go's use is limited to the first few fields. Pad out
the rest so the size Go sees is the same as what the Linux kernel sees.

This is a follow-up to CL 342052 which added the sigevent struct without
padding. It updates the siginfo struct as well so there are no bad
examples in the defs_linux_*.go files.

Change-Id: Id991d4a57826677dd7e6cc30ad113fa3b321cddf
Reviewed-on: https://go-review.googlesource.com/c/go/+/353136
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-10-04 21:10:16 +00:00
Changkun Ou 205640ed7b runtime: avoid run TestSyscallN in parallel
Fixes #48012

Change-Id: Ie27eb864ac387ecf5155a3aefa81661f1448ace5
Reviewed-on: https://go-review.googlesource.com/c/go/+/345670
Trust: Bryan C. Mills <bcmills@google.com>
Trust: Heschi Kreinick <heschi@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-09-30 18:10:18 +00:00
Josh Bleecher Snyder d3ad216f8e cmd/link, runtime: use offset for _func.entry
The first field of the func data stored by the linker is the
entry PC for the function. Prior to this change, this was stored
as a relocation to the function. Change this to be an offset
relative to runtime.text.

This reduces the number of relocations on darwin/arm64 by about 10%.
It also slightly shrinks binaries:

file      before    after     Δ       %
addr2line 3803058   3791298   -11760  -0.309%
api       5140114   5104242   -35872  -0.698%
asm       4886850   4840626   -46224  -0.946%
buildid   2512466   2503042   -9424   -0.375%
cgo       4374770   4342274   -32496  -0.743%
compile   22920530  22769202  -151328 -0.660%
cover     4624626   4588242   -36384  -0.787%
dist      3217570   3205522   -12048  -0.374%
doc       3715026   3684498   -30528  -0.822%
fix       3148226   3119266   -28960  -0.920%
link      6350226   6313362   -36864  -0.581%
nm        3768850   3757106   -11744  -0.312%
objdump   4140594   4127618   -12976  -0.313%
pack      2227474   2218818   -8656   -0.389%
pprof     13598706  13506786  -91920  -0.676%
test2json 2497234   2487426   -9808   -0.393%
trace     10198066  10118498  -79568  -0.780%
vet       6930658   6889074   -41584  -0.600%
total     108055044 107366900 -688144 -0.637%

It should also incrementally speed up binary launching.

This is the first step towards removing enough relocations
that pages that were previously dirtied by the loader may remain clean,
which will offer memory savings useful in constrained environments.

Change-Id: Icfba55e696ba2f9c99c4f179125ba5a3ba4369c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/351463
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-29 22:14:22 +00:00
Josh Bleecher Snyder 88ea8a5fe0 runtime: extract text address calculation into a separate method
Pure code movement.

Change-Id: I7216e50fe14afa3d19c5047c92e515c90838f834
Reviewed-on: https://go-review.googlesource.com/c/go/+/353129
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-29 22:12:05 +00:00
Josh Bleecher Snyder a9d5ea650b runtime: use unsafe.Slice in getStackMap
It's not less code, but it is clearer code.

Change-Id: I32239baf92487a56900a4edd8a2593014f37d093
Reviewed-on: https://go-review.googlesource.com/c/go/+/352953
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2021-09-29 17:50:31 +00:00
Michael Pratt aeb4fbabc0 runtime: drop nowritebarrier from gcParkAssist
Nothing in this function is at odds with having write barriers. It
originally inherited the annotation from gcAssistAlloc
http://golang.org/cl/30700, which subsequently dropped the annotation in
http://golang.org/cl/32431 as it was unnecessary.

Change-Id: Ie464e6b4ed957f57e922ec043728ff4e15bf35ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/352811
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-09-29 15:23:27 +00:00
Lynn Boger daec057602 runtime: port memmove, memclr to register ABI on ppc64x
This allows memmove and memclr to be invoked using the new
register ABI on ppc64x.

Change-Id: Ie397a942d7ebf76f62896924c3bb5b3a3dbba73e
Reviewed-on: https://go-review.googlesource.com/c/go/+/352891
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2021-09-28 20:40:48 +00:00
Lynn Boger 9c43872bd8 reflect,runtime: add reflect support for regabi on PPC64
This adds the regabi support needed for reflect including:
- implementation of the makeFuncSub and methodValueCall for
reflect
- implementations of archFloat32FromReg and archFloat32ToReg
needed for PPC64 due to differences in the way float32 are
represented in registers as compared to other platforms
- change needed to stack.go due to the functions that are
changed above

Change-Id: Ida40d831370e39b91711ccb9616492b7fad3debf
Reviewed-on: https://go-review.googlesource.com/c/go/+/352429
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2021-09-28 18:58:50 +00:00
Leonard Wang 84ba117fd7 runtime: add mp parameter for getMCache
Since all callers of getMCache appear to have mp available,
we pass the mp to getMCache, and reduce one call to getg.
And after modification, getMCache is also inlined.

Change-Id: Ib7880c118336acc026ecd7c60c5a88722c3ddfc7
Reviewed-on: https://go-review.googlesource.com/c/go/+/349329
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Trust: Carlos Amedee <carlos@golang.org>
2021-09-28 18:43:19 +00:00
Josh Bleecher Snyder cd4d59232e runtime: fix and simplify printing on bad ftab
Unilaterally print plugin.
Use println instead of print.

Change-Id: Ib58f187bff9c3dbedfa2725c44754a222807cc36
Reviewed-on: https://go-review.googlesource.com/c/go/+/352072
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 22:22:35 +00:00
Cherry Mui 7887313879 runtime/pprof: skip TestTimeVDSO on Android
The test is flaky on Android. VDSO may not be enabled so it may
not have the original problem anyway.

Change-Id: I73c2902c682a44d893e0d4e34f006c2377ef8816
Reviewed-on: https://go-review.googlesource.com/c/go/+/352509
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-09-27 21:53:31 +00:00
Josh Bleecher Snyder 8e34d77957 runtime, cmd/link: minor cleanup
Fix some comments.
Adjust capitalization for initialisms.
Use a println directly instead of emulating it.

Change-Id: I0d8fa0eb39547e2db8113fd0358136285b86f16a
Reviewed-on: https://go-review.googlesource.com/c/go/+/351462
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-09-27 20:59:14 +00:00
Josh Bleecher Snyder f0c79caa13 runtime: move entry method from _func to funcInfo
This will be required when we change from storing entry PCs in _func
to entry PC offsets, which are relative to the containing module.

Notably, almost all uses of the entry method were already called
on a funcInfo. Only Func.Entry incurs the additional module
lookup cost.

This makes Entry considerably slower, but it is probably
still fast enough in absolute terms that it is OK.

name             old time/op  new time/op  delta
Func/Name-8      8.86ns ± 0%  8.33ns ± 2%    -5.92%  (p=0.000 n=12+13)
Func/Entry-8     0.64ns ± 0%  2.62ns ±36%  +310.07%  (p=0.000 n=14+15)
Func/FileLine-8  24.5ns ± 0%  25.0ns ± 4%    +2.21%  (p=0.015 n=14+13)

Change-Id: Ia2d5de5f2f83fab334f1875452b9e8e87651d340
Reviewed-on: https://go-review.googlesource.com/c/go/+/351461
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 20:59:03 +00:00
Josh Bleecher Snyder 61a0a70113 runtime: convert _func.entry to a method
A subsequent change will alter the semantics of _func.entry.
To make that change obvious and clear, change _func.entry to a method,
and rename the field to _func.entryPC.

Change-Id: I05d66b54d06c5956d4537b0729ddf4290c3e2635
Reviewed-on: https://go-review.googlesource.com/c/go/+/351460
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 20:58:49 +00:00
Josh Bleecher Snyder 6c163e5ac9 runtime: change funcinl sentinel value from 0 to ^0
_func and funcinl are type-punned.
We distinguish them at runtime by inspecting the first word.

Prior to this change, we used 0 as the sentinel value
that means that a Func is a funcinl.
That worked because _func's first word is the functions' entry PC,
and 0 is not a valid PC. I plan to make *_func's entry PC relative
to the containing module. As a result, 0 will be a valid value,
for the first function in the module.

Switch to ^0 as the new sentinel value, which is neither a valid
entry PC nor a valid PC offset.

Change-Id: I4c718523a083ed6edd57767c3548640681993522
Reviewed-on: https://go-review.googlesource.com/c/go/+/351459
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 20:58:24 +00:00
Josh Bleecher Snyder e54843f2f4 runtime: look up funcInfo by func pointer
runtime.Func.{Name,FileLine} need to be able to
go from a *_func to a funcInfo. The missing bit of
information is what module contains that *_func.

The existing implementation looked up the module
using the *_func's entry PC. A subsequent change will
store *_func's entry PC relative to the containing module.
Change the module lookup to instead for the module
whose pclntable contains the *_func,
cutting all dependencies on the contents of the *_func.

Change-Id: I2dbbfec043ebc2e9a6ef19bbdec623ac84353b10
Reviewed-on: https://go-review.googlesource.com/c/go/+/351458
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 20:58:06 +00:00
Josh Bleecher Snyder f961d8e5b1 runtime: add Func method benchmarks
Change-Id: Ib76872c22b1be9e611199b84fd96b59beedf786c
Reviewed-on: https://go-review.googlesource.com/c/go/+/351457
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2021-09-27 20:56:37 +00:00
Keith Randall 2dda92ff6f runtime: make slice growth formula a bit smoother
Instead of growing 2x for < 1024 elements and 1.25x for >= 1024 elements,
use a somewhat smoother formula for the growth factor. Start reducing
the growth factor after 256 elements, but slowly.

starting cap    growth factor
256             2.0
512             1.63
1024            1.44
2048            1.35
4096            1.30

(Note that the real growth factor, both before and now, is somewhat
larger because we round up to the next size class.)

This CL also makes the growth monotonic (larger initial capacities
make larger final capacities, which was not true before). See discussion
at https://groups.google.com/g/golang-nuts/c/UaVlMQ8Nz3o

256 was chosen as the threshold to roughly match the total number of
reallocations when appending to eventually make a very large
slice. (We allocate smaller when appending to capacities [256,1024]
and larger with capacities [1024,...]).

Change-Id: I876df09fdc9ae911bb94e41cb62675229cb10512
Reviewed-on: https://go-review.googlesource.com/c/go/+/347917
Trust: Keith Randall <khr@golang.org>
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <martin@golang.org>
2021-09-27 20:53:51 +00:00
Rhys Hiltner 8d09f7c517 runtime: use per-thread profiler for SetCgoTraceback platforms
Updates #35057

Change-Id: I61d772a2cbfb27540fb70c14676c68593076ca94
Reviewed-on: https://go-review.googlesource.com/c/go/+/342054
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-09-27 18:58:41 +00:00
Rhys Hiltner 5b90958084 runtime: move sigprofNonGo
The sigprofNonGo and sigprofNonGoPC functions are only used on unix-like
platforms. In preparation for unix-specific changes to sigprofNonGo,
move it (plus its close relative) to a unix-specific file.

Updates #35057

Change-Id: I9c814127c58612ea9a9fbd28a992b04ace5c604d
Reviewed-on: https://go-review.googlesource.com/c/go/+/351790
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: David Chase <drchase@google.com>
2021-09-27 18:58:36 +00:00
Rhys Hiltner 8cfd8c3db8 runtime: profile with per-thread timers on Linux
Using setitimer on Linux to request SIGPROF signal deliveries in
proportion to the process's on-CPU time results in under-reporting when
the program uses several goroutines in parallel. Linux calculates the
process's total CPU spend on a regular basis (often every 4ms); if the
process has spent enough CPU time since the last calculation to warrant
more than one SIGPROF (usually 10ms for the default sample rate of 100
Hz), the kernel is often able to deliver only one of them. With these
common settings, that results in Go CPU profiles being attenuated for
programs that use more than 2.5 goroutines in parallel.

To avoid in effect overflowing the kernel's process-wide CPU counter,
and relying on Linux's typical behavior of having the active thread
handle the resulting process-targeted signal, use timer_create to
request a timer for each OS thread that the Go runtime manages. Have
each timer track the CPU time of a single thread, with the resulting
SIGPROF going directly to that thread.

To continue tracking CPU time spent on threads that don't interact with
the Go runtime (such as those created and used in cgo), keep using
setitimer in addition to the new mechanism. When a SIGPROF signal
arrives, check whether it's due to setitimer or timer_create and filter
as appropriate: If the thread is known to Go (has an M) and has a
timer_create timer, ignore SIGPROF signals from setitimer. If the thread
is not known to Go (does not have an M), ignore SIGPROF signals that are
not from setitimer.

Counteract the new bias that per-thread profiling adds against
short-lived threads (or those that are only active on occasion for a
short time, such as garbage collection workers on mostly-idle systems)
by configuring the timers' initial trigger to be from a uniform random
distribution between "immediate trigger" and the full requested sample
period.

Updates #35057

Change-Id: Iab753c4e5101bdc09ef9132eec84a75478e05579
Reviewed-on: https://go-review.googlesource.com/c/go/+/324129
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-09-27 18:58:29 +00:00
Rhys Hiltner f9e90f7e74 runtime: allow per-OS changes to unix profiler
Updates #35057

Change-Id: I56ea8f4750022847f0866c85e237a2cea40e0ff7
Reviewed-on: https://go-review.googlesource.com/c/go/+/342053
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-09-27 18:58:20 +00:00
Rhys Hiltner 3d795ea798 runtime: add timer_create syscalls for Linux
Updates #35057

Change-Id: Id702b502fa4e4005ba1e450a945bc4420a8a8b8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/342052
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Than McIntosh <thanm@google.com>
2021-09-27 18:57:20 +00:00
Lynn Boger 40fce515f9 runtime: mark race functions as ABIInternal
This adds ABIInternal to the race function declarations.

Change-Id: I99f8a310972ff09b4d56eedbcc6e9609bab0f224
Reviewed-on: https://go-review.googlesource.com/c/go/+/352369
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 18:23:35 +00:00
Lynn Boger 6e5dd0b59b runtime: add runtime changes for register ABI on ppc64x
This adds the changes for the register ABI in the runtime
functions for ppc64x:
- Add spill functions used by runtime
- Add ABIInternal to functions

Some changes were needed to the stubs files
due to vet issues when compiling for linux/ppc64.

Change-Id: I010ddbc774ed4f22e1f9d77833bd55b919d95c99
Reviewed-on: https://go-review.googlesource.com/c/go/+/351590
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-27 11:52:07 +00:00
Cherry Mui 217507eb03 runtime: set vdsoSP to caller's SP consistently
m.vdsoSP should be set to the SP of the caller of nanotime1,
instead of the SP of nanotime1 itself, which matches m.vdsoPC.
Otherwise the unmatched vdsoPC and vdsoSP would make the stack
trace look like recursive.

We already do it correctly on AMD64, 386, and RISCV64. This CL
fixes the rest.

Fixes #47324.

Change-Id: I98b6fcfbe9fc6bdd28b8fe2a1299b7c505371dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/337590
Trust: Cherry Mui <cherryyz@google.com>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-09-24 14:52:47 +00:00
Meng Zhuo 483533df9e runtime: using wyrand for fastrand
For modern 64-bit CPU architecture multiplier is faster than xorshift

darwin/amd64
name              old time/op  new time/op  delta
Fastrand          2.13ns ± 1%  1.78ns ± 1%  -16.47%  (p=0.000 n=9+10)
FastrandHashiter  32.5ns ± 4%  32.1ns ± 3%     ~     (p=0.277 n=8+9)
Fastrandn/2       2.16ns ± 1%  1.99ns ± 1%   -7.53%  (p=0.000 n=10+10)
Fastrandn/3       2.13ns ± 3%  2.00ns ± 1%   -5.88%  (p=0.000 n=10+10)
Fastrandn/4       2.08ns ± 2%  1.98ns ± 2%   -4.71%  (p=0.000 n=10+10)
Fastrandn/5       2.08ns ± 2%  1.98ns ± 1%   -4.90%  (p=0.000 n=10+9)

linux/mips64le
name              old time/op  new time/op  delta
Fastrand          12.1ns ± 0%  10.8ns ± 1%  -10.58%  (p=0.000 n=8+10)
FastrandHashiter   105ns ± 1%   105ns ± 1%     ~     (p=0.138 n=10+10)
Fastrandn/2       16.9ns ± 0%  16.4ns ± 4%   -2.84%  (p=0.020 n=10+10)
Fastrandn/3       16.9ns ± 0%  16.4ns ± 3%   -3.03%  (p=0.000 n=10+10)
Fastrandn/4       16.9ns ± 0%  16.5ns ± 2%   -2.01%  (p=0.002 n=8+10)
Fastrandn/5       16.9ns ± 0%  16.4ns ± 3%   -2.70%  (p=0.000 n=8+10)

linux/riscv64
name              old time/op  new time/op  delta
Fastrand          22.7ns ± 0%  12.7ns ±19%  -44.09%  (p=0.000 n=9+10)
FastrandHashiter   255ns ± 4%   250ns ± 7%     ~     (p=0.363 n=10+10)
Fastrandn/2       31.8ns ± 2%  28.5ns ±13%  -10.45%  (p=0.000 n=10+10)
Fastrandn/3       33.0ns ± 2%  27.4ns ± 8%  -17.16%  (p=0.000 n=9+10)
Fastrandn/4       29.6ns ± 3%  28.2ns ± 5%   -4.81%  (p=0.000 n=8+9)
Fastrandn/5       33.4ns ± 3%  26.5ns ± 9%  -20.49%  (p=0.000 n=8+10)

Change-Id: I88ac69625ef923f8be66647e3361e3be986de002
Reviewed-on: https://go-review.googlesource.com/c/go/+/337350
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-09-24 01:25:24 +00:00
Lynn Boger 24c2ee7b65 cmd/compile: enable reg args and add duffcopy support on ppc64x
This adds support for duffcopy on ppc64x and updates the
ssa/config.go file to enable register args and recognize
the duffDevice is available on ppc64x.

Change-Id: Ifc472cc9cc19c9a80e468fb52078c75f7dd44d36
Reviewed-on: https://go-review.googlesource.com/c/go/+/351490
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-23 15:51:39 +00:00
Cherry Mui 91c2318e67 runtime: call cgocallbackg indirectly on PPC64
This is CL 312669, for PPC64.

cgocallback calls cgocallbackg after switching the stack. Call it
indirectly to bypass the linker's nosplit check. The nosplit check
fails after CL 351271, which removes ABI aliases. It would have
been failing before but the linker's nosplit check didn't resolve
ABI alias (it should) so it didn't catch that. Removing the ABI
aliases exposes it. For this partuclar case it is benign as there
is actually a stack switch in between.

Should fix PPC64 build.

Change-Id: I49617aea55270663a9ee4692c54c070c5ab85470
Reviewed-on: https://go-review.googlesource.com/c/go/+/351469
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-09-22 15:00:53 +00:00
Josh Bleecher Snyder 13aa0d8f57 runtime: fix output for bad pcHeader
With print, the output all runs together.
Take this opportunity to clean up and label all the fields.
Print pluginpath unilaterally; no reason not to.
Wrap long lines. Remove pointless newline from throw.

Change-Id: I37af15dc8fcb3dbdbc6da8bbea2c0ceaf7b5b889
Reviewed-on: https://go-review.googlesource.com/c/go/+/350734
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-21 16:34:59 +00:00