Instead of always allocating variable-sized "make" calls on the heap,
allocate a small, constant-sized array on the stack and use that array
as the backing store if it is big enough.
Requires the result of the "make" doesn't escape.
if cap <= K {
var arr [K]E
slice = arr[:len:cap]
} else {
slice = makeslice(E, len, cap)
}
Pretty conservatively for now, K = 32/sizeof(E). The slice header is
already 24 bytes, so wasting 32 bytes of stack if the requested size
is too big isn't that bad. Larger would waste more stack space but
maybe avoid more allocations.
This CL also requires the element type be pointer-free. Maybe we
could relax that at some point, but it is hard. If the element type
has pointers we can get heap->stack pointers (in the case where the
requested size is too big and the slice is heap allocated).
Note that this only handles the case of makeslice called directly from
compiler-generated code. It does not handle slices built in the
runtime on behalf of the program (e.g. in growslice). Some of those
are currently handled by passing in a tmpBuf (e.g. concatstrings),
but we could probably do more.
Change-Id: I8378efad527cd00d25948a80b82a68d88fbd93a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/653856
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Improve the compiler's store-to-load forwarding optimization by relaxing the
type comparison condition. Instead of requiring exact type equality (CMPeq),
we now use copyCompatibleType which allows forwarding between compatible
types where safe.
Fix several size comparison bugs in the nested store patterns. Previously,
we were comparing the size of the outer store with the load type,
rather than comparing with the size of the actual store being forwarded
from.
Skip OpConvert in dead store elimination to help get rid of dead stores such
as zeroing slices. OpConvert, like OpInlMark, doesn't really use the memory.
This optimization is particularly beneficial for code that creates slices with
computed pointers, such as the runtime's heapBitsSlice function, where
intermediate calculations were previously causing the compiler to miss
store-to-load forwarding opportunities.
Local sweet run result on an x86_64 laptop:
│ Orig.res │ Hopt.res │
│ sec/op │ sec/op vs base │
BiogoIgor-8 5.303 ± 1% 5.322 ± 1% ~ (p=0.190 n=10)
BiogoKrishna-8 7.894 ± 1% 7.828 ± 2% ~ (p=0.190 n=10)
BleveIndexBatch100-8 2.257 ± 1% 2.248 ± 2% ~ (p=0.529 n=10)
EtcdPut-8 30.12m ± 1% 30.03m ± 1% ~ (p=0.796 n=10)
EtcdSTM-8 127.1m ± 1% 126.2m ± 0% -0.74% (p=0.023 n=10)
GoBuildKubelet-8 52.21 ± 0% 52.05 ± 1% ~ (p=0.063 n=10)
GoBuildKubeletLink-8 4.342 ± 1% 4.305 ± 0% -0.85% (p=0.000 n=10)
GoBuildIstioctl-8 43.33 ± 0% 43.24 ± 0% -0.22% (p=0.015 n=10)
GoBuildIstioctlLink-8 4.604 ± 1% 4.598 ± 0% ~ (p=0.063 n=10)
GoBuildFrontend-8 15.33 ± 0% 15.29 ± 0% ~ (p=0.143 n=10)
GoBuildFrontendLink-8 740.0m ± 1% 737.7m ± 1% ~ (p=0.912 n=10)
GopherLuaKNucleotide-8 9.590 ± 1% 9.656 ± 1% ~ (p=0.165 n=10)
MarkdownRenderXHTML-8 96.97m ± 1% 97.26m ± 2% ~ (p=0.105 n=10)
Tile38QueryLoad-8 335.9µ ± 1% 335.6µ ± 1% ~ (p=0.481 n=10)
geomean 1.336 1.333 -0.22%
Change-Id: I031552623e6d5a3b1b5be8325e6314706e45534f
Reviewed-on: https://go-review.googlesource.com/c/go/+/662075
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
execIO has multiple return paths and multiple places where error is
mangled. This CL simplifies the function by just having one return
path.
Some more tests have been added to ensure that the error handling
is done correctly.
Updates #19098.
Change-Id: Ida0b1e85d4d123914054306e5bef8da94408b91c
Reviewed-on: https://go-review.googlesource.com/c/go/+/662215
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Carlos Amedee <carlos@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Add a regression test similar to the reproducer from #73141 to try to
help catch future issues with vgetrandom and thread exit. Though the
test isn't very precise, it just hammers thread exit.
When the test reproduces #73141, it simply crashes with a SIGSEGV and no
output or stack trace, which would be very unfortunate on a builder.
https://go.dev/issue/49165 tracks collecting core dumps from builders,
which would make this more tractable to debug.
For #73141.
Change-Id: I6a6a636c7d7b41e2729ff6ceb30fd7f979aa9978
Reviewed-on: https://go-review.googlesource.com/c/go/+/662636
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
This change borrows code from CL 631356 by Emmanuel Odeke (thanks!).
Fixes#70549.
Change-Id: Id6f794ea2a95b4297999456f22c6e02890fce13b
Reviewed-on: https://go-review.googlesource.com/c/go/+/662775
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Mark Freeman <mark@golang.org>
As discussed in #73137, we want to clarify the description of how
B.Loop avoids surprising optimizations, while also hinting that
the exact approach might change in the future.
Updates #73137
Change-Id: I8536540cd5d79804a47fba8cd6eec3821864309d
Reviewed-on: https://go-review.googlesource.com/c/go/+/662356
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
The unsafe.Pointer -> uintptr conversion must happen when calling
syscall.Syscall, not when calling the auto-generated wrapper function,
else the Go compiler doesn't know that it has to keep the pointer alive.
This can cause undefined behavior and stack corruption.
Fixes#73135.
Fixes#73112 (potentially).
Fixes#73128 (potentially).
Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-race
Change-Id: Ib3ad8b99618d8997bfd0742c0e44aeda696856c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/662575
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Carlos Amedee <carlos@golang.org>
This patch adds linker option -funcalign=N that allows to set alignment
for function entries.
This CL is based on vasiliy.leonenko@gmail.com's cl/615736.
For #72130
Change-Id: I57e5c9c4c71a989533643fda63a9a79c5c897dea
Reviewed-on: https://go-review.googlesource.com/c/go/+/660996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Due to a flaw in the %GOROOT_BOOTSTRAP% detection logic, the last Go
executable found by `where go` was taking precedence over the first one.
In batch scripts, environment variable expansion happens when each line
of the script is read, not when it is executed. Thus, the check in the
loop for GOROOT_BOOTSTRAP being unset would always be true, even when
the variable had been set in a previous loop iteration.
See SET /? for more information.
Change-Id: I15ddcbe771a902acb47a1f07ba7f4cb8a311e0dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/653535
Auto-Submit: Carlos Amedee <carlos@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
My CL 645115 added the new entries in the wrong place,
prematurely creating the go1.25 file.
Also, add the missing release note.
Change-Id: Ib5b5ccfb42757a9ea9dc93e33b3e3ed8e8bd7d3f
Reviewed-on: https://go-review.googlesource.com/c/go/+/662615
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Return the shift in bits from movcon, rather than returning an index.
This allows a number of multiplications to be removed, making the code
more readable. Scale down to an index only when encoding.
Change-Id: I1be91eb526ad95d389e2f8ce97212311551790df
Reviewed-on: https://go-review.googlesource.com/c/go/+/650939
Auto-Submit: Joel Sing <joel@sing.id.au>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Teach conclass how to handle 32 bit values and deduplicate the code
between con32class and conclass.
Change-Id: I9c5eea31d443fd4c2ce700c6ea21e1d0bef665b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/650938
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Joel Sing <joel@sing.id.au>
Reduce repetition by pulling some common conversions into variables.
Change-Id: I8c1cc806236b5ecdadf90f4507923718fa5de9b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/650937
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
When an M is destroyed, we put its vgetrandom state back on the shared
list for another M to reuse. This list is simply a slice, so appending
to the slice may allocate. Currently this operation is performed in
mdestroy, after the P is released, meaning allocation is not allowed.
More the cleanup earlier in mdestroy when allocation is still OK.
Also add //go:nowritebarrierrec to mdestroy since it runs without a P,
which would have caught this bug.
Fixes#73141.
Change-Id: I6a6a636c3fbf5c6eec09d07a260e39dbb4d2db12
Reviewed-on: https://go-review.googlesource.com/c/go/+/662455
Reviewed-by: Jason Donenfeld <Jason@zx2c4.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
When both a direct call and an interface call appear on the same line,
PGO devirtualization may make a suboptimal decision. In some cases,
the directly called function becomes a candidate for devirtualization
if no other relevant outgoing edges with non-zero weight exist for the
caller's IRNode in the WeightedCG. The edge to this candidate is
considered the hottest. Despite having zero weight, this edge still
causes the interface call to be devirtualized.
This CL prevents devirtualization when the weight of the hottest edge
is 0.
Fixes#72092
Change-Id: I06c0c5e080398d86f832e09244aceaa4aeb98721
Reviewed-on: https://go-review.googlesource.com/c/go/+/655475
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change-Id: I52e5de04d137238d6f6779edcc662f5c7433c61e
Reviewed-on: https://go-review.googlesource.com/c/go/+/660195
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
We always need to zero allocations with pointers in them. So we don't
need some of the mallocs to take a needzero argument.
Change-Id: Ideaa7b738873ba6a93addb5169791b42e2baad7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/660455
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
For consistency, prefer crypto/hkdf over crypto/internal/fips140/hkdf.
Both should have the same behavior given the constrained use of HKDF
in TLS.
Change-Id: Ia982b9f7a6ea66537d748eb5ecae1ac1eade68a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/658217
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Remove the 'NoInline' field from CallExpr stucture, as it's no longer
used after enabling of tail call inlining.
Change-Id: Ief3ada9938589e7a2f181582ef2758ebc4d03aad
Reviewed-on: https://go-review.googlesource.com/c/go/+/655816
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Defer the association of the IOCP to the handle until the first
I/O operation is performed.
A handle can only be associated with one IOCP at a time, so this allows
external code to associate the handle with their own IOCP and still be
able to use a FD (through os.NewFile) to pass the handle around
(e.g. to a child process standard input, output, and error) without
having to worry about the IOCP association.
This CL doesn't change any user-visible behavior, as os.NewFile still
initializes the FD as non-pollable.
For #19098.
Change-Id: Id22a49846d4fda3a66ffcc0bc1b48eb39b395dc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/661955
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
In extreme cases (e.g., ctx = nil), it is recommended to initialize the
context only once at the entry point before using log and logAttrs.
Change-Id: Ib191963f52183406d7fcd5104b60fea1a9e1bc80
GitHub-Last-Rev: e1719b9539
GitHub-Pull-Request: golang/go#73066
Reviewed-on: https://go-review.googlesource.com/c/go/+/661255
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Add support for vector fixed-point arithmetic instructions to the
RISC-V assembler. This includes single width saturating addition
and subtraction, averaging addition and subtraction and scaling
shift instructions.
Change-Id: I9aa27e9565ad016ba5bb2b479e1ba70db24e4ff5
Reviewed-on: https://go-review.googlesource.com/c/go/+/646776
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Open /dev/bintime at process start on Plan 9,
marked close-on-exec, hold it open for the duration of the
process, and use it for obtaining time.
The change to using /dev/bintime also sets up for an upcoming
Plan 9 change to add monotonic time to that file. If the monotonic
field is available, then nanotime1 and time.now use that field.
Otherwise they fall back to using Unix nanoseconds as "monotonic",
as they always have.
Before this CL, monotonic time went backward any time
aux/timesync decided to adjust the system's time-of-day backward.
Also use /dev/random for randomness (once at startup).
Before this CL, there was no real randomness in the runtime
on Plan 9 (the crypto/rand package still had some). Now there will be.
Change-Id: I0c20ae79d3d96eff1a5f839a56cec5c4bc517e61
Reviewed-on: https://go-review.googlesource.com/c/go/+/656755
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Bypass: Russ Cox <rsc@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Make the openat-using version of RemoveAll use the appropriate
Windows equivalent, via new portable (but internal) functions
added for os.Root.
We could reimplement everything in terms of os.Root,
but this is a bit simpler and keeps the existing code structure.
Fixes#52745
Change-Id: I0eba0286398b351f2ee9abaa60e1675173988787
Reviewed-on: https://go-review.googlesource.com/c/go/+/661575
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Calling syscall.ReadFile and syscall.WriteFile on overlapped handles
always need to be passed a valid *syscall.Overlapped structure, even if
the handle is not added to a IOCP (like the Go runtime poller). Else,
the syscall will fail with ERROR_INVALID_PARAMETER.
We also need to handle ERROR_IO_PENDING errors when the overlapped
handle is not added to the poller, in which case we need to block until
the operation completes.
Previous CLs already added support for overlapped handles to the poller,
mostly to keep track of the file offset independently of the file
pointer (which is not supported for overlapped handles).
Fixed#15388.
Updates #19098.
Change-Id: I2103ab892a37d0e326752ae8c2771a43c13ba42e
Reviewed-on: https://go-review.googlesource.com/c/go/+/661795
Auto-Submit: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Change-Id: I0a3ce2e823697eee5bb5e7d5ea0ef025132c0689
Reviewed-on: https://go-review.googlesource.com/c/go/+/661655
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
On master, lookups on small Swiss Table maps (<= 8 elements) for
non-specialized key types are seemingly a performance regression
compared to the Go 1.23 map implementation (reported in #70849).
Currently, a linear scan is used for gets in these cases.
This CL changes (*Map).getWithKeySmall to instead use the SIMD or SWAR
match on the control bytes to then jump to candidate matching slots,
with sample results below for a 16-byte key. This especially helps the
hit case when the key is unpredictable, which previously had to scan an
unpredictable number of control bytes to find a candidate slot when the
key is unpredictable.
Separately, other CLs in this stack modify the main Swiss Table
benchmarks to randomize lookup key order (vs. previously most of the
benchmarks had a repeating lookup key ordering, which likely is
predictable until the map is too big). We have sample results for the
randomized key order benchmarks followed by results from the older
benchmarks.
The first table below is with randomized key order. For hits, the older
results get slower as there are more elements. With this CL, we see hits
for unpredictable key ordering (sizes 2-8) get a ~1.7x speedup from
~25ns to ~14ns, with a now consistent lookup time for the different
sizes. (The 1 element size map has a predictable key ordering because
there is only one key, and that reports a modest ~0.5ns or ~3%
performance penalty). Misses for unpredictable key order get a ~1.3x
speedup, from ~13ns to ~10ns, with similar results for the 1 element
size.
│ no-fix-new-bmarks │ fix-with-new-bmarks │
│ sec/op │ sec/op vs base │
MapSmallAccessHit/Key=smallType/Elem=int32/len=1-4 13.26n ± 0% 13.64n ± 0% +2.90% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=2-4 19.47n ± 0% 13.62n ± 0% -30.05% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=3-4 22.23n ± 0% 13.64n ± 0% -38.68% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=4-4 23.98n ± 0% 13.64n ± 0% -43.11% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=5-4 25.02n ± 0% 13.67n ± 0% -45.35% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=6-4 25.77n ± 1% 13.68n ± 2% -46.89% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=7-4 26.38n ± 0% 13.64n ± 0% -48.28% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=8-4 26.31n ± 0% 13.71n ± 21% -47.90% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=1-4 13.055n ± 0% 9.815n ± 0% -24.82% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=2-4 13.070n ± 0% 9.813n ± 0% -24.92% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=3-4 13.060n ± 0% 9.819n ± 0% -24.82% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=4-4 13.075n ± 0% 9.816n ± 0% -24.92% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=5-4 13.060n ± 0% 9.826n ± 0% -24.76% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=6-4 13.095n ± 19% 9.834n ± 31% -24.90% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=7-4 13.075n ± 19% 9.822n ± 27% -24.88% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=8-4 13.11n ± 16% 12.14n ± 19% -7.43% (p=0.000 n=20)
The next table uses the original benchmarks from just before this CL
stack (i.e., without shuffling lookup keys).
With this CL, we see improvement that is directionally similar to the
above results but not as large, presumably because the branches in the
linear scan are fairly predictable with predictable keys. (The numbers
here also include the time from a mod in the benchmark code, which
seemed to take around ~1/3 of CPU time based on spot checking a couple
of examples, vs. the modified benchmarks shown above have removed that
mod).
│ master-8c3e391573 │ just-fix-with-old-bmarks │
│ sec/op │ sec/op vs base │
MapSmallAccessHit/Key=smallType/Elem=int32/len=1-4 20.85n ± 0% 21.69n ± 0% +4.03% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=2-4 21.22n ± 0% 21.70n ± 0% +2.24% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=3-4 21.73n ± 0% 21.71n ± 0% ~ (p=0.158 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=4-4 22.06n ± 0% 21.71n ± 0% -1.56% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=5-4 22.41n ± 0% 21.73n ± 0% -3.01% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=6-4 22.71n ± 0% 21.72n ± 0% -4.38% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=7-4 22.98n ± 0% 21.71n ± 0% -5.53% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=8-4 23.20n ± 0% 21.72n ± 0% -6.36% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=1-4 19.95n ± 0% 17.30n ± 0% -13.28% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=2-4 19.96n ± 0% 17.31n ± 0% -13.28% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=3-4 19.95n ± 0% 17.29n ± 0% -13.33% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=4-4 19.95n ± 0% 17.30n ± 0% -13.29% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=5-4 19.96n ± 25% 17.32n ± 0% -13.22% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=6-4 19.99n ± 24% 17.29n ± 0% -13.51% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=7-4 19.97n ± 20% 17.34n ± 16% -13.14% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=8-4 20.02n ± 11% 17.33n ± 14% -13.44% (p=0.000 n=20)
geomean 21.02n 19.39n -7.78%
See #70849 for additional benchmark results, including results for arm64
(which also means without SIMD support).
Updates #54766
Updates #70700Fixes#70849
Change-Id: Ic2361bb6fc15b4436d1d1d5be7e4712e547f611b
Reviewed-on: https://go-review.googlesource.com/c/go/+/634396
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This will allow for further improvements and deduplication.
Change-Id: I9374fc2d16168ced06f3fcc9e558a9c85e24fd01
Reviewed-on: https://go-review.googlesource.com/c/go/+/650936
Reviewed-by: Fannie Zhang <Fannie.Zhang@arm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Add support for vector integer arithmetic instructions to the RISC-V
assembler. This includes vector addition, subtraction, integer
extension, add-with-carry, subtract-with-borrow, bitwise logical
operations, comparison, min/max, integer division and multiplication
instructions.
Change-Id: I8c191ef8e31291e13743732903e4f12356133a46
Reviewed-on: https://go-review.googlesource.com/c/go/+/646775
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
With recent LLVM toolchain, on macOS/AMD64, the race detector syso
file built from it contains X86_64_RELOC_SUBTRACTOR relocations,
which the Go linker currently doesn't handle in internal linking
mode. To ensure internal linking mode continue to work with the
race detector syso, this CL adds support of X86_64_RELOC_SUBTRACTOR
relocations.
X86_64_RELOC_SUBTRACTOR is actually a pair of relocations that
resolves to the difference between two symbol addresses (each
relocation specifies a symbol). For the cases we care (the race
syso), the symbol being subtracted out is always in the current
section, so we can just convert it to a PC-relative relocation,
with the addend adjusted. If later we need the more general form,
we can introduce a new mechanism (say, objabi.R_DIFF) that works
as a pair of relocations like the Mach-O one.
As we expect the pair of relocations be consecutive, don't reorder
(sort) relocation records when loading Mach-O objects.
Change-Id: I757456b07270fb4b2a41fd0fef67a2b39dd6b238
Reviewed-on: https://go-review.googlesource.com/c/go/+/660715
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
FD.Read converts a syscall.ERROR_OPERATION_ABORTED error to
ErrFileClosing. It does that in case the pipe operation was aborted by
a CancelIoEx call in FD.Close.
It doesn't take into account that the operation might have been
aborted by a CancelIoEx call in external code. In that case, the
operation should return the error as is.
Change-Id: I75dcf0edaace8b57dc47b398ea591ca9f116112b
Reviewed-on: https://go-review.googlesource.com/c/go/+/661555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
The docs currently are imprecise about comparisons. This could lead
users to believe that objects of the same type, allocated at the same
address, could produce weak pointers that are equal to
previously-created weak pointers. This is not the case. Weak pointers
map to objects, not addresses.
Update the documentation to state precisely that if two pointers do not
compare equal, then two weak pointers created from those two pointers
are guaranteed not to compare equal. Since a future pointer pointing to
the same address is not comparable with a pointer produced *before* an
object at that address has been reclaimed, this is sufficient to explain
that weak pointers map 1:1 with object offsets, not addresses.
(An object slot cannot be reused unless that slot is unreachable, so
by construction, there's never an opportunity to compare an "old" and
"new" pointer unless one uses unsafe tricks that violate the
unsafe.Pointer rules.)
Fixes#71381.
Change-Id: I5509fd433cde013926d725694d480c697a8bc911
Reviewed-on: https://go-review.googlesource.com/c/go/+/643935
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
When two packages declare a variable with the same name (with
linkname at least on one side), the linker will choose one as the
actual definition of the symbol if one has content (i.e. a DATA
symbol) and the other does not (i.e. a BSS symbol). When both have
content, it is redefinition error. When neither has content,
currently the choice is sort of arbitrary (depending on symbol
loading order, etc. which are subject to change).
One use case for that is that one wants to reference a symbol
defined in another package, and the reference side just wants to
see some of the fields, so it may be declared with a smaller type.
In this case, we want to choose the one with the larger size as
the true definition. Otherwise the code accessing the larger
sized one may read/write out of bounds, corrupting the next
variable. This CL makes the linker do so.
Fixes#72032.
Change-Id: I160aa9e0234702066cb8f141c186eaa89d0fcfed
Reviewed-on: https://go-review.googlesource.com/c/go/+/660696
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@golang.org>
Empty writes might be important for some protocols. Let Windows decide
what do with them rather than skipping them on our side. This is inline
with the behavior of other platforms.
While here, refactor the Read/Write/Pwrite methods to reduce one
indentation level and make the code easier to read.
Fixes#73084.
Change-Id: Ic5393358e237d53b8be6097cd7359ac0ff205309
Reviewed-on: https://go-review.googlesource.com/c/go/+/661435
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Optimise more branches with zero on riscv64. In particular, BLTU with
zero occurs with IsInBounds checks for index zero. This currently results
in two instructions and requires an additional register:
li t2, 0
bltu t2, t1, 0x174b4
This is equivalent to checking if the bounds is not equal to zero. With
this change:
bnez t1, 0x174c0
This removes more than 500 instructions from the Go binary on riscv64.
Change-Id: I6cd861d853e3ef270bd46dacecdfaa205b1c4644
Reviewed-on: https://go-review.googlesource.com/c/go/+/606715
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
We would panic when opening a symlink ending in ..,
where the symlink references the root itself.
Fixes#73081
Change-Id: I7dc3f041ca79df7942feec58c197fde6881ecae5
Reviewed-on: https://go-review.googlesource.com/c/go/+/661416
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
All other files here use the codegen package.
Change-Id: I714162941b9fa9051dacc29643e905fe60b9304b
Reviewed-on: https://go-review.googlesource.com/c/go/+/661135
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Also, rewrite some uses of LookupFieldOrMethod in terms of it.
+ doc, relnote
Fixes#70737
Change-Id: I58a6dd78ee78560d8b6ea2d821381960a72660ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/647196
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
ssa.Sym is only implemented by *ir.Name or *obj.LSym.
Change-Id: Ia171db618abd8b438fcc2cf402f40f3fe3ec6833
Reviewed-on: https://go-review.googlesource.com/c/go/+/660995
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
CL 660595 manually edited zsyscall_windows.go, making it be out of sync
with its associated //sys directive. Longtest builders are failing
due to that.
Fixes#73069.
Change-Id: If7256ef4b831423e4925fb6e5656fe3f7ef77fea
Reviewed-on: https://go-review.googlesource.com/c/go/+/661275
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>