Commit Graph

54849 Commits

Author SHA1 Message Date
Daniel McCarney deb9a7e4ad crypto/tls: match compression method alert across versions
When a pre-TLS 1.3 server processes a client hello message that
indicates compression methods that don't include the null compression
method, send an illegal parameter alert.

Previously we did this for TLS 1.3 server handshakes only, and the
legacy TLS versions used alertHandshakeFailure for this circumstance. By
switching this to alertIllegalParameter we use a consistent alert across
all TLS versions, and can also enable the NoNullCompression-TLS12 BoGo
test we were skipping.

Updates #72006

Change-Id: I27a2cd231e4b8762b0d9e2dbd3d8ddd5b87fd5ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/673736
TryBot-Bypass: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2025-05-21 12:58:42 -07:00
Daniel McCarney cb7fe2a05c crypto/tls: delete dead code curveIDForCurve
This unexported function has no call-sites.

Change-Id: I27a2cd231e4b8762b0d9e2dbd3d8ddd5b87fd5cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/673755
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
TryBot-Bypass: Daniel McCarney <daniel@binaryparadox.net>
2025-05-21 12:58:29 -07:00
Daniel McCarney 7ba996874b crypto/tls: verify server chooses advertised curve
When a crypto/tls client using TLS < 1.3 sends supported elliptic_curves
in a client hello message the server must limit itself to choosing one
of the supported options from our message. If we process a server key
exchange message that chooses an unadvertised curve, abort the
handshake w/ an error.

Previously we would not note that the server chose a curve we didn't
include in the client hello message, and would proceed with the
handshake as long as the chosen curve was one that we've implemented.
However, RFC 8422 5.1 makes it clear this is a server acting
out-of-spec, as it says:

  If a server does not understand the Supported Elliptic Curves
  Extension, does not understand the Supported Point Formats Extension,
  or is unable to complete the ECC handshake while restricting itself
  to the enumerated curves and point formats, it MUST NOT negotiate the
  use of an ECC cipher suite.

Changing our behaviour to enforce this also allows enabling the
UnsupportedCurve BoGo test.

Updates #72006

Change-Id: I27a2cd231e4b8762b0d9e2dbd3d8ddd5b87fd5cc
Reviewed-on: https://go-review.googlesource.com/c/go/+/673735
TryBot-Bypass: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2025-05-21 12:56:52 -07:00
thepudds a8e0641d5b reflect: optimize IsZero with a pointer comparison to global zeroVal
Our prior CL 649078 teaches the compiler to use a pointer to
runtime.zeroVal as the data pointer for an interface in cases it where
it can see that a zero value struct or array is being used in
an interface conversion.

This applies to some uses with reflect, such as:

  s := S{}
  v := reflect.ValueOf(s)

This CL builds on that to do a cheap pointer check in reflect.IsZero
to see if the Value points to runtime.zeroVal, which means it is a zero
value.

An alternative might be to do an initial pointer check in the typ.Equal
function for types where it makes sense to do but doesn't already.

This CL gives a performance boost of -51.71% geomean for
BenchmarkZero/IsZero, with most of the impact there on
arrays of structs. (The left column is CL 649078 and the right column
is this CL).

goos: linux
goarch: amd64
pkg: reflect
cpu: Intel(R) Xeon(R) CPU @ 2.80GHz
                                         │ find-zeroVal │          check-zeroVal              │
                                         │    sec/op    │   sec/op     vs base                │
Zero/IsZero/ByteArray/size=16-4             4.171n ± 0%   3.123n ± 0%  -25.13% (p=0.000 n=20)
Zero/IsZero/ByteArray/size=64-4             3.864n ± 0%   3.129n ± 0%  -19.02% (p=0.000 n=20)
Zero/IsZero/ByteArray/size=1024-4           3.878n ± 0%   3.126n ± 0%  -19.39% (p=0.000 n=20)
Zero/IsZero/BigStruct/size=1024-4           5.061n ± 0%   3.273n ± 0%  -35.34% (p=0.000 n=20)
Zero/IsZero/SmallStruct/size=16-4           4.191n ± 0%   3.275n ± 0%  -21.87% (p=0.000 n=20)
Zero/IsZero/SmallStructArray/size=64-4      8.636n ± 0%   3.127n ± 0%  -63.79% (p=0.000 n=20)
Zero/IsZero/SmallStructArray/size=1024-4   80.055n ± 0%   3.126n ± 0%  -96.10% (p=0.000 n=20)
Zero/IsZero/Time/size=24-4                  3.865n ± 0%   3.274n ± 0%  -15.29% (p=0.000 n=20)
geomean                                     6.587n        3.181n       -51.71%

Note these are of course micro benchmarks with easily predicted
branches. The extra branch we introduce in the CL might hurt if there
was for example a tight loop where 50% of the values used the
global zeroVal and 50% didn't in a way that is not well predicted,
although if the typ.Equal for many types already does an initial
pointer check, it might not matter much.

For the older BenchmarkIsZero in reflect, this change does not help.
(The compiler does not use the global zeroVal as the data word for the
interfaces in this benchmark because values are part of a larger value
that is too big to be used in the global zeroVal, and also a piece of
the larger value is mutated and is not zero).

                              │ find-zeroVal │           check-zeroVal            │
                              │   sec/op     │   sec/op     vs base               │
IsZero/ArrayComparable-4        14.58n ± 0%    14.59n ± 0%       ~ (p=0.177 n=20)
IsZero/ArrayIncomparable-4      163.8n ± 0%    167.5n ± 0%  +2.26% (p=0.000 n=20)
IsZero/StructComparable-4       6.847n ± 0%    6.847n ± 0%       ~ (p=0.703 n=20)
IsZero/StructIncomparable-4     35.41n ± 0%    35.10n ± 0%  -0.86% (p=0.000 n=20)
IsZero/ArrayInt_4-4             8.631n ± 0%    8.363n ± 0%  -3.10% (p=0.000 n=20)
IsZero/ArrayInt_1024-4          265.5n ± 0%    265.4n ± 0%       ~ (p=0.288 n=20)
IsZero/ArrayInt_1024_NoZero-4   135.8n ± 0%    136.2n ± 0%  +0.33% (p=0.000 n=20)
IsZero/Struct4Int-4             8.451n ± 0%    8.386n ± 0%  -0.77% (p=0.000 n=20)
IsZero/ArrayStruct4Int_1024-4   265.2n ± 0%    266.0n ± 0%  +0.30% (p=0.000 n=20)
IsZero/ArrayChanInt_1024-4      265.5n ± 0%    265.4n ± 0%       ~ (p=0.605 n=20)
IsZero/StructInt_512-4          135.8n ± 0%    135.8n ± 0%       ~ (p=0.396 n=20)
geomean                         55.22n         55.12n       -0.18%

Updates #71323

Change-Id: Ie083853a5bff03856277a293d94532a681f4a8d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/654135
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-05-21 12:25:04 -07:00
thepudds c3bb27bbc7 cmd/compile/internal/walk: use global zeroVal in interface conversions for zero values
This is a small-ish adjustment to the change earlier in our
stack in CL 649555, which started creating read-only global storage
for a composite literal used in an interface conversion and setting
the interface data pointer to point to that global storage.

In some cases, there are execution-time performance benefits to point
to runtime.zeroVal in particular. In reflect, pointer checks against
the runtime.zeroVal memory address are used to side-step some work,
such as in reflect.Value.Set and reflect.Value.IsZero.

In this CL, we therefore dig up the zeroVal symbol, and we use the
machinery from earlier in our stack to use a pointer to zeroVal for
the interface data pointer if we see examples like:

    sink = S{}
or:
    s := S{}
    sink = s

CL 649076 (also earlier in our stack) added most of the tests
along with debug diagnostics in convert.go to make it easier
to test this change.

We add a benchmark in reflect to show examples of performance benefit.
The left column is our immediately prior CL 649555, and the right is
this CL. (The arrays of structs here do not seem to benefit, which
we attempt to address in our next CL).

goos: linux
goarch: amd64
pkg: reflect
cpu: Intel(R) Xeon(R) CPU @ 2.80GHz
                                          │  cl-649555   │           new                       │
                                          │    sec/op    │   sec/op     vs base                │
Zero/IsZero/ByteArray/size=16-4              4.176n ± 0%   4.171n ± 0%        ~ (p=0.151 n=20)
Zero/IsZero/ByteArray/size=64-4              6.921n ± 0%   3.864n ± 0%  -44.16% (p=0.000 n=20)
Zero/IsZero/ByteArray/size=1024-4           21.210n ± 0%   3.878n ± 0%  -81.72% (p=0.000 n=20)
Zero/IsZero/BigStruct/size=1024-4           25.505n ± 0%   5.061n ± 0%  -80.15% (p=0.000 n=20)
Zero/IsZero/SmallStruct/size=16-4            4.188n ± 0%   4.191n ± 0%        ~ (p=0.106 n=20)
Zero/IsZero/SmallStructArray/size=64-4       8.639n ± 0%   8.636n ± 0%        ~ (p=0.973 n=20)
Zero/IsZero/SmallStructArray/size=1024-4     79.99n ± 0%   80.06n ± 0%        ~ (p=0.213 n=20)
Zero/IsZero/Time/size=24-4                   7.232n ± 0%   3.865n ± 0%  -46.56% (p=0.000 n=20)
Zero/SetZero/ByteArray/size=16-4             13.47n ± 0%   13.09n ± 0%   -2.78% (p=0.000 n=20)
Zero/SetZero/ByteArray/size=64-4             14.14n ± 0%   13.70n ± 0%   -3.15% (p=0.000 n=20)
Zero/SetZero/ByteArray/size=1024-4           24.22n ± 0%   20.18n ± 0%  -16.68% (p=0.000 n=20)
Zero/SetZero/BigStruct/size=1024-4           24.24n ± 0%   20.18n ± 0%  -16.73% (p=0.000 n=20)
Zero/SetZero/SmallStruct/size=16-4           13.45n ± 0%   13.10n ± 0%   -2.60% (p=0.000 n=20)
Zero/SetZero/SmallStructArray/size=64-4      14.12n ± 0%   13.69n ± 0%   -3.05% (p=0.000 n=20)
Zero/SetZero/SmallStructArray/size=1024-4    24.62n ± 0%   21.61n ± 0%  -12.26% (p=0.000 n=20)
Zero/SetZero/Time/size=24-4                  13.59n ± 0%   13.40n ± 0%   -1.40% (p=0.000 n=20)
geomean                                      14.06n        10.19n       -27.54%

Finally, here are results from the benchmark example from #71323.
Note however that almost all the benefit shown here is from our earlier
CL 649555, which is a more general purpose change and eliminates
the allocation using a different read-only global than this CL.

             │   go1.24       │               new                    │
             │     sec/op     │    sec/op     vs base                │
InterfaceAny   112.6000n ± 5%   0.8078n ± 3%  -99.28% (p=0.000 n=20)
ReflectValue      11.63n ± 2%    11.59n ± 0%        ~ (p=0.330 n=20)

             │  go1.24.out  │                 new.out                 │
             │     B/op     │    B/op     vs base                     │
InterfaceAny   224.0 ± 0%       0.0 ± 0%  -100.00% (p=0.000 n=20)
ReflectValue   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=20) ¹

             │  go1.24.out  │                 new.out                 │
             │  allocs/op   │ allocs/op   vs base                     │
InterfaceAny   1.000 ± 0%     0.000 ± 0%  -100.00% (p=0.000 n=20)
ReflectValue   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=20) ¹

Updates #71359
Updates #71323

Change-Id: I64d8cf1a7900f011d2ec59b948388aeda1150676
Reviewed-on: https://go-review.googlesource.com/c/go/+/649078
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-21 12:24:22 -07:00
thepudds f4de2ecffb cmd/compile/internal/walk: convert composite literals to interfaces without allocating
Today, this interface conversion causes the struct literal
to be heap allocated:

    var sink any

    func example1() {
        sink = S{1, 1}
    }

For basic literals like integers that are directly used in
an interface conversion that would otherwise allocate, the compiler
is able to use read-only global storage (see #18704).

This CL extends that to struct and array literals as well by creating
read-only global storage that is able to represent for example S{1, 1},
and then using a pointer to that storage in the interface
when the interface conversion happens.

A more challenging example is:

    func example2() {
        v := S{1, 1}
        sink = v
    }

In this case, the struct literal is not directly part of the
interface conversion, but is instead assigned to a local variable.

To still avoid heap allocation in cases like this, in walk we
construct a cache that maps from expressions used in interface
conversions to earlier expressions that can be used to represent the
same value (via ir.ReassignOracle.StaticValue). This is somewhat
analogous to how we avoided heap allocation for basic literals in
CL 649077 earlier in our stack, though here we also need to do a
little more work to create the read-only global.

CL 649076 (also earlier in our stack) added most of the tests
along with debug diagnostics in convert.go to make it easier
to test this change.

See the writeup in #71359 for details.

Fixes #71359
Fixes #71323
Updates #62653
Updates #53465
Updates #8618

Change-Id: I8924f0c69ff738ea33439bd6af7b4066af493b90
Reviewed-on: https://go-review.googlesource.com/c/go/+/649555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-05-21 12:23:26 -07:00
Filippo Valsorda ce46c9db86 internal/godebug,crypto/fips140: make fips140 setting immutable
Updates #70123

Co-authored-by: qmuntal <quimmuntal@gmail.com>
Change-Id: I6a6a4656fd23ecd82428cccbd7c48692287fc75a
Reviewed-on: https://go-review.googlesource.com/c/go/+/657116
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
2025-05-21 12:21:44 -07:00
Filippo Valsorda d327e52d43 crypto/internal/fips140: use hash.Hash
Since package hash is just the interface definition, not an
implementation, we can make a good argument that it doesn't impact the
security of the module and can be imported from outside.

For #69521

Change-Id: I6a6a4656b9c3cac8bb9ab8e8df11fa3238dc5d1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/674917
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
2025-05-21 12:20:10 -07:00
Junyang Shao d6c29c7156 cmd/compile: fix offset calculation error in memcombine
Fixes #73812

Change-Id: If7a6e103ae9e1442a2cf4a3c6b1270b6a1887196
Reviewed-on: https://go-review.googlesource.com/c/go/+/675175
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 12:17:08 -07:00
Daniel McCarney a21b71daf5 crypto/tls: have servers prefer TLS 1.3 when supported
Previously the common Config.mutualVersion() code prioritized the
selected version based on the provided peerVersions being sent in peer
preference order.

Instead we would prefer to see TLS 1.3 used whenever it is
supported, even if the peer would prefer an older protocol version.
This commit updates mutualVersions() to implement this policy change.

Our new behaviour matches the behaviour of other TLS stacks, notably
BoringSSL, and so also allows enabling the IgnoreClientVersionOrder BoGo
test that we otherwise must skip.

Updates #72006

Change-Id: I27a2cd231e4b8762b0d9e2dbd3d8ddd5b87fd5cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/673236
Auto-Submit: Daniel McCarney <daniel@binaryparadox.net>
TryBot-Bypass: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2025-05-21 12:17:01 -07:00
Roland Shoemaker c5a1fc1f97 crypto/tls: add GetEncryptedClientHelloKeys
This allows servers to rotate their ECH keys without needing to restart
the server.

Fixes #71920

Change-Id: I55591ab3303d5fde639038541c50edcf1fafc9aa
Reviewed-on: https://go-review.googlesource.com/c/go/+/670655
TryBot-Bypass: Roland Shoemaker <roland@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
2025-05-21 12:15:37 -07:00
Filippo Valsorda a731955f0f crypto/sha1: use cryptotest.TestAllImplementations and impl.Register
Not running TryBots on s390x because the new LUCI builder is broken.

Change-Id: I6a6a4656a8d52fa5ace9effa67a88fbfd7d19b04
Reviewed-on: https://go-review.googlesource.com/c/go/+/674915
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
2025-05-21 12:11:13 -07:00
Roland Shoemaker 272262750f crypto: drop pre-AVX2 SHA assembly implementations
Drop the entire pre-AVX2 assembly implementation of SHA-1, SHA-256, and
SHA-512. This also technically impacts the SHA-1 AVX2 implementation,
since it previously called the pre-AVX2 implementation for the last
block if the number of blocks wasn't a multiple of 2.

Instead of keeping the entire implementation just for that case, we
just call the generic implementation for the last block. This will be a
little slower, but still seems like a win.

Updates #69587

Change-Id: Id5234c42910d8c6ec6b8df700a721c0953dff02b
Reviewed-on: https://go-review.googlesource.com/c/go/+/657716
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
2025-05-21 12:10:59 -07:00
Roland Shoemaker 919d9858bc crypto/internal/fips140/sha3: remove usages of WORD for s390x
We support KIMD and KLMD now, paves the way for banning usage of BYTE
and WORD instructions in crypto assembly.

Change-Id: I0f93744663f23866b2269591db70389e0c77fa4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/671095
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-05-21 12:10:20 -07:00
Julian Zhu 3dbc775d60 cmd/compile/internal: intrinsify publicationBarrier on mipsx
This enables publicationBarrier to be used as an intrinsic on mipsx.

Change-Id: Ic199f34b84b3058bcfab79aac8f2399ff21a97ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/674856
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-05-21 12:08:18 -07:00
Mateusz Poliwczak 3a7a856951 crypto/x509: disallow negative path length
pathLenConstraint is restricted to unsigned integers.
Also the -1 value of cert.MaxPathLength has a special
meaning, so we shouldn't allow unmarshaling -1.

BasicConstraints ::= SEQUENCE {
     cA                      BOOLEAN DEFAULT FALSE,
     pathLenConstraint       INTEGER (0..MAX) OPTIONAL }

Change-Id: I485a6aa7223127becc86c423e1ef9ed2fbd48209
GitHub-Last-Rev: 75a11b47b9
GitHub-Pull-Request: golang/go#60706
Reviewed-on: https://go-review.googlesource.com/c/go/+/502076
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2025-05-21 12:07:24 -07:00
Julian Zhu 94e3caeec1 cmd/compile/internal: intrinsify publicationBarrier on mips64x
This enables publicationBarrier to be used as an intrinsic on mips64x.

Change-Id: I4030ea65086c37ee1dcc1675d0d5d40ef8683851
Reviewed-on: https://go-review.googlesource.com/c/go/+/674855
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2025-05-21 12:06:44 -07:00
Michael Anthony Knyszek 77345f41ee internal/trace: skip clock snapshot checks on Windows in stress mode
Windows' monotonic and wall clock granularity is just too coarse to get
reasonable values out of stress mode, which is creating new trace
generations constantly.

Fixes #73813.

Change-Id: Id9cb2fed9775ce8d78a736d0164daa7bf45075e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/675096
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 12:03:26 -07:00
thepudds ed24bb4e60 cmd/compile/internal/escape: propagate constants to interface conversions to avoid allocs
Currently, the integer value in the following interface conversion gets
heap allocated:

   v := 1000
   fmt.Println(v)

In contrast, this conversion does not currently cause the integer value
to be heap allocated:

   fmt.Println(1000)

The second example is able to avoid heap allocation because of an
optimization in walk (by Josh in #18704 and related issues) that
recognizes a literal is being used. In the first example, that
optimization is currently thwarted by the literal getting assigned
to a local variable prior to use in the interface conversion.

This CL propagates constants to interface conversions like
in the first example to avoid heap allocations, instead using
a read-only global. The net effect is roughly turning the first example
into the second.

One place this comes up in practice currently is with logging or
debug prints. For example, if we have something like:

   func conditionalDebugf(format string, args ...interface{}) {
   	if debugEnabled {
   		fmt.Fprintf(io.Discard, format, args...)
   	}
   }

Prior to this CL, this integer is heap allocated, even when the
debugEnabled flag is false, and even when the compiler
inlines conditionalDebugf:

   v := 1000
   conditionalDebugf("hello %d", v)

With this CL, the integer here is no longer heap allocated, even when
the debugEnabled flag is enabled, because the compiler can now see that
it can use a read-only global.

See the writeup in #71359 for more details.

CL 649076 (earlier in our stack) added most of the tests
along with debug diagnostics in convert.go to make it easier
to test this change.

Updates #71359
Updates #62653
Updates #53465
Updates #8618

Change-Id: I19a51e74b36576ebb0b9cf599267cbd2bd847ce4
Reviewed-on: https://go-review.googlesource.com/c/go/+/649079
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-05-21 12:02:43 -07:00
thepudds e89791983a cmd/compile/internal/escape: use an ir.ReassignOracle
Using the new-ish ir.ReassignOracle is more efficient than calling
ir.StaticValue repeatedly.

This CL now uses an ir.ReassignOracle for the recent
make constant propagation introduced in CL 649035.

We also pull the main change from CL 649035 into a new function,
which we will update later in our stack. We will also use the
ReassignOracles introduced here later in our stack.

(We originally did most of this work in CL 649077, but we abandoned
that in favor of CL 649035).

We could also use an ir.ReassignOracle in the older processing of
ir.OCALLFUNC in (*escape).call, but for now, we just leave that
as a TODO.

Updates #71359

Change-Id: I6e02eeac269bde3a302622b4dfe0c8dc63ec9ffc
Reviewed-on: https://go-review.googlesource.com/c/go/+/673795
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-21 12:01:01 -07:00
Damien Neil 4b7aa542eb os: add Root.ReadFile and Root.WriteFile
For #73126

Change-Id: Ie69cc274e7b59f958c239520318b89ff0141e26b
Reviewed-on: https://go-review.googlesource.com/c/go/+/674315
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Damien Neil <dneil@google.com>
2025-05-21 11:59:27 -07:00
Sean Liao 3ae95aafb5 log/slog: add GroupAttrs
GroupAttrs is a more efficient version of Group
that takes a slice of Attr values.

Fixes #66365

Change-Id: Ic3046704825e17098f2fea5751f2959dce1073e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/672915
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 11:29:28 -07:00
Michael Pratt ce49eb488a runtime: skip windows stack tests in race mode
These became race instrumented in CL 643897, but race mode uses more
memory, so the test doesn't make much sense.

For #71395.

Change-Id: I6a6a636cf09ba29625aa9a22550314845fb2e611
Reviewed-on: https://go-review.googlesource.com/c/go/+/675077
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 11:05:35 -07:00
Michael Pratt d2f229db7a runtime: avoid register clobber in s390x racecall
This is a regression in CL 643875. Loading gsignal clobbers R8, which
contains the m pointer needed for loading g0.

For #71395.

Change-Id: I6a6a636ca95442767efe0eb1b358f2139d18c5b8
Reviewed-on: https://go-review.googlesource.com/c/go/+/675035
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 11:05:32 -07:00
Lokesh Kumar 304d9e2fd1 bufio: update buffer documentation
Fixes #73778

Change-Id: If6d87a92786c9b0ee2bd790b57937919afe0fc5c
GitHub-Last-Rev: 4b4c7595d5
GitHub-Pull-Request: golang/go#73804
Reviewed-on: https://go-review.googlesource.com/c/go/+/674695
Auto-Submit: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-05-21 11:05:26 -07:00
Michael Pratt 419367969c cmd/link: require cgo internal linking in TestIssue33979
This was a typo regression in CL 643897, which accidentally dropped the
requirement for cgo internal linking. As a result, this test is
continuously failing on windows-arm64.

For #71395.

Cq-Include-Trybots: luci.golang.try:gotip-windows-arm64
Change-Id: I6a6a636c25fd399cda6649ef94655aa112f10f63
Reviewed-on: https://go-review.googlesource.com/c/go/+/675015
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 10:42:43 -07:00
Roland Shoemaker 360600b1d2 crypto/tls: replace custom intern cache with weak cache
Uses the new weak package to replace the existing custom intern cache
with a map of weak.Pointers instead. This simplifies the cache, and
means we don't need to store a slice of handles on the Conn anymore.

Change-Id: I5c2bf6ef35fac4255e140e184f4e48574b34174c
Reviewed-on: https://go-review.googlesource.com/c/go/+/644176
TryBot-Bypass: Roland Shoemaker <roland@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Roland Shoemaker <roland@golang.org>
2025-05-21 10:31:41 -07:00
Michael Pratt e6dacf91ff runtime: use cgroup CPU limit to set GOMAXPROCS
This CL adds two related features enabled by default via compatibility
GODEBUGs containermaxprocs and updatemaxprocs.

On Linux, containermaxprocs makes the Go runtime consider cgroup CPU
bandwidth limits (quota/period) when setting GOMAXPROCS. If the cgroup
limit is lower than the number of logical CPUs available, then the
cgroup limit takes precedence.

On all OSes, updatemaxprocs makes the Go runtime periodically
recalculate the default GOMAXPROCS value and update GOMAXPROCS if it has
changed. If GOMAXPROCS is set manually, this update does not occur. This
is intended primarily to detect changes to cgroup limits, but it applies
on all OSes because the CPU affinity mask can change as well.

The runtime only considers the limit in the leaf cgroup (the one that
actually contains the process), caching the CPU limit file
descriptor(s), which are periodically reread for updates. This is a
small departure from the original proposed design. It will not consider
limits of parent cgroups (which may be lower than the leaf), and it will
not detection cgroup migration after process start.

We can consider changing this in the future, but the simpler approach is
less invasive; less risk to packages that have some awareness of runtime
internals. e.g., if the runtime periodically opens new files during
execution, file descriptor leak detection is difficult to implement in a
stable way.

For #73193.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Change-Id: I6a6a636c631c1ae577fb8254960377ba91c5dc98
Reviewed-on: https://go-review.googlesource.com/c/go/+/670497
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 10:21:55 -07:00
Michael Pratt f12c66fbed internal/runtime/cgroup: CPU cgroup limit discovery
For #73193.

Change-Id: I6a6a636ca9fa9cba429cf053468c56c2939cb1ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/668638
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 10:21:47 -07:00
Michael Pratt 06450a82b0 internal/runtime/cgroup: add line-by-line reader using a single scratch buffer
Change-Id: I6a6a636ca21edcc6f16705fbb72a5241d4f7f22d
Reviewed-on: https://go-review.googlesource.com/c/go/+/668637
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 10:21:38 -07:00
Damien Neil e59e128f90 os: add Root.MkdirAll
For #67002

Change-Id: Idd74b5b59e787e89bdfad82171b6a7719465f501
Reviewed-on: https://go-review.googlesource.com/c/go/+/674116
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 10:14:43 -07:00
Roland Shoemaker 63dcc7b906 crypto/sha1: add sha-ni AMD64 implementation
Based on the Intel docs. Provides a ~44% speed-up compared to the AVX
implementation and a ~57% speed-up compared to the generic AMD64
assembly implementation.

                    │ /usr/local/google/home/bracewell/sha1-avx.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                     sec/op                      │            sec/op             vs base                │
Hash8Bytes/New-24                                        157.60n ± 0%                    92.51n ± 0%  -41.30% (p=0.000 n=20)
Hash8Bytes/Sum-24                                        147.00n ± 0%                    85.06n ± 0%  -42.14% (p=0.000 n=20)
Hash320Bytes/New-24                                       625.3n ± 0%                    276.7n ± 0%  -55.75% (p=0.000 n=20)
Hash320Bytes/Sum-24                                       626.2n ± 0%                    272.4n ± 0%  -56.51% (p=0.000 n=20)
Hash1K/New-24                                            1206.5n ± 0%                    692.2n ± 0%  -42.63% (p=0.000 n=20)
Hash1K/Sum-24                                            1210.0n ± 0%                    688.2n ± 0%  -43.13% (p=0.000 n=20)
Hash8K/New-24                                             7.744µ ± 0%                    4.920µ ± 0%  -36.46% (p=0.000 n=20)
Hash8K/Sum-24                                             7.737µ ± 0%                    4.913µ ± 0%  -36.50% (p=0.000 n=20)
geomean                                                   971.5n                         536.1n       -44.81%

                    │ /usr/local/google/home/bracewell/sha1-avx.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                       B/s                       │             B/s              vs base                 │
Hash8Bytes/New-24                                        48.41Mi ± 0%                  82.47Mi ± 0%   +70.37% (p=0.000 n=20)
Hash8Bytes/Sum-24                                        51.90Mi ± 0%                  89.70Mi ± 0%   +72.82% (p=0.000 n=20)
Hash320Bytes/New-24                                      488.0Mi ± 0%                 1103.0Mi ± 0%  +126.01% (p=0.000 n=20)
Hash320Bytes/Sum-24                                      487.4Mi ± 0%                 1120.5Mi ± 0%  +129.91% (p=0.000 n=20)
Hash1K/New-24                                            809.6Mi ± 0%                 1410.8Mi ± 0%   +74.26% (p=0.000 n=20)
Hash1K/Sum-24                                            806.9Mi ± 0%                 1419.1Mi ± 0%   +75.86% (p=0.000 n=20)
Hash8K/New-24                                           1008.9Mi ± 0%                 1588.0Mi ± 0%   +57.40% (p=0.000 n=20)
Hash8K/Sum-24                                           1009.8Mi ± 0%                 1590.1Mi ± 0%   +57.47% (p=0.000 n=20)
geomean                                                  375.8Mi                       680.9Mi        +81.20%

                    │ /usr/local/google/home/bracewell/sha1-amd64.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                      sec/op                       │            sec/op             vs base                │
Hash8Bytes/New-24                                          153.90n ± 0%                    92.51n ± 0%  -39.89% (p=0.000 n=20)
Hash8Bytes/Sum-24                                          145.90n ± 0%                    85.06n ± 0%  -41.70% (p=0.000 n=20)
Hash320Bytes/New-24                                         666.8n ± 0%                    276.7n ± 0%  -58.50% (p=0.000 n=20)
Hash320Bytes/Sum-24                                         660.3n ± 0%                    272.4n ± 0%  -58.75% (p=0.000 n=20)
Hash1K/New-24                                              1810.5n ± 0%                    692.2n ± 0%  -61.77% (p=0.000 n=20)
Hash1K/Sum-24                                              1806.0n ± 0%                    688.2n ± 0%  -61.90% (p=0.000 n=20)
Hash8K/New-24                                              13.509µ ± 0%                    4.920µ ± 0%  -63.58% (p=0.000 n=20)
Hash8K/Sum-24                                              13.515µ ± 0%                    4.913µ ± 0%  -63.65% (p=0.000 n=20)
geomean                                                     1.248µ                         536.1n       -57.05%

                    │ /usr/local/google/home/bracewell/sha1-amd64.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                        B/s                        │             B/s              vs base                 │
Hash8Bytes/New-24                                          49.57Mi ± 0%                  82.47Mi ± 0%   +66.37% (p=0.000 n=20)
Hash8Bytes/Sum-24                                          52.29Mi ± 0%                  89.70Mi ± 0%   +71.52% (p=0.000 n=20)
Hash320Bytes/New-24                                        457.7Mi ± 0%                 1103.0Mi ± 0%  +140.97% (p=0.000 n=20)
Hash320Bytes/Sum-24                                        462.2Mi ± 0%                 1120.5Mi ± 0%  +142.45% (p=0.000 n=20)
Hash1K/New-24                                              539.4Mi ± 0%                 1410.8Mi ± 0%  +161.57% (p=0.000 n=20)
Hash1K/Sum-24                                              540.7Mi ± 0%                 1419.1Mi ± 0%  +162.44% (p=0.000 n=20)
Hash8K/New-24                                              578.4Mi ± 0%                 1588.0Mi ± 0%  +174.57% (p=0.000 n=20)
Hash8K/Sum-24                                              578.1Mi ± 0%                 1590.1Mi ± 0%  +175.07% (p=0.000 n=20)
geomean                                                    292.4Mi                       680.9Mi       +132.86%

Change-Id: Ife90386ba410a80c2e6222c1fe4df2368c4e12b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/642157
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Neal Patel <nealpatel@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 10:14:30 -07:00
Roland Shoemaker 40b19b56a9 runtime: add valgrind instrumentation
Add build tag gated Valgrind annotations to the runtime which let it
understand how the runtime manages memory. This allows for Go binaries
to be run under Valgrind without emitting spurious errors.

Instead of adding the Valgrind headers to the tree, and using cgo to
call the various Valgrind client request macros, we just add an assembly
function which emits the necessary instructions to trigger client
requests.

In particular we add instrumentation of the memory allocator, using a
two-level mempool structure (as described in the Valgrind manual [0]).
We also add annotations which allow Valgrind to track which memory we
use for stacks, which seems necessary to let it properly function.

We describe the memory model to Valgrind as follows: we treat heap
arenas as a "pool" created with VALGRIND_CREATE_MEMPOOL_EXT (so that we
can use VALGRIND_MEMPOOL_METAPOOL and VALGRIND_MEMPOOL_AUTO_FREE).
Within the pool we treat spans as "superblocks", annotated with
VALGRIND_MEMPOOL_ALLOC. We then allocate individual objects within spans
with VALGRIND_MALLOCLIKE_BLOCK.

It should be noted that running binaries under Valgrind can be _quite
slow_, and certain operations, such as running the GC, can be _very
slow_. It is recommended to run programs with GOGC=off. Additionally,
async preemption should be turned off, since it'll cause strange
behavior (GODEBUG=asyncpreemptoff=1).

Running Valgrind with --leak-check=yes will result in some errors
resulting from some things not being marked fully free'd. These likely
need more annotations to rectify, but for now it is recommended to run
with --leak-check=off.

Updates #73602

[0] https://valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools

Change-Id: I71b26c47d7084de71ef1e03947ef6b1cc6d38301
Reviewed-on: https://go-review.googlesource.com/c/go/+/674077
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 10:08:08 -07:00
Michael Matloob 2a5ac1a993 cmd/doc: allow go doc -http without package in current directory
go doc tries to find a package to display documentation for. In the case
that no package is provided, it uses "." just like go list does. So if
go doc -http is run without any arguments, it tries to show the
documentation for the package in the current directory. As a special
case, if no arguments are provided, allow no package to match the
current directory and just open the root pkgsite page.

For #68106

Change-Id: I6d65b160a838591db953fac630eced6b09106877
Reviewed-on: https://go-review.googlesource.com/c/go/+/675075
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 09:58:00 -07:00
Michael Anthony Knyszek 8b45a3f78b runtime: guarantee checkfinalizers test allocates in a shared tiny block
Currently the checkfinalizers test (TestDetectCleanupOrFinalizerLeak)
only *tries* to ensure the tiny alloc with a cleanup attached shares a
block with other objects. However, what it does is insufficient, because
it could get unlucky and have the last object allocated be the first
object of a new block.

This change changes the test to guarantee that a tiny object is not at
the start of a fresh block by looking at the alignment of the object's
pointer. If the object's pointer is odd, then that's good enough to know
that it shares a block with something else, since the blocks themselves
are aligned to a much higher power of two.

This fixes a failure I've seen on the builders.

Fixes #73810.

Change-Id: Ieafdbb9cccb0d2dc3659a9a5d9d9233718461635
Reviewed-on: https://go-review.googlesource.com/c/go/+/674655
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-21 09:50:31 -07:00
Damien Neil 8960970009 os: add Root.RemoveAll
For #67002

Change-Id: If59dab4fd934a115d8ff383826525330de750b54
Reviewed-on: https://go-review.googlesource.com/c/go/+/661595
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Damien Neil <dneil@google.com>
2025-05-21 09:30:51 -07:00
Mark Freeman 26e05b95c2 internal/pkgbits: specify that RelIdx is an element index
Without this, it's not clear what this is relative to or the
granularity of the index.

Change-Id: Ibaabe47e089f0ba9b084523969c5347ed4c9dbee
Reviewed-on: https://go-review.googlesource.com/c/go/+/674636
Auto-Submit: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 08:48:01 -07:00
Xiaolin Zhao 4ce1c8e9e1 cmd/compile: add rules about ORN and ANDN
Reduce the number of go toolchain instructions on loong64 as follows.

    file      before    after     Δ       %
    addr2line 279880    279776  -104   -0.0372%
    asm       556638    556410  -228   -0.0410%
    buildid   272272    272072  -200   -0.0735%
    cgo       481522    481318  -204   -0.0424%
    compile   2457788   2457580 -208   -0.0085%
    covdata   323384    323280  -104   -0.0322%
    cover     518450    518234  -216   -0.0417%
    dist      340790    340686  -104   -0.0305%
    distpack  282456    282252  -204   -0.0722%
    doc       789932    789688  -244   -0.0309%
    fix       324332    324228  -104   -0.0321%
    link      704622    704390  -232   -0.0329%
    nm        277132    277028  -104   -0.0375%
    objdump   507862    507758  -104   -0.0205%
    pack      221774    221674  -100   -0.0451%
    pprof     1469816   1469552 -264   -0.0180%
    test2json 254836    254732  -104   -0.0408%
    trace     1100002   1099738 -264   -0.0240%
    vet       781078    780874  -204   -0.0261%
    go        1529116   1528848 -268   -0.0175%
    gofmt     318556    318448  -108   -0.0339%
    total     13792238 13788566 -3672  -0.0266%

Change-Id: I23fb3ebd41309252c7075e57ea7094e79f8c4fef
Reviewed-on: https://go-review.googlesource.com/c/go/+/674335
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
2025-05-21 08:28:37 -07:00
Guoqi Chen 0810fd2d92 cmd/internal/obj/loong64: remove unused register alias definitions
Change-Id: Ie788747372cd47cb3780e75b35750bb08bd166fc
Reviewed-on: https://go-review.googlesource.com/c/go/+/542835
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 08:27:58 -07:00
Mark Freeman bfb8f13274 internal/pkgbits: indent productions and hoist some types up
The types being hoisted are those which cannot be referenced; that is,
where Ref[T] is illegal. These are most clearly owned by pkgbits. The
types which follow are those which can be referenced.

Referenceable types are more hazy due to the reference mechanism of UIR
- sections. These are a detail of the UIR file format and are surfaced
directly to importers.

I suspect that pkgbits would benefit from a reference mechanism not
dependent on sections. This would permit us to push down many types
from the noder into pkgbits, reducing the interface surface without
giving up deduplication.

Change-Id: Ifaf5cd9de20c767ad0941413385b308d628aac6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/674635
Auto-Submit: Mark Freeman <mark@golang.org>
TryBot-Bypass: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-05-21 08:25:32 -07:00
Felix Geisendörfer 07b94b2db2 internal/trace: add generator tests for sync events
Add generator tests that verify the timestamps for the sync events
emitted in the go1.25 trace format and earlier versions.

Add the ability to configure the properties of the per-generation sync
batches in testgen. Also refactor testgen to produce more realistic
timestamps by keeping track of lastTs and using it for structural
batches that don't have their own timestamps. Otherwise they default to
zero which means the minTs of the generation can't be controlled.

For #69869

Change-Id: I92a49b8281bc4169b63e13c030c1de7720cd6f26
Reviewed-on: https://go-review.googlesource.com/c/go/+/653876
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 08:23:41 -07:00
Felix Geisendörfer b22da3f544 internal/trace/internal/testgen: make generated trace version configurable
Replace hard coded references to version.Go122 with the trace version
passed to NewTrace. This allows writing testgen tests for newer trace
versions.

For #69869

Change-Id: Id25350cea1c397a09ca23465526ff259e34a4752
Reviewed-on: https://go-review.googlesource.com/c/go/+/653875
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21 08:23:37 -07:00
Felix Geisendörfer 847f157166 internal/trace: add a validator test for the new clock snapshots
Check that the clock snapshots, when expected to be present, are
non-zero and monotonically increasing.

This required some refactoring to make the validator aware of the
version of the trace it is validating.

Change-Id: I04c4dd10fe6975cbac12bb0ddaebcec3a5284e7b
Reviewed-on: https://go-review.googlesource.com/c/go/+/669715
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-21 08:22:16 -07:00
Felix Geisendörfer 2d216141a1 internal/trace: expose clock snapshot timestamps on sync event
Add ClockSnapshot field to the Sync event type and populate it with the
information from the new EvClockSnapshot event when available.

For #69869

Change-Id: I3b24b5bfa15cc7a7dba270f5e6bf189adb096840
Reviewed-on: https://go-review.googlesource.com/c/go/+/653576
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-05-21 08:22:01 -07:00
Felix Geisendörfer 112c23612f runtime,internal/trace: emit clock snapshots at the start of trace generations
Replace the per-generation EvEventBatch containing a lone EvFrequency
event with a per-generation EvEventBatch containing a EvSync header
followed by an EvFrequency and EvClockSnapshot event.

The new EvClockSnapshot event contains trace, mono and wall clock
snapshots taken in close time proximity. Ignoring minor resolution
differences, the trace and mono clock are the same on linux, but not on
windows (which still uses a TSC based trace clock).

Emit the new sync batch at the very beginning of every new generation
rather than the end to be in harmony with the internal/trace reader
which emits a sync event at the beginning of every generation as well
and guarantees monotonically increasing event timestamps.

Bump the version of the trace file format to 1.25 since this change is
not backwards compatible.

Update the internal/trace reader implementation to decode the new
events, but do not expose them to the public reader API yet. This is
done in the next CL.

For #69869

Change-Id: I5bfedccdd23dc0adaf2401ec0970cbcc32363393
Reviewed-on: https://go-review.googlesource.com/c/go/+/653575
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21 08:20:34 -07:00
Mark Ryan 0d7dc6842b cmd/internal/obj/riscv: fix vector integer multiply add
The RISC-V integer vector multiply add instructions are not encoded
correctly; the first and second arguments are swapped. For example,
the instruction

VMACCVV V1, V2, V3

encodes to

b620a1d7 or vmacc.vv v3,v1,v2

and not

b61121d7 or vmacc.vv v3,v2,v1

as expected.

This is inconsistent with the argument ordering we use for 3
argument vector instructions, in which the argument order, as given
in the RISC-V specifications, is reversed, and also with the vector
FMA instructions which have the same argument ordering as the vector
integer multiply add instructions in the "The RISC-V Instruction Set
Manual Volume I". For example, in the ISA manual we have the
following instruction definitions

; Integer multiply-add, overwrite addend
vmacc.vv vd, vs1, vs2, vm    # vd[i] = +(vs1[i] * vs2[i]) + vd[i]

; FP multiply-accumulate, overwrites addend
vfmacc.vv vd, vs1, vs2, vm    # vd[i] = +(vs1[i] * vs2[i]) + vd[i]

It's reasonable to expect that the Go assembler would use the same
argument ordering for both of these instructions. It currently does
not.

We fix the issue by switching the argument ordering for the vector
integer multiply add instructions to match those of the vector FMA
instructions.

Change-Id: Ib98e9999617f991969e5c831734b3bb3324439f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/670335
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-05-21 07:19:19 -07:00
Damien Neil 0375edd901 os: skip TestOpenFileCreateExclDanglingSymlink when no symlinks
Skip this test on plan9, and any other platform that doesn't
have symlinks.

Fixes #73729

Change-Id: I8052db24ed54c3361530bd4f54c96c9d10c4714c
Reviewed-on: https://go-review.googlesource.com/c/go/+/674697
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Richard Miller <millerresearch@gmail.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-05-21 06:56:39 -07:00
Guoqi Chen 7f806c1052 runtime, internal/fuzz: optimize build tag combination on loong64
Change-Id: I971b789beb08e0c6b11169fd5547a8d4ab74fab5
Reviewed-on: https://go-review.googlesource.com/c/go/+/668155
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-05-21 00:07:41 -07:00
Xiaolin Zhao 5b17e2f927 crypto/subtle: optimize function xorBytes using SIMD on loong64
On the Loongson-3A6000-HV and Loongson-3A5000, there has been
a significant improvement in all performance metrics except
for '8Bytes', which has experienced a decline, as follows.

goos: linux
goarch: loong64
pkg: crypto/subtle
cpu: Loongson-3A6000-HV @ 2500.00MHz
                                   |  bench.old   |              bench.new              |
                                   |    sec/op    |   sec/op     vs base                |
XORBytes/8Bytes                       7.282n ± 0%   8.805n ± 0%  +20.91% (p=0.000 n=10)
XORBytes/128Bytes                     14.43n ± 0%   10.01n ± 0%  -30.63% (p=0.000 n=10)
XORBytes/2048Bytes                   110.60n ± 0%   46.57n ± 0%  -57.89% (p=0.000 n=10)
XORBytes/8192Bytes                    418.7n ± 0%   161.8n ± 0%  -61.36% (p=0.000 n=10)
XORBytes/32768Bytes                   3.220µ ± 0%   1.673µ ± 0%  -48.04% (p=0.000 n=10)
XORBytesAlignment/8Bytes0Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes1Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes2Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes3Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes4Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes5Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes6Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/8Bytes7Offset       7.621n ± 0%   9.305n ± 0%  +22.10% (p=0.000 n=10)
XORBytesAlignment/128Bytes0Offset    14.430n ± 0%   9.973n ± 0%  -30.88% (p=0.000 n=10)
XORBytesAlignment/128Bytes1Offset     20.83n ± 0%   11.03n ± 0%  -47.05% (p=0.000 n=10)
XORBytesAlignment/128Bytes2Offset     20.83n ± 0%   11.03n ± 0%  -47.07% (p=0.000 n=10)
XORBytesAlignment/128Bytes3Offset     20.83n ± 0%   11.03n ± 0%  -47.07% (p=0.000 n=10)
XORBytesAlignment/128Bytes4Offset     20.83n ± 0%   11.03n ± 0%  -47.05% (p=0.000 n=10)
XORBytesAlignment/128Bytes5Offset     20.83n ± 0%   11.03n ± 0%  -47.05% (p=0.000 n=10)
XORBytesAlignment/128Bytes6Offset     20.83n ± 0%   11.03n ± 0%  -47.05% (p=0.000 n=10)
XORBytesAlignment/128Bytes7Offset     20.83n ± 0%   11.03n ± 0%  -47.05% (p=0.000 n=10)
XORBytesAlignment/2048Bytes0Offset   110.60n ± 0%   46.82n ± 0%  -57.67% (p=0.000 n=10)
XORBytesAlignment/2048Bytes1Offset    234.4n ± 0%   109.3n ± 0%  -53.37% (p=0.000 n=10)
XORBytesAlignment/2048Bytes2Offset    234.4n ± 0%   109.3n ± 0%  -53.37% (p=0.000 n=10)
XORBytesAlignment/2048Bytes3Offset    234.4n ± 0%   109.3n ± 0%  -53.37% (p=0.000 n=10)
XORBytesAlignment/2048Bytes4Offset    234.5n ± 0%   109.3n ± 0%  -53.39% (p=0.000 n=10)
XORBytesAlignment/2048Bytes5Offset    234.4n ± 0%   109.3n ± 0%  -53.37% (p=0.000 n=10)
XORBytesAlignment/2048Bytes6Offset    234.4n ± 0%   109.3n ± 0%  -53.37% (p=0.000 n=10)
XORBytesAlignment/2048Bytes7Offset    234.5n ± 0%   109.3n ± 0%  -53.39% (p=0.000 n=10)
geomean                               39.42n        26.00n       -34.05%

goos: linux
goarch: loong64
pkg: crypto/subtle
cpu: Loongson-3A5000 @ 2500.00MHz
                                   |  bench.old   |              bench.new              |
                                   |    sec/op    |   sec/op     vs base                |
XORBytes/8Bytes                       11.21n ± 0%   12.41n ± 1%  +10.70% (p=0.000 n=10)
XORBytes/128Bytes                     18.22n ± 0%   13.61n ± 0%  -25.30% (p=0.000 n=10)
XORBytes/2048Bytes                   162.20n ± 0%   48.46n ± 0%  -70.13% (p=0.000 n=10)
XORBytes/8192Bytes                    629.8n ± 0%   163.8n ± 0%  -73.99% (p=0.000 n=10)
XORBytes/32768Bytes                  4731.0n ± 1%   632.8n ± 0%  -86.63% (p=0.000 n=10)
XORBytesAlignment/8Bytes0Offset       11.61n ± 1%   12.42n ± 0%   +6.98% (p=0.000 n=10)
XORBytesAlignment/8Bytes1Offset       11.61n ± 0%   12.41n ± 0%   +6.89% (p=0.000 n=10)
XORBytesAlignment/8Bytes2Offset       11.61n ± 0%   12.42n ± 0%   +6.98% (p=0.000 n=10)
XORBytesAlignment/8Bytes3Offset       11.61n ± 0%   12.41n ± 0%   +6.89% (p=0.000 n=10)
XORBytesAlignment/8Bytes4Offset       11.61n ± 0%   12.42n ± 0%   +6.98% (p=0.000 n=10)
XORBytesAlignment/8Bytes5Offset       11.61n ± 0%   12.41n ± 0%   +6.89% (p=0.000 n=10)
XORBytesAlignment/8Bytes6Offset       11.61n ± 0%   12.41n ± 1%   +6.89% (p=0.000 n=10)
XORBytesAlignment/8Bytes7Offset       11.61n ± 0%   12.42n ± 0%   +6.98% (p=0.000 n=10)
XORBytesAlignment/128Bytes0Offset     17.82n ± 0%   13.62n ± 0%  -23.57% (p=0.000 n=10)
XORBytesAlignment/128Bytes1Offset     26.62n ± 0%   18.43n ± 0%  -30.78% (p=0.000 n=10)
XORBytesAlignment/128Bytes2Offset     26.64n ± 0%   18.43n ± 0%  -30.85% (p=0.000 n=10)
XORBytesAlignment/128Bytes3Offset     26.65n ± 0%   18.42n ± 0%  -30.90% (p=0.000 n=10)
XORBytesAlignment/128Bytes4Offset     26.65n ± 0%   18.42n ± 0%  -30.88% (p=0.000 n=10)
XORBytesAlignment/128Bytes5Offset     26.62n ± 0%   18.42n ± 0%  -30.82% (p=0.000 n=10)
XORBytesAlignment/128Bytes6Offset     26.63n ± 0%   18.42n ± 0%  -30.84% (p=0.000 n=10)
XORBytesAlignment/128Bytes7Offset     26.64n ± 0%   18.42n ± 0%  -30.86% (p=0.000 n=10)
XORBytesAlignment/2048Bytes0Offset   161.80n ± 0%   48.25n ± 0%  -70.18% (p=0.000 n=10)
XORBytesAlignment/2048Bytes1Offset    354.6n ± 0%   189.2n ± 0%  -46.64% (p=0.000 n=10)
XORBytesAlignment/2048Bytes2Offset    354.6n ± 0%   189.2n ± 0%  -46.64% (p=0.000 n=10)
XORBytesAlignment/2048Bytes3Offset    354.7n ± 0%   189.2n ± 0%  -46.66% (p=0.000 n=10)
XORBytesAlignment/2048Bytes4Offset    354.7n ± 0%   189.2n ± 1%  -46.66% (p=0.000 n=10)
XORBytesAlignment/2048Bytes5Offset    354.7n ± 0%   189.2n ± 0%  -46.66% (p=0.000 n=10)
XORBytesAlignment/2048Bytes6Offset    354.7n ± 0%   189.2n ± 0%  -46.66% (p=0.000 n=10)
XORBytesAlignment/2048Bytes7Offset    354.8n ± 0%   189.2n ± 0%  -46.67% (p=0.000 n=10)
geomean                               56.46n        36.46n       -35.42%

Change-Id: I66e150b132517e9ff4827abf796812ffe608c052
Reviewed-on: https://go-review.googlesource.com/c/go/+/673355
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-20 20:28:34 -07:00
limeidan a2eb643cbf cmd/dist, internal/platform: enable internal linking feature and test on loong64
Change-Id: Ifea676e9eb44281465832fc4050f6286e50f4543
Reviewed-on: https://go-review.googlesource.com/c/go/+/533717
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
2025-05-20 20:28:18 -07:00
Xiaolin Zhao d37a1bdd48 cmd/compile: fix the implementation of NORconst on loong64
In the loong64 instruction set, there is no NORI instruction,
so the immediate value in NORconst need to be stored in register
and then use the three-register NOR instruction.

Change-Id: I5ef697450619317218cb3ef47fc07e238bdc2139
Reviewed-on: https://go-review.googlesource.com/c/go/+/673836
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 20:24:09 -07:00
thepudds 74304cda29 cmd/compile/internal/escape: improve order of work to speed up analyzing many locations
For the package github.com/microsoft/typescript-go/internal/checker,
compilation currently spends most of its time in escape analysis.

Here, we re-order work to be more efficient when analyzing many
locations, and delay visiting some locations to prioritize locations
that might be more likely to reach a terminal point of reaching the
heap and possibly reduce the count of intermediate states for each location.

Action graph reported build times show roughly a 5x improvement for
compilation of the typescript-go/internal/checker package:

  go1.24.0:      91.792s
  cl-657179-ps1: 17.578s

with timing via:

  go build -a -debug-actiongraph=/tmp/actiongraph-cl-657179-ps1 -v github.com/microsoft/typescript-go/internal/checker

There are some additional adjustments to make here, including we can
consider a follow-on CL I have that parallelizes the operations of the
core loop, but this seems to be a nice win as is, and my understanding
is the desire is to merge this as it stands.

Updates #72815

Change-Id: I1753c5354b495b059f68fb97f3103ee7834f9eee
Reviewed-on: https://go-review.googlesource.com/c/go/+/657179
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 20:11:56 -07:00
khr@golang.org a070533633 reflect: turn off allocation test if instrumentation is on
Help fix the asan builders.

Change-Id: I980f5171519643c3543bdefc6ea46fd0fca17c28
Reviewed-on: https://go-review.googlesource.com/c/go/+/674616
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2025-05-20 17:27:54 -07:00
khr@golang.org 4cdca1342b runtime: disable stack allocation test when instrumentation is on
Should fix some asan build failures.

Change-Id: Ic0a816b56a1a278aa0ad541aea962f9fea7b10fc
Reviewed-on: https://go-review.googlesource.com/c/go/+/674696
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-05-20 17:00:09 -07:00
Filippo Valsorda fccac5fe98 crypto/ecdsa,crypto/ed25519: cache FIPS private keys
All private keys need to go through a slow PCT in FIPS-140 mode.

ECDH and RSA keys have places to hide a precomputed value without
causing races, but Ed25519 and ECDSA keys might be constructed by the
application and then used with concurrent Sign calls.

For these, implement an equivalent to crypto/internal/boring/bcache
using weak.Pointer and runtime.AddCleanup.

fips140: latest
goos: linux
goarch: amd64
pkg: crypto/ed25519
cpu: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics
           │ 1a93e4a2cf  │             78a819ea78             │
           │   sec/op    │   sec/op     vs base               │
Signing-16   72.72µ ± 0%   16.93µ ± 1%  -76.72% (p=0.002 n=6)

fips140: off
goos: linux
goarch: amd64
pkg: crypto/ed25519
cpu: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics
           │ 310bad31e5  │         310bad31e5-dirty          │
           │   sec/op    │   sec/op     vs base              │
Signing-16   17.18µ ± 1%   16.95µ ± 1%  -1.36% (p=0.002 n=6)

fips140: latest
goos: linux
goarch: amd64
pkg: crypto/ecdsa
cpu: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics
             │  1a93e4a2cf  │             78a819ea78             │
             │    sec/op    │   sec/op     vs base               │
Sign/P256-16    90.97µ ± 0%   21.04µ ± 0%  -76.87% (p=0.002 n=6)
Sign/P384-16    701.6µ ± 1%   142.0µ ± 0%  -79.75% (p=0.002 n=6)
Sign/P521-16   2943.5µ ± 1%   491.9µ ± 0%  -83.29% (p=0.002 n=6)

fips140: off
goos: linux
goarch: amd64
pkg: crypto/ecdsa
cpu: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics
             │ 1a93e4a2cf  │             78a819ea78             │
             │   sec/op    │   sec/op     vs base               │
Sign/P256-16   21.27µ ± 0%   21.13µ ± 0%   -0.65% (p=0.002 n=6)
Sign/P384-16   143.3µ ± 0%   142.4µ ± 0%   -0.63% (p=0.009 n=6)
Sign/P521-16   525.3µ ± 0%   462.1µ ± 0%  -12.04% (p=0.002 n=6)

This unavoidably introduces allocations in the very first use of Ed25519
private keys, but usually that's not in the hot path.

Change-Id: I6a6a465640a5dff64edd73ee5dda5f2ad1b476b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/654096
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 16:33:12 -07:00
Damien Neil 0afcf9192c runtime: record synctest bubble ownership in hchan
Replace the hchan.synctest bool with an hchan.bubble reference
to the synctest bubble that created the chan. I originally used
a bool to avoid increasing the size of hchan, but we have space
in hchan's current size class for another pointer.

This lets us detect one bubble operating on a chan created
in a different bubble.

For #67434

Change-Id: If6cf9ffcb372fe7fb3f8f4ef27b664848578ba5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/674515
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
2025-05-20 15:57:34 -07:00
Damien Neil 68bc0d84e9 encoding/json: avoid supurious synctest deadlock detection
Use a sync.OnceValue rather than a sync.WaitGroup to
coordinate access to encoderCache entries.

The OnceValue better expresses the intent of the code
(we want to initialize the cache entry only once).

However, the motivation for this change is to avoid
testing/synctest incorrectly reporting a deadlock
when multiple bubbles call Marshal at the same time.
Goroutines blocked on WaitGroup.Wait are "durably blocked",
causing confusion when a goroutine in one bubble Waits
for a goroutine in a different bubble. Goroutines blocked
on OnceValue are not durably blocked, avoiding the problem.

Fixes #73733
For #67434

Change-Id: I81cddda80af67cf5c280fd4327620bc37e7a6fe6
Reviewed-on: https://go-review.googlesource.com/c/go/+/673335
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 15:46:07 -07:00
Damien Neil 49a660e22c testing/synctest: add Test
Add a synctest.Test function, superseding the experimental
synctest.Run function. Promote the testing/synctest package
out of experimental status.

For #67434
For #73567

Change-Id: I3c5ba030860d90fe2ddb517a2f3536efd60181a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/671961
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-20 15:46:03 -07:00
Michael Matloob 609197b406 cmd/doc: use golang.org/x/pkgsite/cmd/internal/doc to start server
This change switches the pkgsite command invoked to start a pkgsite
server from golang.org/x/pkgsite/cmd/pkgsite to
golang.org/x/pkgsite/cmd/internal/doc. The doc command is a simplified
version of cmd/pkgsite that changes some options to improve the user
experience. For example, it limits logging informational log messages,
doesn't always expect to find modules (for example if we're outside of a
module getting documentation for the standard library), and it takes the
address of the page to open in the browser (which simplifies waiting for
the server to start listening).

Fixes #68106

Change-Id: I667a49d03823242fa1aff333ecb1c0f198e92412
Reviewed-on: https://go-review.googlesource.com/c/go/+/674158
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-20 15:06:58 -07:00
Michael Matloob 546761aff4 cmd/doc: use go list to determine import path if it's missing
cmd/doc uses go/build to get information about the packages it's
documenting. In some cases, go/build can return a build.Package that it
couldn't determine an import path for, in which case it sets the import
path to ".". This can happen for relative package paths in in a module:
for relative package paths we don't use the go command to get
information about the module and just open the source files directly
instead, and will be missing the import path. This is usually okay
because go doc doesn't need to print the import path of the package it's
documenting, but for go doc -http, we want to know the import path so we
can open the right page in the browser.

For #68106

Change-Id: Ifba92862ad01d8d63f531c2451f18db2b0d7a3e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/674556
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
2025-05-20 13:30:32 -07:00
Michael Matloob 1972493904 cmd/doc: show page for the requested object
This fixes a bug where we start pkgsite for every requested object,
rather than the one that we would have printed the documentation for.
To make things simple, we'll run the logic that prints the
documentation, but with an io.Discard writer. Then we can tell if the
documentation was found based on the return values of those functions.

For #68106

Change-Id: Ibf2ab1720f381d7214fc9239b9c2e915c91f7f7b
Reviewed-on: https://go-review.googlesource.com/c/go/+/674555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
2025-05-20 13:30:28 -07:00
Junyang Shao 113b25774e cmd/compile: memcombine different size stores
This CL implements the TODO in combineStores to allow combining
stores of different sizes, as long as the total size aligns to
2, 4, 8.

Fixes #72832.

Change-Id: I6d1d471335da90d851ad8f3b5a0cf10bdcfa17c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/661855
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 13:00:16 -07:00
Mark Freeman fa42585dad internal/pkgbits: rename RelocEnt to RefTableEntry
Change-Id: I9b1c9a0499ad3444e8cb3e4be187f9fab816c90c
Reviewed-on: https://go-review.googlesource.com/c/go/+/674159
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Mark Freeman <mark@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 12:55:36 -07:00
Mark Freeman 96d2211c61 cmd/compile/internal/noder: mark Ref[T] as a primitive
Like Sync, Ref[T] is also used to define things like StringRef.

Change-Id: I9e10234504ee4dd03907bb058a6f3ae7e6a287ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/674157
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Bypass: Mark Freeman <mark@golang.org>
Auto-Submit: Mark Freeman <mark@golang.org>
2025-05-20 12:52:51 -07:00
Mateusz Poliwczak 2541a68a70 reflect: add TypeAssert[T]
This implementation is zero-alloc when T is a concrete type,
allocates when val contains a method or when T is a interface
and Value was obtained for example through Elem(), in which case
it has to be allocated to avoid sharing the same memory.

goos: linux
goarch: amd64
pkg: reflect
cpu: AMD Ryzen 5 4600G with Radeon Graphics
                                                                         │ /tmp/bench2 │
                                                                         │   sec/op    │
TypeAssert/TypeAssert[int](int)-12                                         2.725n ± 1%
TypeAssert/TypeAssert[uint8](int)-12                                       2.599n ± 1%
TypeAssert/TypeAssert[fmt.Stringer](reflect_test.testTypeWithMethod)-12    8.470n ± 0%
TypeAssert/TypeAssert[fmt.Stringer](*reflect_test.testTypeWithMethod)-12   8.460n ± 1%
TypeAssert/TypeAssert[interface_{}](int)-12                                4.181n ± 1%
TypeAssert/TypeAssert[interface_{}](reflect_test.testTypeWithMethod)-12    4.178n ± 1%
TypeAssert/TypeAssert[time.Time](time.Time)-12                             2.839n ± 0%
TypeAssert/TypeAssert[func()_string](func()_string)-12                     151.1n ± 1%
geomean                                                                    6.645n

                                                                         │ /tmp/bench2  │
                                                                         │     B/op     │
TypeAssert/TypeAssert[int](int)-12                                         0.000 ± 0%
TypeAssert/TypeAssert[uint8](int)-12                                       0.000 ± 0%
TypeAssert/TypeAssert[fmt.Stringer](reflect_test.testTypeWithMethod)-12    0.000 ± 0%
TypeAssert/TypeAssert[fmt.Stringer](*reflect_test.testTypeWithMethod)-12   0.000 ± 0%
TypeAssert/TypeAssert[interface_{}](int)-12                                0.000 ± 0%
TypeAssert/TypeAssert[interface_{}](reflect_test.testTypeWithMethod)-12    0.000 ± 0%
TypeAssert/TypeAssert[time.Time](time.Time)-12                             0.000 ± 0%
TypeAssert/TypeAssert[func()_string](func()_string)-12                     72.00 ± 0%
geomean                                                                               ¹

Fixes #62121

Change-Id: I0911c70c5966672c930d387438643f94a40441c4
GitHub-Last-Rev: ce89a53097
GitHub-Pull-Request: golang/go#71639
Reviewed-on: https://go-review.googlesource.com/c/go/+/648056
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-05-20 12:40:33 -07:00
Damien Neil d596bc0e81 runtime: disallow closing bubbled chans from outside bubble
A chan created within a synctest bubble may not be
operated on from outside the bubble.
We panicked on send and receive, but not close.
Panic on close as well.

For #67434

Change-Id: I98d39e0cf7baa1a679aca1fb325453d69c535308
Reviewed-on: https://go-review.googlesource.com/c/go/+/671960
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
2025-05-20 12:36:34 -07:00
Damien Neil b7382cc1f0 runtime: print blocking status of bubbled goroutines in stacks
For goroutines in a synctest bubble, include whether the goroutine
is "durably blocked" or not in the goroutine status.

Synctest categorizes goroutines in certain states as "durably"
blocked, where the goroutine is not merely idle but can only
be awoken by another goroutine in its bubble. To make it easier
for users to understand why a bubble is or is not idle,
print the state of each bubbled goroutine.

For example:

  goroutine 36 [chan receive, synctest bubble 34, not durably blocked]:
  goroutine 37 [chan receive (synctest), synctest bubble 34, durably blocked]:

Goroutine 36 is receiving from a channel created outside its bubble.
Goroutine 36 is receiving from a channel created inside its bubble.

For #67434

Change-Id: I006b656a9ce7eeb75b2be21e748440a5dd57ceb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/670976
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-20 12:36:29 -07:00
Michael Anthony Knyszek 5b0b4c01ba runtime: add package doc for checkfinalizer mode
Fixes #72949.

Change-Id: I114eda73c57bc7d596eb1656e738b80c1cbe5254
Reviewed-on: https://go-review.googlesource.com/c/go/+/662039
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 12:05:59 -07:00
Michael Anthony Knyszek 0d42cebacd runtime: report finalizer and cleanup queue length with checkfinalizer>0
This change adds tracking for approximate finalizer and cleanup queue
lengths. These lengths are reported once every GC cycle as a single line
printed to stderr when GODEBUG=checkfinalizer>0.

This change lays the groundwork for runtime/metrics metrics to produce
the same values.

For #72948.
For #72950.

Change-Id: I081721238a0fc4c7e5bee2dbaba6cfb4120d1a33
Reviewed-on: https://go-review.googlesource.com/c/go/+/671437
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 12:04:52 -07:00
Michael Pratt 2aac5a5cba runtime: skip testprogcgo tests in race mode on freebsd
These were just enabled by https://go.dev/cl/643897, but freebsd
unfortunately doesn't seem to support cgo + race mode by default.

For #73788.

Cq-Include-Trybots: luci.golang.try:gotip-freebsd-amd64-race
Change-Id: I6a6a636c06176ca746548d0588283b1429d7c6d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/674160
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-20 11:31:20 -07:00
Jake Bailey ca3b474702 unique: add alloc test for Make
This will be useful to show how the next CLs improve things.

Change-Id: I49a691295c1fe3c7455a67c7d19e5c03979f714a
Reviewed-on: https://go-review.googlesource.com/c/go/+/673015
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 11:16:49 -07:00
Michael Anthony Knyszek eec8dd0836 runtime: add scan trace for checkfinalizers>1
This change dumps a scan trace (each pointer marked and where it came
from) for the partial GC cycle performed by checkfinalizers mode when
checkfinalizers>1. This is useful for quickly understanding why certain
values are reachable without having to pull out tools like viewcore.

For #72949.

Change-Id: Ic583f80e9558cdfe1c667d27a1d975008dd39a9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/662038
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 11:16:40 -07:00
MaineK00n 89af77deef internal/filepathlite: fix comment
fix typo

Change-Id: I46f0b052615d388a852439e63542b43e2ca62b7e
GitHub-Last-Rev: 96ac66c036
GitHub-Pull-Request: golang/go#73725
Reviewed-on: https://go-review.googlesource.com/c/go/+/672955
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 11:15:21 -07:00
Michael Anthony Knyszek c58f58b9f8 runtime: mark and identify tiny blocks in checkfinalizers mode
This change adds support for identifying cleanups and finalizers
attached to tiny blocks to checkfinalizers mode. It also notes a subtle
pitfall, which is that the cleanup arg, if tiny-allocated, could end up
co-located with the object with the cleanup attached! Oops...

For #72949.

Change-Id: Icbe0112f7dcfc63f35c66cf713216796a70121ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/662037
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-05-20 11:15:12 -07:00
Michael Anthony Knyszek 913c069819 runtime: annotate checkfinalizers reports with source and type info
This change adds a new special kind called CheckFinalizer which is used
to annotate finalizers and cleanups with extra information about where
that cleanup or finalizer came from.

For #72949.

Change-Id: I3c1ace7bd580293961b7f0ea30345a6ce956d340
Reviewed-on: https://go-review.googlesource.com/c/go/+/662135
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 11:13:54 -07:00
Julian Zhu dfebef1c04 cmd/compile: fold negation into addition/subtraction on arm64
Fold negation into addition/subtraction and avoid double negation.

platform: linux/arm64

file      before    after     Δ       %
addr2line 3628108   3628116   +8      +0.000%
asm       6208353   6207857   -496    -0.008%
buildid   3460682   3460418   -264    -0.008%
cgo       5572988   5572492   -496    -0.009%
compile   26042159  26041039  -1120   -0.004%
cover     6304328   6303472   -856    -0.014%
dist      4139330   4139098   -232    -0.006%
doc       9429305   9428065   -1240   -0.013%
fix       3997189   3996733   -456    -0.011%
link      8212128   8210280   -1848   -0.023%
nm        3620056   3619696   -360    -0.010%
objdump   5920289   5919233   -1056   -0.018%
pack      2892250   2891778   -472    -0.016%
pprof     17094569  17092745  -1824   -0.011%
test2json 3335825   3335529   -296    -0.009%
trace     15842080  15841456  -624    -0.004%
vet       9472194   9471106   -1088   -0.011%
go        19081541  19081509  -32     -0.000%
total     154253374 154240622 -12752  -0.008%

platform: darwin/arm64

file    before    after     Δ       %
compile 27152002  27135490  -16512  -0.061%
link    8372914   8356402   -16512  -0.197%
go      19154802  19154778  -24     -0.000%
total   157734180 157701132 -33048  -0.021%

Change-Id: I15a349bfbaf7333ec3e4a62ae4d06f3f371dfb1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/673715
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 11:08:28 -07:00
Shibi J M be0cc937ec net: avoid using Windows' TransmitFile on non-server machines
Windows API's TransmitFile function is limited to two concurrent
operations on workstation and client versions of Windows. This change
modifies the net.sendFile function to perform no work in such cases
so that TransmitFile is avoided.

Fixes #73746

Change-Id: Iba70d5d2758bf986e80c78254c8e9e10b39bb368
GitHub-Last-Rev: 315ddc0cd8
GitHub-Pull-Request: golang/go#73758
Reviewed-on: https://go-review.googlesource.com/c/go/+/673855
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
2025-05-20 11:06:59 -07:00
Michael Matloob 0c7311e9ca cmd/go: do not try to load 'all' packages with invalid import paths
Before this change, when we tried to compute the set of packages in
'all', we'd add packages with invalid import paths to the set and try to
load them, which would fail. Instead, do not add them to the list of
packages to load in the second iteration of the loader. We'll still
return errors for invalid imports in the importing packages.
Change-Id: I682229011f555ed1d0c827f79100c1c43bf7f93a
Reviewed-on: https://go-review.googlesource.com/c/go/+/673655
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 10:51:31 -07:00
Mateusz Poliwczak 3e82316a43 cmd/compile: don't instrument counter globals in internal/fuzz
Fixes: #72766

Change-Id: I45b521e53c2a11e259dc99e2dfc8e40cac39139a
Reviewed-on: https://go-review.googlesource.com/c/go/+/673575
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-05-20 10:47:59 -07:00
Julian Zhu 123141166b cmd/compile: add generic simplifications on riscv64
file      before    after     Δ       %
addr2line 3636263   3636215   -48     -0.001%
asm       6318110   6317966   -144    -0.002%
buildid   3463352   3463224   -128    -0.004%
cgo       5672502   5672214   -288    -0.005%
compile   26904997  26905719  +722    +0.003%
cover     6405603   6405467   -136    -0.002%
dist      4092630   4092494   -136    -0.003%
doc       9728281   9723977   -4304   -0.044%
fix       4014891   4014835   -56     -0.001%
link      8327674   8327426   -248    -0.003%
nm        3628718   3628494   -224    -0.006%
objdump   5951778   5951626   -152    -0.003%
pack      2896080   2896040   -40     -0.001%
pprof     17596796  17591908  -4888   -0.028%
test2json 3346622   3346566   -56     -0.002%
trace     16179738  16175706  -4032   -0.025%
vet       9603472   9603264   -208    -0.002%
total     156070021 156055655 -14366  -0.009%

Change-Id: Ie4a79a3c410eb79155ce2418ae64fa670d1ccd53
Reviewed-on: https://go-review.googlesource.com/c/go/+/673477
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2025-05-20 10:39:07 -07:00
Michael Anthony Knyszek 3df078fc74 runtime: add new GODEBUG checkfinalizer
This new debug mode detects cleanup/finalizer leaks using checkmark
mode. It runs a partial GC using only specials as roots. If the GC can
find a path from one of these roots back to the object the special is
attached to, then the object might never be reclaimed. (The cycle could
be broken in the future, but it's almost certainly a bug.)

This debug mode is very barebones. It contains no type information and
no stack location for where the finalizer or cleanup was created.

For #72949.

Change-Id: Ibffd64c1380b51f281950e4cfe61f677385d42a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/634599
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-05-20 10:37:12 -07:00
Jake Bailey 2a65100e68 cmd/internal/testdir: filter out errors outside input file set
When an errorcheck test uses -m and instantiates an imported generic
function, the errors will include -m messages from the imported package
(since the new function has not previously been walked). These errors
cannot be matched since we can't write errors in files outside the test
input.

To fix this (and enable the other CLs in this stack), drop any unmatched
errors that occur in files outside those in the input set.

Change-Id: I2fcf0dd4693125d2e5823ea4437011730d8b1b1f
Reviewed-on: https://go-review.googlesource.com/c/go/+/672515
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-20 10:15:47 -07:00
khr@golang.org ff9da9bcd5 cmd/dist: pass GO_GCFLAGS to cpuN runtime tests
We want gcflags, which control builder type (e.g. noopt) to be used
for these tests also.

Should fix noopt and maybe other builders.

Change-Id: Iad34beab51714f0c38989ec0fc8778cf79087f72
Reviewed-on: https://go-review.googlesource.com/c/go/+/674455
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2025-05-20 09:45:30 -07:00
thepudds 326e5e1b7a cmd/compile/internal/escape: additional constant and zero value tests and logging
This adds additional logging for the work that walk does to reduce
how often an interface conversion results in an allocation.

Also, as part of #71359, we will be updating how escape analysis and
walk handle basic literals, composite literals, and zero values,
so add some tests that uses this new logging.

By the end of our CL stack, we address all of these tests.

Updates #71359

Change-Id: I43fde8343d9aacaec1e05360417908014a86c8bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/649076
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 09:19:42 -07:00
Vladislav Yarmak 1635aed941 hash/maphash: hash channels in purego version of maphash.Comparable
This change makes purego implementation of maphash.Comparable consistent
with the one in runtime and fixes hashing of channels.

Fixes #73657

Change-Id: If78a21d996f0c20c0224d4014e4a4177b09c3aa3
GitHub-Last-Rev: 2537216a1e
GitHub-Pull-Request: golang/go#73660
Reviewed-on: https://go-review.googlesource.com/c/go/+/671655
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: qiu laidongfeng2 <2645477756@qq.com>
2025-05-20 08:41:59 -07:00
Marc-Antoine Ruel b69f50faef net/http: upon http redirect, copy Request.GetBody in new request
This enable http.RoundTripper implementation to retry POST request (let's
say after a 500) after a 307/308 redirect.

Fixes #73439

Change-Id: I4365ff58b012c7f0d60e0317a08c98b1d48f657e
Reviewed-on: https://go-review.googlesource.com/c/go/+/666735
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
2025-05-20 08:40:50 -07:00
Michael Anthony Knyszek df9888ea4e runtime: prevent unnecessary zeroing of large objects with pointers
CL 614257 refactored mallocgc but lost an optimization: if a span for a
large object is already backed by memory fresh from the OS (and thus
zeroed), we don't need to zero it. CL 614257 unconditionally zeroed
spans for large objects that contain pointers.

This change restores the optimization from before CL 614257, which seems
to matter in some real-world programs.

While we're here, let's also fix a hole with the garbage collector being
able to observe uninitialized memory of the large object is observed
by the conservative scanner before being published. The gory details are
in a comment in heapSetTypeLarge. In short, this change makes
span.largeType an atomic variable, such that the GC can only observe
initialized memory if span.largeType != nil.

Fixes #72991.

Change-Id: I2048aeb220ab363d252ffda7d980b8788e9674dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/659956
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
2025-05-20 08:39:41 -07:00
Michael Anthony Knyszek 24ea1aa25c runtime: only update freeIndexForScan outside of the mark phase
Currently, it's possible for asynchronous preemption to observe a
partially initialized object. The sequence of events goes like this:
- The GC is in the mark phase.
- Thread T1 is allocating object O1.
- Thread T1 zeroes the allocation, runs the publication barrier, and
  updates freeIndexForScan. It has not yet updated the mark bit on O1.
- Thread T2 is conservatively scanning some stack frame.
  That stack frame has a dead pointer with the same address as O1.
- T2 picks up the pointer, checks isFree (which checks
  freeIndexForScan without an import barrier), and sees that O1 is
  allocated. It marks and queues O1.
- T2 then goes to scan O1, and observes uninitialized memory.

Although a publication barrier was executed, T2 did not have an import
barrier. T2 may thus observe T1's writes to zero the object out-of-order
with the write to freeIndexForScan.

Normally this would be impossible if T2 got a pointer to O1 from
somewhere written by T1. The publication barrier guarantees that if the
read side is data-dependent on the write side then we'd necessarily
observe all writes to O1 before T1 published it. However, T2 got the
pointer 'out of thin air' by scanning a stack frame with a dead pointer
on it.

One fix to this problem would be to add the import barrier in the
conservative scanner. We would then also need to put freeIndexForScan
behind the publication barrier, or make the write to freeIndexForScan
exactly that barrier.

However, there's a simpler way. We don't actually care if conservative
scanning observes a stale freeIndexForScan during the mark phase.
Newly-allocated memory is always marked at the point of allocation (the
allocate-black policy part of the GC's design). So it doesn't actually
matter that if the garbage collector scans that memory or not.

This change modifies the allocator to only update freeIndexForScan
outside the mark phase. This means freeIndexForScan is essentially
a snapshot of freeindex at the point the mark phase started. Because
there's no more race between conservative scanning and newly-allocated
objects, the complicated scenario above is no longer a possibility.

One thing we do have to be careful of is other callers of isFree.
Previously freeIndexForScan would always track freeindex, now it no
longer does. This change thus introduces isFreeOrNewlyAllocated which is
used by the conservative scanner, and uses freeIndexForScan. Meanwhile
isFree goes back to using freeindex like it used to. This change also
documents the requirement on isFree that the caller must have obtained
the pointer not 'out of thin air' but after the object was published.
isFree is not currently used anywhere particularly sensitive (heap dump
and checkmark mode, where the world is stopped in both cases) so using
freeindex is both conceptually simple and also safe.

Change-Id: If66b8c536b775971203fb4358c17d711c2944723
Reviewed-on: https://go-review.googlesource.com/c/go/+/672340
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 08:39:26 -07:00
Sam Thanawalla 693d8d920c cmd/go: extend the ignore directive for indexed modules
For modules that have already been indexed, we can skip ignored paths.
We already skip 'testdata' and '_' for this case so we can extend the
ignore directive for this case as well.

Updates: #42965
Change-Id: I076a242ba65c7b905b9dc65dcfb0a0247cbd68d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/674076
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20 08:17:44 -07:00
David Finkel 7b91ec07eb cmd/go: add 2 scripts test for git sha256 fetching
Fast follow to golang.org/cl/636475 with a couple script tests that
build/runs a module that depends on a function inside a git repo using
sha256 hashes. (one with go get of a branch-name and the other
configuring go.mod directly)

Change-Id: Ief6c7efaf6d5c066dc54a3e4a63aad109f625abe
Reviewed-on: https://go-review.googlesource.com/c/go/+/672435
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
2025-05-20 07:59:04 -07:00
thepudds 312ceba318 cmd/go/internal/modload: remove likely vestigial ability to infer module path from Godeps.json and vendor.json
CL 518776 deleted the cmd/go/internal/modconv package and dropped the
ability to import dependency requirements from ~nine or so legacy
pre-module dependency configuration files. Part of the rationale from
Russ in 2023 for dropping that support was that "by now no one is
running into those configs anymore during 'go mod init'".

For two of those legacy file formats, Godeps.json and vendor.json, the
ability to import their listed dependencies was dropped in CL 518776,
but what remained for those two formats was the ability to guess the
resulting module name in the absence of a name being supplied to 'go mod
init'.

This could be explained by the fact that this smaller functionality for
guessing a module name was separate, did not rely on the deleted modconv
package, and instead only relied on simple JSON parsing.

The name guessing was helpful as part of the transition when module
support was initially released, but it was never perfect, including the
various third-party dependency managers did not all have the same naming
rules that were enforced by modules.

In short, it is very unlikely anyone is relying on this now, so we
delete it.

This CL was spawned from discussion in two related documentation CLs
(CL 662675 and CL 662695).

Updates #71537

Change-Id: I9e087aa296580239562a0ecee58913c5edc533ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/664315
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>
2025-05-20 07:58:55 -07:00
Mateusz Poliwczak 4f1146e661 testing: use a pattern to match the elapsed time in TestTRun
Fixes #73723
Fixes #73737
Fixes #73739

Change-Id: I1ebd3614614285c3e660d48241389bb0f896be23
Reviewed-on: https://go-review.googlesource.com/c/go/+/674355
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-05-20 07:37:42 -07:00
Keith Randall c8bf388bad cmd/compile: align stack-allocated backing stores higher than required
Because that's what mallocgc did and some user code came to rely on it.

Fixes #73199

Change-Id: I45ca00d2ea448e6729ef9ac4cec3c1eb0ceccc89
Reviewed-on: https://go-review.googlesource.com/c/go/+/666116
Reviewed-by: t hepudds <thepudds1460@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-05-19 18:12:50 -07:00
khr@golang.org 524946d247 cmd/compile: don't preload registers if destination already scheduled
In regalloc, we allocate some values to registers before loop entry,
so that they don't need to be loaded (from spill locations) during
the loop.

But it is pointless if we've already regalloc'd the loop body.
Whatever restores we needed for the body are already generated.

It's not clear if this code is ever useful. No tests fail if I just
remove it. But at least this change is worthwhile. It doesn't help,
and it actively inserts more restores than we really need (mostly
because the desired register list is approximate - I have seen cases
where the loads implicated here end up being dead because the restores
hit the wrong registers and the edge shuffle pass knows it needs
the restores in different registers).

While we are here, might as well have layoutRegallocOrder return
the standard layout order instead of recomputing it.

Change-Id: Ia624d5121de59b6123492603695de50b272b277f
Reviewed-on: https://go-review.googlesource.com/c/go/+/672735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-19 17:13:21 -07:00
Keith Randall ce88e341b9 cmd/compile: allocate backing store for append on the stack
When appending, if the backing store doesn't escape and a
constant-sized backing store is big enough, use a constant-sized
stack-allocated backing store instead of allocating it from the heap.

cmd/go is <0.1% bigger.

As an example of how this helps, if you edit strings/strings.go:FieldsFunc
to replace
    spans := make([]span, 0, 32)
with
    var spans []span

then this CL removes the first 2 allocations that are part of the growth sequence:

                            │    base      │                 exp                  │
                            │  allocs/op   │  allocs/op   vs base                 │
FieldsFunc/ASCII/16-24         3.000 ± ∞ ¹   2.000 ± ∞ ¹  -33.33% (p=0.008 n=5)
FieldsFunc/ASCII/256-24        7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
FieldsFunc/ASCII/4096-24      11.000 ± ∞ ¹   9.000 ± ∞ ¹  -18.18% (p=0.008 n=5)
FieldsFunc/ASCII/65536-24      18.00 ± ∞ ¹   16.00 ± ∞ ¹  -11.11% (p=0.008 n=5)
FieldsFunc/ASCII/1048576-24    30.00 ± ∞ ¹   28.00 ± ∞ ¹   -6.67% (p=0.008 n=5)
FieldsFunc/Mixed/16-24         2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=5)
FieldsFunc/Mixed/256-24        7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
FieldsFunc/Mixed/4096-24      11.000 ± ∞ ¹   9.000 ± ∞ ¹  -18.18% (p=0.008 n=5)
FieldsFunc/Mixed/65536-24      18.00 ± ∞ ¹   16.00 ± ∞ ¹  -11.11% (p=0.008 n=5)
FieldsFunc/Mixed/1048576-24    30.00 ± ∞ ¹   28.00 ± ∞ ¹   -6.67% (p=0.008 n=5)

(Of course, people have spotted and fixed a bunch of allocation sites
like this, but now we're ~automatically doing it everywhere going forward.)

No significant increases in frame sizes in cmd/go.

Change-Id: I301c4d9676667eacdae0058960321041d173751a
Reviewed-on: https://go-review.googlesource.com/c/go/+/664299
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
2025-05-19 16:14:53 -07:00
Keith Randall 3baf53aec6 cmd/compile: derive bounds on signed %N for N a power of 2
-N+1 <= x % N <= N-1

This is useful for cases like:

func setBit(b []byte, i int) {
    b[i/8] |= 1<<(i%8)
}

The shift does not need protection against larger-than-7 cases.
(It does still need protection against <0 cases.)

Change-Id: Idf83101386af538548bfeb6e2928cea855610ce2
Reviewed-on: https://go-review.googlesource.com/c/go/+/672995
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-19 15:21:54 -07:00
ICHINOSE Shogo 498899e205 math: fix portable FMA implementation when x*y ~ 0, x*y < 0 and z = 0
Adding zero usually does not change the original value.
However, there is an exception with negative zero. (e.g. (-0) + (+0) = (+0))
This applies when x * y is negative and underflows.

Fixes #73757

Change-Id: Ib7b54bdacd1dcfe3d392802ea35cdb4e989f9371
GitHub-Last-Rev: 30d74883b2
GitHub-Pull-Request: golang/go#73759
Reviewed-on: https://go-review.googlesource.com/c/go/+/673856
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-05-19 14:45:48 -07:00
Michael Pratt 2cde950049 runtime: disable TestSegv in race mode
This was just enabled in CL 643897. It seems to work fine on Linux, but
there are traceback issues on Darwin. We could disable just on Darwin,
but I'm not sure SIGSEGV inside of TSAN is something we care to support.

Fixes #73784.

Cq-Include-Trybots: luci.golang.try:gotip-darwin-arm64-race
Change-Id: I6a6a636cb15d7affaeb22c4c13d8f2a5c9bb31fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/674276
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-19 14:07:20 -07:00
Mark Freeman fd6afa352d cmd/compile/internal/noder: mark Sync as a primitive
Sync is used in the definition of primitives and documented by pkgbits.
It's not much help to also document it here.

Change-Id: I18bd0c7816f8249483550a1f0af7c76b9cfe09fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/674156
Auto-Submit: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Bypass: Mark Freeman <mark@golang.org>
2025-05-19 12:52:12 -07:00
Michael Pratt db956262ac runtime: rename ncpu to numCPUStartup
ncpu is the total logical CPU count at startup. It is never updated. For
#73193, we will start using updated CPU counts for updated GOMAXPROCS,
making the ncpu name a bit ambiguous. Change to a less ambiguous name.

While we're at it, give the OS specific lookup functions a common name,
so it can be used outside of osinit later.

For #73193.

Change-Id: I6a6a636cf21cc60de36b211f3c374080849fc667
Reviewed-on: https://go-review.googlesource.com/c/go/+/672277
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-19 12:47:30 -07:00
Michael Pratt 76e7bfbb4e runtime: move atoi to internal/runtime/strconv
Moving to a smaller package allows its use in other internal/runtime
packages.

This isn't internal/strconvlite since it can't be used directly by
strconv.

For #73193.

Change-Id: I6a6a636c9c8b3f06b5fd6c07fe9dd5a7a37d1429
Reviewed-on: https://go-review.googlesource.com/c/go/+/672697
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-19 12:46:19 -07:00
Mark Freeman d93bea0e59 cmd/compile/internal/noder: document SectionPkg
The package section holds package stubs, which are a package
(path, name) pair and a series of declared imports.

Change-Id: If2a260c5e0a3522851be9808de46a3f128902002
Reviewed-on: https://go-review.googlesource.com/c/go/+/674175
Auto-Submit: Mark Freeman <mark@golang.org>
TryBot-Bypass: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-05-19 12:42:07 -07:00
Michael Pratt 2a29cddec3 internal/runtime/syscall: add basic file system calls
Change-Id: I6a6a636c5e119165dc1018d1fc0354f5b6929656
Reviewed-on: https://go-review.googlesource.com/c/go/+/670496
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-19 12:42:00 -07:00
Mark Freeman 195e64232d cmd/compile/internal/noder: format grammar
This just wraps column width to 72 and indents production definitions
so they are easier to distinguish from prose.

Change-Id: I386b122b4f617db4b182ebb549fbee4f35a0122c
Reviewed-on: https://go-review.googlesource.com/c/go/+/673536
TryBot-Bypass: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Mark Freeman <mark@golang.org>
2025-05-19 12:40:33 -07:00
Mark Freeman 2ab210bc74 cmd/compile/internal/noder: document the PosBase section
Positions mostly borrow their representation from package syntax. Of
note, constants (such as the zero value for positions) are not encoded
directly. Rather, a flag typically signals such values.

Change-Id: I6b4bafc6e96bb21902dd2d6e164031e7dd5aabdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/673535
TryBot-Bypass: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Mark Freeman <mark@golang.org>
2025-05-19 12:40:29 -07:00
Jonathan Amsterdam 177d5eb630 net/http: clarify ServeMux.Handler behavior
Explain that ServeMux.Handler doesn't populate the request with
matches.

Fixes #69623.

Change-Id: If625b3f8e8f4e54b05e1d9a86e8c471045e77763
Reviewed-on: https://go-review.googlesource.com/c/go/+/674095
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Chressie Himpel <chressie@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-19 12:26:19 -07:00
Jonathan Amsterdam 3602cec3af net/http: fix ServeMux.Handler on trailing-slash redirect
When a match involves a trailing-slash redirect,  ServeMux.Handler now
returns the pattern that matched.

Fixes #73688.

Change-Id: I682d9cc9a3628bed8bf21139b98369ffa6c53792
Reviewed-on: https://go-review.googlesource.com/c/go/+/673815
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
2025-05-19 12:25:47 -07:00
Julian Zhu d52679006c cmd/compile: fold negation into addition/subtraction on mipsx
Fold negation into addition/subtraction and avoid double negation.

file      before    after     Δ       %
addr2line 3742022   3741986   -36     -0.001%
asm       6668616   6668628   +12     +0.000%
buildid   3583786   3583630   -156    -0.004%
cgo       6020370   6019634   -736    -0.012%
compile   29416016  29417336  +1320   +0.004%
cover     6801903   6801675   -228    -0.003%
dist      4485916   4485816   -100    -0.002%
doc       10652787  10652251  -536    -0.005%
fix       4115988   4115560   -428    -0.010%
link      9002328   9001616   -712    -0.008%
nm        3733148   3732780   -368    -0.010%
objdump   6163292   6163068   -224    -0.004%
pack      2944768   2944604   -164    -0.006%
pprof     18909973  18908773  -1200   -0.006%
test2json 3394662   3394778   +116    +0.003%
trace     17350911  17349751  -1160   -0.007%
vet       10077727  10077527  -200    -0.002%
go        19118769  19118609  -160    -0.001%
total     166182982 166178022 -4960   -0.003%

Change-Id: Id55698800fd70f3cb2ff48393584456b87208921
Reviewed-on: https://go-review.googlesource.com/c/go/+/673556
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-05-19 11:27:35 -07:00
Alan Donovan 972639fc4c go/token: add FileSet.AddExistingFiles
+ test, doc, relnote

Fixes #73205

Change-Id: Id3a4cc6290c55ffa518ad174a02ccca85e8636f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/672875
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
2025-05-19 11:26:48 -07:00
Michael Pratt 11c86ddcb8 runtime: check for gsignal in asancall/msancall/racecall
asancall and msancall are reachable from the signal handler, where we
are running on gsignal. Currently, these calls will use the g0 stack in
this case, but if the interrupted code was running on g0 this will
corrupt the stack and likely cause a crash.

As far as I know, racecall is not reachable from the signal handler, but
I have updated it as well for consistency.

This is the most straightforward fix, though it would be nice to
eventually migrate these wrappers to asmcgocall, which already handled
this case.

Fixes #71395.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-asan-clang15,gotip-linux-amd64-msan-clang15,gotip-linux-amd64-race
Change-Id: I6a6a636ccba826dd53e31c0e85b5d42fb1e98d12
Reviewed-on: https://go-review.googlesource.com/c/go/+/643875
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-19 11:20:58 -07:00
Mark Freeman bc5aa2f7d3 go/types, types2: improve error message for init without body
Change-Id: I8a684965e88e0e33a6ff33a16e08d136e3267f7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/663636
TryBot-Bypass: Mark Freeman <mark@golang.org>
Auto-Submit: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-05-19 11:00:10 -07:00
Michael Pratt 2c929d6f4c runtime: pass through -asan/-msan/-race to testprog tests
The tests using testprog / testprogcgo are currently not covered on the
asan/msan/race builders because they don't build testprog with the
sanitizer flag.

Explicitly pass the flag if the test itself is built with the sanitizer.

There were a few tests that explicitly passed -race (even on non-race
builders). These tests will now only run on race builders.

For #71395.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-asan-clang15,gotip-linux-amd64-msan-clang15,gotip-linux-amd64-race
Change-Id: I6a6a636ce8271246316a80d426c0e4e2f6ab99c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/643897
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-19 11:00:01 -07:00
Alan Donovan 5afada035c go/ast: add PreorderStack, a variant of Inspect that builds a stack
+ doc, test, relnote

Fixes #73319

Change-Id: Ib7c9d0d7107cd62dc7f09120dfb475c4a469ddc9
Reviewed-on: https://go-review.googlesource.com/c/go/+/672696
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
2025-05-19 10:54:45 -07:00
Alan Donovan a74ae95282 strings,bytes: add internal docs about perennial noCopy questions
Updates #26462
Updates #25907
Updates #47276
Updates #48398

Change-Id: Ic64fc8d0c284f6e5aa383a8d417fa5768dcd7925
Reviewed-on: https://go-review.googlesource.com/c/go/+/674096
Auto-Submit: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-19 10:38:56 -07:00
Michael Matloob bd29985212 cmd/distpack: use positive list of tools to keep
Previously, distpack filtered out tools from the packaged distribution
using a list of tools to remove. Instead follow mpratt's suggestion on
CL 666755 and instead filter out tools that are not on a list of tools
to keep. This will make it easier to tell which tools are actually in
the distribution.

For #71867

Change-Id: I8336465703ac820028c3381a0a743c457997e78a
Reviewed-on: https://go-review.googlesource.com/c/go/+/673696
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
2025-05-19 10:20:17 -07:00
Alan Donovan 198c3cb785 std: pass bytes.Buffer and strings.Builder by pointer
This CL fixes a number of (all true positive) findings of vet's
copylock analyzer patched to treat the Bu{ff,uild}er types
as non-copyable after first use.

This does require imposing an additional indirection
between noder.writer and Encoder since the field is
embedded by value but its constructor now returns a pointer.

Updates golang/go#25907
Updates golang/go#47276

Change-Id: I0b4d77ac12bcecadf06a91709e695365da10766c
Reviewed-on: https://go-review.googlesource.com/c/go/+/635339
Reviewed-by: Robert Findley <rfindley@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
2025-05-19 09:13:04 -07:00
Guoqi Chen da9c5b142c cmd/compile: add prefetch intrinsic support on loong64
This CL enables intrinsic support to emit the following prefetch
instructions for loong64 platform:
  1.Prefetch - prefetches data from memory address to cache;
  2.PrefetchStreamed - prefetches data from memory address, with a
    hint that this data is being streamed.

Benchmarks picked from go/test/bench/garbage
Parameters tested with:
GOMAXPROCS=8
tree2 -heapsize=1000000000 -cpus=8
tree -n=18
parser
peano

Benchmarks Loongson-3A6000-HV @ 2500.00MHz:
         |   bench.old   |              bench.new               |
         |    sec/op     |    sec/op      vs base               |
Tree2-8    1238.2µ ± 24%   999.9µ ± 453%       ~ (p=0.089 n=10)
Tree-8      277.4m ±  1%   275.5m ±   1%       ~ (p=0.063 n=10)
Parser-8     3.564 ±  0%    3.509 ±   1%  -1.56% (p=0.000 n=10)
Peano-8     39.12m ±  2%   38.85m ±   2%       ~ (p=0.353 n=10)
geomean     83.19m         78.28m         -5.90%

Change-Id: I59e9aa4f609a106d4f70706e6d6d1fe6738ab72a
Reviewed-on: https://go-review.googlesource.com/c/go/+/671876
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-05-19 00:27:10 -07:00
limeidan f212b6499a cmd/link/internal: add support for internal linking on loong64
Change-Id: Ic0d36f27481ac707d04aaf7001f26061e510dd8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/533716
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-19 00:25:59 -07:00
Ville Vesilehto 42f9ee904c text/template: limit expression parenthesis nesting
Deeply nested parenthesized expressions could cause a stack
overflow during parsing. This change introduces a depth limit
(maxStackDepth) tracked in Tree.stackDepth to prevent this.

Additionally, this commit clarifies the security model in
the package documentation, noting that template authors
are trusted as text/template does not auto-escape.

Fixes #71201

Change-Id: Iab2c2ea6c193ceb44bb2bc7554f3fccf99a9542f
GitHub-Last-Rev: f4ebd1719f
GitHub-Pull-Request: golang/go#73670
Reviewed-on: https://go-review.googlesource.com/c/go/+/671755
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rob Pike <r@golang.org>
2025-05-17 03:27:48 -07:00
Michael Matloob 6425749695 cmd/distpack: remove more tools from packaged distribution
The "doc", "fix", and "covdata" tools invoked by the go command are not
needed for builds. Instead of invoking them directly using the installed
binary in the tool directory, use "go tool" to run them, building them
if needed. We can then stop distributing those tools in the
distribution.

covdata is used in tests and can form part of a cached test result, but
test results don't have the same requirements as build outputs to be
completely determined by the action id. We already don't include a
toolid for the covdata tool in the action id for a test run. The more
principled way to do things would be to load the covdata package,
create the actions to build it, and then depend on the output of
that action from the the test action and use that as the covdata tool.
For now, it's probably not worth the effort, but, in the future, if we
wanted to build a tool like cgo as needed, it would be best to build it
in the same action graph. That would introduce a whole bunch of complexity
because we'd need to build the tool in the host configuration, and all
the configuration parameters are global.

For #71867

Change-Id: Id9bbbb5c169296f66c072949f9da552424ecfa2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/673119
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
2025-05-16 15:32:15 -07:00
Michael Matloob 8798f9e7a4 cmd/distpack: remove some tools from packaged distribution
This change removes some tools that are not used for builds, or
otherwise invoked by the go command (other than through "go tool"
itself) from the packaged distributions produced by distpack. When these
tools are missing, "go tool" will build and run them as needed.

Also update a case where we print a buildid commandline to specify
invoking buildid using "go tool" rather than the binary at it's install
location, because it may not exist there in packaged distributions
anymore.

The tools in this CL are the lowest hanging fruit. There are a few more
tools that aren't used by builds, but we'd have to get the go command to
run them using "go tool" rather than finding them in the tool install
directory.

For #71867

Change-Id: I217683bd549962a1add87405bf3fb1225e2333c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/666755
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
2025-05-16 15:31:52 -07:00
Julian Zhu 8097cf14d2 cmd/compile: fold negation into addition/subtraction on mips64x
Fold negation into addition/subtraction and avoid double negation.

file      before    after     Δ       %
addr2line 4007310   4007470   +160    +0.004%
asm       7007636   7007436   -200    -0.003%
buildid   3839268   3838972   -296    -0.008%
cgo       6353466   6352738   -728    -0.011%
compile   30426920  30426896  -24     -0.000%
cover     7005408   7004744   -664    -0.009%
dist      4651192   4650872   -320    -0.007%
doc       10606050  10606034  -16     -0.000%
fix       4446414   4446390   -24     -0.001%
link      9237736   9237024   -712    -0.008%
nm        3999107   3999323   +216    +0.005%
objdump   6762424   6762144   -280    -0.004%
pack      3270757   3270493   -264    -0.008%
pprof     19428299  19361939  -66360  -0.342%
test2json 3717345   3717217   -128    -0.003%
trace     17382273  17381657  -616    -0.004%
vet       10689481  10688985  -496    -0.005%
go        19118769  19118609  -160    -0.001%
total     171949855 171878943 -70912  -0.041%

Change-Id: I35c1f264d216c214ea3f56252a9ddab8ea850fa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/673555
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-16 11:06:06 -07:00
Michael Matloob 07a279794d cmd/go/internal/modget: check in workspace mode if go.work present
The purpose of this change is to enable go get to be used when working
on a module that is usually built from a workspace and has unreleased
dependencies in that workspacae that have no requirements satisfying
them. These modules can't build in single module mode, and the
expectation is that a workspace will provide the unreleased
requirements.

Before this change, if go get was run in a module, and any of the
module's imports, that were not already satisfied using requirements,
could not be resolved from that module in single module mode, go get
would report an error. This could happen if, for example, the dependency
was unreleased, even privately, and couldn't be fetched using version
control or a proxy.  go get would also do a check using
cmd/go/internal/modget.(*resolver).checkPackageProblems that, among
other things, any package patterns provided to go get (the pkgPattern
argument to checkPackageProblems) could properly load. When checking in
single-module mode, this would cause an error because imports in the
non-required workspace dependencies could not be resolved.

This change makes a couple of changes to address each of those problems.
First, while "go get" still uses the single module's module graph to
load packages and determine which imports are not satisfied by a module
in the build list (the set of modules the build is done with), it will
"cheat" and look up the set of modules that would be loaded in workspace
mode. It will not try to fetch modules to satisfy imports of packages in
those modules.  (Alternatively, it could have tried to fetch modules to
satisfy the requirements, and allowed an error if it could not be found,
but we took the route of not doing the fetch to preserve the invariant
that the behavior wouldn't change if the network was down). The second,
and by far more complex, change is that the load that's done in
checkPackageProblems will be done in workspace mode rather than module
mode. While it is important that the requirements added by "go get" are
not determined using the workspace (with the necessary exception of the
skipped fetches) it is okay to use the workspace to load the modules,
as, if a go.work file is present, the go command would by default run
builds in workspace mode rather than single module mode. This more
relaxed check will allow get to succeed if a go list would succeed in
the workspace, even if it wouldn't from the single module.--

To avoid trying to satisfy imports that are in the other workspace
modules, we add a workspace field to the resolver that can be used to
check if an import is in the workspace. It reads the go.work file and
determines each of the modules' modroots. The hasPackage function will
call into modload logic using the new PkgIsInLocalModule function that
in turn calls into dirInModule to determine if the directory that would
contain the package sources exists in the module. We do that check in
cmd/go/internal/modget.(*resolver).loadPackages, which is used to
resolve modules to satisfy imports in the package graph supplied on the
command line. (Note that we do not skip resolving modules in the
functions that query for a package at a specific module version (such as
in "go get golang.org/x/tools/go/packages@latest), which are done in
(*resolver).queryPath. In that case, the user is explicitly requesting
to add a requirement on that package's module.)

The next step, checking for issues in the workspace mode, is more
complex because of two reasons. First, can't do all of
checkPackageProblems's work in a workspace, and second, that the module
loading state in the go command is global, so we have to manage the
global state in switching to workspace mode and back.

On the work that checkPackageProblems does: it broadly does three things:
first, a load of the packages specified to "go get", to make sure that
imports are satisfied and not ambiguous. Second, using that loaded
information, reporting any retracted or deprecated modules that are
relevant to the user and that they may want to take action on. And
third, checking that all the modules in the build list (the set of
module versions used in a load or build) have sums, and adding those
sums to the in-memory representation of the go.sum file that will be
written out at the end.  When there's a workspace, the first two checks
need to be done in workspace mode so that we properly point out issues
in the build, but the sums need to be updated in module mode so that the
module's go.sum file is updated to reflect changes in go.mod.

To do the first two steps in workspace mode, we add a new
modload.EnterWorkspace function that will reset the global modload state
and load from the workspace, using the updated requirements that have
been calculated for the module. It returns a cleanup function that will
exit the workspace and reset to the previous global state. (We need the
previous global state because it holds the updated in memory
representations of go.mod and go.sum that will be written out by go get
if there are no errors.) We switch to workspace mode if there's a
relevant go.work file that would trigger a workspace load _and_ the
module go get is being run from belongs to that workspace (it wouldn't
make sense to use a workspace that the module itself didn't belong to).
We then switch back to module mode after the first two steps are
complete using the cleanup function. We have to be careful to finish
all the tasks checking for deprecations and retractions before to start
looking at the sums because they retraction and deprecation checking
tasks will depend on the global workspace state.

It's a bit unfortunate that much of the modload and modfetch state is
global. It's pretty gross doing the switch. It would be a lot of
work, but if we need to do a switch in a third instance other than for
go work sync and go get it might be worth doing the work to refactor
modload so that the state isn't global.

The EnterWorkspace function that does the switch to workspace mode (in
cmd/go/internal/modload/init.go) first saves the in memory
representation of the go.mod file (calculated using UpdateGoModFromReqs)
so that it can be applied in the workspace. It then uses the new
setState function to save the old state and reset to a clean state,
loads in workspace mode (using InitWorkfile to so that the go.work file
is used by LoadModFile), and then replaces the go.mod file
representation for the previous main module with the contents saved
earlier and reloads the requirements (holding the roots of the module
graph) to represent the updates. In workspace mode, the roots field of the
requirements is the same, but the reload is used to update the set of
direct requirements. rawGoModSummary is also update to use the in-
memory representation of the previous main module's go.mod file rather
than always reading it from disk so that it can take those updated
contents into account. (It previously didn't need to do this because
rawGoModSummary is primarily used to get module information for
dependency modules. The exception is in workspace mode but the same
logic worked because we didn't update workspace module's go.mod files in
a go command running in workspace mode).  Finally, EnterWorkspace returns
a function that calls setState with the old state, to revert to the
previous state.

When we save the state of the modload package, we also need to save the
state of the modfetch package because it holds the state of the go.sum
file and the operation of looking up a value in the lookupCache or
downloadCache affects whether it appears in the go.sum file. lookupCache
and downloadCache are turned into pointers so that they can be saved in
the modfetchState.

Fixes #73654

Change-Id: I65cf835ec2293d4e3f66b91d3e77d3bb8d2f26d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/669635
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
2025-05-16 11:05:22 -07:00
Michael Anthony Knyszek e6cd9c083e sync: set GOMAXPROCS to 1 in TestPoolGC
This test expects to be able to drain a Pool using only Get. This isn't
actually possible in the general case, since a pooled value could get
stuck in some P's private slot. However, if GOMAXPROCS=1, there's only 1
P we could be running on, so getting stuck becomes impossible.

This test isn't checking any concurrent properties of Pool, so this is
fine. Just set GOMAXPROCS=1 for this one particular test.

Fixes #73728.

Change-Id: I9053e28118060650f2cd7d0d58f5a86d630b36f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/673375
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-16 09:55:03 -07:00
Srinivas Pokala 2b3794e3e8 cmd/go/internal/work: update minimum supported s390x version on go
This updates cgo support for s390x changing from z196 to z13, as
z13 is the minimum machine level running on go for s390x.

Change-Id: I1a102294b2108c35ddb1428bf287ce83debaeac8
Reviewed-on: https://go-review.googlesource.com/c/go/+/666995
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-16 09:15:37 -07:00
Sam Thanawalla f529d56508 cmd/go: add global ignore mechanism
This CL adds the ignore directive which enables users to tell the Go
Command to skip traversing into a given directory.

This behaves similar to how '_' or 'testdata' are currently treated.
This mainly has benefits for go list and go mod tidy.
This does not affect what is packed into a module.

Fixes: #42965
Change-Id: I232e27c1a065bb6eb2d210dbddad0208426a1fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/643355
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Michael Matloob <matloob@google.com>
2025-05-16 08:39:57 -07:00
Michael Anthony Knyszek b450b5409d runtime: prevent cleanup goroutines from missing work
Currently, there's a window of time where each cleanup goroutine has
committed to going to sleep (immediately after full.pop() == nil) but
hasn't yet marked itself as asleep (state.sleep()). If new work arrives
in this window, it might get missed. This is what we see in #73642, and
I can reproduce it with stress2.

Side-note: even if the work gets missed by the existing sleeping
goroutines, needg is incremented. So in theory a new goroutine will
handle the work. Right now that doesn't happen in tests like the one
running in #73642, where there might never be another call to AddCleanup
to create the additional goroutine. Also, if we've hit the maximum on
cleanup goroutines and all of them are in this window simultaneously, we
can still end up missing work, it's just more rare. So this is still a
problem even if we choose to just be more aggressive about creating new
cleanup goroutines.

This change fixes the problem and also aims to make the cleanup
wake/sleep code clearer. The way this change fixes this problem is to
have cleanup goroutines re-check the work list before going to sleep,
but after having already marked themselves as sleeping. This way, if new
work comes in before the cleanup goroutine marks itself as going to
sleep, we can rely on the re-check to pick up that work. If new work
comes after the goroutine marks itself as going to sleep and after the
re-check, we can rely on the scheduler noticing that the goroutine is
asleep and waking it up. If work comes in between a goroutine marking
itself as sleeping and the re-check, then the re-check will catch that
piece of work. However, the scheduler might now get a false signal that
the goroutine is asleep and try to wake it up. This is OK. The sleeping
signal is now mutated and double-checked under the queue lock, so the
scheduler will grab the lock, may notice there are no sleeping
goroutines, and go on its way. This may cause spurious lock acquisitions
but it should be very rare. The window between a cleanup goroutine
marking itself as going to sleep and re-checking the work list is a
handful of instructions at most.

This seems subtle but overall it's a simplification of the code. We
rely more on the lock, which is easier to reason about, and we track two
separate atomic variables instead of the merged cleanupSleepState: the
length of the full list, and the number of cleanup goroutines that are
asleep. The former is now the primary way to acquire work. Cleanup
goroutines must decrement the length successfully to obtain an item off
the full list. The number of cleanup goroutines asleep, meanwhile, is
now only updated with the queue lock held. It can be checked without the
lock held, and the invariant to make that safe is simple: it must always
be an overestimate of the number of sleeping cleanup goroutines.

The changes here do change some other behaviors.

First, since we're tracking the length of the full list instead of the
abstract concept of a wake-up, the waker can't consume wake-ups anymore.
This means that cleanup goroutines may be created more aggressively. If
two threads in the scheduler see that there are goroutines that are
asleep, only one will win the race, but the other will observe zero
asleep goroutines but potentially many work units available. This will
cause it to signal many goroutines to be created. This is OK since we
have a cap on the number of cleanup goroutines, and the race should be
relatively rare.

Second, because cleanup goroutines can now fail to go to sleep if any
units of work come in, they might spend more time contended on the lock.
For example, if we have N cleanup goroutines and work comes in at *just*
the wrong rate, in the worst case we'll have each of G goroutines loop
N times for N blocks, resulting in O(G*N) thread time to handle each
block in the worst case. To paint a picture, imagine each goroutine
trying to go to sleep, fail because a new block of work came in, and
only one goroutine will get that block. Then once that goroutine is
done, we all try again, fail because a new block of work came in, and so
on and so forth. This case is unlikely, though, and probably not worth
worrying about until it actually becomes a problem. (A similar problem
exists with parking (and exists before this change, too) but at least in
that case each goroutine parks, so it doesn't block the thread.)

Fixes #73642.

Change-Id: I6bbe1b789e7eb7e8168e56da425a6450fbad9625
Reviewed-on: https://go-review.googlesource.com/c/go/+/671676
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-16 07:43:49 -07:00
Guoqi Chen 045b5c1bfb cmd/internal/obj/loong64: change the plan9 format of the prefetch instruction PRELDX
before:
    MOVV    $n + $offset, Roff
    PRELDX  (Rbase)(Roff), $hint
after:
    PRELDX  offset(Rbase), $n, $hint

This instruction is supported in CL 671875, but is not actually used

Change-Id: I943d488ea6dc77781cd796ef480a89fede666bab
Reviewed-on: https://go-review.googlesource.com/c/go/+/673155
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-05-15 22:00:16 -07:00
qmuntal 4b5a64f467 os: set FILE_FLAG_BACKUP_SEMANTICS when opening without I/O access
FILE_FLAG_BACKUP_SEMANTICS is necessary to open directories on Windows,
and to enable backup applications do extended operations on files if
they hold the SE_BACKUP_NAME and SE_RESTORE_NAME privileges.

os.OpenFile currently sets FILE_FLAG_BACKUP_SEMANTICS for all supported
cases except when the file is opened with O_WRONLY | O_RDWR (that is,
access mode 3). This access mode doesn't correspond to any of the
standard POSIX access modes, but some OSes special case it to mean
different things. For example, on Linux, O_WRONLY | O_RDWR means check
for read and write permission on the file and return a file descriptor
that can't be used for reading or writing.

On Windows, os.OpenFile has historically mapped O_WRONLY | O_RDWR to a
0 access mode, which Windows internally interprets as
FILE_READ_ATTRIBUTES. Additionally, it doesn't prepare the file for I/O,
given that the read attributes permission doesn't allow reading or
writing (not that this is similar to what happens on Linux). This
makes opening the file around 50% faster, and one can still use the
handle to stat it, so some projects have been using this behavior
to open files without I/O access.

This CL updates os.OpenFile so that directories can also be opened
without I/O access. This effectively closes #23312, as all the remaining
cases where we don't set FILE_FLAG_BACKUP_SEMANTICS imply opening
with O_WRONLY or O_RDWR, and that's not allowed by Unix's open.

Closes #23312.

Change-Id: I77c4f55e1ca377789aef75bd8a9bce2b7499f91d
Reviewed-on: https://go-review.googlesource.com/c/go/+/673035
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-15 21:37:32 -07:00
Daniel McCarney 7b4a3d93d7 crypto/tls: fix bogo IgnoreClientVersionOrder skip reason
The BoGo IgnoreClientVersionOrder test checks that a client that sends
a supported_versions extension with the list [TLS 1.2, TLS 1.3] ends up
negotiating TLS 1.3.

However, the crypto/tls module treats this list as being in client
preference order, and so negotiates TLS 1.2, failing the test.

Our behaviour appears to be the correct handling based on RFC 8446
§4.2.1 where it says:
  The extension contains a list of supported versions in preference
  order, with the most preferred version first.

This commit updates the reason we skip this test to cite the RFC instead
of saying it's something to be fixed.

Updates #72006
Change-Id: I27a2cd231e4b8762b0d9e2dbd3d8ddd5b87fd5ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/671415
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Daniel McCarney <daniel@binaryparadox.net>
2025-05-15 20:14:22 -07:00
Keith Randall d681270714 cmd/compile: allow load-op merging in additional situations
x += *p

We want to do this with a single load+add operation on amd64.
The tricky part is that we don't want to combine if there are
other uses of x after this instruction.

Implement a simple detector that seems to capture a common situation -
x += *p is in a loop, and the other use of x is after loop exit.
In that case, it does not hurt to do the load+add combo.

Change-Id: I466174cce212e78bde83f908cc1f2752b560c49c
Reviewed-on: https://go-review.googlesource.com/c/go/+/672957
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-15 15:21:36 -07:00
Keith Randall a88f093aaa strings,bytes: make benchmark work deterministic
It's hard to compare two different runs of a benchmark if they
are doing different amounts of work.

Change-Id: I5d6845f3d11bb10136f745e6207d5f683612276d
Reviewed-on: https://go-review.googlesource.com/c/go/+/672895
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-15 14:24:16 -07:00
Keith Randall 19f05770b0 cmd/compile: schedule induction variable increments late
for ..; ..; i++ {
 ...
}

We want to schedule the i++ late in the block, so that all other
uses of i in the block are scheduled first. That way, i++ can
happen in place in a register instead of requiring a temporary register.

Change-Id: Id777407c7e67a5ddbd8e58251099b0488138c0df
Reviewed-on: https://go-review.googlesource.com/c/go/+/672998
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-05-15 14:06:41 -07:00
Keith Randall 11fa0de475 cmd/compile: use OpMove instead of memmove more on arm64
OpMove is faster for small moves of fixed size.

For safety, we have to rewrite the Move rewrite rules a bit so that
all the loads are done before any stores happen.

Also use an 8-byte move instead of a 16-byte move if the tail is
at most 8 bytes.

Change-Id: I7f6c7496ac6d5eb2e0706fd59ca4b5d797c51101
Reviewed-on: https://go-review.googlesource.com/c/go/+/672997
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-15 14:06:16 -07:00
Yongyue Sun fc641e7fae cmd/compile: create LSym for closures with type conversion
Follow-up to #54959 with another failing case.

The linker needs FuncInfo metadata for all inlined functions. CL 436240 explicitly creates LSym for direct closure calls to ensure we keep the FuncInfo metadata.

However, CL 436240 won't work if the direct closure call is wrapped by a no-effect type conversion, even if that closure could be inlined.

This commit should fix such case.

Fixes #73716

Change-Id: Icda6024da54c8d933f87300e691334c080344695
GitHub-Last-Rev: e9aed02eb6
GitHub-Pull-Request: golang/go#73718
Reviewed-on: https://go-review.googlesource.com/c/go/+/672855
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-15 13:16:39 -07:00
Jonathan Amsterdam 6df855ebac testing: fix panic in t.Log
If a testing.TB is no longer on the stack, t.Log would panic because
its outputWriter is nil. Check for nil and drop the write, which
is the previous behavior.

Change-Id: Ifde97997a3aa26ae604ac9c218588c1980110cbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/673215
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Jonathan Amsterdam <jba@google.com>
2025-05-15 10:24:18 -07:00
Nick Ripley 01e0e8b6b3 runtime/pprof: include PCs for deduplication in TestMutexBlockFullAggregation
TestMutexBlockFullAggregation aggregates stacks by function, file, and
line number. But there can be multiple function calls on the same line,
giving us different sequences of PCs. This causes the test to spuriously
fail in some cases. Include PCs in the stacks for this test.

Also pick up a small "range over int" modernize suggestion while we're
looking at the test.

Fixes #73641

Change-Id: I50489e19fcf920e27b9eebd9d4b35feb89981cbc
Reviewed-on: https://go-review.googlesource.com/c/go/+/673115
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-15 08:26:27 -07:00
Zxilly b338f6bfa6 cmd/link: fix outdated output mmap check
Outbuf.View used to perform a mmap check by default
and return an error if the check failed,
this behavior has been changed so that now
the View never returns any error,
so the usage needs to be modified accordingly.

Change-Id: I76ffcda5476847f6fed59856a5a5161734f47562
GitHub-Last-Rev: 6449f2973d
GitHub-Pull-Request: golang/go#73730
Reviewed-on: https://go-review.googlesource.com/c/go/+/673095
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-15 08:21:31 -07:00
qmuntal 8105ea53c3 net: avoid windows hang in TestCloseWrite
On Windows, reading from a socket at the same time as the other
end is closing it will occasionally hang. This is a Windows issue, not
a Go issue, similar to what happens in macOS (see #49352).

Work around this condition by adding a brief sleep before the read.

Fixes #73140.

Change-Id: I24e457a577e507d0d69924af6ffa1aa24c4aaaa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/671457
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
2025-05-15 07:44:10 -07:00
qmuntal bb0c14b895 os: don't fallback to the Stat slow path if file doesn't exist on Windows
os.Stat and os.Lstat first try stating the file without opening it. If
that fails, then they open the file and try again, operations that tends
to be slow. There is no point in trying the slow path if the file
doesn't exist, we should just return an error immediately.

This CL makes stating a non-existent file on Windows 50% faster:

goos: windows
goarch: amd64
pkg: os
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
                │   old.txt    │                new.txt                 │
                │    sec/op    │    sec/op     vs base                  │
StatNotExist-12   43.65µ ± 15%   20.02µ ± 10%  -54.14% (p=0.000 n=10+7)

                │  old.txt   │             new.txt              │
                │    B/op    │    B/op     vs base              │
StatNotExist-12   224.0 ± 0%   224.0 ± 0%  ~ (p=1.000 n=10+7) ¹
¹ all samples are equal

                │  old.txt   │             new.txt              │
                │ allocs/op  │ allocs/op   vs base              │
StatNotExist-12   2.000 ± 0%   2.000 ± 0%  ~ (p=1.000 n=10+7) ¹

Updates #72992.

Change-Id: Iaeb9596d0d18e5a5a1bd1970e296a3480501af78
Reviewed-on: https://go-review.googlesource.com/c/go/+/671458
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jake Bailey <jacob.b.bailey@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-14 22:22:10 -07:00
qmuntal 3be537e663 net: use closesocket when closing socket os.File's on Windows
The WSASocket documentation states that the returned socket must be
closed by calling closesocket instead of CloseHandle. The different
File methods on the net package return an os.File that is not aware
that it should use closesocket. Ideally, os.NewFile should detect that
the passed handle is a socket and use the appropriate close function,
but there is no reliable way to detect that a handle is a socket on
Windows (see CL 671455).

To work around this, we add a hidden function to the os package that
can be used to return an os.File that uses closesocket. This approach
is the same as used on Unix, which also uses a hidden function for other
purposes.

While here, fix a potential issue with FileConn, which was using File.Fd
rather than File.SyscallConn to get the handle. This could result in the
File being closed and garbage collected before the syscall was made.

Fixes #73683.

Change-Id: I179405f34c63cbbd555d8119e0f77157c670eb3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/672195
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-14 22:21:50 -07:00
Michael Anthony Knyszek a197a471b9 sync: use blockUntilCleanupQueueEmpty instead of busy-looping in tests
testPool currently does the old-style busy loop to wait until cleanups
have executed. Clean this up by using the linkname'd
blockUntilCleanupQueueEmpty.

For #73642.

Change-Id: Ie0c2614db858a984f25b33a805dc52948069eb52
Reviewed-on: https://go-review.googlesource.com/c/go/+/671675
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-14 20:42:54 -07:00
Michael Anthony Knyszek 3ea94ae446 runtime: help the race detector detect possible concurrent cleanups
This change makes it so that cleanup goroutines, in race mode, create a
fake race context and switch to it, emulating cleanups running on new
goroutines. This helps in catching races between cleanups that might run
concurrently.

Change-Id: I4c4e33054313798d4ac4e5d91ff2487ea3eb4b16
Reviewed-on: https://go-review.googlesource.com/c/go/+/652635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-14 19:12:19 -07:00
Keith Randall b30fa1bcc4 runtime: improve scan inner loop
On every arch except amd64, it is faster to do x&(x-1) than x^(1<<n).

Most archs need 3 instructions for the latter: MOV $1, R; SLL n, R;
ANDN R, x. Maybe 4 if there's no ANDN.

Most archs need only 2 instructions to do x&(x-1). It takes 3 on
x86/amd64 because NEG only works in place.

Only amd64 can do x^(1<<n) in a single instruction.
(We could on 386 also, but that's currently not implemented.)

Change-Id: I3b74b7a466ab972b20a25dbb21b572baf95c3467
Reviewed-on: https://go-review.googlesource.com/c/go/+/672956
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-14 18:11:51 -07:00
Xiaolin Zhao c31a5c571f cmd/compile: fold negation into addition/subtraction on loong64
This change also avoid double negation, and add loong64 codegen for arithmetic tests.
Reduce the number of go toolchain instructions on loong64 as follows.

    file      before    after     Δ       %
    addr2line 279972    279896  -76    -0.0271%
    asm       556390    556310  -80    -0.0144%
    buildid   272376    272300  -76    -0.0279%
    cgo       481534    481550  +16    +0.0033%
    compile   2457992   2457396 -596   -0.0242%
    covdata   323488    323404  -84    -0.0260%
    cover     518630    518490  -140   -0.0270%
    dist      340894    340814  -80    -0.0235%
    distpack  282568    282484  -84    -0.0297%
    doc       790224    789984  -240   -0.0304%
    fix       324408    324348  -60    -0.0185%
    link      704910    704666  -244   -0.0346%
    nm        277220    277144  -76    -0.0274%
    objdump   508026    507878  -148   -0.0291%
    pack      221810    221786  -24    -0.0108%
    pprof     1470284   1469880 -404   -0.0275%
    test2json 254896    254852  -44    -0.0173%
    trace     1100390   1100074 -316   -0.0287%
    vet       781398    781142  -256   -0.0328%
    go        1529668   1529128 -540   -0.0353%
    gofmt     318668    318568  -100   -0.0314%
    total     13795746 13792094 -3652  -0.0265%

Change-Id: I88d1f12cfc4be0e92687c48e06a57213aa484aca
Reviewed-on: https://go-review.googlesource.com/c/go/+/672555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-14 17:46:58 -07:00
Julian Zhu de86d02c32 crypto/internal/fips140/subtle: add assembly implementation of xorBytes for arm
goos: linux
goarch: arm
pkg: crypto/subtle
                                     │       o        │                  n                  │
                                     │     sec/op     │    sec/op     vs base               │
ConstantTimeByteEq-4                    5.353n ±  88%   4.012n ± 67%        ~ (p=0.381 n=8)
ConstantTimeEq-4                        4.151n ±   1%   4.078n ±  0%   -1.76% (p=0.000 n=8)
ConstantTimeLessOrEq-4                  4.010n ±  15%   4.154n ±  3%        ~ (p=0.584 n=8)
XORBytes/8Bytes-4                       85.69n ±  13%   44.02n ±  1%  -48.64% (p=0.000 n=8)
XORBytes/128Bytes-4                    164.85n ±   9%   84.62n ±  5%  -48.67% (p=0.000 n=8)
XORBytes/2048Bytes-4                   1374.0n ±   1%   741.2n ± 15%  -46.05% (p=0.000 n=8)
XORBytes/8192Bytes-4                    4.357µ ±   0%   2.801µ ±  0%  -35.71% (p=0.000 n=8)
XORBytes/32768Bytes-4                   16.67µ ±   0%   11.96µ ±  0%  -28.26% (p=0.000 n=8)
XORBytesAlignment/8Bytes0Offset-4       83.28n ±   0%   42.77n ±  1%  -48.65% (p=0.000 n=8)
XORBytesAlignment/8Bytes1Offset-4       61.52n ±   1%   50.30n ± 16%  -18.24% (p=0.000 n=8)
XORBytesAlignment/8Bytes2Offset-4       61.75n ±   1%   42.72n ±  1%  -30.82% (p=0.000 n=8)
XORBytesAlignment/8Bytes3Offset-4       61.53n ±   1%   42.70n ±  1%  -30.60% (p=0.000 n=8)
XORBytesAlignment/8Bytes4Offset-4       83.28n ±   0%   42.71n ±  1%  -48.72% (p=0.000 n=8)
XORBytesAlignment/8Bytes5Offset-4       61.53n ±   0%   42.73n ±  1%  -30.55% (p=0.000 n=8)
XORBytesAlignment/8Bytes6Offset-4       61.58n ±   0%   42.69n ±  1%  -30.68% (p=0.000 n=8)
XORBytesAlignment/8Bytes7Offset-4       61.63n ±   1%   42.70n ±  1%  -30.72% (p=0.000 n=8)
XORBytesAlignment/128Bytes0Offset-4    154.15n ±   4%   83.48n ±  0%  -45.84% (p=0.000 n=8)
XORBytesAlignment/128Bytes1Offset-4    265.25n ±   0%   91.70n ±  8%  -65.43% (p=0.000 n=8)
XORBytesAlignment/128Bytes2Offset-4    265.20n ±   0%   98.09n ± 13%  -63.01% (p=0.000 n=8)
XORBytesAlignment/128Bytes3Offset-4    265.20n ±   0%   85.48n ±  0%  -67.77% (p=0.000 n=8)
XORBytesAlignment/128Bytes4Offset-4    150.05n ±   0%   83.52n ± 15%  -44.34% (p=0.000 n=8)
XORBytesAlignment/128Bytes5Offset-4    265.20n ±   0%   85.48n ± 15%  -67.77% (p=0.000 n=8)
XORBytesAlignment/128Bytes6Offset-4    265.20n ±   0%   96.16n ± 11%  -63.74% (p=0.000 n=8)
XORBytesAlignment/128Bytes7Offset-4    265.20n ±   0%   85.49n ±  0%  -67.76% (p=0.000 n=8)
XORBytesAlignment/2048Bytes0Offset-4   1114.0n ±   0%   739.5n ±  0%  -33.62% (p=0.000 n=8)
XORBytesAlignment/2048Bytes1Offset-4   3285.0n ±  15%   783.5n ±  0%  -76.15% (p=0.000 n=8)
XORBytesAlignment/2048Bytes2Offset-4   3288.0n ±  15%   783.6n ± 25%  -76.17% (p=0.000 n=8)
XORBytesAlignment/2048Bytes3Offset-4   3286.0n ±   0%   783.5n ±  0%  -76.15% (p=0.000 n=8)
XORBytesAlignment/2048Bytes4Offset-4   1116.0n ± 115%   742.9n ±  0%  -33.43% (p=0.000 n=8)
XORBytesAlignment/2048Bytes5Offset-4   3285.0n ±   0%   785.0n ±  0%  -76.10% (p=0.000 n=8)
XORBytesAlignment/2048Bytes6Offset-4   3284.0n ±   0%   784.8n ±  0%  -76.10% (p=0.000 n=8)
XORBytesAlignment/2048Bytes7Offset-4   3283.0n ±   0%   784.9n ±  0%  -76.09% (p=0.000 n=8)
geomean                                 269.5n          129.5n        -51.93%

                                     │       o       │                   n                    │
                                     │      B/s      │      B/s        vs base                │
XORBytes/8Bytes-4                      89.08Mi ± 11%   173.34Mi ±  1%   +94.58% (p=0.000 n=8)
XORBytes/128Bytes-4                    741.9Mi ± 10%   1442.6Mi ± 13%   +94.45% (p=0.000 n=8)
XORBytes/2048Bytes-4                   1.388Gi ±  0%    2.573Gi ± 13%   +85.40% (p=0.000 n=8)
XORBytes/8192Bytes-4                   1.751Gi ±  1%    2.724Gi ±  0%   +55.57% (p=0.000 n=8)
XORBytes/32768Bytes-4                  1.830Gi ±  0%    2.551Gi ±  0%   +39.38% (p=0.000 n=8)
XORBytesAlignment/8Bytes0Offset-4      91.61Mi ±  0%   178.40Mi ±  1%   +94.75% (p=0.000 n=8)
XORBytesAlignment/8Bytes1Offset-4      124.0Mi ±  1%    152.2Mi ± 18%   +22.73% (p=0.000 n=8)
XORBytesAlignment/8Bytes2Offset-4      123.6Mi ±  1%    178.6Mi ± 14%   +44.54% (p=0.000 n=8)
XORBytesAlignment/8Bytes3Offset-4      124.0Mi ±  1%    178.6Mi ±  1%   +44.10% (p=0.000 n=8)
XORBytesAlignment/8Bytes4Offset-4      91.61Mi ±  0%   178.65Mi ±  1%   +95.01% (p=0.000 n=8)
XORBytesAlignment/8Bytes5Offset-4      124.0Mi ±  1%    178.5Mi ±  1%   +43.98% (p=0.000 n=8)
XORBytesAlignment/8Bytes6Offset-4      123.9Mi ±  1%    178.7Mi ±  1%   +44.23% (p=0.000 n=8)
XORBytesAlignment/8Bytes7Offset-4      123.8Mi ±  6%    178.7Mi ±  1%   +44.33% (p=0.000 n=8)
XORBytesAlignment/128Bytes0Offset-4    792.5Mi ±  4%   1462.3Mi ± 13%   +84.51% (p=0.000 n=8)
XORBytesAlignment/128Bytes1Offset-4    460.2Mi ±  0%   1337.2Mi ±  8%  +190.56% (p=0.000 n=8)
XORBytesAlignment/128Bytes2Offset-4    460.2Mi ±  0%   1244.6Mi ± 15%  +170.42% (p=0.000 n=8)
XORBytesAlignment/128Bytes3Offset-4    460.3Mi ±  0%   1428.1Mi ±  0%  +210.27% (p=0.000 n=8)
XORBytesAlignment/128Bytes4Offset-4    813.5Mi ±  0%   1461.6Mi ± 13%   +79.67% (p=0.000 n=8)
XORBytesAlignment/128Bytes5Offset-4    460.3Mi ±  0%   1428.0Mi ± 13%  +210.25% (p=0.000 n=8)
XORBytesAlignment/128Bytes6Offset-4    460.3Mi ±  0%   1285.1Mi ± 11%  +179.16% (p=0.000 n=8)
XORBytesAlignment/128Bytes7Offset-4    460.2Mi ±  0%   1427.9Mi ± 18%  +210.25% (p=0.000 n=8)
XORBytesAlignment/2048Bytes0Offset-4   1.711Gi ±  0%    2.579Gi ±  0%   +50.71% (p=0.000 n=8)
XORBytesAlignment/2048Bytes1Offset-4   594.5Mi ± 13%   2493.0Mi ± 20%  +319.35% (p=0.000 n=8)
XORBytesAlignment/2048Bytes2Offset-4   594.0Mi ± 13%   2492.7Mi ± 20%  +319.63% (p=0.000 n=8)
XORBytesAlignment/2048Bytes3Offset-4   594.4Mi ± 53%   2492.8Mi ±  0%  +319.35% (p=0.000 n=8)
XORBytesAlignment/2048Bytes4Offset-4   1.710Gi ± 53%    2.567Gi ±  0%   +50.17% (p=0.000 n=8)
XORBytesAlignment/2048Bytes5Offset-4   594.5Mi ±  0%   2487.9Mi ±  0%  +318.47% (p=0.000 n=8)
XORBytesAlignment/2048Bytes6Offset-4   594.8Mi ±  0%   2488.6Mi ±  0%  +318.41% (p=0.000 n=8)
XORBytesAlignment/2048Bytes7Offset-4   594.9Mi ±  0%   2488.3Mi ±  0%  +318.28% (p=0.000 n=8)
geomean                                414.2Mi          921.5Mi        +122.46%

Change-Id: I0ac50135de2e69fcf802be31e5175d666c93ad4c
Reviewed-on: https://go-review.googlesource.com/c/go/+/667817
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-05-14 15:10:32 -07:00
kmvijay a177448765 runtime: Improvement in perf of s390x memclr
Memclr routine of s390x architecture is now implemented with vector operations.
And loop unrolling is used for larger sizes.

goos: linux
goarch: s390x
pkg: runtime
                        |    old.txt    |            new_final.txt             |
                        |    sec/op     |    sec/op     vs base                |
Memclr/5                   2.485n ±  5%   2.421n ±  0%   -2.54% (p=0.000 n=10)
Memclr/16                  3.037n ±  2%   2.969n ±  0%   -2.26% (p=0.001 n=10)
Memclr/64                  9.623n ±  0%   4.455n ±  1%  -53.70% (p=0.000 n=10)
Memclr/256                 3.347n ±  3%   3.312n ±  4%        ~ (p=0.670 n=10)
Memclr/4096                15.53n ±  0%   15.54n ±  0%   +0.06% (p=0.000 n=10)
Memclr/65536               329.8n ±  2%   228.4n ±  0%  -30.74% (p=0.000 n=10)
Memclr/1M                  13.09µ ±  0%   12.78µ ±  0%   -2.34% (p=0.000 n=10)
Memclr/4M                  52.33µ ±  0%   51.16µ ±  0%   -2.24% (p=0.000 n=10)
Memclr/8M                  104.6µ ±  0%   102.3µ ±  0%   -2.20% (p=0.000 n=10)
Memclr/16M                 209.4µ ±  0%   204.9µ ±  0%   -2.17% (p=0.000 n=10)
Memclr/64M                 977.8µ ±  0%   967.8µ ±  0%   -1.02% (p=0.000 n=10)
MemclrUnaligned/0_5        3.398n ±  0%   3.657n ±  0%   +7.62% (p=0.000 n=10)
MemclrUnaligned/0_16       3.957n ±  0%   3.958n ±  0%        ~ (p=0.325 n=10)
MemclrUnaligned/0_64      11.550n ±  0%   5.139n ±  0%  -55.51% (p=0.000 n=10)
MemclrUnaligned/0_256      4.288n ±  0%   4.025n ±  4%   -6.14% (p=0.000 n=10)
MemclrUnaligned/0_4096     15.53n ±  0%   15.53n ±  0%        ~ (p=1.000 n=10)
MemclrUnaligned/0_65536    318.3n ±  1%   233.9n ±  0%  -26.52% (p=0.000 n=10)
MemclrUnaligned/1_5        3.398n ±  0%   3.657n ±  0%   +7.62% (p=0.000 n=10)
MemclrUnaligned/1_16       3.965n ±  0%   3.969n ±  0%   +0.10% (p=0.000 n=10)
MemclrUnaligned/1_64      11.550n ±  0%   5.109n ±  0%  -55.76% (p=0.000 n=10)
MemclrUnaligned/1_256      4.385n ±  0%   4.174n ±  1%   -4.80% (p=0.000 n=10)
MemclrUnaligned/1_4096     26.23n ±  0%   26.24n ±  0%   +0.04% (p=0.005 n=10)
MemclrUnaligned/1_65536    570.5n ±  0%   401.3n ±  0%  -29.66% (p=0.000 n=10)
MemclrUnaligned/4_5        3.398n ±  0%   3.657n ±  0%   +7.62% (p=0.000 n=10)
MemclrUnaligned/4_16       3.965n ±  0%   3.973n ±  1%   +0.19% (p=0.000 n=10)
MemclrUnaligned/4_64      11.550n ±  0%   5.131n ±  0%  -55.58% (p=0.000 n=10)
MemclrUnaligned/4_256      4.419n ±  0%   4.187n ±  1%   -5.25% (p=0.000 n=10)
MemclrUnaligned/4_4096     26.23n ±  0%   26.24n ±  0%   +0.04% (p=0.011 n=10)
MemclrUnaligned/4_65536    570.5n ±  0%   401.2n ±  0%  -29.67% (p=0.000 n=10)
MemclrUnaligned/7_5        3.397n ±  0%   3.657n ±  0%   +7.65% (p=0.000 n=10)
MemclrUnaligned/7_16       3.965n ±  0%   3.969n ±  0%   +0.10% (p=0.000 n=10)
MemclrUnaligned/7_64      11.550n ±  0%   5.120n ±  0%  -55.67% (p=0.000 n=10)
MemclrUnaligned/7_256      4.407n ±  0%   4.188n ±  2%   -4.99% (p=0.000 n=10)
MemclrUnaligned/7_4096     26.24n ±  0%   26.24n ±  0%        ~ (p=1.000 n=10)
MemclrUnaligned/7_65536    570.8n ±  0%   401.3n ±  0%  -29.69% (p=0.000 n=10)
MemclrUnaligned/0_1M       13.08µ ±  0%   12.81µ ±  0%   -2.06% (p=0.000 n=10)
MemclrUnaligned/0_4M       52.28µ ±  0%   51.13µ ±  0%   -2.21% (p=0.000 n=10)
MemclrUnaligned/0_8M       104.6µ ±  0%   102.3µ ±  0%   -2.18% (p=0.000 n=10)
MemclrUnaligned/0_16M      209.5µ ±  0%   204.8µ ±  0%   -2.24% (p=0.000 n=10)
MemclrUnaligned/0_64M      977.7µ ±  0%   969.1µ ±  0%   -0.88% (p=0.000 n=10)
MemclrUnaligned/1_1M       17.49µ ±  0%   16.04µ ±  0%   -8.32% (p=0.000 n=10)
MemclrUnaligned/1_4M       69.92µ ±  0%   64.13µ ±  0%   -8.28% (p=0.000 n=10)
MemclrUnaligned/1_8M       139.8µ ±  0%   128.2µ ±  0%   -8.32% (p=0.000 n=10)
MemclrUnaligned/1_16M      279.9µ ±  0%   256.1µ ±  0%   -8.50% (p=0.000 n=10)
MemclrUnaligned/1_64M      1.250m ±  0%   1.216m ±  0%   -2.73% (p=0.000 n=10)
MemclrUnaligned/4_1M       17.50µ ±  0%   16.04µ ±  0%   -8.33% (p=0.000 n=10)
MemclrUnaligned/4_4M       69.93µ ±  0%   64.12µ ±  0%   -8.30% (p=0.000 n=10)
MemclrUnaligned/4_8M       139.8µ ±  0%   128.2µ ±  0%   -8.32% (p=0.000 n=10)
MemclrUnaligned/4_16M      280.2µ ±  0%   256.2µ ±  0%   -8.55% (p=0.000 n=10)
MemclrUnaligned/4_64M      1.250m ±  0%   1.216m ±  0%   -2.73% (p=0.000 n=10)
MemclrUnaligned/7_1M       17.50µ ±  0%   16.04µ ±  0%   -8.35% (p=0.000 n=10)
MemclrUnaligned/7_4M       69.92µ ±  0%   64.13µ ±  0%   -8.28% (p=0.000 n=10)
MemclrUnaligned/7_8M       139.8µ ±  0%   128.2µ ±  0%   -8.34% (p=0.000 n=10)
MemclrUnaligned/7_16M      279.6µ ±  0%   256.2µ ±  0%   -8.35% (p=0.000 n=10)
MemclrUnaligned/7_64M      1.250m ±  0%   1.216m ±  0%   -2.73% (p=0.000 n=10)
MemclrRange/1K_2K          1.053µ ±  0%   1.020µ ±  1%   -3.09% (p=0.000 n=10)
MemclrRange/2K_8K          1.552µ ±  0%   1.570µ ± 12%        ~ (p=0.137 n=10)
MemclrRange/4K_16K         1.283µ ±  0%   1.250µ ±  0%   -2.61% (p=0.000 n=10)
MemclrRange/160K_228K      20.62µ ±  0%   19.86µ ±  0%   -3.70% (p=0.000 n=10)
MemclrKnownSize1           1.732n ±  0%   1.732n ±  0%        ~ (p=1.000 n=10)
MemclrKnownSize2           1.925n ± 34%   1.967n ±  8%        ~ (p=0.080 n=10)
MemclrKnownSize4           1.808n ±  3%   1.732n ±  0%   -4.20% (p=0.000 n=10)
MemclrKnownSize8           2.002n ±  9%   1.773n ±  5%  -11.46% (p=0.000 n=10)
MemclrKnownSize16          2.880n ±  5%   2.461n ±  5%  -14.53% (p=0.000 n=10)
MemclrKnownSize32          8.082n ±  0%   2.838n ±  5%  -64.88% (p=0.000 n=10)
MemclrKnownSize64          8.083n ±  0%   4.960n ±  4%  -38.63% (p=0.000 n=10)
MemclrKnownSize112         8.082n ±  0%   5.533n ±  1%  -31.53% (p=0.000 n=10)
MemclrKnownSize128         8.082n ±  0%   5.534n ±  1%  -31.54% (p=0.000 n=10)
MemclrKnownSize192         8.082n ±  0%   6.833n ±  2%  -15.45% (p=0.000 n=10)
MemclrKnownSize248         8.082n ±  0%   7.165n ±  1%  -11.34% (p=0.000 n=10)
MemclrKnownSize256         2.995n ±  6%   3.226n ±  4%   +7.70% (p=0.006 n=10)
MemclrKnownSize512         3.356n ±  8%   3.595n ±  3%   +7.14% (p=0.007 n=10)
MemclrKnownSize1024        4.664n ±  0%   4.665n ±  0%        ~ (p=0.426 n=10)
MemclrKnownSize4096        15.80n ±  4%   15.15n ±  0%        ~ (p=0.449 n=10)
MemclrKnownSize512KiB      6.543µ ±  0%   6.380µ ±  0%   -2.48% (p=0.000 n=10)
geomean                    327.2n         286.6n        -12.42%

Change-Id: I0f8450743e2f7e736c5ff96a316a8b5d98b27222
Reviewed-on: https://go-review.googlesource.com/c/go/+/662475
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-14 14:33:09 -07:00
Ian Alexander ac341b8e6b cmd/go/internal/pkg: fail on bad filenames
Unhidden filenames with forbidden characters in subdirectories now
correctly fail the build instead of silently being skipped.
Previously this behavior would only trigger on files in the root of
the embedded directory.

Fixes #54003
Change-Id: I52967d90d6b7929e4ae474b78d3f88732bdd3ac4
Reviewed-on: https://go-review.googlesource.com/c/go/+/670615
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-14 13:48:02 -07:00
qiulaidongfeng 12e11e7523 cmd/go: fix not print GCCGO when it's not the default
Fixes #69994

Change-Id: I2a23e5998b7421fd5ae0fdb68303d3244361b341
Reviewed-on: https://go-review.googlesource.com/c/go/+/671635
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-14 13:45:40 -07:00
Mark Freeman a24f4db2a2 internal/pkgbits, cmd/compile/internal/noder: document string section
To understand this change, we begin with a short description of the UIR
file format.

Every file is a header followed by a series of sections. Each section
has a kind, which determines the type of elements it contains. An
element is just a collection of one or more primitives, as defined by
package pkgbits.

Strings have their own section. Elements in the string section contain
only string primitives. To use a string, elements in other sections
encode a reference to the string section.

To illustrate, consider a simple file which exports nothing at all.

  package p

In the meta section, there is an element representing a package stub.
In that package stub, a string ("p") represents both the path and name
of the package. Again, these are encoded as references.

To manage references, every element begins with a reference table.
Instead of writing the bytes for "p" directly, the package stub encodes
an index in this reference table. At that index, a pair of numbers is
stored, indicating:

  1. which section
  2. which element index within the section

Effectively, elements always use *2* layers of indirection; first to the
reference table, then to the bytes themselves.

With some minor hand-waving, an encoding for the above package is given
below, with (S)ections, (E)lements and (P)rimitives denoted.

+ Header
| + Section Ends                           // each section has 1 element
| | + 1                                    // String is elements [0, 1)
| | + 2                                    // Meta   is elements [1, 2)
| + Element Ends
| | + 1                                    // "p"    is bytes    [0, 1)
| | + 6                                    // stub   is bytes    [1, 6)
+ Payload
| + (S) String
| | + (E) String
| | | + (P) String           { byte } 0x70 // "p"
| + (S) Meta
| | + (E) Package Stub
| | | + Reference Table
| | | | + (P) Entry Count    uvarint  1    // there is a single entry
| | | | + (P) 0th Section    uvarint  0    // to String, 0th section
| | | | + (P) 0th Index      uvarint  0    // to 0th element in String
| | | + Internals
| | | | + (P) Path           uvarint  0    // 0th entry in table
| | | | + (P) Name           uvarint  0    // 0th entry in table

Note that string elements do not have reference tables like other
elements. They behave more like a primitive.

As this is a bit complicated and getting into details of the UIR file
format, we omit some details in the documentation here. The structure
will become clearer as we continue documenting.

Change-Id: I12a5ce9a34251c5358a20f2f2c4d0f9bd497f4d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/671997
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Mark Freeman <mark@golang.org>
TryBot-Bypass: Mark Freeman <mark@golang.org>
2025-05-14 13:18:06 -07:00