Commit Graph

62581 Commits

Author SHA1 Message Date
Damien Neil 0a0e6af39b context: use atomic operation in ctx.Err
oos: darwin
goarch: arm64
pkg: context
cpu: Apple M1 Pro
               │ /tmp/bench.0.mac │          /tmp/bench.1.mac           │
               │      sec/op      │   sec/op     vs base                │
ErrOK-10             13.750n ± 1%   2.080n ± 0%  -84.87% (p=0.000 n=10)
ErrCanceled-10       13.530n ± 1%   3.248n ± 1%  -76.00% (p=0.000 n=10)
geomean               13.64n        2.599n       -80.94%

goos: linux
goarch: amd64
pkg: context
cpu: Intel(R) Xeon(R) CPU @ 2.30GHz
               │ /tmp/bench.0.linux │         /tmp/bench.1.linux          │
               │       sec/op       │   sec/op     vs base                │
ErrOK-16               21.435n ± 0%   4.243n ± 0%  -80.21% (p=0.000 n=10)
ErrCanceled-16         21.445n ± 0%   5.070n ± 0%  -76.36% (p=0.000 n=10)
geomean                 21.44n        4.638n       -78.37%

Fixes #72040

Change-Id: I3b337ab1934689d2da4134492ee7c5aac8f92845
Reviewed-on: https://go-review.googlesource.com/c/go/+/653795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-03-03 12:32:57 -08:00
qmuntal 14647b0ac8 os: only call GetConsoleMode for char devices
There is no need to call GetConsoleMode if we know that the file
type is not FILE_TYPE_CHAR. This is a tiny performance optimization,
as I sometimes see this call in profiles.

Change-Id: I9e9237908585d0ec8360930a0406b26f52699b92
Reviewed-on: https://go-review.googlesource.com/c/go/+/654155
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-03-03 11:33:48 -08:00
Robert Griesemer 05354fc3b4 go/types, types2: remove remaining references to coreType in literals.go
For now, use commonUnder (formerly called sharedUnder) and update
error messages and comments. We can provide better error messages
in individual cases eventually.

For #70128.

Change-Id: I906ba9a0c768f6499c1683dc9be3ad27da8007a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/653156
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
2025-03-03 11:26:34 -08:00
Robert Griesemer 26ba61dfad go/types, types2: remove most remaining references to coreType in builtin.go
For now, use commonUnder (formerly called sharedUnder) and update
error messages and comments. We can provide better error messages
in individual cases eventually.

Kepp using coreType for make built-in for now because it must accept
different channel types with non-conflicting directions and identical
element types. Added extra test cases.

While at it, rename sharedUnder, sharedUnderOrChan to commonUnder
and commonUnderOrChan, respectively (per suggestion from rfindley).

For #70128.

Change-Id: I11f3d5ce858746574f4302271d8cb763c2cdcf98
Reviewed-on: https://go-review.googlesource.com/c/go/+/653139
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-03-03 11:26:31 -08:00
Filippo Valsorda 19d0b3e81f crypto/rsa: use Div instead of GCD for trial division
Div is way faster. We could actually test a lot more primes and still
gain performance despite the diminishing returns, but necessarily it
would have marginal impact overall.

fips140: off
goos: linux
goarch: amd64
pkg: crypto/rsa
cpu: AMD Ryzen 7 PRO 8700GE w/ Radeon 780M Graphics
                    │  e325b41ad1  │             0f611af2e1              │
                    │    sec/op    │   sec/op     vs base                │
GenerateKey/2048-16   124.19m ± 0%   39.93m ± 0%  -67.85% (p=0.000 n=20)

Surprisingly, the performance gain is similar on ARM64, which doesn't
have intrinsified math.Div.

fips140: off
goos: darwin
goarch: arm64
pkg: crypto/rsa
cpu: Apple M2
                   │  e325b41ad1  │             6276161a7f              │
                   │    sec/op    │   sec/op     vs base                │
GenerateKey/2048-8   136.49m ± 0%   47.97m ± 1%  -64.86% (p=0.000 n=20)

Change-Id: I6a6a46560331198312bd09c1cbe4d2b3c370c552
Reviewed-on: https://go-review.googlesource.com/c/go/+/639955
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
2025-03-03 11:17:41 -08:00
Robert Griesemer 4b1ac7bbfe go/types, types2: remove references to core type in append
Writing explicit code for this case turned out to be simpler
and easier to reason about then relying on a helper functions
(except for typeset).

While at it, make append error messages more consistent.

For #70128.

Change-Id: I3dc79774249929de5061b4301ab2506d4b3da0d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/653095
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-03-03 11:07:14 -08:00
Junyang Shao f48b53f0f6 testing: fix testing.B.Loop doc on loop condition
As mentioned by
https://github.com/golang/go/issues/61515#issuecomment-2656656554,
the documentation should be relaxed.

Change-Id: I9f18301e1a4e4d9a72c9fa0b1132b1ba3cc57b03
Reviewed-on: https://go-review.googlesource.com/c/go/+/651435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Commit-Queue: Junyang Shao <shaojunyang@google.com>
2025-03-03 10:37:29 -08:00
Brad Fitzpatrick 0312e31ed1 net: fix parsing of interfaces on plan9 without associated devices
Fixes #72060
Updates #39908

Change-Id: I7d5bda1654753acebc8aa9937d010b41c5722b36
Reviewed-on: https://go-review.googlesource.com/c/go/+/654055
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Brad Fitzpatrick <bradfitz@golang.org>
2025-03-03 08:07:04 -08:00
Jakob Ackermann bda9f85e5c net/http: allocate CloseNotifier channel lazily
The CloseNotifier interface is deprecated. We can defer allocating the
backing channel until the first use of CloseNotifier.

goos: linux
goarch: amd64
pkg: net/http
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
                   │   before    │               after                │
                   │   sec/op    │   sec/op     vs base               │
Server-8             160.8µ ± 2%   160.1µ ± 1%       ~ (p=0.353 n=10)
CloseNotifier/h1-8   222.1µ ± 4%   226.4µ ± 7%       ~ (p=0.143 n=10)
geomean              189.0µ        190.4µ       +0.75%

                   │    before    │                after                │
                   │     B/op     │     B/op      vs base               │
Server-8             2.292Ki ± 0%   2.199Ki ± 0%  -4.07% (p=0.000 n=10)
CloseNotifier/h1-8   3.224Ki ± 0%   3.241Ki ± 0%  +0.51% (p=0.000 n=10)
geomean              2.718Ki        2.669Ki       -1.80%

                   │   before   │                after                │
                   │ allocs/op  │ allocs/op   vs base                 │
Server-8             21.00 ± 0%   20.00 ± 0%  -4.76% (p=0.000 n=10)
CloseNotifier/h1-8   50.00 ± 0%   50.00 ± 0%       ~ (p=1.000 n=10) ¹
geomean              32.40        31.62       -2.41%
¹ all samples are equal

Change-Id: I3f35d56b8356fb660589b7708a023e4480f32067
GitHub-Last-Rev: c75696b9b8
GitHub-Pull-Request: golang/go#71163
Reviewed-on: https://go-review.googlesource.com/c/go/+/640598
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-03 07:45:07 -08:00
Joel Sing b199d9766a runtime: add padding to m struct for 64 bit architectures
CL 652276 reduced the m struct by 8 bytes, which has changed the
allocation class on 64 bit OpenBSD platforms. This results in build
failures due to:

    M structure uses sizeclass 1792/0x700 bytes; incompatible with mutex flag mask 0x3ff

Add 128 bytes of padding when spinbitmutex is enabled on 64 bit
architectures, moving the size to the half point between the
1792 and 2048 allocation size.

Change-Id: I71623a1f75714543c302217e619d20cf0e717aeb
Reviewed-on: https://go-review.googlesource.com/c/go/+/653335
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-03-01 09:22:41 -08:00
Mark Ryan 5a7db813a6 cmd/internal/obj/riscv: add riscv64 CSR map
The map is automatically generated by running the latest version of
parse.py from github.com/riscv/riscv-opcodes.

Change-Id: I05e00ab27ec583750752c25e1835c2578b339fbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/630518
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-03-01 04:59:21 -08:00
qmuntal f4750a6cfb cmd/go/internal/work: use par.Cache to cache tool IDs.
The tool IDs can be calculated once and reused across multiple
threads. This is a small optimization that helps optimize system
resources.

On a normal Windows machine with 12 virtual CPUs, the time to build
a hello world program is reduced from over 1 second, with spikes of 2
seconds, to a consistent 0.7 seconds.

Updates #71981.

Change-Id: I85f4a19f8ad4230afa32213780c761b7eb22fa29
Reviewed-on: https://go-review.googlesource.com/c/go/+/653715
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-01 00:29:04 -08:00
Xiaolin Zhao 039b3ebeba runtime: use ABIInternal on syscall and other sys.stuff for loong64
Change-Id: I6b2942c413eab58c457980131022dace036cd76c
Reviewed-on: https://go-review.googlesource.com/c/go/+/623475
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-02-28 17:10:54 -08:00
limeidan f24d2e175e cmd/internal/obj, cmd/asm: reclassify 32-bit immediate value of loong64
Change-Id: If9fd257ca0837a8c8597889c4f5ed3d4edc602c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/636995
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-28 17:10:24 -08:00
Michael Matloob 6894904974 cmd/go/internal/modindex: clean modroot and pkgdir for openIndexPackage
GetPackage is sometimes called with a modroot or pkgdir that ends with a
path separator, and sometimes without. Clean them before passing them to
openIndexPackage to deduplicate calls to mcache.Do and action entry
files.

This shouldn't affect #71698 but was discovered while debugging that
issue.

Change-Id: I6a7fa4de93f45801504abea11bd97f6c6577f296
Reviewed-on: https://go-review.googlesource.com/c/go/+/652435
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-28 14:09:39 -08:00
guoguangwu 64feded8af cmd/covdata: close output meta-data file
Change-Id: Idd2a324eb51ffa3f40cb3df03a82a1d6d882295a
GitHub-Last-Rev: 62e22b309d
GitHub-Pull-Request: golang/go#71993
Reviewed-on: https://go-review.googlesource.com/c/go/+/653140
Reviewed-by: Than McIntosh <thanm@golang.org>
Commit-Queue: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2025-02-28 12:43:43 -08:00
qmuntal 2298215f5b cmd/link: use __got as the .got section name
The __nl_symbol_ptr is not a common section name anymore. LLVM prefers
__got for GOT symbols in the __DATA_CONST segment.

Note that the Go linker used to place the GOT section in the __DATA
segment, but since CL 644055 we place it in the __DATA_CONST segment.

Updates #71416.

Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64-longtest
Change-Id: Icb776e19855eaabb4777a9b1eb433497842413b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/652555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-02-28 12:43:36 -08:00
Jes Cok 74ba2164a0 reflect: add more tests for Type.{CanSeq,CanSeq2}
For #71874.

Change-Id: I3850edfb3104305f3bf4847a73cdd826cc99837f
GitHub-Last-Rev: 574c1edb7a
GitHub-Pull-Request: golang/go#71890
Reviewed-on: https://go-review.googlesource.com/c/go/+/651775
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-28 11:09:39 -08:00
Michael Anthony Knyszek 1dafccaaf3 runtime: increase timeout in TestSpuriousWakeupsNeverHangSemasleep
This change tries increasing the timeout in
TestSpuriousWakeupsNeverHangSemasleep. I'm not entirely sure of the
mechanism, but GODEBUG=gcstoptheworld=2 and GODEBUG=gccheckmark=1 can
cause this test to fail at it's regular timeout. It does not seem to
indicate a deadlock, because bumping the timeout 10x make the problem
go away. I suspect the problem is due to the long STW times these two
modes can induce, plus the fact this test runs in parallel with others.

Let's just bump the timeout. The test is fundamentally sound, and it's
unclear to me how else to test for a deadlock here.

Fixes #71691.
Fixes #71548.

Change-Id: I649531eeec8a8408ba90823ce5223f3a17863124
Reviewed-on: https://go-review.googlesource.com/c/go/+/652756
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-02-28 10:17:23 -08:00
Filippo Valsorda 6e8d7a113c crypto/x509: avoid crypto/rand.Int to generate serial number
It's probabyl safe enough, but just reading bytes from rand and then
using SetBytes is simpler, and doesn't require allowing calls from
crypto into math/big's Lsh, Sub, and Cmp.

Change-Id: I6a6a4656761f7073f9e149f288c48e97048ab13c
Reviewed-on: https://go-review.googlesource.com/c/go/+/643278
Auto-Submit: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Roland Shoemaker <roland@golang.org>
2025-02-28 08:54:13 -08:00
KangJi 555974734f cmd/cgo: update generated headers for compatibility with latest MSVC C++ standards
Updates #71921

Change-Id: Idfbb72e259b169121c8ced6d89ee2f13d6254d0d
GitHub-Last-Rev: fcf12e5a22
GitHub-Pull-Request: golang/go#72004
Reviewed-on: https://go-review.googlesource.com/c/go/+/653141
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-28 08:48:12 -08:00
Jes Cok b3e36364b9 flag: replace interface{} -> any for textValue.Get method
Make it literally match the Getter interface.

Change-Id: I73f03780ba1d3fd2230e0e5e2343d40530d9e6d8
GitHub-Last-Rev: 398b90b2fb
GitHub-Pull-Request: golang/go#71975
Reviewed-on: https://go-review.googlesource.com/c/go/+/652795
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-28 08:43:46 -08:00
Damien Neil d31c805535 net/http: reject newlines in chunk-size lines
Unlike request headers, where we are allowed to leniently accept
a bare LF in place of a CRLF, chunked bodies must always use CRLF
line terminators. We were already enforcing this for chunk-data lines;
do so for chunk-size lines as well. Also reject bare CRs anywhere
other than as part of the CRLF terminator.

Fixes CVE-2025-22871
Fixes #71988

Change-Id: Ib0e21af5a8ba28c2a1ca52b72af8e2265ec79e4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/652998
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-27 09:23:42 -08:00
Lin Lin 0135120922 cmd/go: update c document
Fixes: #11875

Change-Id: I0ea2c3e94d7d1647c2aaa3d488ac3c1f5fb6cb18
GitHub-Last-Rev: 7512b33f05
GitHub-Pull-Request: golang/go#71966
Reviewed-on: https://go-review.googlesource.com/c/go/+/652675
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
2025-02-27 07:57:37 -08:00
Russ Cox 2b73707358 math/big: add tests for allocation during multiply
Test that big.Int.Mul reusing the same target is not allocating
temporary garbage during its computation. That code is going
to be modified in an upcoming CL.

Change-Id: I3ed55c06da030282233c29cd7af2a04f395dc7a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/652056
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2025-02-27 06:05:02 -08:00
Russ Cox 0ab2253ce1 math/big: move multiplication to natmul.go
No code changes.

This CL moves the multiplication (and squaring) code into natmul.go,
in preparation for cleaning up Karatsuba and then adding Toom-Cook
and FFT-based multiplication.

Change-Id: I7f84328284cc4e1ca4da0ebb9f666a5535e8d7f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/652055
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-02-27 06:05:00 -08:00
Russ Cox 1f7e28acdf math/big: optimize atoi of base 2, 4, 16
Avoid multiplies when converting base 2, 4, 16 inputs,
reducing conversion time from O(N²) to O(N).

The Base8 and Base10 code paths should be unmodified,
but the base-2,4,16 changes tickle the compiler to generate
better (amd64) or worse (arm64) when really it should not.
This is described in detail in #71868 and should be ignored
for the purposes of this CL.

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
                      │     old      │                 new                 │
                      │    sec/op    │   sec/op     vs base                │
Scan/10/Base2-16         324.4n ± 0%   258.7n ± 0%  -20.25% (p=0.000 n=15)
Scan/100/Base2-16        2.376µ ± 0%   1.968µ ± 0%  -17.17% (p=0.000 n=15)
Scan/1000/Base2-16       23.89µ ± 0%   19.16µ ± 0%  -19.80% (p=0.000 n=15)
Scan/10000/Base2-16      311.5µ ± 0%   190.4µ ± 0%  -38.86% (p=0.000 n=15)
Scan/100000/Base2-16    10.508m ± 0%   1.904m ± 0%  -81.88% (p=0.000 n=15)
Scan/10/Base8-16         138.3n ± 0%   127.9n ± 0%   -7.52% (p=0.000 n=15)
Scan/100/Base8-16        886.1n ± 0%   790.2n ± 0%  -10.82% (p=0.000 n=15)
Scan/1000/Base8-16       9.227µ ± 0%   8.234µ ± 0%  -10.76% (p=0.000 n=15)
Scan/10000/Base8-16      165.8µ ± 0%   155.6µ ± 0%   -6.19% (p=0.000 n=15)
Scan/100000/Base8-16     9.044m ± 0%   8.935m ± 0%   -1.20% (p=0.000 n=15)
Scan/10/Base10-16        129.9n ± 0%   120.0n ± 0%   -7.62% (p=0.000 n=15)
Scan/100/Base10-16       816.3n ± 0%   730.0n ± 0%  -10.57% (p=0.000 n=15)
Scan/1000/Base10-16      8.518µ ± 0%   7.628µ ± 0%  -10.45% (p=0.000 n=15)
Scan/10000/Base10-16     158.6µ ± 0%   149.4µ ± 0%   -5.80% (p=0.000 n=15)
Scan/100000/Base10-16    8.962m ± 0%   8.855m ± 0%   -1.20% (p=0.000 n=15)
Scan/10/Base16-16        114.5n ± 0%   108.6n ± 0%   -5.15% (p=0.000 n=15)
Scan/100/Base16-16       648.3n ± 0%   525.0n ± 0%  -19.02% (p=0.000 n=15)
Scan/1000/Base16-16      7.375µ ± 0%   5.636µ ± 0%  -23.58% (p=0.000 n=15)
Scan/10000/Base16-16    171.18µ ± 0%   66.99µ ± 0%  -60.87% (p=0.000 n=15)
Scan/100000/Base16-16   9490.9µ ± 0%   682.8µ ± 0%  -92.81% (p=0.000 n=15)
geomean                  20.11µ        13.69µ       -31.94%

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
                      │      old      │                 new                 │
                      │    sec/op     │   sec/op     vs base                │
Scan/10/Base2-88          275.4n ± 0%   215.0n ± 0%  -21.93% (p=0.000 n=15)
Scan/100/Base2-88         1.869µ ± 0%   1.629µ ± 0%  -12.84% (p=0.000 n=15)
Scan/1000/Base2-88        18.56µ ± 0%   15.81µ ± 0%  -14.82% (p=0.000 n=15)
Scan/10000/Base2-88       270.0µ ± 0%   157.2µ ± 0%  -41.77% (p=0.000 n=15)
Scan/100000/Base2-88     11.518m ± 0%   1.571m ± 0%  -86.36% (p=0.000 n=15)
Scan/10/Base8-88          108.9n ± 0%   106.0n ± 0%   -2.66% (p=0.000 n=15)
Scan/100/Base8-88         655.2n ± 0%   594.9n ± 0%   -9.20% (p=0.000 n=15)
Scan/1000/Base8-88        6.467µ ± 0%   5.966µ ± 0%   -7.75% (p=0.000 n=15)
Scan/10000/Base8-88       151.2µ ± 0%   147.4µ ± 0%   -2.53% (p=0.000 n=15)
Scan/100000/Base8-88      10.33m ± 0%   10.30m ± 0%   -0.25% (p=0.000 n=15)
Scan/10/Base10-88        100.20n ± 0%   98.53n ± 0%   -1.67% (p=0.000 n=15)
Scan/100/Base10-88        596.9n ± 0%   543.3n ± 0%   -8.98% (p=0.000 n=15)
Scan/1000/Base10-88       5.904µ ± 0%   5.485µ ± 0%   -7.10% (p=0.000 n=15)
Scan/10000/Base10-88      145.7µ ± 0%   142.0µ ± 0%   -2.55% (p=0.000 n=15)
Scan/100000/Base10-88     10.26m ± 0%   10.24m ± 0%   -0.18% (p=0.000 n=15)
Scan/10/Base16-88         90.33n ± 0%   87.60n ± 0%   -3.02% (p=0.000 n=15)
Scan/100/Base16-88        506.4n ± 0%   437.7n ± 0%  -13.57% (p=0.000 n=15)
Scan/1000/Base16-88       5.056µ ± 0%   4.007µ ± 0%  -20.75% (p=0.000 n=15)
Scan/10000/Base16-88     163.35µ ± 0%   65.37µ ± 0%  -59.98% (p=0.000 n=15)
Scan/100000/Base16-88   11027.2µ ± 0%   735.1µ ± 0%  -93.33% (p=0.000 n=15)
geomean                   17.13µ        11.74µ       -31.46%

goos: linux
goarch: arm64
pkg: math/big
                      │     old      │                 new                  │
                      │    sec/op    │    sec/op     vs base                │
Scan/10/Base2-16         324.7n ± 0%    348.4n ± 0%   +7.30% (p=0.000 n=15)
Scan/100/Base2-16        2.604µ ± 0%    3.031µ ± 0%  +16.40% (p=0.000 n=15)
Scan/1000/Base2-16       26.15µ ± 0%    29.94µ ± 0%  +14.52% (p=0.000 n=15)
Scan/10000/Base2-16      334.3µ ± 0%    298.8µ ± 0%  -10.64% (p=0.000 n=15)
Scan/100000/Base2-16    10.664m ± 0%    2.991m ± 0%  -71.95% (p=0.000 n=15)
Scan/10/Base8-16         144.4n ± 1%    162.2n ± 1%  +12.33% (p=0.000 n=15)
Scan/100/Base8-16        917.2n ± 0%   1084.0n ± 0%  +18.19% (p=0.000 n=15)
Scan/1000/Base8-16       9.367µ ± 0%   10.901µ ± 0%  +16.38% (p=0.000 n=15)
Scan/10000/Base8-16      164.2µ ± 0%    181.2µ ± 0%  +10.34% (p=0.000 n=15)
Scan/100000/Base8-16     8.871m ± 1%    9.140m ± 0%   +3.04% (p=0.000 n=15)
Scan/10/Base10-16        134.6n ± 1%    148.3n ± 1%  +10.18% (p=0.000 n=15)
Scan/100/Base10-16       837.1n ± 0%    986.6n ± 0%  +17.86% (p=0.000 n=15)
Scan/1000/Base10-16      8.563µ ± 0%    9.936µ ± 0%  +16.03% (p=0.000 n=15)
Scan/10000/Base10-16     156.5µ ± 1%    171.3µ ± 0%   +9.41% (p=0.000 n=15)
Scan/100000/Base10-16    8.863m ± 1%    9.011m ± 0%   +1.66% (p=0.000 n=15)
Scan/10/Base16-16        115.7n ± 2%    129.1n ± 1%  +11.58% (p=0.000 n=15)
Scan/100/Base16-16       708.6n ± 0%    796.8n ± 0%  +12.45% (p=0.000 n=15)
Scan/1000/Base16-16      7.314µ ± 0%    7.554µ ± 0%   +3.28% (p=0.000 n=15)
Scan/10000/Base16-16    149.05µ ± 0%    74.60µ ± 0%  -49.95% (p=0.000 n=15)
Scan/100000/Base16-16   9091.6µ ± 0%    741.5µ ± 0%  -91.84% (p=0.000 n=15)
geomean                  20.39µ         17.65µ       -13.44%

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                      │     old      │                 new                 │
                      │    sec/op    │   sec/op     vs base                │
Scan/10/Base2-12         193.8n ± 2%   157.3n ± 1%  -18.83% (p=0.000 n=15)
Scan/100/Base2-12        1.445µ ± 2%   1.362µ ± 1%   -5.74% (p=0.000 n=15)
Scan/1000/Base2-12       14.28µ ± 0%   13.51µ ± 0%   -5.42% (p=0.000 n=15)
Scan/10000/Base2-12      177.1µ ± 0%   134.6µ ± 0%  -24.04% (p=0.000 n=15)
Scan/100000/Base2-12     5.429m ± 1%   1.333m ± 0%  -75.45% (p=0.000 n=15)
Scan/10/Base8-12         75.52n ± 2%   76.09n ± 1%        ~ (p=0.010 n=15)
Scan/100/Base8-12        528.4n ± 1%   532.1n ± 1%        ~ (p=0.003 n=15)
Scan/1000/Base8-12       5.423µ ± 1%   5.427µ ± 0%        ~ (p=0.183 n=15)
Scan/10000/Base8-12      89.26µ ± 1%   89.37µ ± 0%        ~ (p=0.237 n=15)
Scan/100000/Base8-12     4.543m ± 2%   4.560m ± 1%        ~ (p=0.595 n=15)
Scan/10/Base10-12        69.87n ± 1%   70.51n ± 0%        ~ (p=0.002 n=15)
Scan/100/Base10-12       488.4n ± 1%   491.2n ± 0%        ~ (p=0.060 n=15)
Scan/1000/Base10-12      5.014µ ± 1%   5.008µ ± 0%        ~ (p=0.783 n=15)
Scan/10000/Base10-12     84.90µ ± 0%   85.10µ ± 0%        ~ (p=0.109 n=15)
Scan/100000/Base10-12    4.516m ± 1%   4.521m ± 1%        ~ (p=0.713 n=15)
Scan/10/Base16-12        59.21n ± 1%   57.70n ± 1%   -2.55% (p=0.000 n=15)
Scan/100/Base16-12       380.0n ± 1%   360.7n ± 1%   -5.08% (p=0.000 n=15)
Scan/1000/Base16-12      3.775µ ± 0%   3.421µ ± 0%   -9.38% (p=0.000 n=15)
Scan/10000/Base16-12     80.62µ ± 0%   34.44µ ± 1%  -57.28% (p=0.000 n=15)
Scan/100000/Base16-12   4826.4µ ± 2%   450.9µ ± 2%  -90.66% (p=0.000 n=15)
geomean                  11.05µ        8.448µ       -23.52%

Change-Id: Ifdb2049545f34072aa75cdbb72bed4cf465f0ad7
Reviewed-on: https://go-review.googlesource.com/c/go/+/650640
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-02-27 06:04:57 -08:00
Russ Cox f8f08bfd7c math/big: improve scan test and benchmark
Add a few more test cases for scanning (integer conversion),
which were helpful in debugging some upcoming changes.

BenchmarkScan currently times converting the value 10**N
represented in base B back into []Word form.
When B = 10, the text is 1 followed by many zeros, which
could hit a "multiply by zero" special case when processing
many digit chunks, misrepresenting the actual time required
depending on whether that case is optimized.

Change the benchmark to use 9**N, which is about as big and
will not cause runs of zeros in any of the tested bases.

The benchmark comparison below is not showing faster code,
since of course the code is not changing at all here. Instead,
it is showing that the new benchmark work is roughly the same
size as the old benchmark work.

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                      │     old     │                new                 │
                      │   sec/op    │   sec/op     vs base               │
ScanPi-12               43.35µ ± 1%   43.59µ ± 1%       ~ (p=0.069 n=15)
Scan/10/Base2-12        202.3n ± 2%   193.7n ± 1%  -4.25% (p=0.000 n=15)
Scan/100/Base2-12       1.512µ ± 3%   1.447µ ± 1%  -4.30% (p=0.000 n=15)
Scan/1000/Base2-12      15.06µ ± 2%   14.33µ ± 0%  -4.83% (p=0.000 n=15)
Scan/10000/Base2-12     188.0µ ± 5%   177.3µ ± 1%  -5.65% (p=0.000 n=15)
Scan/100000/Base2-12    5.814m ± 3%   5.382m ± 1%  -7.43% (p=0.000 n=15)
Scan/10/Base8-12        78.57n ± 2%   75.02n ± 1%  -4.52% (p=0.000 n=15)
Scan/100/Base8-12       548.2n ± 2%   526.8n ± 1%  -3.90% (p=0.000 n=15)
Scan/1000/Base8-12      5.674µ ± 2%   5.421µ ± 0%  -4.46% (p=0.000 n=15)
Scan/10000/Base8-12     94.42µ ± 1%   88.61µ ± 1%  -6.15% (p=0.000 n=15)
Scan/100000/Base8-12    4.906m ± 2%   4.498m ± 3%  -8.31% (p=0.000 n=15)
Scan/10/Base10-12       73.42n ± 1%   69.56n ± 0%  -5.26% (p=0.000 n=15)
Scan/100/Base10-12      511.9n ± 1%   488.2n ± 0%  -4.63% (p=0.000 n=15)
Scan/1000/Base10-12     5.254µ ± 2%   5.009µ ± 0%  -4.66% (p=0.000 n=15)
Scan/10000/Base10-12    90.22µ ± 2%   84.52µ ± 0%  -6.32% (p=0.000 n=15)
Scan/100000/Base10-12   4.842m ± 3%   4.471m ± 3%  -7.65% (p=0.000 n=15)
Scan/10/Base16-12       62.28n ± 1%   58.70n ± 1%  -5.75% (p=0.000 n=15)
Scan/100/Base16-12      398.6n ± 0%   377.9n ± 1%  -5.19% (p=0.000 n=15)
Scan/1000/Base16-12     4.108µ ± 1%   3.782µ ± 0%  -7.94% (p=0.000 n=15)
Scan/10000/Base16-12    83.78µ ± 2%   80.51µ ± 1%  -3.90% (p=0.000 n=15)
Scan/100000/Base16-12   5.080m ± 3%   4.698m ± 3%  -7.53% (p=0.000 n=15)
geomean                 12.41µ        11.74µ       -5.36%

Change-Id: If3ce290ecc7f38672f11b42fd811afb53dee665d
Reviewed-on: https://go-review.googlesource.com/c/go/+/650639
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2025-02-27 05:58:49 -08:00
Russ Cox 872496da2b math/big: replace nat pool with Word stack
In the early days of math/big, algorithms that needed more space
grew the result larger than it needed to be and then used the
high words as extra space. This made results their own temporary
space caches, at the cost that saving a result in a data structure
might hold significantly more memory than necessary.
Specifically, new(big.Int).Mul(x, y) returned a big.Int with a
backing slice 3X as big as it strictly needed to be.
If you are storing many multiplication results, or even a single
large result, the 3X overhead can add up.

This approach to storage for temporaries also requires being able
to analyze the algorithms to predict the exact amount they need,
which can be difficult.

For both these reasons, the implementation of recursive long division,
which came later, introduced a “nat pool” where temporaries could be
stored and reused, or reclaimed by the GC when no longer used.
This avoids the storage and bookkeeping overheads but introduces a
per-temporary sync.Pool overhead. divRecursiveStep takes an array
of cached temporaries to remove some of that overhead.
The nat pool was better but is still not quite right.

This CL introduces something even better than the nat pool
(still probably not quite right, but the best I can see for now):
a sync.Pool holding stacks for allocating temporaries.
Now an operation can get one stack out of the pool and then
allocate as many temporaries as it needs during the operation,
eventually returning the stack back to the pool. The sync.Pool
operations are now per-exported-operation (like big.Int.Mul),
not per-temporary.

This CL converts both the pre-allocation in nat.mul and the
uses of the nat pool to use stack pools instead. This simplifies
some code and sets us up better for more complex algorithms
(such as Toom-Cook or FFT-based multiplication) that need
more temporaries. It is also a little bit faster.

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Div/20/10-16               23.68n ± 0%   22.21n ± 0%   -6.21% (p=0.000 n=15)
Div/40/20-16               23.68n ± 0%   22.21n ± 0%   -6.21% (p=0.000 n=15)
Div/100/50-16              56.65n ± 0%   55.53n ± 0%   -1.98% (p=0.000 n=15)
Div/200/100-16             194.6n ± 1%   172.8n ± 0%  -11.20% (p=0.000 n=15)
Div/400/200-16             232.1n ± 0%   206.7n ± 0%  -10.94% (p=0.000 n=15)
Div/1000/500-16            405.3n ± 1%   383.8n ± 0%   -5.30% (p=0.000 n=15)
Div/2000/1000-16           810.4n ± 1%   795.2n ± 0%   -1.88% (p=0.000 n=15)
Div/20000/10000-16         25.88µ ± 0%   25.39µ ± 0%   -1.89% (p=0.000 n=15)
Div/200000/100000-16       931.5µ ± 0%   924.3µ ± 0%   -0.77% (p=0.000 n=15)
Div/2000000/1000000-16     37.77m ± 0%   37.75m ± 0%        ~ (p=0.098 n=15)
Div/20000000/10000000-16    1.367 ± 0%    1.377 ± 0%   +0.72% (p=0.003 n=15)
NatMul/10-16               168.5n ± 3%   164.0n ± 4%        ~ (p=0.751 n=15)
NatMul/100-16              6.086µ ± 3%   5.380µ ± 3%  -11.60% (p=0.000 n=15)
NatMul/1000-16             238.1µ ± 3%   228.3µ ± 1%   -4.12% (p=0.000 n=15)
NatMul/10000-16            8.721m ± 2%   8.518m ± 1%   -2.33% (p=0.000 n=15)
NatMul/100000-16           369.6m ± 0%   371.1m ± 0%   +0.42% (p=0.000 n=15)
geomean                    19.57µ        18.74µ        -4.21%

                 │     old      │                  new                   │
                 │     B/op     │     B/op      vs base                  │
NatMul/10-16         192.0 ± 0%     192.0 ± 0%        ~ (p=1.000 n=15) ¹
NatMul/100-16      4.750Ki ± 0%   1.751Ki ± 0%  -63.14% (p=0.000 n=15)
NatMul/1000-16     48.16Ki ± 0%   16.02Ki ± 0%  -66.73% (p=0.000 n=15)
NatMul/10000-16    482.9Ki ± 1%   165.4Ki ± 3%  -65.75% (p=0.000 n=15)
NatMul/100000-16   5.747Mi ± 7%   4.197Mi ± 0%  -26.97% (p=0.000 n=15)
geomean            41.42Ki        20.63Ki       -50.18%
¹ all samples are equal

                 │     old     │                 new                  │
                 │  allocs/op  │  allocs/op   vs base                 │
NatMul/10-16       1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100-16      1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/1000-16     1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/10000-16    1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100000-16   7.000 ± 14%   7.000 ± 14%       ~ (p=0.668 n=15)
geomean            1.476         1.476        +0.00%
¹ all samples are equal

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Div/20/10-88               15.84n ± 1%   13.12n ± 0%  -17.17% (p=0.000 n=15)
Div/40/20-88               15.88n ± 1%   13.12n ± 0%  -17.38% (p=0.000 n=15)
Div/100/50-88              26.42n ± 0%   25.47n ± 0%   -3.60% (p=0.000 n=15)
Div/200/100-88             132.4n ± 0%   114.9n ± 0%  -13.22% (p=0.000 n=15)
Div/400/200-88             150.1n ± 0%   135.6n ± 0%   -9.66% (p=0.000 n=15)
Div/1000/500-88            275.5n ± 0%   264.1n ± 0%   -4.14% (p=0.000 n=15)
Div/2000/1000-88           586.5n ± 0%   581.1n ± 0%   -0.92% (p=0.000 n=15)
Div/20000/10000-88         25.87µ ± 0%   25.72µ ± 0%   -0.59% (p=0.000 n=15)
Div/200000/100000-88       772.2µ ± 0%   779.0µ ± 0%   +0.88% (p=0.000 n=15)
Div/2000000/1000000-88     33.36m ± 0%   33.63m ± 0%   +0.80% (p=0.000 n=15)
Div/20000000/10000000-88    1.307 ± 0%    1.320 ± 0%   +1.03% (p=0.000 n=15)
NatMul/10-88               140.4n ± 0%   148.8n ± 4%   +5.98% (p=0.000 n=15)
NatMul/100-88              4.663µ ± 1%   4.388µ ± 1%   -5.90% (p=0.000 n=15)
NatMul/1000-88             207.7µ ± 0%   205.8µ ± 0%   -0.89% (p=0.000 n=15)
NatMul/10000-88            8.456m ± 0%   8.468m ± 0%   +0.14% (p=0.021 n=15)
NatMul/100000-88           295.1m ± 0%   297.9m ± 0%   +0.94% (p=0.000 n=15)
geomean                    14.96µ        14.33µ        -4.23%

                 │     old      │                   new                   │
                 │     B/op     │     B/op       vs base                  │
NatMul/10-88         192.0 ± 0%     192.0 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/100-88      4.750Ki ± 0%   1.758Ki ±  0%  -62.99% (p=0.000 n=15)
NatMul/1000-88     48.44Ki ± 0%   16.08Ki ±  0%  -66.80% (p=0.000 n=15)
NatMul/10000-88    489.7Ki ± 1%   166.1Ki ±  3%  -66.08% (p=0.000 n=15)
NatMul/100000-88   5.546Mi ± 0%   3.819Mi ± 60%  -31.15% (p=0.000 n=15)
geomean            41.29Ki        20.30Ki        -50.85%
¹ all samples are equal

                 │     old     │                 new                  │
                 │  allocs/op  │  allocs/op   vs base                 │
NatMul/10-88       1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100-88      1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/1000-88     1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/10000-88    1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100000-88   5.000 ± 20%   6.000 ± 67%       ~ (p=0.672 n=15)
geomean            1.380         1.431        +3.71%
¹ all samples are equal

goos: linux
goarch: arm64
pkg: math/big
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Div/20/10-16               15.85n ± 0%   15.23n ± 0%   -3.91% (p=0.000 n=15)
Div/40/20-16               15.88n ± 0%   15.22n ± 0%   -4.16% (p=0.000 n=15)
Div/100/50-16              29.69n ± 0%   26.39n ± 0%  -11.11% (p=0.000 n=15)
Div/200/100-16             149.2n ± 0%   123.3n ± 0%  -17.36% (p=0.000 n=15)
Div/400/200-16             160.3n ± 0%   139.2n ± 0%  -13.16% (p=0.000 n=15)
Div/1000/500-16            271.0n ± 0%   256.1n ± 0%   -5.50% (p=0.000 n=15)
Div/2000/1000-16           545.3n ± 0%   527.0n ± 0%   -3.36% (p=0.000 n=15)
Div/20000/10000-16         22.60µ ± 0%   22.20µ ± 0%   -1.77% (p=0.000 n=15)
Div/200000/100000-16       889.0µ ± 0%   892.2µ ± 0%   +0.35% (p=0.000 n=15)
Div/2000000/1000000-16     38.01m ± 0%   38.12m ± 0%   +0.30% (p=0.000 n=15)
Div/20000000/10000000-16    1.437 ± 0%    1.444 ± 0%   +0.50% (p=0.000 n=15)
NatMul/10-16               166.4n ± 2%   169.5n ± 1%   +1.86% (p=0.000 n=15)
NatMul/100-16              5.733µ ± 1%   5.570µ ± 1%   -2.84% (p=0.000 n=15)
NatMul/1000-16             232.6µ ± 1%   229.8µ ± 0%   -1.22% (p=0.000 n=15)
NatMul/10000-16            9.039m ± 1%   8.969m ± 0%   -0.77% (p=0.000 n=15)
NatMul/100000-16           367.0m ± 0%   368.8m ± 0%   +0.48% (p=0.000 n=15)
geomean                    16.15µ        15.50µ        -4.01%

                 │     old      │                  new                   │
                 │     B/op     │     B/op      vs base                  │
NatMul/10-16         192.0 ± 0%     192.0 ± 0%        ~ (p=1.000 n=15) ¹
NatMul/100-16      4.750Ki ± 0%   1.751Ki ± 0%  -63.14% (p=0.000 n=15)
NatMul/1000-16     48.33Ki ± 0%   16.02Ki ± 0%  -66.85% (p=0.000 n=15)
NatMul/10000-16    536.5Ki ± 1%   165.7Ki ± 3%  -69.12% (p=0.000 n=15)
NatMul/100000-16   6.078Mi ± 6%   4.197Mi ± 0%  -30.94% (p=0.000 n=15)
geomean            42.81Ki        20.64Ki       -51.78%
¹ all samples are equal

                 │     old     │                  new                  │
                 │  allocs/op  │  allocs/op   vs base                  │
NatMul/10-16       1.000 ±  0%   1.000 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/100-16      1.000 ±  0%   1.000 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/1000-16     1.000 ±  0%   1.000 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/10000-16    2.000 ± 50%   1.000 ±  0%  -50.00% (p=0.001 n=15)
NatMul/100000-16   9.000 ± 11%   8.000 ± 12%  -11.11% (p=0.001 n=15)
geomean            1.783         1.516        -14.97%
¹ all samples are equal

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                         │     old      │                new                 │
                         │    sec/op    │   sec/op     vs base               │
Div/20/10-12                9.850n ± 1%   9.405n ± 1%  -4.52% (p=0.000 n=15)
Div/40/20-12                9.858n ± 0%   9.403n ± 1%  -4.62% (p=0.000 n=15)
Div/100/50-12               16.40n ± 1%   14.81n ± 0%  -9.70% (p=0.000 n=15)
Div/200/100-12              88.48n ± 2%   80.88n ± 0%  -8.59% (p=0.000 n=15)
Div/400/200-12             107.90n ± 1%   99.28n ± 1%  -7.99% (p=0.000 n=15)
Div/1000/500-12             188.8n ± 1%   178.6n ± 1%  -5.40% (p=0.000 n=15)
Div/2000/1000-12            399.9n ± 0%   389.1n ± 0%  -2.70% (p=0.000 n=15)
Div/20000/10000-12          13.94µ ± 2%   13.81µ ± 1%       ~ (p=0.574 n=15)
Div/200000/100000-12        523.8µ ± 0%   521.7µ ± 0%  -0.40% (p=0.000 n=15)
Div/2000000/1000000-12      21.46m ± 0%   21.48m ± 0%       ~ (p=0.067 n=15)
Div/20000000/10000000-12    812.5m ± 0%   812.9m ± 0%       ~ (p=0.061 n=15)
NatMul/10-12                77.14n ± 0%   78.35n ± 1%  +1.57% (p=0.000 n=15)
NatMul/100-12               2.999µ ± 0%   2.871µ ± 1%  -4.27% (p=0.000 n=15)
NatMul/1000-12              126.2µ ± 0%   126.8µ ± 0%  +0.51% (p=0.011 n=15)
NatMul/10000-12             5.099m ± 0%   5.125m ± 0%  +0.51% (p=0.000 n=15)
NatMul/100000-12            206.7m ± 0%   208.4m ± 0%  +0.80% (p=0.000 n=15)
geomean                     9.512µ        9.236µ       -2.91%

                 │     old      │                   new                    │
                 │     B/op     │      B/op       vs base                  │
NatMul/10-12         192.0 ± 0%     192.0 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/100-12      4.750Ki ± 0%   1.750Ki ±   0%  -63.16% (p=0.000 n=15)
NatMul/1000-12     48.13Ki ± 0%   16.01Ki ±   0%  -66.73% (p=0.000 n=15)
NatMul/10000-12    483.5Ki ± 1%   163.2Ki ±   2%  -66.24% (p=0.000 n=15)
NatMul/100000-12   5.480Mi ± 4%   1.532Mi ± 104%  -72.05% (p=0.000 n=15)
geomean            41.03Ki        16.82Ki         -59.01%
¹ all samples are equal

                 │    old     │                  new                   │
                 │ allocs/op  │  allocs/op    vs base                  │
NatMul/10-12       1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/100-12      1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/1000-12     1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/10000-12    1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/100000-12   5.000 ± 0%   1.000 ± 400%  -80.00% (p=0.007 n=15)
geomean            1.380        1.000         -27.52%
¹ all samples are equal

Change-Id: I7efa6fe37971ed26ae120a32250fcb47ece0a011
Reviewed-on: https://go-review.googlesource.com/c/go/+/650638
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2025-02-27 05:58:09 -08:00
Russ Cox 763766e9ab math/big: report allocs in BenchmarkNatMul, BenchmarkNatSqr
Change-Id: I112f55c0e3ee3b75e615a06b27552de164565c04
Reviewed-on: https://go-review.googlesource.com/c/go/+/650637
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2025-02-27 05:50:45 -08:00
Russ Cox b66d83ccea math/big: clean up GCD a little
The GCD code was setting one *Int to the value of another
by smashing one struct on top of the other, instead of using Set.
That was safe in this one case, but it's not idiomatic in math/big
nor safe in general, so rewrite the code not to do that.
(In one case, by swapping variables around; in another, by calling Set.)

The added Set call does slow down GCDs by a small amount,
since the answer has to be copied out. To compensate for that,
optimize a bit: remove the s, t temporaries entirely and handle
vector x word multiplication directly. The net result is that almost
all GCDs are faster, except for small ones, which are a few
nanoseconds slower.

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                              │ bench.before │             bench.after             │
                              │    sec/op    │   sec/op     vs base                │
GCD10x10/WithoutXY-12            23.80n ± 1%   31.71n ± 1%  +33.24% (p=0.000 n=10)
GCD10x10/WithXY-12              100.40n ± 0%   92.14n ± 1%   -8.22% (p=0.000 n=10)
GCD10x100/WithoutXY-12           63.70n ± 0%   70.73n ± 0%  +11.05% (p=0.000 n=10)
GCD10x100/WithXY-12              278.6n ± 0%   233.1n ± 1%  -16.35% (p=0.000 n=10)
GCD10x1000/WithoutXY-12          153.4n ± 0%   162.2n ± 1%   +5.74% (p=0.000 n=10)
GCD10x1000/WithXY-12             456.0n ± 0%   411.8n ± 1%   -9.69% (p=0.000 n=10)
GCD10x10000/WithoutXY-12         1.002µ ± 1%   1.036µ ± 0%   +3.39% (p=0.000 n=10)
GCD10x10000/WithXY-12            2.330µ ± 1%   2.210µ ± 0%   -5.13% (p=0.000 n=10)
GCD10x100000/WithoutXY-12        8.894µ ± 0%   8.889µ ± 1%        ~ (p=0.754 n=10)
GCD10x100000/WithXY-12           20.84µ ± 0%   20.24µ ± 0%   -2.84% (p=0.000 n=10)
GCD100x100/WithoutXY-12          373.3n ± 3%   314.4n ± 0%  -15.76% (p=0.000 n=10)
GCD100x100/WithXY-12             662.5n ± 0%   572.4n ± 1%  -13.59% (p=0.000 n=10)
GCD100x1000/WithoutXY-12         641.8n ± 0%   598.1n ± 1%   -6.81% (p=0.000 n=10)
GCD100x1000/WithXY-12            1.123µ ± 0%   1.019µ ± 1%   -9.26% (p=0.000 n=10)
GCD100x10000/WithoutXY-12        2.870µ ± 0%   2.831µ ± 0%   -1.38% (p=0.000 n=10)
GCD100x10000/WithXY-12           4.930µ ± 1%   4.675µ ± 0%   -5.16% (p=0.000 n=10)
GCD100x100000/WithoutXY-12       24.08µ ± 0%   23.97µ ± 0%   -0.48% (p=0.007 n=10)
GCD100x100000/WithXY-12          43.66µ ± 0%   42.52µ ± 0%   -2.61% (p=0.001 n=10)
GCD1000x1000/WithoutXY-12        3.999µ ± 0%   3.569µ ± 1%  -10.75% (p=0.000 n=10)
GCD1000x1000/WithXY-12           6.397µ ± 0%   5.534µ ± 0%  -13.49% (p=0.000 n=10)
GCD1000x10000/WithoutXY-12       6.875µ ± 0%   6.450µ ± 0%   -6.18% (p=0.000 n=10)
GCD1000x10000/WithXY-12          20.75µ ± 1%   19.17µ ± 1%   -7.64% (p=0.000 n=10)
GCD1000x100000/WithoutXY-12      36.38µ ± 0%   35.60µ ± 1%   -2.13% (p=0.000 n=10)
GCD1000x100000/WithXY-12         172.1µ ± 0%   174.4µ ± 3%        ~ (p=0.052 n=10)
GCD10000x10000/WithoutXY-12      79.89µ ± 1%   75.16µ ± 2%   -5.92% (p=0.000 n=10)
GCD10000x10000/WithXY-12         160.1µ ± 0%   150.0µ ± 0%   -6.33% (p=0.000 n=10)
GCD10000x100000/WithoutXY-12     213.2µ ± 1%   209.0µ ± 1%   -1.98% (p=0.000 n=10)
GCD10000x100000/WithXY-12        1.399m ± 0%   1.342m ± 3%   -4.08% (p=0.002 n=10)
GCD100000x100000/WithoutXY-12    5.463m ± 1%   5.504m ± 2%        ~ (p=0.190 n=10)
GCD100000x100000/WithXY-12       11.36m ± 0%   11.46m ± 1%   +0.86% (p=0.000 n=10)
geomean                          6.953µ        6.695µ        -3.71%

goos: linux
goarch: amd64
pkg: math/big
cpu: AMD Ryzen 9 7950X 16-Core Processor
                              │ bench.before │             bench.after             │
                              │    sec/op    │   sec/op     vs base                │
GCD10x10/WithoutXY-32           39.66n ±  4%   44.34n ± 4%  +11.77% (p=0.000 n=10)
GCD10x10/WithXY-32              156.7n ± 12%   130.8n ± 2%  -16.53% (p=0.000 n=10)
GCD10x100/WithoutXY-32          115.8n ±  5%   120.2n ± 2%   +3.89% (p=0.000 n=10)
GCD10x100/WithXY-32             465.3n ±  3%   368.1n ± 2%  -20.91% (p=0.000 n=10)
GCD10x1000/WithoutXY-32         201.1n ±  1%   210.8n ± 2%   +4.82% (p=0.000 n=10)
GCD10x1000/WithXY-32            652.9n ±  4%   605.0n ± 1%   -7.32% (p=0.002 n=10)
GCD10x10000/WithoutXY-32        1.046µ ±  2%   1.143µ ± 1%   +9.33% (p=0.000 n=10)
GCD10x10000/WithXY-32           3.360µ ±  1%   3.258µ ± 1%   -3.04% (p=0.000 n=10)
GCD10x100000/WithoutXY-32       9.391µ ±  3%   9.997µ ± 1%   +6.46% (p=0.000 n=10)
GCD10x100000/WithXY-32          27.92µ ±  1%   28.21µ ± 0%   +1.04% (p=0.043 n=10)
GCD100x100/WithoutXY-32         443.7n ±  5%   320.0n ± 2%  -27.88% (p=0.000 n=10)
GCD100x100/WithXY-32            789.9n ±  2%   690.4n ± 1%  -12.60% (p=0.000 n=10)
GCD100x1000/WithoutXY-32        718.4n ±  3%   600.0n ± 1%  -16.48% (p=0.000 n=10)
GCD100x1000/WithXY-32           1.388µ ±  4%   1.175µ ± 1%  -15.28% (p=0.000 n=10)
GCD100x10000/WithoutXY-32       2.750µ ±  1%   2.668µ ± 1%   -2.96% (p=0.000 n=10)
GCD100x10000/WithXY-32          6.016µ ±  1%   5.590µ ± 1%   -7.09% (p=0.000 n=10)
GCD100x100000/WithoutXY-32      21.40µ ±  1%   22.30µ ± 1%   +4.21% (p=0.000 n=10)
GCD100x100000/WithXY-32         47.02µ ±  4%   48.80µ ± 0%   +3.78% (p=0.015 n=10)
GCD1000x1000/WithoutXY-32       3.417µ ±  4%   3.020µ ± 1%  -11.65% (p=0.000 n=10)
GCD1000x1000/WithXY-32          5.752µ ±  0%   5.418µ ± 2%   -5.81% (p=0.000 n=10)
GCD1000x10000/WithoutXY-32      6.150µ ±  0%   6.246µ ± 1%   +1.55% (p=0.000 n=10)
GCD1000x10000/WithXY-32         24.68µ ±  3%   25.07µ ± 1%        ~ (p=0.051 n=10)
GCD1000x100000/WithoutXY-32     34.60µ ±  2%   36.85µ ± 1%   +6.51% (p=0.000 n=10)
GCD1000x100000/WithXY-32        209.5µ ±  4%   227.4µ ± 0%   +8.56% (p=0.000 n=10)
GCD10000x10000/WithoutXY-32     90.69µ ±  0%   88.48µ ± 0%   -2.44% (p=0.000 n=10)
GCD10000x10000/WithXY-32        197.1µ ±  0%   200.5µ ± 0%   +1.73% (p=0.000 n=10)
GCD10000x100000/WithoutXY-32    239.1µ ±  0%   242.5µ ± 0%   +1.42% (p=0.000 n=10)
GCD10000x100000/WithXY-32       1.963m ±  3%   2.028m ± 0%   +3.28% (p=0.000 n=10)
GCD100000x100000/WithoutXY-32   7.466m ±  0%   7.412m ± 0%   -0.71% (p=0.000 n=10)
GCD100000x100000/WithXY-32      16.10m ±  2%   16.47m ± 0%   +2.25% (p=0.000 n=10)
geomean                         8.388µ         8.127µ        -3.12%

Change-Id: I161dc409bad11bcc553bc8116449905ae5b06742
Reviewed-on: https://go-review.googlesource.com/c/go/+/650636
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2025-02-27 05:49:50 -08:00
Joel Sing 37e9c5eaba cmd/internal/obj/riscv: implement vector load/store instructions
Implement vector unit stride, vector strided, vector indexed and
vector whole register load and store instructions.

The vector unit stride instructions take an optional vector mask
register, which if specified must be register V0. If only two
operands are given, the instruction is encoded as unmasked.

The vector strided and vector indexed instructions also take an
optional vector mask register, which if specified must be register
V0. If only three operands are given, the instruction is encoded as
unmasked.

Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Change-Id: I35e43bb8f1cf6ae8826fbeec384b95ac945da50f
Reviewed-on: https://go-review.googlesource.com/c/go/+/631937
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
2025-02-27 03:47:20 -08:00
Joel Sing 927fdb7843 cmd/compile: simplify intrinsification of TrailingZeros16 and TrailingZeros8
Decompose Ctz16 and Ctz8 within the SSA rules for LOONG64, MIPS, PPC64
and S390X, rather than having a custom intrinsic. Note that for PPC64 this
actually allows the existing Ctz16 and Ctz8 rules to be used.

Change-Id: I27a5e978f852b9d75396d2a80f5d7dfcb5ef7dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/651816
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-27 03:45:44 -08:00
Guoqi Chen 01ba8bfe86 runtime/cgo: use standard ABI call setg_gcc in crosscall1 on loong64
Change-Id: Ie38583d667d579751d643b2da2aa56390b69904c
Reviewed-on: https://go-review.googlesource.com/c/go/+/652255
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
2025-02-26 17:36:57 -08:00
Guoqi Chen 11c847e536 runtime: use correct memory barrier in exitThread function on loong64
In the runtime.exitThread function, a storeRelease barrier
is required instead of a full barrier.

Change-Id: I2815ddb03e4984c891d71811ccf650a82325e10d
Reviewed-on: https://go-review.googlesource.com/c/go/+/631915
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-02-26 17:35:25 -08:00
Keith Randall c594762ad5 runtime: remove ret field from gobuf
It's not used for anything.

Change-Id: I031b3cdfe52b6b1cff4b3cb6713ffe588084542f
Reviewed-on: https://go-review.googlesource.com/c/go/+/652276
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-26 14:48:55 -08:00
Ian Lance Taylor 2e71ae33ca runtime/cgo: avoid errors from -Wdeclaration-after-statement
CL 652181 accidentally missed this iPhone only code.

For #71961

Change-Id: I567f8bb38958907442e69494da330d5199d11f54
Reviewed-on: https://go-review.googlesource.com/c/go/+/653135
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-02-26 14:39:37 -08:00
Mateusz Poliwczak c55c1cbd04 net: return proper error from Context
Sadly err was a named parameter so this did not cause
compile error.

Fixes #71974

Change-Id: I10cf29ae14c52d48a793c9a6cb01b01d79b1b356
GitHub-Last-Rev: 4dc0e6670a
GitHub-Pull-Request: golang/go#71976
Reviewed-on: https://go-review.googlesource.com/c/go/+/652815
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
2025-02-26 14:16:12 -08:00
Ian Lance Taylor 9cceaf8736 cmd/link: require cgo for -linkmode=external test
For #71416
Fixes #71957

Change-Id: I2180dada34d9dd2d3f5b0aaf8525951fd2e86a27
Reviewed-on: https://go-review.googlesource.com/c/go/+/652277
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-02-26 14:12:58 -08:00
Egon Elbre 983e30bd3b crypto/internal/fips140/edwards25519/field: optimize AMD64
Replace constant multiplication with shift and adds,
this reduces pressure on multiplications, making things overall
faster.

goos: windows
goarch: amd64
pkg: crypto/internal/fips140/edwards25519/field
cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
            │   v0.log~   │              v1.log~               │
            │   sec/op    │   sec/op     vs base               │
Add-32        4.768n ± 1%   4.763n ± 0%       ~ (p=0.683 n=20)
Multiply-32   20.93n ± 0%   19.48n ± 0%  -6.88% (p=0.000 n=20)
Square-32     15.88n ± 0%   15.00n ± 0%  -5.51% (p=0.000 n=20)
Invert-32     4.291µ ± 0%   4.072µ ± 0%  -5.10% (p=0.000 n=20)
Mult32-32     5.184n ± 0%   5.169n ± 0%  -0.30% (p=0.032 n=20)
Bytes-32      11.36n ± 0%   11.34n ± 0%       ~ (p=0.106 n=20)
geomean       27.15n        26.32n       -3.06%

Change-Id: I9c2f588fad29d89c3e6c712c092b32b66479f596
Reviewed-on: https://go-review.googlesource.com/c/go/+/652716
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
2025-02-26 13:58:20 -08:00
Egon Elbre 16959799bd crypto/internal/fips140/edwards25519/field/_asm: update avo dependency
v0.4.0 does not work with Go 1.25.

Change-Id: Ib9081b0ee78cad7974b038d89fa92d9ccbfa305a
Reviewed-on: https://go-review.googlesource.com/c/go/+/652715
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Commit-Queue: Junyang Shao <shaojunyang@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-26 13:58:17 -08:00
Mateusz Poliwczak ee0d03fab6 sync: don't keep func alive after OnceFunc panics
This moves the f = nil assignment to the defer statement,
so that in case the functions panics, the f func is not
referenced anymore.

Change-Id: I3e53b90a10f21741e26602270822c8a75679f163
GitHub-Last-Rev: bda01100c6
GitHub-Pull-Request: golang/go#68636
Reviewed-on: https://go-review.googlesource.com/c/go/+/601240
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2025-02-26 12:52:02 -08:00
Michael Anthony Knyszek 1421b982dc runtime: document that cleanups can run concurrently with each other
Fixes #71825.

Change-Id: I25af19eb72d75f13cf661fc47ee5717782785326
Reviewed-on: https://go-review.googlesource.com/c/go/+/650696
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-02-26 11:35:44 -08:00
Ian Lance Taylor 76c7028253 runtime/cgo: avoid errors from -Wdeclaration-after-statement
It's used by the SWIG CI build, at least, and it's an easy fix.

Fixes #71961

Change-Id: Id21071a5aef216b35ecf0e9cd3e05d08972d92fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/652181
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-02-26 10:52:55 -08:00
qiulaidongfeng 194696f1d1 reflect: let Value.Seq return the iteration value correct type
Fixes #71905

Change-Id: I50a418f8552e071c6e5011af5b9accc7d41548d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/651855
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-02-26 09:41:14 -08:00
Keith Randall 8b8bff7bb2 cmd/compile: don't pull constant offsets out of pointer arithmetic
This could lead to manufacturing a pointer that points outside
its original allocation.

Bug was introduced in CL 629858.

Fixes #71932

Change-Id: Ia86ab0b65ce5f80a8e0f4f4c81babd07c5904f8d
Reviewed-on: https://go-review.googlesource.com/c/go/+/652078
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-02-26 09:39:12 -08:00
Alexandr Primak 4c75671871 runtime: Added usage example for the runtime.AddCleanup() function.
The existing description of the function lacks usage examples, which makes it difficult to understand, so I added one.

There is no open issue about this, since the implementation seems trivial.

Change-Id: I96b29f0b21d1c7fda04128239633c8a2fc36fef2
Reviewed-on: https://go-review.googlesource.com/c/go/+/649995
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-26 08:54:51 -08:00
Cuong Manh Le 011da163f4 cmd/compile/internal/test: fix noopt builder
The function argument passed to hash function escaped to heap when
optimization is disabled, causing the builder failed.

To fix this, skip the test on noopt builder.

Updates #71943
Fixes #71965

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-noopt
Change-Id: I3a9ece09bfa10bf5eb102a7da3ade65634565cb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/652735
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-02-26 07:04:08 -08:00
Joel Sing 1b1c6b838e cmd/compile: simplify intrinsification of BitLen16 and BitLen8
Decompose BitLen16 and BitLen8 within the SSA rules for architectures that
support BitLen32 or BitLen64, rather than having a custom intrinsic.

Change-Id: Ie4188ce69d1021e63cec27a8e7418efb0714812b
Reviewed-on: https://go-review.googlesource.com/c/go/+/651817
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-02-26 02:02:07 -08:00
Ian Lance Taylor 969a0da362 runtime: route calls to msan_memmove through cgo
This avoids problems when the C linker doesn't want to see the Go relocation.

Fixes #71954

Change-Id: I7cf884c4059d596cad6074ade02020d5a724f20e
Reviewed-on: https://go-review.googlesource.com/c/go/+/652180
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2025-02-25 21:41:00 -08:00