mirror of https://github.com/golang/go.git
62712 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
fb8691edae |
syscall: use testing.T.Context
Change-Id: I62763878d51598bf1ae0a4e75441e1d3a4b86aa3 Reviewed-on: https://go-review.googlesource.com/c/go/+/656955 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
af92bb594d |
test/codegen: remove plan9/amd64 specific array zeroing/copying tests
The compiler previously avoided the use of MOVUPS on plan9/amd64. This was changed in CL 655875, however the codegen tests were not updated and now fail (seemingly the full codegen tests do not run anywhere, not even on the longtest builders). Change-Id: I388b60e7b0911048d4949c5029347f9801c018a9 Reviewed-on: https://go-review.googlesource.com/c/go/+/656997 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Auto-Submit: Keith Randall <khr@google.com> |
|
|
|
bdfa604b2e |
cmd/internal/dwarf: always use AT_ranges for scopes with DWARF 5
This patch extends the change in CL 657175 to apply the same abbrev selection strategy to single-range lexical scopes that we're now using for inlined routine bodies, when DWARF 5 is in effect. Ranges are more compact and use fewer relocation than explicit hi/lo PC values, so we might as well always use them. Updates #26379. Change-Id: Ieeaddf50e82acc4866010e29af32bcd1fb3b4f02 Reviewed-on: https://go-review.googlesource.com/c/go/+/657177 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> |
|
|
|
d7f58834cb |
doc/next: add tentative DWARF 5 release note fragment
Add a small fragment describing the move to DWARF 5 for this release, along with the name of the GOEXPERIMENT. Updates #26379. Change-Id: I3a30a71436133e2e0a5edf1ba0db84b9cc17cc5c Reviewed-on: https://go-review.googlesource.com/c/go/+/657176 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
8cdef129fb |
cmd/link: only check PIE size difference when the linkmode is the same
Currently we check the size difference between non-PIE and PIE binaries without specifying a linkmode (and that is presumed to be internal). However, on some platforms (like openbsd/arm64), the use of -buildmode=pie results in external linking. Ensure that we only test internally linked non-PIE against internally linked PIE and externally linked non-PIE against externally linked PIE, avoiding unexpected differences. Fixes #72818 Change-Id: I7e1da0976a4b5de387a59d0d6c04f58498a8eca0 Reviewed-on: https://go-review.googlesource.com/c/go/+/657035 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@golang.org> |
|
|
|
b143c98169 |
cmd/compile: simplify bounded shift on loong64
Use the shiftIsBounded function to generate more efficient shift instructions.
This change also optimize shift ops when the shift value is v&63 and v&31.
goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000-HV @ 2500.00MHz
| CL 627855 | this CL |
| sec/op | sec/op vs base |
LeadingZeros 1.1005n ± 0% 0.8425n ± 1% -23.44% (p=0.000 n=10)
LeadingZeros8 1.502n ± 0% 1.501n ± 0% -0.07% (p=0.001 n=10)
LeadingZeros16 1.502n ± 0% 1.501n ± 0% -0.07% (p=0.000 n=10)
LeadingZeros32 0.9511n ± 0% 0.8050n ± 0% -15.36% (p=0.000 n=10)
LeadingZeros64 1.1195n ± 0% 0.8423n ± 0% -24.76% (p=0.000 n=10)
TrailingZeros 0.8086n ± 0% 0.8005n ± 0% -1.00% (p=0.000 n=10)
TrailingZeros8 1.031n ± 1% 1.035n ± 1% ~ (p=0.136 n=10)
TrailingZeros16 0.8114n ± 0% 0.8254n ± 1% +1.73% (p=0.000 n=10)
TrailingZeros32 0.8090n ± 0% 0.8005n ± 0% -1.05% (p=0.000 n=10)
TrailingZeros64 0.8089n ± 1% 0.8005n ± 0% -1.04% (p=0.000 n=10)
OnesCount 0.8677n ± 0% 1.2010n ± 0% +38.41% (p=0.000 n=10)
OnesCount8 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.000 n=10)
OnesCount16 0.9344n ± 0% 1.2010n ± 0% +28.53% (p=0.000 n=10)
OnesCount32 0.8677n ± 0% 1.2010n ± 0% +38.41% (p=0.000 n=10)
OnesCount64 1.2010n ± 0% 0.8671n ± 0% -27.80% (p=0.000 n=10)
RotateLeft 0.8009n ± 0% 0.6671n ± 0% -16.71% (p=0.000 n=10)
RotateLeft8 1.202n ± 0% 1.327n ± 0% +10.40% (p=0.000 n=10)
RotateLeft16 0.8036n ± 0% 0.8218n ± 0% +2.26% (p=0.000 n=10)
RotateLeft32 0.6674n ± 0% 0.8004n ± 0% +19.94% (p=0.000 n=10)
RotateLeft64 0.6674n ± 0% 0.8004n ± 0% +19.94% (p=0.000 n=10)
Reverse 0.4067n ± 1% 0.4122n ± 1% +1.38% (p=0.001 n=10)
Reverse8 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.000 n=10)
Reverse16 0.8009n ± 0% 0.8005n ± 0% -0.05% (p=0.000 n=10)
Reverse32 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.001 n=10)
Reverse64 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.008 n=10)
ReverseBytes 0.4057n ± 1% 0.4133n ± 1% +1.90% (p=0.000 n=10)
ReverseBytes16 0.8009n ± 0% 0.8004n ± 0% -0.07% (p=0.000 n=10)
ReverseBytes32 0.8009n ± 0% 0.8005n ± 0% -0.05% (p=0.000 n=10)
ReverseBytes64 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.000 n=10)
Add 1.201n ± 0% 1.201n ± 0% ~ (p=1.000 n=10)
Add32 1.201n ± 0% 1.201n ± 0% ~ (p=0.474 n=10)
Add64 1.201n ± 0% 1.201n ± 0% ~ (p=1.000 n=10)
Add64multiple 1.832n ± 0% 1.828n ± 0% -0.22% (p=0.001 n=10)
Sub 1.201n ± 0% 1.201n ± 0% ~ (p=1.000 n=10)
Sub32 1.602n ± 0% 1.601n ± 0% -0.06% (p=0.000 n=10)
Sub64 1.201n ± 0% 1.201n ± 0% ~ (p=0.474 n=10)
Sub64multiple 2.402n ± 0% 2.400n ± 0% -0.10% (p=0.000 n=10)
Mul 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.000 n=10)
Mul32 0.8009n ± 0% 0.8004n ± 0% -0.06% (p=0.000 n=10)
Mul64 0.8008n ± 0% 0.8004n ± 0% -0.05% (p=0.000 n=10)
Div 9.083n ± 0% 7.638n ± 0% -15.91% (p=0.000 n=10)
Div32 4.011n ± 0% 4.009n ± 0% -0.05% (p=0.000 n=10)
Div64 9.711n ± 0% 8.204n ± 0% -15.51% (p=0.000 n=10)
geomean 1.083n 1.078n -0.40%
goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
| CL 627855 | this CL |
| sec/op | sec/op vs base |
LeadingZeros 1.341n ± 4% 1.331n ± 2% -0.71% (p=0.008 n=10)
LeadingZeros8 1.781n ± 0% 1.766n ± 1% -0.84% (p=0.011 n=10)
LeadingZeros16 1.782n ± 0% 1.767n ± 0% -0.79% (p=0.001 n=10)
LeadingZeros32 1.341n ± 1% 1.333n ± 0% -0.52% (p=0.001 n=10)
LeadingZeros64 1.338n ± 0% 1.333n ± 0% -0.37% (p=0.008 n=10)
TrailingZeros 0.9025n ± 0% 0.8077n ± 0% -10.50% (p=0.000 n=10)
TrailingZeros8 1.056n ± 0% 1.089n ± 1% +3.17% (p=0.001 n=10)
TrailingZeros16 1.101n ± 0% 1.102n ± 0% +0.09% (p=0.011 n=10)
TrailingZeros32 0.9024n ± 1% 0.8083n ± 0% -10.43% (p=0.000 n=10)
TrailingZeros64 0.9028n ± 1% 0.8087n ± 0% -10.43% (p=0.000 n=10)
OnesCount 1.482n ± 1% 1.302n ± 0% -12.15% (p=0.000 n=10)
OnesCount8 1.206n ± 0% 1.207n ± 2% +0.12% (p=0.000 n=10)
OnesCount16 1.534n ± 0% 1.402n ± 0% -8.58% (p=0.000 n=10)
OnesCount32 1.531n ± 1% 1.302n ± 0% -14.99% (p=0.000 n=10)
OnesCount64 1.302n ± 0% 1.538n ± 1% +18.16% (p=0.000 n=10)
RotateLeft 0.8083n ± 0% 0.8087n ± 1% ~ (p=0.579 n=10)
RotateLeft8 1.310n ± 0% 1.323n ± 0% +0.95% (p=0.001 n=10)
RotateLeft16 1.149n ± 0% 1.165n ± 1% +1.35% (p=0.001 n=10)
RotateLeft32 0.8093n ± 0% 0.8105n ± 0% ~ (p=0.393 n=10)
RotateLeft64 0.8088n ± 0% 0.8090n ± 0% ~ (p=0.739 n=10)
Reverse 0.5109n ± 0% 0.5172n ± 1% +1.25% (p=0.000 n=10)
Reverse8 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.000 n=10)
Reverse16 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.002 n=10)
Reverse32 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.000 n=10)
Reverse64 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.005 n=10)
ReverseBytes 0.5122n ± 2% 0.5182n ± 1% ~ (p=0.060 n=10)
ReverseBytes16 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.005 n=10)
ReverseBytes32 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.005 n=10)
ReverseBytes64 0.8010n ± 0% 0.8011n ± 0% +0.01% (p=0.001 n=10)
Add 1.201n ± 4% 1.202n ± 0% +0.08% (p=0.028 n=10)
Add32 1.201n ± 0% 1.202n ± 2% +0.08% (p=0.014 n=10)
Add64 1.201n ± 1% 1.202n ± 0% +0.08% (p=0.025 n=10)
Add64multiple 1.902n ± 0% 1.913n ± 0% +0.55% (p=0.004 n=10)
Sub 1.201n ± 0% 1.202n ± 3% +0.08% (p=0.001 n=10)
Sub32 1.654n ± 0% 1.656n ± 1% ~ (p=0.117 n=10)
Sub64 1.201n ± 0% 1.202n ± 0% +0.08% (p=0.001 n=10)
Sub64multiple 2.180n ± 4% 2.159n ± 1% -0.96% (p=0.006 n=10)
Mul 0.9345n ± 0% 0.9346n ± 0% +0.01% (p=0.000 n=10)
Mul32 1.030n ± 0% 1.050n ± 1% +1.94% (p=0.000 n=10)
Mul64 0.9345n ± 0% 0.9346n ± 1% +0.01% (p=0.000 n=10)
Div 11.57n ± 1% 11.12n ± 0% -3.85% (p=0.000 n=10)
Div32 4.337n ± 1% 4.341n ± 1% ~ (p=0.286 n=10)
Div64 12.76n ± 0% 12.02n ± 3% -5.80% (p=0.000 n=10)
geomean 1.252n 1.235n -1.32%
Change-Id: Iec4cfd2b83bb0f946068c1d657369ff081d95b04
Reviewed-on: https://go-review.googlesource.com/c/go/+/628575
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
|
|
b10c35945d |
cmd/internal/obj/loong64: add {V,XV}DIV{B/H/W/V}[U] and {V,XV}MOD{B/H/W/V}[U] instructions support
Go asm syntax:
VDIV{B/H/W/V}[U] VK, VJ, VD
XVDIV{B/H/W/V}[U] XK, XJ, XD
VMOD{B/H/W/V}[U] VK, VJ, VD
XVMOD{B/H/W/V}[U] XK, XJ, XD
Equivalent platform assembler syntax:
vdiv.{b/h/w/d}[u] vd, vj, vk
xvdiv.{b/h/w/d}[u] xd, xj, xk
vmod.{b/h/w/d}[u] vd, vj, vk
xvmod.{b/h/w/d}[u] xd, xj, xk
Change-Id: I3676721c3c415de0f2ebbd480ecd1b2400a28dba
Reviewed-on: https://go-review.googlesource.com/c/go/+/636376
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
|
|
d729053edf |
mime/multipart: add helper to build content-disposition header contents
This PR adds an helper FileContentDisposition that builds multipart
Content-Disposition header contents with field name and file name,
escaping quotes and escape characters.
The function is then called in the related helper CreateFormFile.
The new function allows users to add other custom MIMEHeaders,
without having to rewrite the char escaping logic of field name and
file name, which is provided by the new helper.
Fixes #46771
Change-Id: Ifc82a79583feb6dd609ca1e6024e612fb58c05ce
GitHub-Last-Rev:
|
|
|
|
a68bf75d34 |
cmd/go: don't write own toolchain line when updating go line
The Go command had a behavior of writing its own toolchain name when updating the go line in a go.mod (for example when a user runs go get go@version). This behavior was often undesirable and the toolchain line was often removed by users before checking in go.mod files (including in the x/ repos). It also led to user confusion. This change removes that behavior. A toolchain line will not be added if one wasn't present before. The toolchain line can still be removed though: the toolchain line must be at least the go version, so if the go version is increased above the toolchain version, the toolchain version will be bumped up to that go version. Then the toolchain line will then be dropped because go <version> implies toolchain <version>. Making this change slightly hurts reproducability because future go commands run on the go.mod file may be run with a different toolchain than the one that used it, but that doesn't seem to be worth the confusion the behavior resulted in. We expect this change will not have negative consequences, but it could be possible, and we would like to hear from any users that depended on the previous behavior in case we need to roll it back before the release. Fixes #65847 Change-Id: Id795b7f762e4f90ba0fa8c7935d03f32dfc8590e Reviewed-on: https://go-review.googlesource.com/c/go/+/656835 Reviewed-by: Alan Donovan <adonovan@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
485480faaa |
net: deflake recently added TestCloseUnblocksReadUDP
Fixes #72802 Change-Id: I0dd457ef81a354f61c9de306e4609efdbe3d69b4 Reviewed-on: https://go-review.googlesource.com/c/go/+/656857 Auto-Submit: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Damien Neil <dneil@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Bypass: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
955cf0873f |
cmd/internal/dwarf: fix bug in inlined func DIE range DWARF 5 info
This patch changes the strategy we use in the compiler for handling range information for inlined subroutine bodies, fixing a bug in how this was handled for DWARF 5. The high and lo PC values being emitted for DW_TAG_inlined_subroutine DIEs were incorrect, pointing to the start of functions instead of the proper location. The fix in this patch is to move to unconditionally using DW_AT_ranges for inlined subroutines, even those with only a single range. Background: prior to this point, if a given inlined function body had a single contiguous range, we'd pick an abbrev entry for it with explicit DW_AT_low_pc and DW_AT_high_pc attributes. If the extent of the code for the inlined body was not contiguous (which can happen), we'd select an abbrev that used a DW_AT_ranges attribute instead. This strategy (preferring explicit hi/lo PC attrs for a single-range func) made sense for DWARF 4, since in DWARF 4 the representation used in the .debug_ranges section was especially heavyweight (lots of space, lots of relocations), so having explicit hi/lo PC attrs was less expensive. With DWARF 5 range info is written to the .debug_rnglists section, and the representation here is much more compact. Specifically, a single hi/lo range can be represented using a base address in addrx format (max of 4 bytes, but more likely 2 or 3) followed by start and endpoints of the range in ULEB128 format. This combination is more compact spacewise than the explicit hi/lo values, and has fewer relocations (0 as opposed to 2). Note: we should at some point consider applying this same strategy to lexical scopes, since we can probably reap some of the same benefits there as well. Updates #26379. Fixes #72821. Change-Id: Ifb65ecc6221601bad2ca3939f9b69964c1fafc7c Reviewed-on: https://go-review.googlesource.com/c/go/+/657175 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> |
|
|
|
bec12f153a |
log/slog: optimize appendKey to reduce allocations
This change introduces a new method, `appendTwoStrings`, which
optimizes the `appendKey` function by avoiding the allocation of a
temporary string (string concatenation of prefix and key). Instead, it
directly appends the prefix and key to the buffer.
Additionally, added `BenchmarkAppendKey` benchmark tests to validate performance improvements.
This change improves performance in cases where large prefixes are used,
as verified by the following benchmarks:
goos: darwin
goarch: arm64
pkg: log/slog
cpu: Apple M1 Max
│ old.out │ new.out │
│ sec/op │ sec/op vs base │
AppendKey/prefix_size_5-10 44.41n ± 0% 35.62n ± 0% -19.80% (p=0.000 n=10)
AppendKey/prefix_size_10-10 48.17n ± 0% 39.12n ± 0% -18.80% (p=0.000 n=10)
AppendKey/prefix_size_30-10 84.50n ± 0% 62.30n ± 0% -26.28% (p=0.000 n=10)
AppendKey/prefix_size_50-10 124.9n ± 0% 102.3n ± 0% -18.09% (p=0.000 n=10)
AppendKey/prefix_size_100-10 203.6n ± 1% 168.7n ± 0% -17.14% (p=0.000 n=10)
geomean 85.61n 68.41n -20.09%
│ old.out │ new.out │
│ B/op │ B/op vs base │
AppendKey/prefix_size_5-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
AppendKey/prefix_size_10-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
AppendKey/prefix_size_30-10 48.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10)
AppendKey/prefix_size_50-10 128.00 ± 0% 64.00 ± 0% -50.00% (p=0.000 n=10)
AppendKey/prefix_size_100-10 224.0 ± 0% 112.0 ± 0% -50.00% (p=0.000 n=10)
geomean ² ? ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
│ old.out │ new.out │
│ allocs/op │ allocs/op vs base │
AppendKey/prefix_size_5-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
AppendKey/prefix_size_10-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
AppendKey/prefix_size_30-10 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)
AppendKey/prefix_size_50-10 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.000 n=10)
AppendKey/prefix_size_100-10 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.000 n=10)
geomean ² ? ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
This patch improves performance without altering the external behavior of the `slog` package.
Change-Id: I8b47718de522196f06e0ddac48af73e352d2e5cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/631415
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
|
|
7e8ceadf85 |
cmd/compile/internal/ssagen: use an alias for math/bits.Len
Rather than using a specific intrinsic for math/bits.Len, use a pair of aliases instead. This requires less code and automatically adapts when platforms have a math/bits.Len32 or math/bits.Len64 intrinsic. Change-Id: I28b300172daaee26ef82a7530d9e96123663f541 Reviewed-on: https://go-review.googlesource.com/c/go/+/656995 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
a812e5f3c3 |
math/big: update calibration tests and recalibrate
Refactor calibration tests to use the same logic for all. Choosing thresholds that are broadly appropriate for all systems is part science but also part guesswork and judgement. We could instead set per-GOOS/GOARCH thresholds, but that seems like too much work, and even then there would be variation between different chips within a GOOS/GOARCH. (For example see the three linux/amd64 systems benchmarked below.) The thresholds chosen in this CL are: karatsubaThreshold = 40 // unchanged basicSqrThreshold = 12 // was 20 karatsubaSqrThreshold = 80 // was 260 divRecursiveThreshold = 40 // was 100 The new file calibrate.md explains the calibration process and links to graphs justifying those values. (The graphs are hosted on swtch.com to avoid adding a megabyte of extra data to the Go repo and Go distributions.) A rendered copy of calibrate.md is at https://swtch.com/math/big/calibrate.html. goos: linux goarch: amd64 pkg: math/big cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz │ old │ new │ │ sec/op │ sec/op vs base │ Div/20/10-88 13.13n ± 2% 13.14n ± 2% ~ (p=0.494 n=15) Div/40/20-88 13.13n ± 2% 13.14n ± 2% ~ (p=0.137 n=15) Div/100/50-88 25.50n ± 0% 25.51n ± 0% ~ (p=0.038 n=15) Div/200/100-88 113.1n ± 1% 116.0n ± 3% +2.56% (p=0.000 n=15) Div/400/200-88 135.3n ± 0% 137.1n ± 1% ~ (p=0.004 n=15) Div/1000/500-88 259.9n ± 1% 259.0n ± 2% ~ (p=0.182 n=15) Div/2000/1000-88 568.8n ± 1% 564.7n ± 3% ~ (p=0.927 n=15) Div/20000/10000-88 25.79µ ± 1% 22.11µ ± 2% -14.26% (p=0.000 n=15) Div/200000/100000-88 755.1µ ± 1% 737.6µ ± 1% -2.32% (p=0.000 n=15) Div/2000000/1000000-88 31.30m ± 0% 31.20m ± 1% ~ (p=0.081 n=15) Div/20000000/10000000-88 1.268 ± 0% 1.265 ± 0% ~ (p=0.011 n=15) NatMul/10-88 142.6n ± 0% 142.9n ± 7% ~ (p=0.145 n=15) NatMul/100-88 4.347µ ± 0% 4.350µ ± 3% ~ (p=0.430 n=15) NatMul/1000-88 187.6µ ± 0% 188.4µ ± 2% ~ (p=0.004 n=15) NatMul/10000-88 8.052m ± 0% 8.057m ± 1% ~ (p=0.148 n=15) NatMul/100000-88 260.6m ± 0% 260.7m ± 0% ~ (p=0.512 n=15) NatSqr/1-88 26.58n ± 5% 27.96n ± 8% ~ (p=0.574 n=15) NatSqr/2-88 42.35n ± 7% 44.87n ± 6% ~ (p=0.690 n=15) NatSqr/3-88 53.28n ± 4% 55.62n ± 5% ~ (p=0.151 n=15) NatSqr/5-88 76.26n ± 6% 81.43n ± 6% +6.78% (p=0.000 n=15) NatSqr/8-88 110.8n ± 5% 116.4n ± 6% ~ (p=0.040 n=15) NatSqr/10-88 141.4n ± 4% 147.8n ± 4% ~ (p=0.011 n=15) NatSqr/20-88 325.8n ± 3% 341.7n ± 4% +4.88% (p=0.000 n=15) NatSqr/30-88 536.8n ± 3% 556.1n ± 4% ~ (p=0.027 n=15) NatSqr/50-88 1.168µ ± 3% 1.197µ ± 3% ~ (p=0.442 n=15) NatSqr/80-88 2.527µ ± 2% 2.480µ ± 2% -1.86% (p=0.000 n=15) NatSqr/100-88 3.771µ ± 2% 3.535µ ± 2% -6.26% (p=0.000 n=15) NatSqr/200-88 14.03µ ± 2% 10.57µ ± 3% -24.68% (p=0.000 n=15) NatSqr/300-88 24.06µ ± 2% 20.57µ ± 2% -14.52% (p=0.000 n=15) NatSqr/500-88 65.43µ ± 1% 45.45µ ± 1% -30.55% (p=0.000 n=15) NatSqr/800-88 126.41µ ± 1% 94.13µ ± 2% -25.54% (p=0.000 n=15) NatSqr/1000-88 196.4µ ± 1% 135.1µ ± 1% -31.18% (p=0.000 n=15) NatSqr/10000-88 6.404m ± 0% 5.326m ± 1% -16.84% (p=0.000 n=15) NatSqr/100000-88 267.2m ± 0% 198.7m ± 0% -25.64% (p=0.000 n=15) geomean 7.318µ 6.948µ -5.06% goos: linux goarch: amd64 pkg: math/big cpu: Intel(R) Xeon(R) CPU @ 3.10GHz │ old │ new │ │ sec/op │ sec/op vs base │ Div/20/10-16 22.23n ± 0% 22.23n ± 0% ~ (p=0.973 n=15) Div/40/20-16 22.23n ± 0% 22.23n ± 0% ~ (p=0.226 n=15) Div/100/50-16 55.27n ± 1% 55.59n ± 0% ~ (p=0.004 n=15) Div/200/100-16 174.7n ± 3% 175.9n ± 2% ~ (p=0.645 n=15) Div/400/200-16 208.8n ± 1% 209.5n ± 2% ~ (p=0.169 n=15) Div/1000/500-16 378.7n ± 2% 380.5n ± 2% ~ (p=0.091 n=15) Div/2000/1000-16 778.4n ± 1% 781.1n ± 2% ~ (p=0.104 n=15) Div/20000/10000-16 25.16µ ± 1% 24.93µ ± 1% -0.91% (p=0.000 n=15) Div/200000/100000-16 926.4µ ± 0% 927.7µ ± 1% ~ (p=0.436 n=15) Div/2000000/1000000-16 35.58m ± 0% 35.53m ± 0% ~ (p=0.267 n=15) Div/20000000/10000000-16 1.333 ± 0% 1.330 ± 0% ~ (p=0.126 n=15) NatMul/10-16 172.6n ± 0% 165.4n ± 0% -4.17% (p=0.000 n=15) NatMul/100-16 5.706µ ± 0% 5.503µ ± 0% -3.56% (p=0.000 n=15) NatMul/1000-16 220.8µ ± 0% 219.1µ ± 0% -0.76% (p=0.000 n=15) NatMul/10000-16 8.688m ± 0% 8.621m ± 0% -0.77% (p=0.000 n=15) NatMul/100000-16 333.3m ± 0% 333.5m ± 0% ~ (p=0.512 n=15) NatSqr/1-16 28.66n ± 1% 28.42n ± 3% -0.84% (p=0.000 n=15) NatSqr/2-16 48.29n ± 2% 48.19n ± 2% ~ (p=0.042 n=15) NatSqr/3-16 59.93n ± 0% 59.64n ± 2% -0.48% (p=0.000 n=15) NatSqr/5-16 88.05n ± 0% 87.89n ± 3% ~ (p=0.066 n=15) NatSqr/8-16 127.7n ± 0% 126.9n ± 3% -0.63% (p=0.000 n=15) NatSqr/10-16 170.4n ± 0% 169.7n ± 3% ~ (p=0.004 n=15) NatSqr/20-16 388.8n ± 0% 392.9n ± 3% ~ (p=0.123 n=15) NatSqr/30-16 635.2n ± 0% 641.7n ± 3% ~ (p=0.123 n=15) NatSqr/50-16 1.304µ ± 1% 1.314µ ± 3% ~ (p=0.927 n=15) NatSqr/80-16 2.709µ ± 1% 2.899µ ± 4% +7.01% (p=0.000 n=15) NatSqr/100-16 3.885µ ± 0% 3.981µ ± 4% ~ (p=0.123 n=15) NatSqr/200-16 13.29µ ± 2% 12.14µ ± 4% -8.67% (p=0.000 n=15) NatSqr/300-16 23.39µ ± 0% 22.51µ ± 3% -3.78% (p=0.000 n=15) NatSqr/500-16 58.13µ ± 1% 50.56µ ± 2% -13.02% (p=0.000 n=15) NatSqr/800-16 118.4µ ± 1% 107.6µ ± 2% -9.11% (p=0.000 n=15) NatSqr/1000-16 172.7µ ± 1% 151.8µ ± 2% -12.11% (p=0.000 n=15) NatSqr/10000-16 6.065m ± 1% 5.757m ± 1% -5.08% (p=0.000 n=15) NatSqr/100000-16 240.9m ± 0% 228.1m ± 0% -5.32% (p=0.000 n=15) geomean 8.601µ 8.453µ -1.71% goos: linux goarch: amd64 pkg: math/big cpu: AMD Ryzen 9 7950X 16-Core Processor │ old │ new │ │ sec/op │ sec/op vs base │ Div/20/10-32 11.11n ± 0% 11.11n ± 1% ~ (p=0.532 n=15) Div/40/20-32 11.08n ± 1% 11.11n ± 0% ~ (p=0.815 n=15) Div/100/50-32 16.81n ± 0% 16.84n ± 29% ~ (p=0.020 n=15) Div/200/100-32 73.91n ± 0% 76.85n ± 11% +3.98% (p=0.000 n=15) Div/400/200-32 87.35n ± 0% 88.91n ± 34% +1.79% (p=0.000 n=15) Div/1000/500-32 169.3n ± 1% 168.9n ± 1% ~ (p=0.049 n=15) Div/2000/1000-32 369.3n ± 0% 369.0n ± 0% ~ (p=0.108 n=15) Div/20000/10000-32 15.92µ ± 0% 13.55µ ± 2% -14.91% (p=0.000 n=15) Div/200000/100000-32 491.4µ ± 0% 482.4µ ± 1% -1.84% (p=0.000 n=15) Div/2000000/1000000-32 20.09m ± 0% 19.96m ± 0% -0.69% (p=0.000 n=15) Div/20000000/10000000-32 756.5m ± 0% 755.5m ± 0% ~ (p=0.089 n=15) NatMul/10-32 125.4n ± 5% 124.8n ± 1% ~ (p=0.588 n=15) NatMul/100-32 2.952µ ± 3% 2.969µ ± 0% ~ (p=0.237 n=15) NatMul/1000-32 120.7µ ± 0% 121.1µ ± 0% +0.30% (p=0.000 n=15) NatMul/10000-32 4.845m ± 0% 4.839m ± 1% ~ (p=0.653 n=15) NatMul/100000-32 173.3m ± 0% 173.3m ± 0% ~ (p=0.838 n=15) NatSqr/1-32 31.18n ± 23% 32.08n ± 2% ~ (p=0.015 n=15) NatSqr/2-32 57.22n ± 28% 58.88n ± 2% ~ (p=0.054 n=15) NatSqr/3-32 61.34n ± 18% 64.33n ± 2% ~ (p=0.237 n=15) NatSqr/5-32 72.47n ± 17% 79.81n ± 3% ~ (p=0.067 n=15) NatSqr/8-32 83.26n ± 26% 100.10n ± 3% ~ (p=0.016 n=15) NatSqr/10-32 87.31n ± 43% 125.50n ± 2% ~ (p=0.003 n=15) NatSqr/20-32 193.5n ± 25% 244.4n ± 13% ~ (p=0.002 n=15) NatSqr/30-32 323.9n ± 17% 380.9n ± 6% ~ (p=0.003 n=15) NatSqr/50-32 713.4n ± 9% 761.7n ± 8% ~ (p=0.419 n=15) NatSqr/80-32 1.486µ ± 7% 1.609µ ± 5% +8.28% (p=0.000 n=15) NatSqr/100-32 2.115µ ± 9% 2.253µ ± 1% ~ (p=0.104 n=15) NatSqr/200-32 7.201µ ± 4% 6.610µ ± 1% -8.21% (p=0.000 n=15) NatSqr/300-32 13.08µ ± 2% 12.37µ ± 1% -5.41% (p=0.000 n=15) NatSqr/500-32 32.56µ ± 2% 27.83µ ± 2% -14.52% (p=0.000 n=15) NatSqr/800-32 66.83µ ± 3% 59.59µ ± 1% -10.83% (p=0.000 n=15) NatSqr/1000-32 98.09µ ± 1% 83.59µ ± 1% -14.78% (p=0.000 n=15) NatSqr/10000-32 3.445m ± 1% 3.245m ± 0% -5.81% (p=0.000 n=15) NatSqr/100000-32 137.3m ± 0% 127.0m ± 0% -7.54% (p=0.000 n=15) geomean 4.897µ 4.972µ +1.52% goos: linux goarch: arm64 pkg: math/big │ old │ new │ │ sec/op │ sec/op vs base │ Div/20/10-16 15.26n ± 2% 15.14n ± 1% ~ (p=0.212 n=15) Div/40/20-16 15.22n ± 1% 15.16n ± 0% ~ (p=0.190 n=15) Div/100/50-16 26.53n ± 2% 26.42n ± 0% -0.41% (p=0.000 n=15) Div/200/100-16 124.3n ± 0% 124.0n ± 0% ~ (p=0.704 n=15) Div/400/200-16 142.4n ± 0% 141.8n ± 0% ~ (p=0.074 n=15) Div/1000/500-16 262.0n ± 1% 261.3n ± 1% ~ (p=0.046 n=15) Div/2000/1000-16 532.6n ± 0% 532.5n ± 1% ~ (p=0.798 n=15) Div/20000/10000-16 22.27µ ± 0% 22.88µ ± 0% +2.73% (p=0.000 n=15) Div/200000/100000-16 890.4µ ± 0% 902.8µ ± 0% +1.39% (p=0.000 n=15) Div/2000000/1000000-16 35.03m ± 0% 35.10m ± 0% ~ (p=0.305 n=15) Div/20000000/10000000-16 1.380 ± 0% 1.385 ± 0% ~ (p=0.019 n=15) NatMul/10-16 177.6n ± 1% 175.6n ± 3% ~ (p=0.480 n=15) NatMul/100-16 5.675µ ± 0% 5.669µ ± 1% ~ (p=0.705 n=15) NatMul/1000-16 224.3µ ± 0% 224.6µ ± 0% ~ (p=0.653 n=15) NatMul/10000-16 8.735m ± 0% 8.739m ± 0% ~ (p=0.567 n=15) NatMul/100000-16 331.6m ± 0% 331.6m ± 1% ~ (p=0.412 n=15) NatSqr/1-16 43.69n ± 2% 42.77n ± 6% ~ (p=0.383 n=15) NatSqr/2-16 65.26n ± 2% 63.91n ± 5% ~ (p=0.285 n=15) NatSqr/3-16 73.95n ± 1% 72.25n ± 6% ~ (p=0.198 n=15) NatSqr/5-16 95.06n ± 1% 94.21n ± 3% ~ (p=0.721 n=15) NatSqr/8-16 155.5n ± 1% 153.4n ± 4% ~ (p=0.170 n=15) NatSqr/10-16 175.4n ± 1% 174.0n ± 2% ~ (p=0.271 n=15) NatSqr/20-16 360.8n ± 0% 358.5n ± 2% ~ (p=0.170 n=15) NatSqr/30-16 584.7n ± 0% 582.9n ± 1% ~ (p=0.170 n=15) NatSqr/50-16 1.323µ ± 0% 1.322µ ± 0% ~ (p=0.627 n=15) NatSqr/80-16 2.916µ ± 0% 2.674µ ± 0% -8.30% (p=0.000 n=15) NatSqr/100-16 4.365µ ± 0% 3.802µ ± 0% -12.90% (p=0.000 n=15) NatSqr/200-16 16.42µ ± 0% 11.29µ ± 0% -31.26% (p=0.000 n=15) NatSqr/300-16 28.07µ ± 0% 22.83µ ± 0% -18.68% (p=0.000 n=15) NatSqr/500-16 76.30µ ± 0% 50.06µ ± 0% -34.39% (p=0.000 n=15) NatSqr/800-16 147.5µ ± 0% 101.2µ ± 1% -31.41% (p=0.000 n=15) NatSqr/1000-16 228.6µ ± 0% 149.5µ ± 0% -34.61% (p=0.000 n=15) NatSqr/10000-16 7.417m ± 0% 6.025m ± 0% -18.76% (p=0.000 n=15) NatSqr/100000-16 309.2m ± 0% 214.9m ± 0% -30.50% (p=0.000 n=15) geomean 8.559µ 7.906µ -7.63% goos: darwin goarch: arm64 pkg: math/big cpu: Apple M3 Pro │ old │ new │ │ sec/op │ sec/op vs base │ Div/20/10-12 9.577n ± 6% 9.473n ± 5% ~ (p=0.384 n=15) Div/40/20-12 9.480n ± 1% 9.430n ± 1% ~ (p=0.019 n=15) Div/100/50-12 14.82n ± 0% 14.82n ± 0% ~ (p=0.845 n=15) Div/200/100-12 83.94n ± 1% 84.35n ± 4% ~ (p=0.512 n=15) Div/400/200-12 102.7n ± 1% 102.9n ± 0% ~ (p=0.845 n=15) Div/1000/500-12 185.3n ± 1% 181.9n ± 1% -1.83% (p=0.000 n=15) Div/2000/1000-12 397.0n ± 1% 396.7n ± 0% ~ (p=0.959 n=15) Div/20000/10000-12 14.05µ ± 0% 13.70µ ± 1% ~ (p=0.002 n=15) Div/200000/100000-12 529.4µ ± 3% 526.7µ ± 2% ~ (p=0.967 n=15) Div/2000000/1000000-12 20.05m ± 0% 20.05m ± 0% ~ (p=0.653 n=15) Div/20000000/10000000-12 788.2m ± 1% 789.0m ± 1% ~ (p=0.412 n=15) NatMul/10-12 79.95n ± 1% 80.87n ± 1% +1.15% (p=0.000 n=15) NatMul/100-12 2.973µ ± 0% 2.986µ ± 2% ~ (p=0.051 n=15) NatMul/1000-12 122.6µ ± 5% 123.0µ ± 1% ~ (p=0.783 n=15) NatMul/10000-12 4.990m ± 1% 5.000m ± 1% ~ (p=0.653 n=15) NatMul/100000-12 185.3m ± 3% 190.3m ± 1% ~ (p=0.089 n=15) NatSqr/1-12 11.84n ± 1% 11.88n ± 1% ~ (p=0.735 n=15) NatSqr/2-12 21.01n ± 1% 21.44n ± 6% ~ (p=0.039 n=15) NatSqr/3-12 25.59n ± 0% 26.74n ± 9% +4.49% (p=0.000 n=15) NatSqr/5-12 36.78n ± 0% 37.04n ± 1% +0.71% (p=0.000 n=15) NatSqr/8-12 63.09n ± 3% 63.22n ± 1% ~ (p=0.846 n=15) NatSqr/10-12 79.98n ± 0% 79.78n ± 0% ~ (p=0.100 n=15) NatSqr/20-12 174.0n ± 0% 175.5n ± 1% ~ (p=0.361 n=15) NatSqr/30-12 290.0n ± 0% 291.4n ± 0% ~ (p=0.002 n=15) NatSqr/50-12 655.2n ± 4% 658.1n ± 0% ~ (p=0.060 n=15) NatSqr/80-12 1.506µ ± 0% 1.397µ ± 5% -7.24% (p=0.000 n=15) NatSqr/100-12 2.273µ ± 0% 2.005µ ± 5% -11.79% (p=0.000 n=15) NatSqr/200-12 8.833µ ± 6% 6.109µ ± 0% -30.84% (p=0.000 n=15) NatSqr/300-12 15.15µ ± 4% 12.37µ ± 0% -18.34% (p=0.000 n=15) NatSqr/500-12 41.89µ ± 0% 27.70µ ± 1% -33.88% (p=0.000 n=15) NatSqr/800-12 80.72µ ± 0% 56.40µ ± 0% -30.12% (p=0.000 n=15) NatSqr/1000-12 127.06µ ± 1% 84.06µ ± 1% -33.84% (p=0.000 n=15) NatSqr/10000-12 4.130m ± 0% 3.390m ± 0% -17.91% (p=0.000 n=15) NatSqr/100000-12 173.2m ± 0% 131.2m ± 6% -24.25% (p=0.000 n=15) geomean 4.489µ 4.189µ -6.68% Change-Id: Iaf65fd85457b003ebf07a787c875cda321b40cc9 Reviewed-on: https://go-review.googlesource.com/c/go/+/652058 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Alan Donovan <adonovan@google.com> Auto-Submit: Russ Cox <rsc@golang.org> |
|
|
|
40c953cd46 |
runtime: remove nextSampleNoFP from plan9
Plan 9 can use floating point now. Change-Id: If721b243daa31853609cb3d2c535d86c106a1ee1 Reviewed-on: https://go-review.googlesource.com/c/go/+/655879 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> |
|
|
|
d037ed62bc |
math/big: simplify, speed up Karatsuba multiplication
The old Karatsuba implementation only operated on lengths that are
a power of two times a number smaller than karatsubaThreshold.
For example, when karatsubaThreshold = 40, multiplying a pair
of 99-word numbers runs karatsuba on the low 96 (= 39<<2) words
and then has to fix up the answer to include the high 3 words of each.
I suspect this requirement was needed to make the analysis of
how many temporary words to reserve easier, back when the
answer was 3*n and depended on exactly halving the size at
each Karatsuba step.
Now that we have the more flexible temporary allocation stack,
we can change Karatsuba to accept operands of odd length.
Doing so avoids most of the fixup that the old approach required.
For example, multiplying a pair of 99-word numbers runs
karatsuba on all 99 words now.
This is simpler and about the same speed or, for large cases, faster.
goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
│ old │ new │
│ sec/op │ sec/op vs base │
GCD10x10/WithoutXY-16 99.62n ± 3% 99.10n ± 3% ~ (p=0.009 n=15)
GCD10x10/WithXY-16 243.4n ± 1% 245.2n ± 1% ~ (p=0.009 n=15)
GCD100x100/WithoutXY-16 921.9n ± 1% 919.2n ± 1% ~ (p=0.076 n=15)
GCD100x100/WithXY-16 1.527µ ± 1% 1.526µ ± 0% ~ (p=0.813 n=15)
GCD1000x1000/WithoutXY-16 9.704µ ± 1% 9.696µ ± 0% ~ (p=0.532 n=15)
GCD1000x1000/WithXY-16 14.03µ ± 1% 13.96µ ± 0% ~ (p=0.014 n=15)
GCD10000x10000/WithoutXY-16 206.5µ ± 2% 206.5µ ± 0% ~ (p=0.967 n=15)
GCD10000x10000/WithXY-16 398.0µ ± 1% 397.4µ ± 0% ~ (p=0.683 n=15)
Div/20/10-16 22.22n ± 0% 22.23n ± 0% ~ (p=0.105 n=15)
Div/40/20-16 22.23n ± 0% 22.23n ± 0% ~ (p=0.307 n=15)
Div/100/50-16 55.47n ± 0% 55.47n ± 0% ~ (p=0.573 n=15)
Div/200/100-16 174.9n ± 1% 174.6n ± 1% ~ (p=0.814 n=15)
Div/400/200-16 209.5n ± 1% 210.5n ± 1% ~ (p=0.454 n=15)
Div/1000/500-16 379.9n ± 0% 383.5n ± 2% ~ (p=0.123 n=15)
Div/2000/1000-16 780.1n ± 0% 784.6n ± 1% +0.58% (p=0.000 n=15)
Div/20000/10000-16 25.22µ ± 1% 25.15µ ± 0% ~ (p=0.213 n=15)
Div/200000/100000-16 921.8µ ± 1% 926.1µ ± 0% ~ (p=0.009 n=15)
Div/2000000/1000000-16 37.91m ± 0% 35.63m ± 0% -6.02% (p=0.000 n=15)
Div/20000000/10000000-16 1.378 ± 0% 1.336 ± 0% -3.03% (p=0.000 n=15)
NatMul/10-16 166.8n ± 4% 168.9n ± 3% ~ (p=0.008 n=15)
NatMul/100-16 5.519µ ± 2% 5.548µ ± 4% ~ (p=0.032 n=15)
NatMul/1000-16 230.4µ ± 1% 220.2µ ± 1% -4.43% (p=0.000 n=15)
NatMul/10000-16 8.569m ± 1% 8.640m ± 1% ~ (p=0.005 n=15)
NatMul/100000-16 376.5m ± 1% 334.1m ± 0% -11.26% (p=0.000 n=15)
NatSqr/1-16 27.85n ± 5% 28.60n ± 2% ~ (p=0.123 n=15)
NatSqr/2-16 47.99n ± 2% 48.84n ± 1% ~ (p=0.008 n=15)
NatSqr/3-16 59.41n ± 2% 60.87n ± 2% +2.46% (p=0.001 n=15)
NatSqr/5-16 87.27n ± 2% 89.31n ± 3% ~ (p=0.087 n=15)
NatSqr/8-16 124.6n ± 3% 128.9n ± 3% ~ (p=0.006 n=15)
NatSqr/10-16 166.3n ± 3% 172.7n ± 3% ~ (p=0.002 n=15)
NatSqr/20-16 385.2n ± 2% 394.7n ± 3% ~ (p=0.036 n=15)
NatSqr/30-16 622.7n ± 3% 642.9n ± 3% ~ (p=0.032 n=15)
NatSqr/50-16 1.274µ ± 3% 1.323µ ± 4% ~ (p=0.003 n=15)
NatSqr/80-16 2.606µ ± 4% 2.714µ ± 4% ~ (p=0.044 n=15)
NatSqr/100-16 3.731µ ± 4% 3.871µ ± 4% ~ (p=0.038 n=15)
NatSqr/200-16 12.99µ ± 2% 13.09µ ± 3% ~ (p=0.838 n=15)
NatSqr/300-16 22.87µ ± 2% 23.25µ ± 2% ~ (p=0.285 n=15)
NatSqr/500-16 58.43µ ± 1% 58.25µ ± 2% ~ (p=0.345 n=15)
NatSqr/800-16 115.3µ ± 3% 116.2µ ± 3% ~ (p=0.126 n=15)
NatSqr/1000-16 173.9µ ± 1% 174.3µ ± 1% ~ (p=0.935 n=15)
NatSqr/10000-16 6.133m ± 2% 6.034m ± 1% -1.62% (p=0.000 n=15)
NatSqr/100000-16 253.8m ± 1% 241.5m ± 0% -4.87% (p=0.000 n=15)
geomean 7.745µ 7.760µ +0.19%
goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
│ old │ new │
│ sec/op │ sec/op vs base │
GCD10x10/WithoutXY-88 62.17n ± 4% 61.44n ± 0% -1.17% (p=0.000 n=15)
GCD10x10/WithXY-88 173.4n ± 2% 172.4n ± 4% ~ (p=0.615 n=15)
GCD100x100/WithoutXY-88 584.0n ± 1% 582.9n ± 0% ~ (p=0.009 n=15)
GCD100x100/WithXY-88 1.098µ ± 1% 1.091µ ± 2% ~ (p=0.002 n=15)
GCD1000x1000/WithoutXY-88 6.055µ ± 0% 6.049µ ± 0% ~ (p=0.007 n=15)
GCD1000x1000/WithXY-88 9.430µ ± 0% 9.417µ ± 1% ~ (p=0.123 n=15)
GCD10000x10000/WithoutXY-88 153.4µ ± 2% 149.0µ ± 2% -2.85% (p=0.000 n=15)
GCD10000x10000/WithXY-88 350.6µ ± 3% 349.0µ ± 2% ~ (p=0.126 n=15)
Div/20/10-88 13.12n ± 0% 13.12n ± 1% 0.00% (p=0.042 n=15)
Div/40/20-88 13.12n ± 0% 13.13n ± 0% ~ (p=0.004 n=15)
Div/100/50-88 25.49n ± 0% 25.49n ± 0% ~ (p=0.452 n=15)
Div/200/100-88 115.7n ± 2% 113.8n ± 2% ~ (p=0.212 n=15)
Div/400/200-88 135.0n ± 1% 136.1n ± 1% ~ (p=0.005 n=15)
Div/1000/500-88 257.5n ± 1% 259.9n ± 1% ~ (p=0.004 n=15)
Div/2000/1000-88 567.5n ± 1% 572.4n ± 2% ~ (p=0.616 n=15)
Div/20000/10000-88 25.65µ ± 0% 25.77µ ± 1% ~ (p=0.032 n=15)
Div/200000/100000-88 777.4µ ± 1% 754.3µ ± 1% -2.97% (p=0.000 n=15)
Div/2000000/1000000-88 33.66m ± 0% 31.37m ± 0% -6.81% (p=0.000 n=15)
Div/20000000/10000000-88 1.320 ± 0% 1.266 ± 0% -4.04% (p=0.000 n=15)
NatMul/10-88 151.9n ± 7% 143.3n ± 7% ~ (p=0.878 n=15)
NatMul/100-88 4.418µ ± 2% 4.337µ ± 3% ~ (p=0.512 n=15)
NatMul/1000-88 206.8µ ± 1% 189.8µ ± 1% -8.25% (p=0.000 n=15)
NatMul/10000-88 8.531m ± 1% 8.095m ± 0% -5.12% (p=0.000 n=15)
NatMul/100000-88 298.9m ± 0% 260.5m ± 1% -12.85% (p=0.000 n=15)
NatSqr/1-88 27.55n ± 6% 28.25n ± 7% ~ (p=0.024 n=15)
NatSqr/2-88 44.71n ± 6% 46.21n ± 9% ~ (p=0.024 n=15)
NatSqr/3-88 55.44n ± 4% 58.41n ± 10% ~ (p=0.126 n=15)
NatSqr/5-88 80.71n ± 5% 81.41n ± 5% ~ (p=0.032 n=15)
NatSqr/8-88 115.7n ± 4% 115.4n ± 5% ~ (p=0.814 n=15)
NatSqr/10-88 147.4n ± 4% 147.3n ± 4% ~ (p=0.505 n=15)
NatSqr/20-88 337.8n ± 3% 337.3n ± 4% ~ (p=0.814 n=15)
NatSqr/30-88 556.9n ± 3% 557.6n ± 4% ~ (p=0.814 n=15)
NatSqr/50-88 1.208µ ± 4% 1.208µ ± 3% ~ (p=0.910 n=15)
NatSqr/80-88 2.591µ ± 3% 2.581µ ± 3% ~ (p=0.705 n=15)
NatSqr/100-88 3.870µ ± 3% 3.858µ ± 3% ~ (p=0.846 n=15)
NatSqr/200-88 14.43µ ± 3% 14.28µ ± 2% ~ (p=0.383 n=15)
NatSqr/300-88 24.68µ ± 2% 24.49µ ± 2% ~ (p=0.624 n=15)
NatSqr/500-88 66.27µ ± 1% 66.18µ ± 1% ~ (p=0.735 n=15)
NatSqr/800-88 128.7µ ± 1% 127.4µ ± 1% ~ (p=0.050 n=15)
NatSqr/1000-88 198.7µ ± 1% 197.7µ ± 1% ~ (p=0.229 n=15)
NatSqr/10000-88 6.582m ± 1% 6.426m ± 1% -2.37% (p=0.000 n=15)
NatSqr/100000-88 274.3m ± 0% 267.3m ± 0% -2.57% (p=0.000 n=15)
geomean 6.518µ 6.438µ -1.22%
goos: linux
goarch: arm64
pkg: math/big
│ old │ new │
│ sec/op │ sec/op vs base │
GCD10x10/WithoutXY-16 61.70n ± 1% 61.32n ± 1% ~ (p=0.361 n=15)
GCD10x10/WithXY-16 217.3n ± 1% 217.0n ± 1% ~ (p=0.395 n=15)
GCD100x100/WithoutXY-16 569.7n ± 0% 572.6n ± 2% ~ (p=0.213 n=15)
GCD100x100/WithXY-16 1.241µ ± 1% 1.236µ ± 1% ~ (p=0.157 n=15)
GCD1000x1000/WithoutXY-16 5.558µ ± 0% 5.566µ ± 0% ~ (p=0.228 n=15)
GCD1000x1000/WithXY-16 9.319µ ± 0% 9.326µ ± 0% ~ (p=0.233 n=15)
GCD10000x10000/WithoutXY-16 126.4µ ± 2% 128.7µ ± 3% ~ (p=0.081 n=15)
GCD10000x10000/WithXY-16 279.3µ ± 0% 278.3µ ± 5% ~ (p=0.187 n=15)
Div/20/10-16 15.12n ± 1% 15.21n ± 1% ~ (p=0.490 n=15)
Div/40/20-16 15.11n ± 0% 15.23n ± 1% ~ (p=0.107 n=15)
Div/100/50-16 26.53n ± 0% 26.50n ± 0% ~ (p=0.299 n=15)
Div/200/100-16 123.7n ± 0% 124.0n ± 0% ~ (p=0.086 n=15)
Div/400/200-16 142.5n ± 0% 142.4n ± 0% ~ (p=0.039 n=15)
Div/1000/500-16 259.9n ± 1% 261.2n ± 1% ~ (p=0.044 n=15)
Div/2000/1000-16 539.4n ± 1% 532.3n ± 1% -1.32% (p=0.001 n=15)
Div/20000/10000-16 22.43µ ± 0% 22.32µ ± 0% -0.49% (p=0.000 n=15)
Div/200000/100000-16 898.3µ ± 0% 889.6µ ± 0% -0.96% (p=0.000 n=15)
Div/2000000/1000000-16 38.37m ± 0% 35.11m ± 0% -8.49% (p=0.000 n=15)
Div/20000000/10000000-16 1.449 ± 0% 1.384 ± 0% -4.48% (p=0.000 n=15)
NatMul/10-16 182.0n ± 1% 177.8n ± 1% -2.31% (p=0.000 n=15)
NatMul/100-16 5.537µ ± 0% 5.693µ ± 0% +2.82% (p=0.000 n=15)
NatMul/1000-16 229.9µ ± 0% 224.8µ ± 0% -2.24% (p=0.000 n=15)
NatMul/10000-16 8.985m ± 0% 8.751m ± 0% -2.61% (p=0.000 n=15)
NatMul/100000-16 371.1m ± 0% 331.5m ± 0% -10.66% (p=0.000 n=15)
NatSqr/1-16 46.77n ± 6% 42.76n ± 1% -8.57% (p=0.000 n=15)
NatSqr/2-16 66.99n ± 4% 63.62n ± 1% -5.03% (p=0.000 n=15)
NatSqr/3-16 76.79n ± 4% 73.42n ± 1% ~ (p=0.007 n=15)
NatSqr/5-16 99.00n ± 3% 95.35n ± 1% -3.69% (p=0.000 n=15)
NatSqr/8-16 160.0n ± 3% 155.1n ± 1% -3.06% (p=0.001 n=15)
NatSqr/10-16 178.4n ± 2% 175.9n ± 0% -1.40% (p=0.001 n=15)
NatSqr/20-16 361.9n ± 2% 361.3n ± 0% ~ (p=0.083 n=15)
NatSqr/30-16 584.7n ± 0% 586.8n ± 0% +0.36% (p=0.000 n=15)
NatSqr/50-16 1.327µ ± 0% 1.329µ ± 0% ~ (p=0.349 n=15)
NatSqr/80-16 2.893µ ± 1% 2.925µ ± 0% +1.11% (p=0.000 n=15)
NatSqr/100-16 4.330µ ± 1% 4.381µ ± 0% +1.18% (p=0.000 n=15)
NatSqr/200-16 16.25µ ± 1% 16.43µ ± 0% +1.07% (p=0.000 n=15)
NatSqr/300-16 27.85µ ± 1% 28.06µ ± 0% +0.77% (p=0.000 n=15)
NatSqr/500-16 76.01µ ± 0% 76.34µ ± 0% ~ (p=0.002 n=15)
NatSqr/800-16 146.8µ ± 0% 148.1µ ± 0% +0.83% (p=0.000 n=15)
NatSqr/1000-16 228.2µ ± 0% 228.6µ ± 0% ~ (p=0.123 n=15)
NatSqr/10000-16 7.524m ± 0% 7.426m ± 0% -1.31% (p=0.000 n=15)
NatSqr/100000-16 316.7m ± 0% 309.2m ± 0% -2.36% (p=0.000 n=15)
geomean 7.264µ 7.172µ -1.27%
goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
│ old │ new │
│ sec/op │ sec/op vs base │
GCD10x10/WithoutXY-12 32.61n ± 1% 32.42n ± 1% ~ (p=0.021 n=15)
GCD10x10/WithXY-12 87.70n ± 1% 88.42n ± 1% ~ (p=0.010 n=15)
GCD100x100/WithoutXY-12 305.9n ± 0% 306.4n ± 0% ~ (p=0.003 n=15)
GCD100x100/WithXY-12 560.3n ± 2% 556.6n ± 1% ~ (p=0.018 n=15)
GCD1000x1000/WithoutXY-12 3.509µ ± 2% 3.464µ ± 1% ~ (p=0.145 n=15)
GCD1000x1000/WithXY-12 5.347µ ± 2% 5.372µ ± 1% ~ (p=0.046 n=15)
GCD10000x10000/WithoutXY-12 73.75µ ± 1% 73.99µ ± 1% ~ (p=0.004 n=15)
GCD10000x10000/WithXY-12 148.4µ ± 0% 147.8µ ± 1% ~ (p=0.076 n=15)
Div/20/10-12 9.481n ± 0% 9.462n ± 1% ~ (p=0.631 n=15)
Div/40/20-12 9.457n ± 0% 9.462n ± 1% ~ (p=0.798 n=15)
Div/100/50-12 14.91n ± 0% 14.79n ± 1% -0.80% (p=0.000 n=15)
Div/200/100-12 84.56n ± 1% 84.60n ± 1% ~ (p=0.271 n=15)
Div/400/200-12 103.8n ± 0% 102.8n ± 0% -0.96% (p=0.000 n=15)
Div/1000/500-12 181.3n ± 1% 184.2n ± 2% ~ (p=0.091 n=15)
Div/2000/1000-12 397.5n ± 0% 397.4n ± 0% ~ (p=0.299 n=15)
Div/20000/10000-12 14.04µ ± 1% 13.99µ ± 0% ~ (p=0.221 n=15)
Div/200000/100000-12 523.1µ ± 0% 514.0µ ± 3% ~ (p=0.775 n=15)
Div/2000000/1000000-12 21.58m ± 0% 20.01m ± 1% -7.29% (p=0.000 n=15)
Div/20000000/10000000-12 813.5m ± 0% 796.2m ± 1% -2.13% (p=0.000 n=15)
NatMul/10-12 80.46n ± 1% 80.02n ± 1% ~ (p=0.063 n=15)
NatMul/100-12 2.904µ ± 0% 2.979µ ± 1% +2.58% (p=0.000 n=15)
NatMul/1000-12 127.8µ ± 0% 122.3µ ± 0% -4.28% (p=0.000 n=15)
NatMul/10000-12 5.141m ± 0% 4.975m ± 1% -3.23% (p=0.000 n=15)
NatMul/100000-12 208.8m ± 0% 189.6m ± 3% -9.21% (p=0.000 n=15)
NatSqr/1-12 11.90n ± 1% 11.76n ± 1% ~ (p=0.059 n=15)
NatSqr/2-12 21.33n ± 1% 21.12n ± 0% ~ (p=0.063 n=15)
NatSqr/3-12 26.05n ± 1% 25.79n ± 0% ~ (p=0.002 n=15)
NatSqr/5-12 37.31n ± 0% 36.98n ± 1% ~ (p=0.008 n=15)
NatSqr/8-12 63.07n ± 0% 62.75n ± 1% ~ (p=0.061 n=15)
NatSqr/10-12 79.48n ± 0% 79.59n ± 0% ~ (p=0.455 n=15)
NatSqr/20-12 173.1n ± 0% 173.2n ± 1% ~ (p=0.518 n=15)
NatSqr/30-12 288.6n ± 1% 289.2n ± 0% ~ (p=0.030 n=15)
NatSqr/50-12 653.3n ± 0% 653.3n ± 0% ~ (p=0.361 n=15)
NatSqr/80-12 1.492µ ± 0% 1.496µ ± 0% ~ (p=0.018 n=15)
NatSqr/100-12 2.270µ ± 1% 2.270µ ± 0% ~ (p=0.326 n=15)
NatSqr/200-12 8.776µ ± 1% 8.784µ ± 1% ~ (p=0.083 n=15)
NatSqr/300-12 15.07µ ± 0% 15.09µ ± 0% ~ (p=0.455 n=15)
NatSqr/500-12 41.71µ ± 0% 41.77µ ± 1% ~ (p=0.305 n=15)
NatSqr/800-12 80.77µ ± 1% 80.59µ ± 0% ~ (p=0.113 n=15)
NatSqr/1000-12 126.4µ ± 1% 126.5µ ± 0% ~ (p=0.683 n=15)
NatSqr/10000-12 4.204m ± 0% 4.119m ± 0% -2.02% (p=0.000 n=15)
NatSqr/100000-12 177.0m ± 0% 172.9m ± 0% -2.31% (p=0.000 n=15)
geomean 3.790µ 3.757µ -0.87%
Change-Id: Ifc7a9b61f678df216690511ac8bb9143189a795e
Reviewed-on: https://go-review.googlesource.com/c/go/+/652057
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
|
|
|
|
26040b1dd7 |
cmd/compile: remove noDuffDevice
noDuffDevice was for Plan 9, but Plan 9 doesn't need it anymore. It was also being set in s390x, mips, mipsle, and wasm, but on those systems it had no effect since the SSA rules for those architectures don't refer to it at all. Change-Id: Ib85c0832674c714f3ad5091f0a022eb7cd3ebcdf Reviewed-on: https://go-review.googlesource.com/c/go/+/655878 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> |
|
|
|
c9b07e8871 |
cmd/compile: use FMA on plan9, and drop UseFMA
Every OS uses FMA now. Change-Id: Ia7ffa77c52c45aefca611ddc54e9dfffb27a48da Reviewed-on: https://go-review.googlesource.com/c/go/+/655877 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
35cb497d6e |
cmd/compile: remove useSSE
Every OS uses SSE now. Change-Id: I4df7e2fbc8e5ccb1fc84a884d4c922b7a2a628e4 Reviewed-on: https://go-review.googlesource.com/c/go/+/655876 Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> |
|
|
|
644b984027 |
cmd/compile: compute bitsize from type size in prove to clean some switches
Change-Id: I215adda9050d214576433700aed4c371a36aaaed Reviewed-on: https://go-review.googlesource.com/c/go/+/656335 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
b60b9cf21f |
cmd/compile: add constant folding for bits.Add64
Change-Id: I0ed4ebeaaa68e274e5902485ccc1165c039440bd Reviewed-on: https://go-review.googlesource.com/c/go/+/656275 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
4ff70cf868 |
cmd/compile: add MakeTuple generic SSA op to remove duplicate Select[01] rules
Change-Id: Id94a5e503f02aa29dc1e334b521770107d4261db Reviewed-on: https://go-review.googlesource.com/c/go/+/656615 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
99411d7847 |
cmd/compile: compute bits.OnesCount's limits from argument's limits
Change-Id: Ia90d48ea0fab363c8592221fad88958b522edefe Reviewed-on: https://go-review.googlesource.com/c/go/+/656159 Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
8d767ff38d |
runtime: increase GDB version testing requirement to 10 from 7.7
Bump the required version of GDB up to 10 from 7.7 in the runtime GDB tests, so as to ensure that we have something that can handle DWARF 5 when running tests. In theory there is some DWARF 5 support on the version 9 release branch, but we get "Dwarf Error: DW_FORM_addrx" errors for some archs on builders where GDB 9.2 is installed. Updates #26379. Change-Id: I1b7b45f8e4dd1fafccf22f2dda0124458ecf7cba Reviewed-on: https://go-review.googlesource.com/c/go/+/656836 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
c032b04219 |
internal/buildcfg: fix typo in DWARF 5 enabling code
Fix a typo in the code that decides which GOOS values will support use
of DWARF 5 ("darwin" was not spelled correctly).
Updates #26379.
Change-Id: I3a7906d708550fcedc3a8e89d0444bf12b9143f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/656895
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
|
|
c00647b49b |
cmd/compile: set bits.OnesCount's limits to [0, 64]
Change-Id: I2f60de836f58ef91baae856f44d8f73c190326f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/656158 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> |
|
|
|
554a3c51dc |
cmd/compile: use min & max builtins to assert constant bounds in prove's tests
I've originally used |= and &= to setup assumptions exploitable by the operation under test but theses have multiple issues making it poor for this usecase: - &= does not pass the minimum value as-is, rather always set it to 0 - |= rounds up the max value to a number of the same length with all ones set - I've never implemented them to work with negative signed numbers Change-Id: Ie43c576fb10393e69d6f989b048823daa02b1df8 Reviewed-on: https://go-review.googlesource.com/c/go/+/656160 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
d2842229fc |
cmd/compile: compute min's & max's limits from argument's limits inside flowLimit
Updates #68857 Change-Id: Ied07e656bba42f3b1b5f9b9f5442806aa2e7959b Reviewed-on: https://go-review.googlesource.com/c/go/+/656157 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
|
|
|
bcd0ebbd2a |
internal/cpu: use correct variable when parsing CPU features lamcas and lam_bh on loong64
Change-Id: I5019f4e32243911f735f775bcb3c0dba5adb4162 Reviewed-on: https://go-review.googlesource.com/c/go/+/655395 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
4364893149 |
cmd/internal/script/scripttest: use GOHOSTARCH to find tool directory
Fixes #72800 Change-Id: Idde7eae13d1c0098e5314935cf8ca823cbc7a7cc Reviewed-on: https://go-review.googlesource.com/c/go/+/656855 Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
17b9c9f2ad |
internal/bytealg: optimize Count{,String} in loong64
Benchmark on Loongson 3A6000 and 3A5000:
goos: linux
goarch: loong64
pkg: bytes
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
CountSingle/10 13.210n ± 0% 9.984n ± 0% -24.42% (p=0.000 n=15)
CountSingle/32 31.970n ± 1% 7.205n ± 0% -77.46% (p=0.000 n=15)
CountSingle/4K 4039.0n ± 0% 108.7n ± 0% -97.31% (p=0.000 n=15)
CountSingle/4M 4158.9µ ± 0% 117.3µ ± 0% -97.18% (p=0.000 n=15)
CountSingle/64M 68.641m ± 0% 2.585m ± 1% -96.23% (p=0.000 n=15)
geomean 13.72µ 1.189µ -91.34%
| bench.old | bench.new |
| B/s | B/s vs base |
CountSingle/10 722.0Mi ± 0% 955.2Mi ± 0% +32.30% (p=0.000 n=15)
CountSingle/32 954.6Mi ± 1% 4235.4Mi ± 0% +343.68% (p=0.000 n=15)
CountSingle/4K 967.2Mi ± 0% 35947.6Mi ± 0% +3616.64% (p=0.000 n=15)
CountSingle/4M 961.8Mi ± 0% 34092.7Mi ± 0% +3444.71% (p=0.000 n=15)
CountSingle/64M 932.4Mi ± 0% 24757.2Mi ± 1% +2555.24% (p=0.000 n=15)
geomean 902.2Mi 10.17Gi +1054.77%
goos: linux
goarch: loong64
pkg: bytes
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
CountSingle/10 14.41n ± 0% 12.81n ± 0% -11.10% (p=0.000 n=15)
CountSingle/32 36.230n ± 0% 9.609n ± 0% -73.48% (p=0.000 n=15)
CountSingle/4K 4366.0n ± 0% 165.5n ± 0% -96.21% (p=0.000 n=15)
CountSingle/4M 4464.7µ ± 0% 325.2µ ± 0% -92.72% (p=0.000 n=15)
CountSingle/64M 75.627m ± 0% 8.307m ± 69% -89.02% (p=0.000 n=15)
geomean 15.04µ 2.229µ -85.18%
| bench.old | bench.new |
| B/s | B/s vs base |
CountSingle/10 661.8Mi ± 0% 744.4Mi ± 0% +12.49% (p=0.000 n=15)
CountSingle/32 842.4Mi ± 0% 3176.1Mi ± 0% +277.03% (p=0.000 n=15)
CountSingle/4K 894.7Mi ± 0% 23596.7Mi ± 0% +2537.34% (p=0.000 n=15)
CountSingle/4M 895.9Mi ± 0% 12299.7Mi ± 0% +1272.88% (p=0.000 n=15)
CountSingle/64M 846.3Mi ± 0% 7703.9Mi ± 41% +810.34% (p=0.000 n=15)
geomean 823.3Mi 5.424Gi +574.68%
Change-Id: Ie07592beac61bdb093470c524049ed494df4d703
Reviewed-on: https://go-review.googlesource.com/c/go/+/586055
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
|
|
ca19f987ca |
internal/buildcfg: enable DWARF version 5 by default
This patch enables the DWARF version 5 experiment by default for most platforms that support DWARF. Note that MacOS is kept at version 4, due to problems with CGO builds; the "dsymutil" tool from older versions of Xcode (prior to V16) can't handle DWARF5. Similar we keep DWARF 4 for GOOS=aix, where XCOFF doesn't appear to support the new section subtypes in DWARF 5. Updates #26379. Change-Id: I5edd600c611f03ce8e11be3ca18c1e6686ac74ef Reviewed-on: https://go-review.googlesource.com/c/go/+/637895 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> |
|
|
|
4acc5b4da6 |
cmp: add examples for Compare and Less
Change-Id: I6900f52736d5316ca523a213c65896861d855433 Reviewed-on: https://go-review.googlesource.com/c/go/+/656635 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com> |
|
|
|
908af6529c |
archive/zip: error on ReadDir if there are invalid file names
Fixes #50179 Change-Id: I616a6d1279d025e345d2daa6d44b687c8a2d09e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/656495 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
817218a26c |
net/http: document Redirect behavior for non-ASCII characters
For #4385 For #72745 Change-Id: Ibd54fc03467eb948001299001bb2e2529512a7c0 Reviewed-on: https://go-review.googlesource.com/c/go/+/656135 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Damien Neil <dneil@google.com> |
|
|
|
3fb8b4f3db |
all: move //go:debug decoratemappings=0 test to cmd/go
test/decoratemappingszero.go is intended to test that //go:debug decoratemappings=0 disables annonations. Unfortunately, //go:debug processing is handled by cmd/go, but cmd/internal/testdir (which runs tests from test/) generally invokes the compiler directly, thus it does not set default GODEBUGs. Move this test to the cmd/go script tests, alongside the similar test for language version. Fixes #72772. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64le_power10 Change-Id: I6a6a636c9d380ef984f760be5689fdc7f5cb2aeb Reviewed-on: https://go-review.googlesource.com/c/go/+/656795 Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> |
|
|
|
8867af9207 |
os: add more File.WriteAt tests
The File.WriteAt doesn't verify that the file offset is not changed when calling WriteAt, although it is what users expect. Add some new tests to verify that this behavior doesn't regress. Change-Id: Ib1e048c7333d6efec71bd8f75a4fa745775306f9 Reviewed-on: https://go-review.googlesource.com/c/go/+/656355 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> |
|
|
|
b0e2f185c5 |
cmd/internal/obj/loong64: add {V,XV}MUL{B/H/W/V} and {V,XV}MUH{B/H/W/V}[U] instructions support
Go asm syntax:
VMUL{B/H/W/V} VK, VJ, VD
VMUH{B/H/W/V}[U] VK, VJ, VD
XVMUL{B/H/W/V} XK, XJ, XD
XVMUH{B/H/W/V}[U] XK, XJ, XD
Equivalent platform assembler syntax:
vmul.{b/h/w/d} vd, vj, vk
vmuh.{b/h/w/d}[u] vd, vj, vk
xvmul.{b/h/w/d} xd, xj, xk
xvmuh.{b/h/w/d}[u] xd, xj, xk
Change-Id: I2f15a5b4b6303a0f82cb85114477f58e1b5fd950
Reviewed-on: https://go-review.googlesource.com/c/go/+/636375
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
|
|
|
|
6c70f2b960 |
cmd/compile: Enable inlining of tail calls
Enable inlining tail calls and do not limit emitting tail calls only to the
non-inlineable methods when generating wrappers. This change produces
additional code size reduction.
Code size difference measured with this change (tried for x86_64):
etcd binary:
.text section size: 10613393 -> 10593841 (0.18%)
total binary size: 33450787 -> 33424307 (0.07%)
compile binary:
.text section size: 10171025 -> 10126545 (0.43%)
total binary size: 28241012 -> 28192628 (0.17%)
cockroach binary:
.text section size: 83947260 -> 83694140 (0.3%)
total binary size: 263799808 -> 263534160 (0.1%)
Change-Id: I694f83cb838e64bd4c51f05b7b9f2bf0193bb551
Reviewed-on: https://go-review.googlesource.com/c/go/+/650455
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
|
|
c18ff21cc8 |
cmd/compile, runtime: remove plan9 special case avoiding SSE
Change-Id: Id5258a72b0727bf7c66d558e30486eac2c6c8c36 Reviewed-on: https://go-review.googlesource.com/c/go/+/655875 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David du Colombier <0intro@gmail.com> |
|
|
|
3e033b7553 |
cmd/compile: add constant folding for PopCount
Change-Id: I6ea3f75ddd5c7af114ef77bc48f28c7f8570997b Reviewed-on: https://go-review.googlesource.com/c/go/+/656156 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> |
|
|
|
8591f8e19e |
log/slog: use consistent call depth for all output
This makes all log functions keep a consistent call structure to be nice
with the handleWriter in the slog package which expects a strict level
of 4.
Fixes #67362.
Change-Id: Ib967c696074b1ca931f6656dd27ff1ec484233b8
GitHub-Last-Rev:
|
|
|
|
39b783780a |
net/mail: use sync.OnceValue to build dateLayouts
Simplify buildDateLayouts with sync.OnceValue.
Change-Id: Ib48ab20ee00f5e44cc1b0f6e1afe3fcd1b7dc3c7
GitHub-Last-Rev:
|
|
|
|
31658ace9d |
runtime/internal: clean up completely
We've been slowly moving packages from runtime/internal to
internal/runtime. For now, runtime/internal only has test packages.
It's a good chance to clean up the references to runtime/internal
in the toolchain.
For #65355.
Change-Id: Ie6f9091a44511d0db9946ea6de7a78d3afe9f063
GitHub-Last-Rev:
|
|
|
|
598df45fce |
net: unblock UDP Reads upon Close on plan9, add test
Fixes #72770 Change-Id: I42be7c7349961188f4b5d73287a3550aba323893 Reviewed-on: https://go-review.googlesource.com/c/go/+/656395 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: David du Colombier <0intro@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org> |
|
|
|
be2ecfbff8 |
debug/dwarf: read DWARF 5 cu base offsets on SeekPC() path
This patch fixes a bug in CL 655976 relating to DWARF 5 support; we were reading in compile unit base offsets on the Seek() path but not on the corresponding SeekPC path (we need the offsets to be read in both cases). Updates #26379. Fixes #72778. Change-Id: I02850b786a53142307219292f2c5099eb0271559 Reviewed-on: https://go-review.googlesource.com/c/go/+/656675 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
|
|
|
73fea035bf |
cmd/go: allow symlinks of non-directory files in embed
We previously disallowed all non-regular files being embedded. This CL relaxes the restriction a little: if the GODEBUG embedfollowsymlinks=1 is set, we allow the leaf files being embedded (not the directories containing them) to be symlinks. The files pointed to by the symlinks must still be regular files. This will be used when a Bazel build action executing the Go command is running in a symlink-based sandbox. It's not something we want to enable in general for now, so it's behind a GODEBUG. Fixes #59924 Change-Id: I895be14c12de55b7d1b663d81bdda1df37d54804 Reviewed-on: https://go-review.googlesource.com/c/go/+/643215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Alan Donovan <adonovan@google.com> |
|
|
|
a588c6fba6 |
go/types, types2: report better error messages for make calls
Change-Id: I4593aeb4cad1e2c3f4705ed5249ac0bad910162f Reviewed-on: https://go-review.googlesource.com/c/go/+/655518 Auto-Submit: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Robert Findley <rfindley@google.com> |
|
|
|
ae4c13afc5 |
go/types, types2: report better error messages for slice expressions
Explicitly compute the common underlying type and while doing so report better slice-expression relevant error messages. Streamline message format for index and slice errors. This removes the last uses of the coreString and match functions. Delete them. Change-Id: I4b50dda1ef7e2ab5e296021458f7f0b6f6e229cd Reviewed-on: https://go-review.googlesource.com/c/go/+/655935 Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Findley <rfindley@google.com> Auto-Submit: Robert Griesemer <gri@google.com> |
|
|
|
e5d3ece35d |
go/types, types2: remove need for coreString in signature.go
Also, add additional test cases for NewSignatureType to check expected panic behavior. Change-Id: If26cd81a2af384bf2084dd09119483c0584715c8 Reviewed-on: https://go-review.googlesource.com/c/go/+/655695 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Robert Findley <rfindley@google.com> |