Commit Graph

5235 Commits

Author SHA1 Message Date
Joel Sing c1c7e5902f test/codegen: tighten the TrailingZeros64 test on 386
Make the TrailingZeros64 code generation check more specific for 386.
Just checking for BSFL will match both the generic 64 bit decomposition
and the custom 386 lowering.

Change-Id: I62076f1889af0ef1f29704cba01ab419cae0c6e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/656996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-03-14 15:04:38 -07:00
Keith Randall a1ddbdd3ef cmd/compile: don't move nilCheck operations during tighten
Nil checks need to stay in their original blocks. They cannot
be moved to a following conditionally-executed block.

Fixes #72860

Change-Id: Ic2d66cdf030357d91f8a716a004152ba4c016f77
Reviewed-on: https://go-review.googlesource.com/c/go/+/657715
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-03-13 21:24:20 -07:00
Joel Sing af92bb594d test/codegen: remove plan9/amd64 specific array zeroing/copying tests
The compiler previously avoided the use of MOVUPS on plan9/amd64. This
was changed in CL 655875, however the codegen tests were not updated
and now fail (seemingly the full codegen tests do not run anywhere,
not even on the longtest builders).

Change-Id: I388b60e7b0911048d4949c5029347f9801c018a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/656997
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Auto-Submit: Keith Randall <khr@google.com>
2025-03-13 05:19:13 -07:00
Xiaolin Zhao b143c98169 cmd/compile: simplify bounded shift on loong64
Use the shiftIsBounded function to generate more efficient shift instructions.
This change also optimize shift ops when the shift value is v&63 and v&31.

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000-HV @ 2500.00MHz
                |  CL 627855   |               this CL                |
                |    sec/op    |    sec/op     vs base                |
LeadingZeros      1.1005n ± 0%   0.8425n ± 1%  -23.44% (p=0.000 n=10)
LeadingZeros8      1.502n ± 0%    1.501n ± 0%   -0.07% (p=0.001 n=10)
LeadingZeros16     1.502n ± 0%    1.501n ± 0%   -0.07% (p=0.000 n=10)
LeadingZeros32    0.9511n ± 0%   0.8050n ± 0%  -15.36% (p=0.000 n=10)
LeadingZeros64    1.1195n ± 0%   0.8423n ± 0%  -24.76% (p=0.000 n=10)
TrailingZeros     0.8086n ± 0%   0.8005n ± 0%   -1.00% (p=0.000 n=10)
TrailingZeros8     1.031n ± 1%    1.035n ± 1%        ~ (p=0.136 n=10)
TrailingZeros16   0.8114n ± 0%   0.8254n ± 1%   +1.73% (p=0.000 n=10)
TrailingZeros32   0.8090n ± 0%   0.8005n ± 0%   -1.05% (p=0.000 n=10)
TrailingZeros64   0.8089n ± 1%   0.8005n ± 0%   -1.04% (p=0.000 n=10)
OnesCount         0.8677n ± 0%   1.2010n ± 0%  +38.41% (p=0.000 n=10)
OnesCount8        0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.000 n=10)
OnesCount16       0.9344n ± 0%   1.2010n ± 0%  +28.53% (p=0.000 n=10)
OnesCount32       0.8677n ± 0%   1.2010n ± 0%  +38.41% (p=0.000 n=10)
OnesCount64       1.2010n ± 0%   0.8671n ± 0%  -27.80% (p=0.000 n=10)
RotateLeft        0.8009n ± 0%   0.6671n ± 0%  -16.71% (p=0.000 n=10)
RotateLeft8        1.202n ± 0%    1.327n ± 0%  +10.40% (p=0.000 n=10)
RotateLeft16      0.8036n ± 0%   0.8218n ± 0%   +2.26% (p=0.000 n=10)
RotateLeft32      0.6674n ± 0%   0.8004n ± 0%  +19.94% (p=0.000 n=10)
RotateLeft64      0.6674n ± 0%   0.8004n ± 0%  +19.94% (p=0.000 n=10)
Reverse           0.4067n ± 1%   0.4122n ± 1%   +1.38% (p=0.001 n=10)
Reverse8          0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.000 n=10)
Reverse16         0.8009n ± 0%   0.8005n ± 0%   -0.05% (p=0.000 n=10)
Reverse32         0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.001 n=10)
Reverse64         0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.008 n=10)
ReverseBytes      0.4057n ± 1%   0.4133n ± 1%   +1.90% (p=0.000 n=10)
ReverseBytes16    0.8009n ± 0%   0.8004n ± 0%   -0.07% (p=0.000 n=10)
ReverseBytes32    0.8009n ± 0%   0.8005n ± 0%   -0.05% (p=0.000 n=10)
ReverseBytes64    0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.000 n=10)
Add                1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Add32              1.201n ± 0%    1.201n ± 0%        ~ (p=0.474 n=10)
Add64              1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Add64multiple      1.832n ± 0%    1.828n ± 0%   -0.22% (p=0.001 n=10)
Sub                1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Sub32              1.602n ± 0%    1.601n ± 0%   -0.06% (p=0.000 n=10)
Sub64              1.201n ± 0%    1.201n ± 0%        ~ (p=0.474 n=10)
Sub64multiple      2.402n ± 0%    2.400n ± 0%   -0.10% (p=0.000 n=10)
Mul               0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.000 n=10)
Mul32             0.8009n ± 0%   0.8004n ± 0%   -0.06% (p=0.000 n=10)
Mul64             0.8008n ± 0%   0.8004n ± 0%   -0.05% (p=0.000 n=10)
Div                9.083n ± 0%    7.638n ± 0%  -15.91% (p=0.000 n=10)
Div32              4.011n ± 0%    4.009n ± 0%   -0.05% (p=0.000 n=10)
Div64              9.711n ± 0%    8.204n ± 0%  -15.51% (p=0.000 n=10)
geomean            1.083n         1.078n        -0.40%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
                |  CL 627855   |               this CL                |
                |    sec/op    |    sec/op     vs base                |
LeadingZeros       1.341n ± 4%    1.331n ± 2%   -0.71% (p=0.008 n=10)
LeadingZeros8      1.781n ± 0%    1.766n ± 1%   -0.84% (p=0.011 n=10)
LeadingZeros16     1.782n ± 0%    1.767n ± 0%   -0.79% (p=0.001 n=10)
LeadingZeros32     1.341n ± 1%    1.333n ± 0%   -0.52% (p=0.001 n=10)
LeadingZeros64     1.338n ± 0%    1.333n ± 0%   -0.37% (p=0.008 n=10)
TrailingZeros     0.9025n ± 0%   0.8077n ± 0%  -10.50% (p=0.000 n=10)
TrailingZeros8     1.056n ± 0%    1.089n ± 1%   +3.17% (p=0.001 n=10)
TrailingZeros16    1.101n ± 0%    1.102n ± 0%   +0.09% (p=0.011 n=10)
TrailingZeros32   0.9024n ± 1%   0.8083n ± 0%  -10.43% (p=0.000 n=10)
TrailingZeros64   0.9028n ± 1%   0.8087n ± 0%  -10.43% (p=0.000 n=10)
OnesCount          1.482n ± 1%    1.302n ± 0%  -12.15% (p=0.000 n=10)
OnesCount8         1.206n ± 0%    1.207n ± 2%   +0.12% (p=0.000 n=10)
OnesCount16        1.534n ± 0%    1.402n ± 0%   -8.58% (p=0.000 n=10)
OnesCount32        1.531n ± 1%    1.302n ± 0%  -14.99% (p=0.000 n=10)
OnesCount64        1.302n ± 0%    1.538n ± 1%  +18.16% (p=0.000 n=10)
RotateLeft        0.8083n ± 0%   0.8087n ± 1%        ~ (p=0.579 n=10)
RotateLeft8        1.310n ± 0%    1.323n ± 0%   +0.95% (p=0.001 n=10)
RotateLeft16       1.149n ± 0%    1.165n ± 1%   +1.35% (p=0.001 n=10)
RotateLeft32      0.8093n ± 0%   0.8105n ± 0%        ~ (p=0.393 n=10)
RotateLeft64      0.8088n ± 0%   0.8090n ± 0%        ~ (p=0.739 n=10)
Reverse           0.5109n ± 0%   0.5172n ± 1%   +1.25% (p=0.000 n=10)
Reverse8          0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.000 n=10)
Reverse16         0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.002 n=10)
Reverse32         0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.000 n=10)
Reverse64         0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.005 n=10)
ReverseBytes      0.5122n ± 2%   0.5182n ± 1%        ~ (p=0.060 n=10)
ReverseBytes16    0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.005 n=10)
ReverseBytes32    0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.005 n=10)
ReverseBytes64    0.8010n ± 0%   0.8011n ± 0%   +0.01% (p=0.001 n=10)
Add                1.201n ± 4%    1.202n ± 0%   +0.08% (p=0.028 n=10)
Add32              1.201n ± 0%    1.202n ± 2%   +0.08% (p=0.014 n=10)
Add64              1.201n ± 1%    1.202n ± 0%   +0.08% (p=0.025 n=10)
Add64multiple      1.902n ± 0%    1.913n ± 0%   +0.55% (p=0.004 n=10)
Sub                1.201n ± 0%    1.202n ± 3%   +0.08% (p=0.001 n=10)
Sub32              1.654n ± 0%    1.656n ± 1%        ~ (p=0.117 n=10)
Sub64              1.201n ± 0%    1.202n ± 0%   +0.08% (p=0.001 n=10)
Sub64multiple      2.180n ± 4%    2.159n ± 1%   -0.96% (p=0.006 n=10)
Mul               0.9345n ± 0%   0.9346n ± 0%   +0.01% (p=0.000 n=10)
Mul32              1.030n ± 0%    1.050n ± 1%   +1.94% (p=0.000 n=10)
Mul64             0.9345n ± 0%   0.9346n ± 1%   +0.01% (p=0.000 n=10)
Div                11.57n ± 1%    11.12n ± 0%   -3.85% (p=0.000 n=10)
Div32              4.337n ± 1%    4.341n ± 1%        ~ (p=0.286 n=10)
Div64              12.76n ± 0%    12.02n ± 3%   -5.80% (p=0.000 n=10)
geomean            1.252n         1.235n        -1.32%

Change-Id: Iec4cfd2b83bb0f946068c1d657369ff081d95b04
Reviewed-on: https://go-review.googlesource.com/c/go/+/628575
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-03-12 18:18:03 -07:00
Jorropo 99411d7847 cmd/compile: compute bits.OnesCount's limits from argument's limits
Change-Id: Ia90d48ea0fab363c8592221fad88958b522edefe
Reviewed-on: https://go-review.googlesource.com/c/go/+/656159
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-11 20:17:36 -07:00
Jorropo c00647b49b cmd/compile: set bits.OnesCount's limits to [0, 64]
Change-Id: I2f60de836f58ef91baae856f44d8f73c190326f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/656158
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
2025-03-11 19:52:29 -07:00
Jorropo 554a3c51dc cmd/compile: use min & max builtins to assert constant bounds in prove's tests
I've originally used |= and &= to setup assumptions exploitable by the
operation under test but theses have multiple issues making it poor
for this usecase:
- &= does not pass the minimum value as-is, rather always set it to 0
- |= rounds up the max value to a number of the same length with all ones set
- I've never implemented them to work with negative signed numbers

Change-Id: Ie43c576fb10393e69d6f989b048823daa02b1df8
Reviewed-on: https://go-review.googlesource.com/c/go/+/656160
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
2025-03-11 19:51:59 -07:00
Jorropo d2842229fc cmd/compile: compute min's & max's limits from argument's limits inside flowLimit
Updates #68857

Change-Id: Ied07e656bba42f3b1b5f9b9f5442806aa2e7959b
Reviewed-on: https://go-review.googlesource.com/c/go/+/656157
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
2025-03-11 19:51:31 -07:00
Michael Pratt 3fb8b4f3db all: move //go:debug decoratemappings=0 test to cmd/go
test/decoratemappingszero.go is intended to test that
//go:debug decoratemappings=0 disables annonations.

Unfortunately, //go:debug processing is handled by cmd/go, but
cmd/internal/testdir (which runs tests from test/) generally invokes the
compiler directly, thus it does not set default GODEBUGs.

Move this test to the cmd/go script tests, alongside the similar test
for language version.

Fixes #72772.

Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64le_power10
Change-Id: I6a6a636c9d380ef984f760be5689fdc7f5cb2aeb
Reviewed-on: https://go-review.googlesource.com/c/go/+/656795
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-03-11 15:07:33 -07:00
Alexander Musman 6c70f2b960 cmd/compile: Enable inlining of tail calls
Enable inlining tail calls and do not limit emitting tail calls only to the
non-inlineable methods when generating wrappers. This change produces
additional code size reduction.

 Code size difference measured with this change (tried for x86_64):
    etcd binary:
    .text section size: 10613393 -> 10593841 (0.18%)
    total binary size:  33450787 -> 33424307 (0.07%)

    compile binary:
    .text section size: 10171025 -> 10126545 (0.43%)
    total binary size:  28241012 -> 28192628 (0.17%)

    cockroach binary:
    .text section size:  83947260 -> 83694140  (0.3%)
    total binary size:  263799808 -> 263534160 (0.1%)

Change-Id: I694f83cb838e64bd4c51f05b7b9f2bf0193bb551
Reviewed-on: https://go-review.googlesource.com/c/go/+/650455
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
2025-03-11 14:18:43 -07:00
Robert Griesemer ae4c13afc5 go/types, types2: report better error messages for slice expressions
Explicitly compute the common underlying type and while doing
so report better slice-expression relevant error messages.
Streamline message format for index and slice errors.

This removes the last uses of the coreString and match functions.
Delete them.

Change-Id: I4b50dda1ef7e2ab5e296021458f7f0b6f6e229cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/655935
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
2025-03-11 05:44:15 -07:00
Robert Griesemer 2d097e363a go/types, types2: better error messages for copy built-in
Rather than relying on coreString, use the new commonUnder function
to determine the argument slice element types.

Factor out this functionality, which is shared for append and copy,
into a new helper function sliceElem (similar to chanElem).
Use sliceElem for both the append and copy implementation.
As a result, the error messages for invalid copy calls are
now more detailed.

While at it, handle the special cases for append and copy first
because they don't need the slice element computation.

Finally, share the same type recording code for the special and
general cases.

As an aside, in commonUnder, be clearer in the code that the
result is either a nil type and an error, or a non-nil type
and a nil error. This matches in style what we do in sliceElem.

Change-Id: I318bafc0d2d31df04f33b1b464ad50d581918671
Reviewed-on: https://go-review.googlesource.com/c/go/+/655675
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2025-03-10 21:30:51 -07:00
Xiaolin Zhao 2a772a2fe7 cmd/compile: optimize shifts of int32 and uint32 on loong64
goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000-HV @ 2500.00MHz
                |  bench.old   |              bench.new               |
                |    sec/op    |    sec/op     vs base                |
LeadingZeros       1.100n ± 1%    1.101n ± 0%        ~ (p=0.566 n=10)
LeadingZeros8      1.501n ± 0%    1.502n ± 0%   +0.07% (p=0.000 n=10)
LeadingZeros16     1.501n ± 0%    1.502n ± 0%   +0.07% (p=0.000 n=10)
LeadingZeros32    1.2010n ± 0%   0.9511n ± 0%  -20.81% (p=0.000 n=10)
LeadingZeros64     1.104n ± 1%    1.119n ± 0%   +1.40% (p=0.000 n=10)
TrailingZeros     0.8137n ± 0%   0.8086n ± 0%   -0.63% (p=0.001 n=10)
TrailingZeros8     1.031n ± 1%    1.031n ± 1%        ~ (p=0.956 n=10)
TrailingZeros16   0.8204n ± 1%   0.8114n ± 0%   -1.11% (p=0.000 n=10)
TrailingZeros32   0.8145n ± 0%   0.8090n ± 0%   -0.68% (p=0.000 n=10)
TrailingZeros64   0.8159n ± 0%   0.8089n ± 1%   -0.86% (p=0.000 n=10)
OnesCount         0.8672n ± 0%   0.8677n ± 0%   +0.06% (p=0.000 n=10)
OnesCount8        0.8005n ± 0%   0.8009n ± 0%   +0.06% (p=0.000 n=10)
OnesCount16       0.9339n ± 0%   0.9344n ± 0%   +0.05% (p=0.000 n=10)
OnesCount32       0.8672n ± 0%   0.8677n ± 0%   +0.06% (p=0.000 n=10)
OnesCount64        1.201n ± 0%    1.201n ± 0%        ~ (p=0.474 n=10)
RotateLeft        0.8005n ± 0%   0.8009n ± 0%   +0.05% (p=0.000 n=10)
RotateLeft8        1.202n ± 0%    1.202n ± 0%        ~ (p=0.210 n=10)
RotateLeft16      0.8050n ± 0%   0.8036n ± 0%   -0.17% (p=0.002 n=10)
RotateLeft32      0.6674n ± 0%   0.6674n ± 0%        ~ (p=1.000 n=10)
RotateLeft64      0.6673n ± 0%   0.6674n ± 0%        ~ (p=0.072 n=10)
Reverse           0.4123n ± 0%   0.4067n ± 1%   -1.37% (p=0.000 n=10)
Reverse8          0.8005n ± 0%   0.8009n ± 0%   +0.05% (p=0.000 n=10)
Reverse16         0.8004n ± 0%   0.8009n ± 0%   +0.06% (p=0.000 n=10)
Reverse32         0.8004n ± 0%   0.8009n ± 0%   +0.06% (p=0.000 n=10)
Reverse64         0.8004n ± 0%   0.8009n ± 0%   +0.06% (p=0.001 n=10)
ReverseBytes      0.4100n ± 1%   0.4057n ± 1%   -1.06% (p=0.002 n=10)
ReverseBytes16    0.8004n ± 0%   0.8009n ± 0%   +0.07% (p=0.000 n=10)
ReverseBytes32    0.8005n ± 0%   0.8009n ± 0%   +0.05% (p=0.000 n=10)
ReverseBytes64    0.8005n ± 0%   0.8009n ± 0%   +0.05% (p=0.000 n=10)
Add                1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Add32              1.201n ± 0%    1.201n ± 0%        ~ (p=0.474 n=10)
Add64              1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Add64multiple      1.831n ± 0%    1.832n ± 0%        ~ (p=1.000 n=10)
Sub                1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Sub32              1.601n ± 0%    1.602n ± 0%   +0.06% (p=0.000 n=10)
Sub64              1.201n ± 0%    1.201n ± 0%        ~ (p=0.474 n=10)
Sub64multiple      2.400n ± 0%    2.402n ± 0%   +0.10% (p=0.000 n=10)
Mul               0.8005n ± 0%   0.8009n ± 0%   +0.05% (p=0.000 n=10)
Mul32             0.8005n ± 0%   0.8009n ± 0%   +0.05% (p=0.000 n=10)
Mul64             0.8004n ± 0%   0.8008n ± 0%   +0.05% (p=0.000 n=10)
Div                9.107n ± 0%    9.083n ± 0%        ~ (p=0.255 n=10)
Div32              4.009n ± 0%    4.011n ± 0%   +0.05% (p=0.000 n=10)
Div64              9.705n ± 0%    9.711n ± 0%   +0.06% (p=0.000 n=10)
geomean            1.089n         1.083n        -0.62%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
                |  bench.old   |              bench.new               |
                |    sec/op    |    sec/op     vs base                |
LeadingZeros       1.352n ± 0%    1.341n ± 4%   -0.81% (p=0.024 n=10)
LeadingZeros8      1.766n ± 0%    1.781n ± 0%   +0.88% (p=0.000 n=10)
LeadingZeros16     1.766n ± 0%    1.782n ± 0%   +0.88% (p=0.000 n=10)
LeadingZeros32     1.536n ± 0%    1.341n ± 1%  -12.73% (p=0.000 n=10)
LeadingZeros64     1.351n ± 1%    1.338n ± 0%   -0.96% (p=0.000 n=10)
TrailingZeros     0.9037n ± 0%   0.9025n ± 0%   -0.12% (p=0.020 n=10)
TrailingZeros8     1.087n ± 3%    1.056n ± 0%        ~ (p=0.060 n=10)
TrailingZeros16    1.101n ± 0%    1.101n ± 0%        ~ (p=0.211 n=10)
TrailingZeros32   0.9040n ± 0%   0.9024n ± 1%   -0.18% (p=0.017 n=10)
TrailingZeros64   0.9043n ± 0%   0.9028n ± 1%        ~ (p=0.118 n=10)
OnesCount          1.503n ± 2%    1.482n ± 1%   -1.43% (p=0.001 n=10)
OnesCount8         1.207n ± 0%    1.206n ± 0%   -0.12% (p=0.000 n=10)
OnesCount16        1.501n ± 0%    1.534n ± 0%   +2.13% (p=0.000 n=10)
OnesCount32        1.483n ± 1%    1.531n ± 1%   +3.27% (p=0.000 n=10)
OnesCount64        1.301n ± 0%    1.302n ± 0%   +0.08% (p=0.000 n=10)
RotateLeft        0.8136n ± 4%   0.8083n ± 0%   -0.66% (p=0.002 n=10)
RotateLeft8        1.311n ± 0%    1.310n ± 0%        ~ (p=0.786 n=10)
RotateLeft16       1.165n ± 0%    1.149n ± 0%   -1.33% (p=0.001 n=10)
RotateLeft32      0.8138n ± 1%   0.8093n ± 0%   -0.57% (p=0.017 n=10)
RotateLeft64      0.8149n ± 1%   0.8088n ± 0%   -0.74% (p=0.000 n=10)
Reverse           0.5195n ± 1%   0.5109n ± 0%   -1.67% (p=0.000 n=10)
Reverse8          0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.000 n=10)
Reverse16         0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.000 n=10)
Reverse32         0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.012 n=10)
Reverse64         0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.010 n=10)
ReverseBytes      0.5120n ± 1%   0.5122n ± 2%        ~ (p=0.306 n=10)
ReverseBytes16    0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.000 n=10)
ReverseBytes32    0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.000 n=10)
ReverseBytes64    0.8007n ± 0%   0.8010n ± 0%   +0.04% (p=0.000 n=10)
Add                1.201n ± 0%    1.201n ± 4%        ~ (p=0.334 n=10)
Add32              1.201n ± 0%    1.201n ± 0%        ~ (p=0.563 n=10)
Add64              1.201n ± 0%    1.201n ± 1%        ~ (p=0.652 n=10)
Add64multiple      1.909n ± 0%    1.902n ± 0%        ~ (p=0.126 n=10)
Sub                1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Sub32              1.655n ± 0%    1.654n ± 0%        ~ (p=0.589 n=10)
Sub64              1.201n ± 0%    1.201n ± 0%        ~ (p=1.000 n=10)
Sub64multiple      2.150n ± 0%    2.180n ± 4%   +1.37% (p=0.000 n=10)
Mul               0.9341n ± 0%   0.9345n ± 0%   +0.04% (p=0.011 n=10)
Mul32              1.053n ± 0%    1.030n ± 0%   -2.23% (p=0.000 n=10)
Mul64             0.9341n ± 0%   0.9345n ± 0%   +0.04% (p=0.018 n=10)
Div                11.59n ± 0%    11.57n ± 1%        ~ (p=0.091 n=10)
Div32              4.337n ± 0%    4.337n ± 1%        ~ (p=0.783 n=10)
Div64              12.81n ± 0%    12.76n ± 0%   -0.39% (p=0.001 n=10)
geomean            1.257n         1.252n        -0.46%

Change-Id: I9e93ea49736760c19dc6b6463d2aa95878121b7b
Reviewed-on: https://go-review.googlesource.com/c/go/+/627855
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-03-10 17:55:10 -07:00
Michael Pratt c40a3731f4 internal/godebugs: add decoratemappings as an opaque godebug setting
This adds a new godebug to control whether the runtime applies the
anonymous memory mapping annotations added in https://go.dev/cl/646095.
It is enabled by default.

This has several effects:

* The feature is only enabled by default when the main go.mod has go >=
  1.25.
* This feature can be disabled with GODEBUG=decoratemappings=0, or the
  equivalents in go.mod or package main. See https://go.dev/doc/godebug.
* As an opaque setting, this option will not appear in runtime/metrics.
* This setting is non-atomic, so it cannot be changed after startup.

I am not 100% sure about my decision for the last two points.

I've made this an opaque setting because it affects every memory mapping
the runtime performs. Thus every mapping would report "non-default
behavior", which doesn't seem useful.

This setting could trivially be atomic and allow changes at run time,
but those changes would only affect future mappings. That seems
confusing and not helpful. On the other hand, going back to annotate or
unannotate every previous mapping when the setting changes is
unwarranted complexity.

For #71546.

Change-Id: I6a6a636c5ad551d76691cba2a6f668d5cff0e352
Reviewed-on: https://go-review.googlesource.com/c/go/+/655895
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2025-03-10 08:29:59 -07:00
Robert Griesemer 584e631023 go/types, types2: better error messages for invalid calls
Rather than reporting "non-function" for an invalid type parameter,
report which type in the type parameter's type set is not a function.

Change-Id: I8beec25cc337bae8e03d23e62d97aa82db46bab4
Reviewed-on: https://go-review.googlesource.com/c/go/+/654475
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-06 13:35:49 -08:00
David Chase 1cf6b50263 cmd/compile: remove no-longer-necessary recursive inlining checks
this does result in a little bit more inlining,
cmd/compile text is 0.5% larger,
bent-benchmark text geomeans grow by only 0.02%.
some of our tests make assumptions about inlining.

Change-Id: I999d1798aca5dc64a1928bd434258a61e702951a
Reviewed-on: https://go-review.googlesource.com/c/go/+/655157
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-03-06 10:07:17 -08:00
David Chase cad4dca518 cmd/compile: use inline-Pos-based recursion test
Look at the inlining stack of positions for a call site,
if the line/col/file of the call site appears in that
stack, do not inline.  This subsumes all the other
recently-added recursive inlining checks, but they are
left in to make this easier+safer to backport.

Fixes #72090

Change-Id: I0f487bb0d4c514015907c649312672b7be464abd
Reviewed-on: https://go-review.googlesource.com/c/go/+/655155
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-03-05 15:02:38 -08:00
Cuong Manh Le 4f45b2b7e0 cmd/compile: fix out of memory when inlining closure
CL 629195 strongly favor closure inlining, allowing closures to be
inlined more aggressively.

However, if the closure body contains a call to a function, which itself
is one of the call arguments, it causes the infinite inlining.

Fixing this by prevent this kind of functions from being inlinable.

Fixes #72063

Change-Id: I5fb5723a819b1e2c5aadb57c1023ec84ca9fa53c
Reviewed-on: https://go-review.googlesource.com/c/go/+/654195
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-04 03:10:17 -08:00
Joel Sing 927fdb7843 cmd/compile: simplify intrinsification of TrailingZeros16 and TrailingZeros8
Decompose Ctz16 and Ctz8 within the SSA rules for LOONG64, MIPS, PPC64
and S390X, rather than having a custom intrinsic. Note that for PPC64 this
actually allows the existing Ctz16 and Ctz8 rules to be used.

Change-Id: I27a5e978f852b9d75396d2a80f5d7dfcb5ef7dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/651816
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-27 03:45:44 -08:00
Keith Randall 8b8bff7bb2 cmd/compile: don't pull constant offsets out of pointer arithmetic
This could lead to manufacturing a pointer that points outside
its original allocation.

Bug was introduced in CL 629858.

Fixes #71932

Change-Id: Ia86ab0b65ce5f80a8e0f4f4c81babd07c5904f8d
Reviewed-on: https://go-review.googlesource.com/c/go/+/652078
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-02-26 09:39:12 -08:00
khr@golang.org cc16fb52e6 cmd/compile: ensure we don't reuse temporary register
Before this CL, we could use the same register for both a temporary
register and for moving a value in the output register out of the way.

Fixes #71857

Change-Id: Iefbfd9d4139136174570d8aadf8a0fb391791ea9
Reviewed-on: https://go-review.googlesource.com/c/go/+/651221
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-25 10:19:25 -08:00
Jorropo 00635de759 cmd/compile: don't report newLimit discovered when unsat happens multiple times
Fixes #71852

Change-Id: I696fcb8fc8c0c2e5e5ae6ab50596f6bdb9b7d498
Reviewed-on: https://go-review.googlesource.com/c/go/+/650975
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-02-20 09:44:28 -08:00
Mateusz Poliwczak 43e6525986 cmd/compile: load properly constant values from itabs
While looking at the SSA of following code, i noticed
that these rules do not work properly, and the types
are loaded indirectly through an itab, instead of statically.

type M interface{ M() }
type A interface{ A() }

type Impl struct{}
func (*Impl) M() {}
func (*Impl) A() {}

func main() {
        var a M = &Impl{}
        a.(A).A()
}

Change-Id: Ia275993f81a2e7302102d4ff87ac28586023d13c
GitHub-Last-Rev: 4bfc901917
GitHub-Pull-Request: golang/go#71784
Reviewed-on: https://go-review.googlesource.com/c/go/+/649500
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-19 13:39:00 -08:00
Mateusz Poliwczak ba3f568988 cmd/compile: determine static values of len and cap in make() calls
This change improves escape analysis by attempting to
deduce static values for the len and cap parameters,
allowing allocations to be made on the stack.

Change-Id: I1161019aed9f60cf2c2fe4d405da94ad415231ac
GitHub-Last-Rev: d78c1b4ca5
GitHub-Pull-Request: golang/go#71693
Reviewed-on: https://go-review.googlesource.com/c/go/+/649035
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-19 13:38:58 -08:00
David Chase 9ddeac30b5 cmd/compile, runtime: use deferreturn as target PC for recover from deferrangefunc
The existing code for recover from deferrangefunc was broken in
several ways.

1. the code following a deferrangefunc call did not check the return
value for an out-of-band value indicating "return now" (i.e., recover
was called)

2. the returned value was delivered using a bespoke ABI that happened
to match on register-ABI platforms, but not on older stack-based
ABI.

3. the returned value was the wrong width (1 word versus 2) and
type/value(integer 1, not a pointer to anything) for deferrangefunc's
any-typed return value (in practice, the OOB value check could catch
this, but still, it's sketchy).

This -- using the deferreturn lookup method already in place for
open-coded defers -- turned out to be a much-less-ugly way of
obtaining the desired transfer of control for recover().

TODO: we also could do this for regular defer, and delete some code.

Fixes #71675

Change-Id: If7d7ea789ad4320821aab3b443759a7d71647ff0
Reviewed-on: https://go-review.googlesource.com/c/go/+/650476
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-02-19 08:27:06 -08:00
Cuong Manh Le 34073a736a cmd/compile: avoid infinite recursion when inlining closures
CL 630696 changes budget for once-called closures, making them more
inlinable. However, when recursive inlining involve both the closure and
its parent, the inliner goes into an infinite loop:

	parent (a closure)  -> closure -> parent -> ...

The problem here dues to the closure name mangling, causing the inlined
checking condition failed, since the closure name affects how the
linker symbol generated.

To fix this, just prevent the closure from inlining its parent into
itself, avoid the infinite inlining loop.

Fixes #71680

Change-Id: Ib27626d70f95e5f1c24a3eb1c8e6c3443b7d90c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/649656
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-02-18 17:52:51 -08:00
Jakub Ciolek d524e1eccd cmd/compile: on AMD64, turn x < 128 into x <= 127
x < 128 -> x <= 127
x >= 128 -> x > 127

This allows for shorter encoding as 127 fits into
a single-byte immediate.

archive/tar benchmark (Alder Lake 12600K)

name              old time/op    new time/op    delta
/Writer/USTAR-16    1.46µs ± 0%    1.32µs ± 0%  -9.43%  (p=0.008 n=5+5)
/Writer/GNU-16      1.85µs ± 1%    1.79µs ± 1%  -3.47%  (p=0.008 n=5+5)
/Writer/PAX-16      3.21µs ± 0%    3.11µs ± 2%  -2.96%  (p=0.008 n=5+5)
/Reader/USTAR-16    1.38µs ± 1%    1.37µs ± 0%    ~     (p=0.127 n=5+4)
/Reader/GNU-16       798ns ± 1%     800ns ± 2%    ~     (p=0.548 n=5+5)
/Reader/PAX-16      3.07µs ± 1%    3.00µs ± 0%  -2.35%  (p=0.008 n=5+5)
[Geo mean]          1.76µs         1.70µs       -3.15%

compilecmp:

hash/maphash
hash/maphash.(*Hash).Write 517 -> 510  (-1.35%)

runtime
runtime.traceReadCPU 1626 -> 1615  (-0.68%)

runtime [cmd/compile]
runtime.traceReadCPU 1626 -> 1615  (-0.68%)

math/rand/v2
type:.eq.[128]float32 65 -> 59  (-9.23%)

bytes
bytes.trimLeftUnicode 378 -> 373  (-1.32%)
bytes.IndexAny 1189 -> 1157  (-2.69%)
bytes.LastIndexAny 1256 -> 1239  (-1.35%)
bytes.lastIndexFunc 263 -> 261  (-0.76%)

strings
strings.FieldsFuncSeq.func1 411 -> 399  (-2.92%)
strings.EqualFold 625 -> 624  (-0.16%)
strings.trimLeftUnicode 248 -> 231  (-6.85%)

math/rand
type:.eq.[128]float32 65 -> 59  (-9.23%)

bytes [cmd/compile]
bytes.LastIndexAny 1256 -> 1239  (-1.35%)
bytes.lastIndexFunc 263 -> 261  (-0.76%)
bytes.trimLeftUnicode 378 -> 373  (-1.32%)
bytes.IndexAny 1189 -> 1157  (-2.69%)

regexp/syntax
regexp/syntax.(*parser).parseEscape 1113 -> 1102  (-0.99%)

math/rand/v2 [cmd/compile]
type:.eq.[128]float32 65 -> 59  (-9.23%)

strings [cmd/compile]
strings.EqualFold 625 -> 624  (-0.16%)
strings.FieldsFuncSeq.func1 411 -> 399  (-2.92%)
strings.trimLeftUnicode 248 -> 231  (-6.85%)

math/rand [cmd/compile]
type:.eq.[128]float32 65 -> 59  (-9.23%)

regexp
regexp.(*inputString).context 198 -> 197  (-0.51%)
regexp.(*inputBytes).context 221 -> 212  (-4.07%)

image/jpeg
image/jpeg.(*decoder).processDQT 500 -> 491  (-1.80%)

regexp/syntax [cmd/compile]
regexp/syntax.(*parser).parseEscape 1113 -> 1102  (-0.99%)

regexp [cmd/compile]
regexp.(*inputString).context 198 -> 197  (-0.51%)
regexp.(*inputBytes).context 221 -> 212  (-4.07%)

encoding/csv
encoding/csv.(*Writer).fieldNeedsQuotes 269 -> 266  (-1.12%)

cmd/vendor/golang.org/x/sys/unix
type:.eq.[131]struct 855 -> 823  (-3.74%)

vendor/golang.org/x/text/unicode/norm
vendor/golang.org/x/text/unicode/norm.nextDecomposed 4831 -> 4826  (-0.10%)
vendor/golang.org/x/text/unicode/norm.(*Iter).returnSlice 281 -> 275  (-2.14%)

vendor/golang.org/x/text/secure/bidirule
vendor/golang.org/x/text/secure/bidirule.init.0 85 -> 83  (-2.35%)

go/scanner
go/scanner.isDigit 100 -> 98  (-2.00%)
go/scanner.(*Scanner).next 431 -> 422  (-2.09%)
go/scanner.isLetter 142 -> 124  (-12.68%)

encoding/asn1
encoding/asn1.parseTagAndLength 1189 -> 1182  (-0.59%)
encoding/asn1.makeField 3481 -> 3463  (-0.52%)

text/scanner
text/scanner.(*Scanner).next 1242 -> 1236  (-0.48%)

archive/tar
archive/tar.isASCII 133 -> 127  (-4.51%)
archive/tar.(*Writer).writeRawFile 1206 -> 1198  (-0.66%)
archive/tar.(*Reader).readHeader.func1 9 -> 7  (-22.22%)
archive/tar.toASCII 393 -> 383  (-2.54%)
archive/tar.splitUSTARPath 405 -> 396  (-2.22%)
archive/tar.(*Writer).writePAXHeader.func1 627 -> 620  (-1.12%)

text/template
text/template.jsIsSpecial 59 -> 57  (-3.39%)

go/doc
go/doc.assumedPackageName 714 -> 701  (-1.82%)

vendor/golang.org/x/net/http/httpguts
vendor/golang.org/x/net/http/httpguts.headerValueContainsToken 965 -> 952  (-1.35%)
vendor/golang.org/x/net/http/httpguts.tokenEqual 280 -> 269  (-3.93%)
vendor/golang.org/x/net/http/httpguts.IsTokenRune 28 -> 26  (-7.14%)

net/mail
net/mail.isVchar 26 -> 24  (-7.69%)
net/mail.isAtext 106 -> 104  (-1.89%)
net/mail.(*Address).String 1084 -> 1052  (-2.95%)
net/mail.isQtext 39 -> 37  (-5.13%)
net/mail.isMultibyte 9 -> 7  (-22.22%)
net/mail.isDtext 45 -> 43  (-4.44%)
net/mail.(*addrParser).consumeQuotedString 1050 -> 1029  (-2.00%)
net/mail.quoteString 741 -> 714  (-3.64%)

cmd/internal/obj/s390x
cmd/internal/obj/s390x.preprocess 6405 -> 6393  (-0.19%)

cmd/internal/obj/x86
cmd/internal/obj/x86.toDisp8 303 -> 301  (-0.66%)

fmt [cmd/compile]
fmt.Fprintf 4726 -> 4662  (-1.35%)

go/scanner [cmd/compile]
go/scanner.(*Scanner).next 431 -> 422  (-2.09%)
go/scanner.isLetter 142 -> 124  (-12.68%)
go/scanner.isDigit 100 -> 98  (-2.00%)

cmd/compile/internal/syntax
cmd/compile/internal/syntax.(*source).nextch 879 -> 847  (-3.64%)

cmd/vendor/golang.org/x/mod/module
cmd/vendor/golang.org/x/mod/module.checkElem 1253 -> 1235  (-1.44%)
cmd/vendor/golang.org/x/mod/module.escapeString 519 -> 517  (-0.39%)

go/doc [cmd/compile]
go/doc.assumedPackageName 714 -> 701  (-1.82%)

cmd/compile/internal/syntax [cmd/compile]
cmd/compile/internal/syntax.(*scanner).escape 1965 -> 1933  (-1.63%)
cmd/compile/internal/syntax.(*scanner).next 8975 -> 8847  (-1.43%)

cmd/internal/obj/s390x [cmd/compile]
cmd/internal/obj/s390x.preprocess 6405 -> 6393  (-0.19%)

cmd/internal/obj/x86 [cmd/compile]
cmd/internal/obj/x86.toDisp8 303 -> 301  (-0.66%)

cmd/internal/gcprog
cmd/internal/gcprog.(*Writer).Repeat 688 -> 677  (-1.60%)
cmd/internal/gcprog.(*Writer).varint 442 -> 439  (-0.68%)

cmd/compile/internal/ir
cmd/compile/internal/ir.splitPkg 331 -> 325  (-1.81%)

cmd/compile/internal/ir [cmd/compile]
cmd/compile/internal/ir.splitPkg 331 -> 325  (-1.81%)

net/http
net/http.containsDotDot.FieldsFuncSeq.func1 411 -> 399  (-2.92%)
net/http.isNotToken 33 -> 30  (-9.09%)
net/http.containsDotDot 606 -> 588  (-2.97%)
net/http.isCookieNameValid 197 -> 191  (-3.05%)
net/http.parsePattern 4330 -> 4317  (-0.30%)
net/http.ParseCookie 1099 -> 1096  (-0.27%)
net/http.validMethod 197 -> 187  (-5.08%)

cmd/vendor/golang.org/x/text/unicode/norm
cmd/vendor/golang.org/x/text/unicode/norm.(*Iter).returnSlice 281 -> 275  (-2.14%)
cmd/vendor/golang.org/x/text/unicode/norm.nextDecomposed 4831 -> 4826  (-0.10%)

net/http/cookiejar
net/http/cookiejar.encode 1936 -> 1918  (-0.93%)

expvar
expvar.appendJSONQuote 972 -> 965  (-0.72%)

cmd/cgo/internal/test
cmd/cgo/internal/test.stack128 116 -> 114  (-1.72%)

cmd/vendor/rsc.io/markdown
cmd/vendor/rsc.io/markdown.newATXHeading 1637 -> 1628  (-0.55%)
cmd/vendor/rsc.io/markdown.isUnicodePunct 197 -> 179  (-9.14%)

Change-Id: I578bdf42ef229d687d526e378d697ced51e1880c
Reviewed-on: https://go-review.googlesource.com/c/go/+/639935
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-16 07:23:13 -08:00
Keith Randall beac2f7d3b cmd/compile: fix sign extension of paired 32-bit loads on arm64
Fixes #71759

Change-Id: Iab05294ac933cc9972949158d3fe2bdc3073df5e
Reviewed-on: https://go-review.googlesource.com/c/go/+/649895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-02-15 07:53:28 -08:00
Keith Randall 187fd2698d cmd/compile: make write barrier code amenable to paired loads/stores
It currently isn't because it does load/store/load/store/...
Rework to do overwrite processing in pairs so it is instead
load/load/store/store/...

Change-Id: If7be629bc4048da5f2386dafb8f05759b79e9e2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/631495
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-13 14:08:14 -08:00
Keith Randall a0029e95e5 cmd/compile: regalloc: handle desired registers of 2-output insns
Particularly with 2-word load instructions, this becomes important.
Classic example is:

    func f(p *string) string {
        return *p
    }

We want the two loads to put the return values directly into
the two ABI return registers.

At this point in the stack, cmd/go is 1.1% smaller.

Change-Id: I51fd1710238e81d15aab2bfb816d73c8e7c207b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/631137
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-13 14:08:07 -08:00
khr@golang.org 20d7c57422 cmd/compile: pair loads and stores on arm64
Look for possible paired load/store operations on arm64.
I don't expect this would be a lot faster, but it will save
binary space, and indirectly through the icache at least a bit
of time.

Change-Id: I4dd73b0e6329c4659b7453998f9b75320fcf380b
Reviewed-on: https://go-review.googlesource.com/c/go/+/629256
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2025-02-13 14:07:47 -08:00
Keith Randall 89c2f282dc cmd/compile: move []byte->string map key optimization to ssa
If we call slicebytetostring immediately (with no intervening writes)
before calling map access or delete functions with the resulting
string as the key, then we can just use the ptr/len of the
slicebytetostring argument as the key. This avoids an allocation.

Fixes #44898
Update #71132

There's old code in cmd/compile/internal/walk/order.go that handles
some of these cases.

1. m[string(b)]
2. s := string(b); m[s]
3. m[[2]string{string(b1),string(b2)}]

The old code handled cases 1&3. The new code handles cases 1&2.
We'll leave the old code around to keep 3 working, although it seems
not terribly common.

Case 2 happens particularly after inlining, so it is pretty common.

Change-Id: I8913226ca79d2c65f4e2bd69a38ac8c976a57e43
Reviewed-on: https://go-review.googlesource.com/c/go/+/640656
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-13 13:03:07 -08:00
Jakub Ciolek 43b7e67040 cmd/compile: lower x*z + y to FMA if FMA enabled
There is a generic opcode for FMA, but we don't use it in rewrite rules.
This is maybe because some archs, like WASM and MIPS don't have a late
lowering rule for it.

Fixes #71204

Intel Alder Lake 12600k (GOAMD64=v3):

math:

name                    old time/op  new time/op  delta
Acos-16                 4.58ns ± 0%  3.36ns ± 0%  -26.68%  (p=0.008 n=5+5)
Acosh-16                8.04ns ± 1%  6.44ns ± 0%  -19.95%  (p=0.008 n=5+5)
Asin-16                 4.28ns ± 0%  3.32ns ± 0%  -22.24%  (p=0.008 n=5+5)
Asinh-16                9.92ns ± 0%  8.62ns ± 0%  -13.13%  (p=0.008 n=5+5)
Atan-16                 2.31ns ± 0%  1.84ns ± 0%  -20.02%  (p=0.008 n=5+5)
Atanh-16                7.79ns ± 0%  7.03ns ± 0%   -9.67%  (p=0.008 n=5+5)
Atan2-16                3.93ns ± 0%  3.52ns ± 0%  -10.35%  (p=0.000 n=5+4)
Cbrt-16                 4.62ns ± 0%  4.41ns ± 0%   -4.57%  (p=0.016 n=4+5)
Ceil-16                 0.14ns ± 1%  0.14ns ± 2%     ~     (p=0.103 n=5+5)
Copysign-16             0.33ns ± 0%  0.33ns ± 0%   +0.03%  (p=0.029 n=4+4)
Cos-16                  4.87ns ± 0%  4.75ns ± 0%   -2.44%  (p=0.016 n=5+4)
Cosh-16                 4.86ns ± 0%  4.86ns ± 0%     ~     (p=0.317 n=5+5)
Erf-16                  2.71ns ± 0%  2.25ns ± 0%  -16.69%  (p=0.008 n=5+5)
Erfc-16                 3.06ns ± 0%  2.67ns ± 0%  -13.00%  (p=0.016 n=5+4)
Erfinv-16               3.88ns ± 0%  2.84ns ± 3%  -26.83%  (p=0.008 n=5+5)
Erfcinv-16              4.08ns ± 0%  3.01ns ± 1%  -26.27%  (p=0.008 n=5+5)
Exp-16                  3.29ns ± 0%  3.37ns ± 2%   +2.64%  (p=0.016 n=4+5)
ExpGo-16                8.44ns ± 0%  7.48ns ± 1%  -11.37%  (p=0.008 n=5+5)
Expm1-16                4.46ns ± 0%  3.69ns ± 2%  -17.26%  (p=0.016 n=4+5)
Exp2-16                 8.20ns ± 0%  7.39ns ± 2%   -9.94%  (p=0.008 n=5+5)
Exp2Go-16               8.26ns ± 0%  7.23ns ± 0%  -12.49%  (p=0.016 n=4+5)
Abs-16                  0.26ns ± 3%  0.22ns ± 1%  -16.34%  (p=0.008 n=5+5)
Dim-16                  0.38ns ± 1%  0.40ns ± 2%   +5.02%  (p=0.008 n=5+5)
Floor-16                0.11ns ± 1%  0.17ns ± 4%  +54.99%  (p=0.008 n=5+5)
Max-16                  1.24ns ± 0%  1.24ns ± 0%     ~     (p=0.619 n=5+5)
Min-16                  1.24ns ± 0%  1.24ns ± 0%     ~     (p=0.484 n=5+5)
Mod-16                  13.4ns ± 1%  12.8ns ± 0%   -4.21%  (p=0.016 n=5+4)
Frexp-16                1.70ns ± 0%  1.71ns ± 0%   +0.46%  (p=0.008 n=5+5)
Gamma-16                3.97ns ± 0%  3.97ns ± 0%     ~     (p=0.643 n=5+5)
Hypot-16                2.11ns ± 0%  2.11ns ± 0%     ~     (p=0.762 n=5+5)
HypotGo-16              2.48ns ± 4%  2.26ns ± 0%   -8.94%  (p=0.008 n=5+5)
Ilogb-16                1.67ns ± 0%  1.67ns ± 0%   -0.07%  (p=0.048 n=5+5)
J0-16                   19.8ns ± 0%  19.3ns ± 0%     ~     (p=0.079 n=4+5)
J1-16                   19.4ns ± 0%  18.9ns ± 0%   -2.63%  (p=0.000 n=5+4)
Jn-16                   41.5ns ± 0%  40.6ns ± 0%   -2.32%  (p=0.016 n=4+5)
Ldexp-16                2.26ns ± 0%  2.26ns ± 0%     ~     (p=0.683 n=5+5)
Lgamma-16               4.40ns ± 0%  4.21ns ± 0%   -4.21%  (p=0.008 n=5+5)
Log-16                  4.05ns ± 0%  4.05ns ± 0%     ~     (all equal)
Logb-16                 1.69ns ± 0%  1.69ns ± 0%     ~     (p=0.429 n=5+5)
Log1p-16                5.00ns ± 0%  3.99ns ± 0%  -20.14%  (p=0.008 n=5+5)
Log10-16                4.22ns ± 0%  4.21ns ± 0%   -0.15%  (p=0.008 n=5+5)
Log2-16                 2.27ns ± 0%  2.25ns ± 0%   -0.94%  (p=0.008 n=5+5)
Modf-16                 1.44ns ± 0%  1.44ns ± 0%     ~     (p=0.492 n=5+5)
Nextafter32-16          2.09ns ± 0%  2.09ns ± 0%     ~     (p=0.079 n=4+5)
Nextafter64-16          2.09ns ± 0%  2.09ns ± 0%     ~     (p=0.095 n=4+5)
PowInt-16               10.8ns ± 0%  10.8ns ± 0%     ~     (all equal)
PowFrac-16              25.3ns ± 0%  25.3ns ± 0%   -0.09%  (p=0.000 n=5+4)
Pow10Pos-16             0.52ns ± 1%  0.52ns ± 0%     ~     (p=0.810 n=5+5)
Pow10Neg-16             0.82ns ± 0%  0.82ns ± 0%     ~     (p=0.381 n=5+5)
Round-16                0.93ns ± 0%  0.93ns ± 0%     ~     (p=0.056 n=5+5)
RoundToEven-16          1.64ns ± 0%  1.64ns ± 0%     ~     (all equal)
Remainder-16            12.4ns ± 2%  12.0ns ± 0%   -3.27%  (p=0.008 n=5+5)
Signbit-16              0.37ns ± 0%  0.37ns ± 0%   -0.19%  (p=0.008 n=5+5)
Sin-16                  4.04ns ± 0%  3.92ns ± 0%   -3.13%  (p=0.000 n=4+5)
Sincos-16               5.99ns ± 0%  5.80ns ± 0%   -3.03%  (p=0.008 n=5+5)
Sinh-16                 5.22ns ± 0%  5.22ns ± 0%     ~     (p=0.651 n=5+4)
SqrtIndirect-16         0.41ns ± 0%  0.41ns ± 0%     ~     (p=0.333 n=4+5)
SqrtLatency-16          2.66ns ± 0%  2.66ns ± 0%     ~     (p=0.079 n=4+5)
SqrtIndirectLatency-16  2.66ns ± 0%  2.66ns ± 0%     ~     (p=1.000 n=5+5)
SqrtGoLatency-16        30.1ns ± 0%  28.6ns ± 1%   -4.84%  (p=0.008 n=5+5)
SqrtPrime-16             645ns ± 0%   645ns ± 0%     ~     (p=0.095 n=5+4)
Tan-16                  4.21ns ± 0%  4.09ns ± 0%   -2.76%  (p=0.029 n=4+4)
Tanh-16                 5.36ns ± 0%  5.36ns ± 0%     ~     (p=0.444 n=5+5)
Trunc-16                0.12ns ± 6%  0.11ns ± 1%   -6.79%  (p=0.008 n=5+5)
Y0-16                   19.2ns ± 0%  18.7ns ± 0%   -2.52%  (p=0.000 n=5+4)
Y1-16                   19.1ns ± 0%  18.4ns ± 0%     ~     (p=0.079 n=4+5)
Yn-16                   40.7ns ± 0%  39.5ns ± 0%   -2.82%  (p=0.008 n=5+5)
Float64bits-16          0.21ns ± 0%  0.21ns ± 0%     ~     (p=0.603 n=5+5)
Float64frombits-16      0.21ns ± 0%  0.21ns ± 0%     ~     (p=0.984 n=4+5)
Float32bits-16          0.21ns ± 0%  0.21ns ± 0%     ~     (p=0.778 n=4+5)
Float32frombits-16      0.21ns ± 0%  0.20ns ± 0%     ~     (p=0.397 n=5+5)
FMA-16                  0.82ns ± 0%  0.82ns ± 0%   +0.02%  (p=0.029 n=4+4)
[Geo mean]              2.87ns       2.74ns        -4.61%

math/cmplx:

name        old time/op  new time/op  delta
Abs-16      2.07ns ± 0%  2.05ns ± 0%   -0.70%  (p=0.016 n=5+4)
Acos-16     36.5ns ± 0%  35.7ns ± 0%   -2.33%  (p=0.029 n=4+4)
Acosh-16    37.0ns ± 0%  36.2ns ± 0%   -2.20%  (p=0.008 n=5+5)
Asin-16     36.5ns ± 0%  35.7ns ± 0%   -2.29%  (p=0.008 n=5+5)
Asinh-16    33.5ns ± 0%  31.6ns ± 0%   -5.51%  (p=0.008 n=5+5)
Atan-16     15.5ns ± 0%  13.9ns ± 0%  -10.61%  (p=0.008 n=5+5)
Atanh-16    15.0ns ± 0%  13.6ns ± 0%   -9.73%  (p=0.008 n=5+5)
Conj-16     0.11ns ± 5%  0.11ns ± 1%     ~     (p=0.421 n=5+5)
Cos-16      12.3ns ± 0%  12.2ns ± 0%   -0.60%  (p=0.000 n=4+5)
Cosh-16     12.1ns ± 0%  12.0ns ± 0%     ~     (p=0.079 n=4+5)
Exp-16      10.0ns ± 0%   9.8ns ± 0%   -1.77%  (p=0.008 n=5+5)
Log-16      14.5ns ± 0%  13.7ns ± 0%   -5.67%  (p=0.008 n=5+5)
Log10-16    14.5ns ± 0%  13.7ns ± 0%   -5.55%  (p=0.000 n=5+4)
Phase-16    5.11ns ± 0%  4.25ns ± 0%  -16.90%  (p=0.008 n=5+5)
Polar-16    7.12ns ± 0%  6.35ns ± 0%  -10.90%  (p=0.008 n=5+5)
Pow-16      64.3ns ± 0%  63.7ns ± 0%   -0.97%  (p=0.008 n=5+5)
Rect-16     5.74ns ± 0%  5.58ns ± 0%   -2.73%  (p=0.016 n=4+5)
Sin-16      12.2ns ± 0%  12.2ns ± 0%   -0.54%  (p=0.000 n=4+5)
Sinh-16     12.1ns ± 0%  12.0ns ± 0%   -0.58%  (p=0.000 n=5+4)
Sqrt-16     5.30ns ± 0%  5.18ns ± 0%   -2.36%  (p=0.008 n=5+5)
Tan-16      22.7ns ± 0%  22.6ns ± 0%   -0.33%  (p=0.008 n=5+5)
Tanh-16     21.2ns ± 0%  20.9ns ± 0%   -1.32%  (p=0.008 n=5+5)
[Geo mean]  11.3ns       10.8ns        -3.97%

Change-Id: Idcc4b357ba68477929c126289e5095b27a827b1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/646335
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-02-13 12:34:33 -08:00
Keith Randall a7e331e671 cmd/compile: implement signed loads from read-only memory
In addition to unsigned loads which already exist.

This helps code that does switches on strings to constant-fold
the switch away when the string being switched on is constant.

Fixes #71699

Change-Id: If3051af0f7255d2a573da6f96b153a987a7f159d
Reviewed-on: https://go-review.googlesource.com/c/go/+/649295
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
2025-02-13 12:27:55 -08:00
Keith Randall 072eea9b3b cmd/compile: avoid ifaceeq call if we know the interface is direct
We can just use == if the interface is direct.

Fixes #70738

Change-Id: Ia9a644791a370fec969c04c42d28a9b58f16911f
Reviewed-on: https://go-review.googlesource.com/c/go/+/635435
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-10 13:28:41 -08:00
thepudds 8163ea1458 weak: prevent unsafe conversions using weak pointers
Prevent conversions between Pointer types,
like we do for sync/atomic.Pointer.

Fixes #71583

Change-Id: I20e83106d8a27996f221e6cd9d52637b0442cea4
Reviewed-on: https://go-review.googlesource.com/c/go/+/647195
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-02-06 13:16:59 -08:00
Jakub Ciolek cd595be6d6 cmd/compile: prefer an add when shifting left by 1
ADD(Q|L) has generally twice the throughput.

Came up in CL 626998.

Throughput by arch:

Zen 4:

SHLL (R64, 1):   0.5
ADD  (R64, R64): 0.25

Intel Alder Lake:

SHLL (R64, 1):   0.5
ADD  (R64, R64): 0.2

Intel Haswell:

SHLL (R64, 1):   0.5
ADD  (R64, R64): 0.25

Also include a minor opt for:

(x + x) << c -> x << (c + 1)

Before this, the code:

func addShift(x int64) int64 {
    return (x + x) << 1
}

emitted two instructions:

        ADDQ    AX, AX
        SHLQ    $1, AX

but we can do it in a single shift:

        SHLQ    $2, AX

Add a codegen test for clearing the last bit.

compilecmp linux/amd64:

math
math.sqrt 243 -> 242  (-0.41%)

math [cmd/compile]
math.sqrt 243 -> 242  (-0.41%)

runtime
runtime.selectgo 5455 -> 5445  (-0.18%)
runtime.sysargs 665 -> 662  (-0.45%)
runtime.isPinned 145 -> 141  (-2.76%)
runtime.atoi64 198 -> 194  (-2.02%)
runtime.setPinned 714 -> 709  (-0.70%)

runtime [cmd/compile]
runtime.sysargs 665 -> 662  (-0.45%)
runtime.setPinned 714 -> 709  (-0.70%)
runtime.atoi64 198 -> 194  (-2.02%)
runtime.isPinned 145 -> 141  (-2.76%)

strconv
strconv.computeBounds 109 -> 107  (-1.83%)
strconv.FormatInt 201 -> 197  (-1.99%)
strconv.ryuFtoaShortest 1298 -> 1266  (-2.47%)
strconv.small 144 -> 134  (-6.94%)
strconv.AppendInt 357 -> 344  (-3.64%)
strconv.ryuDigits32 490 -> 488  (-0.41%)
strconv.AppendUint 342 -> 340  (-0.58%)

strconv [cmd/compile]
strconv.FormatInt 201 -> 197  (-1.99%)
strconv.ryuFtoaShortest 1298 -> 1266  (-2.47%)
strconv.ryuDigits32 490 -> 488  (-0.41%)
strconv.AppendUint 342 -> 340  (-0.58%)
strconv.computeBounds 109 -> 107  (-1.83%)
strconv.small 144 -> 134  (-6.94%)
strconv.AppendInt 357 -> 344  (-3.64%)

image
image.Rectangle.Inset 101 -> 97  (-3.96%)

regexp/syntax
regexp/syntax.inCharClass.func1 111 -> 110  (-0.90%)
regexp/syntax.(*compiler).quest 586 -> 573  (-2.22%)
regexp/syntax.ranges.Less 153 -> 150  (-1.96%)
regexp/syntax.(*compiler).loop 583 -> 568  (-2.57%)

time
time.Time.Before 179 -> 161  (-10.06%)
time.Time.Compare 189 -> 166  (-12.17%)
time.Time.Sub 444 -> 425  (-4.28%)
time.Time.UnixMicro 106 -> 95  (-10.38%)
time.div 592 -> 587  (-0.84%)
time.Time.UnixNano 85 -> 78  (-8.24%)
time.(*Time).UnixMilli 141 -> 140  (-0.71%)
time.Time.UnixMilli 106 -> 95  (-10.38%)
time.(*Time).UnixMicro 141 -> 140  (-0.71%)
time.Time.After 179 -> 161  (-10.06%)
time.Time.Equal 170 -> 150  (-11.76%)
time.Time.AppendBinary 766 -> 757  (-1.17%)
time.Time.IsZero 74 -> 66  (-10.81%)
time.(*Time).UnixNano 124 -> 113  (-8.87%)
time.(*Time).IsZero 113 -> 108  (-4.42%)

regexp
regexp.(*Regexp).FindAllStringSubmatch.func1 590 -> 569  (-3.56%)
regexp.QuoteMeta 485 -> 469  (-3.30%)

regexp/syntax [cmd/compile]
regexp/syntax.inCharClass.func1 111 -> 110  (-0.90%)
regexp/syntax.(*compiler).loop 583 -> 568  (-2.57%)
regexp/syntax.(*compiler).quest 586 -> 573  (-2.22%)
regexp/syntax.ranges.Less 153 -> 150  (-1.96%)

encoding/base64
encoding/base64.decodedLen 92 -> 90  (-2.17%)
encoding/base64.(*Encoding).DecodedLen 99 -> 97  (-2.02%)

time [cmd/compile]
time.(*Time).IsZero 113 -> 108  (-4.42%)
time.Time.IsZero 74 -> 66  (-10.81%)
time.(*Time).UnixNano 124 -> 113  (-8.87%)
time.Time.UnixMilli 106 -> 95  (-10.38%)
time.Time.Equal 170 -> 150  (-11.76%)
time.Time.UnixMicro 106 -> 95  (-10.38%)
time.(*Time).UnixMicro 141 -> 140  (-0.71%)
time.Time.Before 179 -> 161  (-10.06%)
time.Time.UnixNano 85 -> 78  (-8.24%)
time.Time.AppendBinary 766 -> 757  (-1.17%)
time.div 592 -> 587  (-0.84%)
time.Time.After 179 -> 161  (-10.06%)
time.Time.Compare 189 -> 166  (-12.17%)
time.(*Time).UnixMilli 141 -> 140  (-0.71%)
time.Time.Sub 444 -> 425  (-4.28%)

index/suffixarray
index/suffixarray.sais_8_32 1677 -> 1645  (-1.91%)
index/suffixarray.sais_32 1677 -> 1645  (-1.91%)
index/suffixarray.sais_64 1677 -> 1654  (-1.37%)
index/suffixarray.sais_8_64 1677 -> 1654  (-1.37%)
index/suffixarray.writeInt 249 -> 247  (-0.80%)

os
os.Expand 1070 -> 1051  (-1.78%)
os.Chtimes 787 -> 774  (-1.65%)

regexp [cmd/compile]
regexp.(*Regexp).FindAllStringSubmatch.func1 590 -> 569  (-3.56%)
regexp.QuoteMeta 485 -> 469  (-3.30%)

encoding/base64 [cmd/compile]
encoding/base64.decodedLen 92 -> 90  (-2.17%)
encoding/base64.(*Encoding).DecodedLen 99 -> 97  (-2.02%)

encoding/hex
encoding/hex.Encode 138 -> 136  (-1.45%)
encoding/hex.(*decoder).Read 830 -> 824  (-0.72%)

crypto/des
crypto/des.initFeistelBox 235 -> 229  (-2.55%)
crypto/des.cryptBlock 549 -> 538  (-2.00%)

os [cmd/compile]
os.Chtimes 787 -> 774  (-1.65%)
os.Expand 1070 -> 1051  (-1.78%)

math/big
math/big.newFloat 238 -> 223  (-6.30%)
math/big.nat.mul 2138 -> 2122  (-0.75%)
math/big.karatsubaSqr 1372 -> 1369  (-0.22%)
math/big.(*Float).sqrtInverse 895 -> 878  (-1.90%)
math/big.basicSqr 1032 -> 1017  (-1.45%)

cmd/vendor/golang.org/x/sys/unix
cmd/vendor/golang.org/x/sys/unix.TimeToTimespec 72 -> 66  (-8.33%)

encoding/json
encoding/json.Indent 404 -> 403  (-0.25%)
encoding/json.MarshalIndent 303 -> 297  (-1.98%)

testing
testing.(*T).Deadline 84 -> 82  (-2.38%)
testing.(*M).Run 3545 -> 3525  (-0.56%)

archive/zip
archive/zip.headerFileInfo.ModTime 229 -> 223  (-2.62%)

encoding/gob
encoding/gob.(*encoderState).encodeInt 474 -> 469  (-1.05%)

crypto/elliptic
crypto/elliptic.Marshal 728 -> 714  (-1.92%)

debug/buildinfo
debug/buildinfo.readString 325 -> 315  (-3.08%)

image/png
image/png.(*decoder).readImagePass 10866 -> 10834  (-0.29%)

archive/tar
archive/tar.Header.allowedFormats.func3 1768 -> 1736  (-1.81%)
archive/tar.formatPAXTime 389 -> 358  (-7.97%)
archive/tar.(*Writer).writeGNUHeader 741 -> 727  (-1.89%)
archive/tar.readGNUSparseMap0x1 709 -> 695  (-1.97%)
archive/tar.(*Writer).templateV7Plus 915 -> 909  (-0.66%)

crypto/internal/cryptotest
crypto/internal/cryptotest.TestHash.func4 890 -> 879  (-1.24%)
crypto/internal/cryptotest.TestStream.func6.1 646 -> 645  (-0.15%)
crypto/internal/cryptotest.testCipher.func3 1300 -> 1289  (-0.85%)

internal/pkgbits
internal/pkgbits.(*Encoder).Int64 113 -> 103  (-8.85%)
internal/pkgbits.(*Encoder).rawVarint 74 -> 72  (-2.70%)

testing/quick
testing/quick.(*Config).getRand 316 -> 315  (-0.32%)

log/slog
log/slog.TimeValue 489 -> 479  (-2.04%)

runtime/pprof
runtime/pprof.(*profileBuilder).build 2341 -> 2322  (-0.81%)

internal/coverage/cfile
internal/coverage/cfile.(*emitState).openMetaFile 824 -> 822  (-0.24%)
internal/coverage/cfile.(*emitState).openCounterFile 904 -> 892  (-1.33%)

cmd/internal/objabi
cmd/internal/objabi.expandArgs 1177 -> 1169  (-0.68%)

crypto/ecdsa
crypto/ecdsa.pointFromAffine 1162 -> 1144  (-1.55%)

net
net.minNonzeroTime 313 -> 308  (-1.60%)
net.cgoLookupAddrPTR 812 -> 797  (-1.85%)
net.(*IPNet).String 851 -> 827  (-2.82%)
net.IP.AppendText 488 -> 471  (-3.48%)
net.IPMask.String 281 -> 270  (-3.91%)
net.partialDeadline 374 -> 366  (-2.14%)
net.hexString 249 -> 240  (-3.61%)
net.IP.String 454 -> 453  (-0.22%)

internal/fuzz
internal/fuzz.newPcgRand 240 -> 234  (-2.50%)

crypto/x509
crypto/x509.(*Certificate).isValid 2642 -> 2611  (-1.17%)

cmd/internal/obj/s390x
cmd/internal/obj/s390x.buildop 33676 -> 33644  (-0.10%)

encoding/hex [cmd/compile]
encoding/hex.(*decoder).Read 830 -> 824  (-0.72%)
encoding/hex.Encode 138 -> 136  (-1.45%)

cmd/internal/objabi [cmd/compile]
cmd/internal/objabi.expandArgs 1177 -> 1169  (-0.68%)

math/big [cmd/compile]
math/big.(*Float).sqrtInverse 895 -> 878  (-1.90%)
math/big.nat.mul 2138 -> 2122  (-0.75%)
math/big.karatsubaSqr 1372 -> 1369  (-0.22%)
math/big.basicSqr 1032 -> 1017  (-1.45%)
math/big.newFloat 238 -> 223  (-6.30%)

encoding/json [cmd/compile]
encoding/json.MarshalIndent 303 -> 297  (-1.98%)
encoding/json.Indent 404 -> 403  (-0.25%)

cmd/covdata
main.(*metaMerge).emitCounters 985 -> 973  (-1.22%)

runtime/pprof [cmd/compile]
runtime/pprof.(*profileBuilder).build 2341 -> 2322  (-0.81%)

cmd/compile/internal/syntax
cmd/compile/internal/syntax.(*source).fill 722 -> 703  (-2.63%)

cmd/dist
main.runInstall 19081 -> 19049  (-0.17%)

crypto/tls
crypto/tls.extractPadding 176 -> 175  (-0.57%)
slices.Clone[[]crypto/tls.SignatureScheme,crypto/tls.SignatureScheme] 253 -> 247  (-2.37%)
slices.Clone[[]uint16,uint16] 253 -> 247  (-2.37%)
slices.Clone[[]crypto/tls.CurveID,crypto/tls.CurveID] 253 -> 247  (-2.37%)
crypto/tls.(*Config).cipherSuites 335 -> 326  (-2.69%)
slices.DeleteFunc[go.shape.[]crypto/tls.CurveID,go.shape.uint16] 437 -> 434  (-0.69%)
crypto/tls.dial 1349 -> 1339  (-0.74%)
slices.DeleteFunc[go.shape.[]uint16,go.shape.uint16] 437 -> 434  (-0.69%)

internal/pkgbits [cmd/compile]
internal/pkgbits.(*Encoder).Int64 113 -> 103  (-8.85%)
internal/pkgbits.(*Encoder).rawVarint 74 -> 72  (-2.70%)

cmd/compile/internal/syntax [cmd/compile]
cmd/compile/internal/syntax.(*source).fill 722 -> 703  (-2.63%)

cmd/internal/obj/s390x [cmd/compile]
cmd/internal/obj/s390x.buildop 33676 -> 33644  (-0.10%)

cmd/go/internal/trace
cmd/go/internal/trace.Flow 910 -> 886  (-2.64%)
cmd/go/internal/trace.(*Span).Done 311 -> 304  (-2.25%)
cmd/go/internal/trace.StartSpan 620 -> 615  (-0.81%)

cmd/internal/script
cmd/internal/script.(*Engine).Execute.func2 534 -> 528  (-1.12%)

cmd/link/internal/loader
cmd/link/internal/loader.(*Loader).SetSymSect 344 -> 338  (-1.74%)

net/http
net/http.(*Transport).queueForIdleConn 1797 -> 1766  (-1.73%)
net/http.(*Transport).getConn 2149 -> 2131  (-0.84%)
net/http.(*http2ClientConn).tooIdleLocked 207 -> 197  (-4.83%)
net/http.(*http2responseWriter).SetWriteDeadline.func1 520 -> 508  (-2.31%)
net/http.(*Cookie).Valid 837 -> 818  (-2.27%)
net/http.(*http2responseWriter).SetReadDeadline 373 -> 357  (-4.29%)
net/http.checkIfRange 701 -> 690  (-1.57%)
net/http.(*http2SettingsFrame).Value 325 -> 298  (-8.31%)
net/http.(*http2SettingsFrame).HasDuplicates 777 -> 767  (-1.29%)
net/http.(*Server).Serve 1746 -> 1739  (-0.40%)
net/http.http2traceGotConn 569 -> 556  (-2.28%)

net/http/pprof
net/http/pprof.collectProfile 242 -> 239  (-1.24%)

cmd/compile/internal/coverage
cmd/compile/internal/coverage.metaHashAndLen 439 -> 438  (-0.23%)

cmd/vendor/golang.org/x/telemetry/internal/upload
cmd/vendor/golang.org/x/telemetry/internal/upload.(*uploader).findWork 4570 -> 4540  (-0.66%)
cmd/vendor/golang.org/x/telemetry/internal/upload.(*uploader).reports 3604 -> 3572  (-0.89%)

cmd/compile/internal/coverage [cmd/compile]
cmd/compile/internal/coverage.metaHashAndLen 439 -> 438  (-0.23%)

cmd/vendor/golang.org/x/text/language
cmd/vendor/golang.org/x/text/language.regionGroupDist 287 -> 284  (-1.05%)

cmd/go/internal/vcweb
cmd/go/internal/vcweb.(*Server).overview.func1 1045 -> 1041  (-0.38%)

cmd/go/internal/vcs
cmd/go/internal/vcs.expand 761 -> 741  (-2.63%)

cmd/compile/internal/inline/inlheur
slices.stableCmpFunc[go.shape.struct 2300 -> 2284  (-0.70%)

cmd/compile/internal/inline/inlheur [cmd/compile]
slices.stableCmpFunc[go.shape.struct 2300 -> 2284  (-0.70%)

cmd/go/internal/modfetch/codehost
cmd/go/internal/modfetch/codehost.bzrParseStat 2217 -> 2213  (-0.18%)

cmd/link/internal/ld
cmd/link/internal/ld.decodetypeStructFieldCount 157 -> 152  (-3.18%)
cmd/link/internal/ld.(*Link).address 12559 -> 12495  (-0.51%)
cmd/link/internal/ld.(*dodataState).allocateDataSections 18345 -> 18205  (-0.76%)
cmd/link/internal/ld.elfshreloc 618 -> 616  (-0.32%)
cmd/link/internal/ld.(*deadcodePass).decodetypeMethods 794 -> 779  (-1.89%)
cmd/link/internal/ld.(*dodataState).assignDsymsToSection 668 -> 663  (-0.75%)
cmd/link/internal/ld.relocSectFn 285 -> 284  (-0.35%)
cmd/link/internal/ld.decodetypeIfaceMethodCount 146 -> 144  (-1.37%)
cmd/link/internal/ld.decodetypeArrayLen 157 -> 152  (-3.18%)

cmd/link/internal/arm64
cmd/link/internal/arm64.gensymlate.func1 895 -> 888  (-0.78%)

cmd/go/internal/modload
cmd/go/internal/modload.queryProxy.func3 1029 -> 1012  (-1.65%)

cmd/go/internal/load
cmd/go/internal/load.(*Package).setBuildInfo 8453 -> 8447  (-0.07%)

cmd/go/internal/clean
cmd/go/internal/clean.runClean 2120 -> 2104  (-0.75%)

cmd/compile/internal/ssa
cmd/compile/internal/ssa.(*poset).aliasnodes 2010 -> 1978  (-1.59%)
cmd/compile/internal/ssa.rewriteValueARM64_OpARM64MOVHstoreidx2 730 -> 719  (-1.51%)
cmd/compile/internal/ssa.(*debugState).buildLocationLists 3326 -> 3294  (-0.96%)
cmd/compile/internal/ssa.rewriteValueAMD64_OpAMD64ADDLconst 3069 -> 2941  (-4.17%)
cmd/compile/internal/ssa.(*debugState).processValue 9756 -> 9724  (-0.33%)
cmd/compile/internal/ssa.rewriteValueAMD64_OpAMD64ADDQconst 3069 -> 2941  (-4.17%)
cmd/compile/internal/ssa.(*poset).mergeroot 1079 -> 1054  (-2.32%)

cmd/compile/internal/ssa [cmd/compile]
cmd/compile/internal/ssa.rewriteValueARM64_OpARM64MOVHstoreidx2 730 -> 719  (-1.51%)
cmd/compile/internal/ssa.(*poset).aliasnodes 2010 -> 1978  (-1.59%)
cmd/compile/internal/ssa.(*poset).mergeroot 1079 -> 1054  (-2.32%)
cmd/compile/internal/ssa.rewriteValueAMD64_OpAMD64ADDQconst 3069 -> 2941  (-4.17%)
cmd/compile/internal/ssa.rewriteValueAMD64_OpAMD64ADDLconst 3069 -> 2941  (-4.17%)

file                                                before   after    Δ       %
math/bits.s                                         2352     2354     +2      +0.085%
math/bits [cmd/compile].s                           2352     2354     +2      +0.085%
math.s                                              35675    35674    -1      -0.003%
math [cmd/compile].s                                35675    35674    -1      -0.003%
runtime.s                                           577251   577245   -6      -0.001%
runtime [cmd/compile].s                             642419   642438   +19     +0.003%
sort.s                                              37434    37435    +1      +0.003%
strconv.s                                           48391    48343    -48     -0.099%
sort [cmd/compile].s                                37434    37435    +1      +0.003%
bufio.s                                             21386    21418    +32     +0.150%
strconv [cmd/compile].s                             48391    48343    -48     -0.099%
image.s                                             34978    35022    +44     +0.126%
regexp/syntax.s                                     81719    81781    +62     +0.076%
time.s                                              94341    94184    -157    -0.166%
regexp.s                                            60411    60399    -12     -0.020%
bufio [cmd/compile].s                               21512    21544    +32     +0.149%
encoding/binary.s                                   34062    34087    +25     +0.073%
regexp/syntax [cmd/compile].s                       81719    81781    +62     +0.076%
encoding/base64.s                                   11907    11903    -4      -0.034%
time [cmd/compile].s                                94341    94184    -157    -0.166%
index/suffixarray.s                                 41633    41527    -106    -0.255%
os.s                                                101770   101738   -32     -0.031%
regexp [cmd/compile].s                              60411    60399    -12     -0.020%
encoding/binary [cmd/compile].s                     37173    37198    +25     +0.067%
encoding/base64 [cmd/compile].s                     11907    11903    -4      -0.034%
os/exec.s                                           23900    23907    +7      +0.029%
encoding/hex.s                                      6038     6030     -8      -0.132%
crypto/des.s                                        5073     5056     -17     -0.335%
os [cmd/compile].s                                  102030   101998   -32     -0.031%
vendor/golang.org/x/net/http2/hpack.s               22027    22033    +6      +0.027%
math/big.s                                          164808   164753   -55     -0.033%
cmd/vendor/golang.org/x/sys/unix.s                  121450   121444   -6      -0.005%
encoding/json.s                                     110294   110287   -7      -0.006%
testing.s                                           115303   115281   -22     -0.019%
archive/zip.s                                       65329    65325    -4      -0.006%
os/user.s                                           10078    10080    +2      +0.020%
encoding/gob.s                                      143788   143783   -5      -0.003%
crypto/elliptic.s                                   30686    30704    +18     +0.059%
go/doc/comment.s                                    49401    49433    +32     +0.065%
debug/buildinfo.s                                   9095     9085     -10     -0.110%
image/png.s                                         36113    36081    -32     -0.089%
archive/tar.s                                       71994    71897    -97     -0.135%
crypto/internal/cryptotest.s                        60872    60849    -23     -0.038%
internal/pkgbits.s                                  20441    20429    -12     -0.059%
testing/quick.s                                     8236     8235     -1      -0.012%
log/slog.s                                          77568    77558    -10     -0.013%
internal/trace/internal/oldtrace.s                  52885    52896    +11     +0.021%
runtime/pprof.s                                     123978   123969   -9      -0.007%
internal/coverage/cfile.s                           25198    25184    -14     -0.056%
cmd/internal/objabi.s                               19954    19946    -8      -0.040%
crypto/ecdsa.s                                      29159    29141    -18     -0.062%
log/slog/internal/benchmarks.s                      6694     6695     +1      +0.015%
net.s                                               299569   299503   -66     -0.022%
os/exec [cmd/compile].s                             23888    23895    +7      +0.029%
internal/trace.s                                    179226   179240   +14     +0.008%
internal/fuzz.s                                     86190    86191    +1      +0.001%
crypto/x509.s                                       177195   177164   -31     -0.017%
cmd/internal/obj/s390x.s                            121642   121610   -32     -0.026%
cmd/internal/obj/ppc64.s                            140118   140122   +4      +0.003%
encoding/hex [cmd/compile].s                        6149     6141     -8      -0.130%
cmd/internal/objabi [cmd/compile].s                 19954    19946    -8      -0.040%
cmd/internal/obj/arm64.s                            158523   158555   +32     +0.020%
go/doc/comment [cmd/compile].s                      49512    49544    +32     +0.065%
math/big [cmd/compile].s                            166394   166339   -55     -0.033%
encoding/json [cmd/compile].s                       110712   110705   -7      -0.006%
cmd/covdata.s                                       39699    39687    -12     -0.030%
runtime/pprof [cmd/compile].s                       125209   125200   -9      -0.007%
cmd/compile/internal/syntax.s                       181755   181736   -19     -0.010%
cmd/dist.s                                          177893   177861   -32     -0.018%
crypto/tls.s                                        389157   389113   -44     -0.011%
internal/pkgbits [cmd/compile].s                    41644    41632    -12     -0.029%
cmd/compile/internal/syntax [cmd/compile].s         196105   196086   -19     -0.010%
cmd/compile/internal/types.s                        71315    71345    +30     +0.042%
cmd/internal/obj/s390x [cmd/compile].s              121733   121701   -32     -0.026%
cmd/go/internal/trace.s                             4796     4760     -36     -0.751%
cmd/internal/obj/arm64 [cmd/compile].s              168120   168147   +27     +0.016%
cmd/internal/obj/ppc64 [cmd/compile].s              140219   140223   +4      +0.003%
cmd/internal/script.s                               83442    83436    -6      -0.007%
cmd/link/internal/loader.s                          93299    93294    -5      -0.005%
net/http.s                                          620639   620472   -167    -0.027%
net/http/pprof.s                                    35016    35013    -3      -0.009%
cmd/compile/internal/coverage.s                     6668     6667     -1      -0.015%
cmd/vendor/golang.org/x/telemetry/internal/upload.s 34210    34148    -62     -0.181%
cmd/compile/internal/coverage [cmd/compile].s       6664     6663     -1      -0.015%
cmd/vendor/golang.org/x/text/language.s             48077    48074    -3      -0.006%
cmd/go/internal/vcweb.s                             45193    45189    -4      -0.009%
cmd/go/internal/vcs.s                               44749    44729    -20     -0.045%
cmd/compile/internal/inline/inlheur.s               83758    83742    -16     -0.019%
cmd/compile/internal/inline/inlheur [cmd/compile].s 84773    84757    -16     -0.019%
cmd/go/internal/modfetch/codehost.s                 89098    89094    -4      -0.004%
cmd/trace.s                                         257550   257564   +14     +0.005%
cmd/link/internal/ld.s                              641945   641706   -239    -0.037%
cmd/link/internal/arm64.s                           34805    34798    -7      -0.020%
cmd/go/internal/modload.s                           328971   328954   -17     -0.005%
cmd/go/internal/load.s                              178877   178871   -6      -0.003%
cmd/go/internal/clean.s                             11006    10990    -16     -0.145%
cmd/compile/internal/ssa.s                          3552843  3553347  +504    +0.014%
cmd/compile/internal/ssa [cmd/compile].s            3752511  3753123  +612    +0.016%
total                                               36179015 36178687 -328    -0.001%

Change-Id: I251c2898ccf3c9931d162d87dabbd49cf4ec73a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/641757
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-06 09:05:20 -08:00
Jakub Ciolek b7c9cdd53c cmd/compile: establish limits of bool to uint8 conversions
Improves bound check elimination for:

func arrayLargeEnough(b bool, a [2]int64) int64 {
    c := byte(0)
    if b {
        c = 1
    }

    // this bound check gets elided
    return a[c]
}

We also detect never true branches like:

func cCanOnlyBe0or1(b bool) byte {
    var c byte
    if b {
        c = 1
    }
    // this statement can never be true so we can elide it
    if c == 2 {
        c = 3
    }

    return c
}

Hits a few times:

crypto/internal/sysrand
crypto/internal/sysrand.Read 357 -> 349  (-2.24%)

testing
testing.(*F).Fuzz.func1.1 837 -> 828  (-1.08%)

image/png
image/png.(*Encoder).Encode 1735 -> 1733  (-0.12%)

vendor/golang.org/x/crypto/cryptobyte
vendor/golang.org/x/crypto/cryptobyte.(*Builder).callContinuation 187 -> 185  (-1.07%)

crypto/internal/sysrand [cmd/compile]
crypto/internal/sysrand.Read 357 -> 349  (-2.24%)

go/parser
go/parser.(*parser).parseType 463 -> 457  (-1.30%)
go/parser.(*parser).embeddedElem 633 -> 626  (-1.11%)
go/parser.(*parser).parseFuncDecl 917 -> 914  (-0.33%)
go/parser.(*parser).parseDotsType 393 -> 391  (-0.51%)
go/parser.(*parser).error 1061 -> 1054  (-0.66%)
go/parser.(*parser).parseTypeName 537 -> 532  (-0.93%)
go/parser.(*parser).parseParamDecl 1478 -> 1451  (-1.83%)
go/parser.(*parser).parseFuncTypeOrLit 498 -> 495  (-0.60%)
go/parser.(*parser).parseValue 375 -> 371  (-1.07%)
go/parser.(*parser).parseElementList 594 -> 593  (-0.17%)
go/parser.(*parser).parseResult 593 -> 583  (-1.69%)
go/parser.(*parser).parseElement 506 -> 504  (-0.40%)
go/parser.(*parser).parseImportSpec 1110 -> 1108  (-0.18%)
go/parser.(*parser).parseStructType 741 -> 735  (-0.81%)
go/parser.(*parser).parseTypeSpec 1054 -> 1048  (-0.57%)
go/parser.(*parser).parseIdentList 625 -> 623  (-0.32%)
go/parser.(*parser).parseOperand 1221 -> 1199  (-1.80%)
go/parser.(*parser).parseIndexOrSliceOrInstance 2713 -> 2694  (-0.70%)
go/parser.(*parser).parseSwitchStmt 1458 -> 1447  (-0.75%)
go/parser.(*parser).parseArrayFieldOrTypeInstance 1865 -> 1861  (-0.21%)
go/parser.(*parser).parseExpr 307 -> 305  (-0.65%)
go/parser.(*parser).parseSelector 427 -> 425  (-0.47%)
go/parser.(*parser).parseTypeInstance 1433 -> 1420  (-0.91%)
go/parser.(*parser).parseCaseClause 629 -> 626  (-0.48%)
go/parser.(*parser).parseParameterList 4212 -> 4189  (-0.55%)
go/parser.(*parser).parsePointerType 393 -> 391  (-0.51%)
go/parser.(*parser).parseFuncType 465 -> 463  (-0.43%)
go/parser.(*parser).parseTypeAssertion 559 -> 557  (-0.36%)
go/parser.(*parser).parseSimpleStmt 2443 -> 2388  (-2.25%)
go/parser.(*parser).parseCallOrConversion 1093 -> 1087  (-0.55%)
go/parser.(*parser).parseForStmt 2168 -> 2159  (-0.42%)
go/parser.(*parser).embeddedTerm 657 -> 649  (-1.22%)
go/parser.(*parser).parseCommClause 1509 -> 1501  (-0.53%)

cmd/internal/objfile
cmd/internal/objfile.(*goobjFile).symbols 5299 -> 5274  (-0.47%)

net
net.initConfVal 378 -> 374  (-1.06%)
net.(*conf).hostLookupOrder 269 -> 267  (-0.74%)
net.(*conf).addrLookupOrder 261 -> 255  (-2.30%)

cmd/internal/obj/loong64
cmd/internal/obj/loong64.(*ctxt0).oplook 1829 -> 1813  (-0.87%)

cmd/internal/obj/mips
cmd/internal/obj/mips.(*ctxt0).oplook 1428 -> 1400  (-1.96%)

go/types
go/types.(*typeWriter).signature 605 -> 601  (-0.66%)
go/types.(*Checker).instantiateSignature 1469 -> 1467  (-0.14%)

go/parser [cmd/compile]
go/parser.(*parser).parseSwitchStmt 1458 -> 1447  (-0.75%)
go/parser.(*parser).parseDotsType 393 -> 391  (-0.51%)
go/parser.(*parser).embeddedElem 633 -> 626  (-1.11%)
go/parser.(*parser).parseTypeAssertion 559 -> 557  (-0.36%)
go/parser.(*parser).parseCommClause 1509 -> 1501  (-0.53%)
go/parser.(*parser).parseCaseClause 629 -> 626  (-0.48%)
go/parser.(*parser).parseImportSpec 1110 -> 1108  (-0.18%)
go/parser.(*parser).parseTypeSpec 1054 -> 1048  (-0.57%)
go/parser.(*parser).parseElementList 594 -> 593  (-0.17%)
go/parser.(*parser).parseParamDecl 1478 -> 1451  (-1.83%)
go/parser.(*parser).parseType 463 -> 457  (-1.30%)
go/parser.(*parser).parseSimpleStmt 2443 -> 2388  (-2.25%)
go/parser.(*parser).parseIdentList 625 -> 623  (-0.32%)
go/parser.(*parser).parseTypeInstance 1433 -> 1420  (-0.91%)
go/parser.(*parser).parseResult 593 -> 583  (-1.69%)
go/parser.(*parser).parseValue 375 -> 371  (-1.07%)
go/parser.(*parser).parseFuncDecl 917 -> 914  (-0.33%)
go/parser.(*parser).error 1061 -> 1054  (-0.66%)
go/parser.(*parser).parseElement 506 -> 504  (-0.40%)
go/parser.(*parser).parseFuncType 465 -> 463  (-0.43%)
go/parser.(*parser).parsePointerType 393 -> 391  (-0.51%)
go/parser.(*parser).parseTypeName 537 -> 532  (-0.93%)
go/parser.(*parser).parseExpr 307 -> 305  (-0.65%)
go/parser.(*parser).parseFuncTypeOrLit 498 -> 495  (-0.60%)
go/parser.(*parser).parseStructType 741 -> 735  (-0.81%)
go/parser.(*parser).parseOperand 1221 -> 1199  (-1.80%)
go/parser.(*parser).parseIndexOrSliceOrInstance 2713 -> 2694  (-0.70%)
go/parser.(*parser).parseForStmt 2168 -> 2159  (-0.42%)
go/parser.(*parser).parseParameterList 4212 -> 4189  (-0.55%)
go/parser.(*parser).parseArrayFieldOrTypeInstance 1865 -> 1861  (-0.21%)
go/parser.(*parser).parseSelector 427 -> 425  (-0.47%)
go/parser.(*parser).parseCallOrConversion 1093 -> 1087  (-0.55%)
go/parser.(*parser).embeddedTerm 657 -> 649  (-1.22%)

crypto/tls
crypto/tls.(*Conn).clientHandshake 3430 -> 3421  (-0.26%)

cmd/internal/obj/mips [cmd/compile]
cmd/internal/obj/mips.(*ctxt0).oplook 1428 -> 1400  (-1.96%)

cmd/internal/obj/loong64 [cmd/compile]
cmd/internal/obj/loong64.(*ctxt0).oplook 1829 -> 1813  (-0.87%)

cmd/compile/internal/types2
cmd/compile/internal/types2.(*typeWriter).signature 605 -> 601  (-0.66%)
cmd/compile/internal/types2.(*Checker).infer 10646 -> 10614  (-0.30%)
cmd/compile/internal/types2.(*Checker).instantiateSignature 1567 -> 1561  (-0.38%)

cmd/compile/internal/types2 [cmd/compile]
cmd/compile/internal/types2.(*Checker).instantiateSignature 1567 -> 1561  (-0.38%)
cmd/compile/internal/types2.(*typeWriter).signature 605 -> 601  (-0.66%)
cmd/compile/internal/types2.(*Checker).infer 10718 -> 10654  (-0.60%)

cmd/vendor/golang.org/x/arch/s390x/s390xasm
cmd/vendor/golang.org/x/arch/s390x/s390xasm.GoSyntax 36778 -> 36682  (-0.26%)

net/http
net/http.(*Client).do 4202 -> 4170  (-0.76%)
net/http.(*http2clientStream).writeRequest 3692 -> 3686  (-0.16%)

cmd/vendor/github.com/ianlancetaylor/demangle
cmd/vendor/github.com/ianlancetaylor/demangle.(*rustState).genericArgs 466 -> 463  (-0.64%)

cmd/compile/internal/devirtualize
cmd/compile/internal/devirtualize.ProfileGuided.func1 1364 -> 1357  (-0.51%)

cmd/compile/internal/inline/interleaved
cmd/compile/internal/inline/interleaved.DevirtualizeAndInlinePackage.func2 533 -> 526  (-1.31%)

cmd/compile/internal/devirtualize [cmd/compile]
cmd/compile/internal/devirtualize.ProfileGuided.func1 1343 -> 1332  (-0.82%)

cmd/compile/internal/inline/interleaved [cmd/compile]
cmd/compile/internal/inline/interleaved.DevirtualizeAndInlinePackage.func2 533 -> 526  (-1.31%)

cmd/link/internal/ld
cmd/link/internal/ld.mustLinkExternal 2739 -> 2674  (-2.37%)

cmd/compile/internal/ssa
cmd/compile/internal/ssa.(*poset).Ordered 391 -> 389  (-0.51%)
cmd/compile/internal/ssa.(*poset).Equal 318 -> 313  (-1.57%)
cmd/compile/internal/ssa.(*poset).Undo 1842 -> 1832  (-0.54%)
cmd/compile/internal/ssa.(*expandState).decomposeAsNecessary 4587 -> 4555  (-0.70%)
cmd/compile/internal/ssa.(*poset).OrderedOrEqual 390 -> 389  (-0.26%)
cmd/compile/internal/ssa.(*poset).NonEqual 613 -> 606  (-1.14%)

cmd/compile/internal/ssa [cmd/compile]
cmd/compile/internal/ssa.(*poset).OrderedOrEqual 368 -> 365  (-0.82%)
cmd/compile/internal/ssa.(*poset).Equal 318 -> 313  (-1.57%)
cmd/compile/internal/ssa.(*expandState).decomposeAsNecessary 4952 -> 4938  (-0.28%)
cmd/compile/internal/ssa.(*poset).NonEqual 613 -> 606  (-1.14%)
cmd/compile/internal/ssa.(*poset).SetEqual 2533 -> 2505  (-1.11%)
cmd/compile/internal/ssa.(*poset).SetNonEqual 785 -> 777  (-1.02%)
cmd/compile/internal/ssa.(*poset).Ordered 370 -> 366  (-1.08%)

cmd/compile/internal/gc [cmd/compile]
cmd/compile/internal/gc.Main.DevirtualizeAndInlinePackage.func2 492 -> 489  (-0.61%)

file                                                    before   after    Δ       %
crypto/internal/sysrand.s                               1553     1545     -8      -0.515%
internal/zstd.s                                         49179    49190    +11     +0.022%
testing.s                                               115197   115188   -9      -0.008%
image/png.s                                             36109    36107    -2      -0.006%
vendor/golang.org/x/crypto/cryptobyte.s                 30980    30978    -2      -0.006%
crypto/internal/sysrand [cmd/compile].s                 1553     1545     -8      -0.515%
go/parser.s                                             112638   112354   -284    -0.252%
cmd/internal/objfile.s                                  49994    49969    -25     -0.050%
net.s                                                   299558   299546   -12     -0.004%
cmd/internal/obj/loong64.s                              71651    71635    -16     -0.022%
cmd/internal/obj/mips.s                                 59681    59653    -28     -0.047%
go/types.s                                              558839   558833   -6      -0.001%
cmd/compile/internal/types.s                            71305    71306    +1      +0.001%
go/parser [cmd/compile].s                               112749   112465   -284    -0.252%
crypto/tls.s                                            388859   388850   -9      -0.002%
cmd/internal/obj/mips [cmd/compile].s                   59792    59764    -28     -0.047%
cmd/internal/obj/loong64 [cmd/compile].s                71762    71746    -16     -0.022%
cmd/compile/internal/types2.s                           540608   540566   -42     -0.008%
cmd/compile/internal/types2 [cmd/compile].s             577428   577354   -74     -0.013%
cmd/vendor/golang.org/x/arch/s390x/s390xasm.s           267664   267568   -96     -0.036%
net/http.s                                              620704   620666   -38     -0.006%
cmd/vendor/github.com/ianlancetaylor/demangle.s         299991   299988   -3      -0.001%
cmd/compile/internal/devirtualize.s                     21452    21445    -7      -0.033%
cmd/compile/internal/inline/interleaved.s               8358     8351     -7      -0.084%
cmd/compile/internal/devirtualize [cmd/compile].s       20994    20983    -11     -0.052%
cmd/compile/internal/inline/interleaved [cmd/compile].s 8328     8321     -7      -0.084%
cmd/link/internal/ld.s                                  641802   641737   -65     -0.010%
cmd/compile/internal/ssa.s                              3552939  3552957  +18     +0.001%
cmd/compile/internal/ssa [cmd/compile].s                3752191  3752197  +6      +0.000%
cmd/compile/internal/ssagen.s                           405780   405786   +6      +0.001%
cmd/compile/internal/ssagen [cmd/compile].s             434472   434496   +24     +0.006%
cmd/compile/internal/gc [cmd/compile].s                 38499    38496    -3      -0.008%
total                                                   36185267 36184243 -1024   -0.003%

Change-Id: I867222b0f907b29d32b2676e55c6b5789ec56511
Reviewed-on: https://go-review.googlesource.com/c/go/+/642716
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-02-05 09:04:25 -08:00
Youlin Feng 0825475599 cmd/compile: do not treat OpLocalAddr as load in DSE
Fixes #70409
Fixes #47107

Change-Id: I82a66c46f6b76c68e156b5d937273b0316975d44
Reviewed-on: https://go-review.googlesource.com/c/go/+/629016
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2025-02-04 12:52:01 -08:00
Ian Lance Taylor 82337de9f2 test/issue71226: add cast to avoid clang error
Change-Id: I2d8ecb7b5f48943697d454d09947fdb1817809d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/646295
TryBot-Bypass: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2025-02-03 10:14:38 -08:00
Jakub Ciolek e57769d5ad cmd/compile: on AMD64, prefer XOR/AND for (x & 1) == 0 check
It's shorter to encode. Additionally, XOR and AND generally
have higher throughput than BT/SET*.

compilecmp:

runtime
runtime.(*sweepClass).split 58 -> 56  (-3.45%)
runtime.sweepClass.split 14 -> 11  (-21.43%)

runtime [cmd/compile]
runtime.(*sweepClass).split 58 -> 56  (-3.45%)
runtime.sweepClass.split 14 -> 11  (-21.43%)

strconv
strconv.ryuFtoaShortest changed

strconv [cmd/compile]
strconv.ryuFtoaShortest changed

math/big
math/big.(*Int).MulRange 255 -> 252  (-1.18%)

testing/quick
testing/quick.sizedValue changed

internal/fuzz
internal/fuzz.(*pcgRand).bool 69 -> 70  (+1.45%)

cmd/internal/obj/x86
cmd/internal/obj/x86.(*AsmBuf).asmevex changed

math/big [cmd/compile]
math/big.(*Int).MulRange 255 -> 252  (-1.18%)

cmd/internal/obj/x86 [cmd/compile]
cmd/internal/obj/x86.(*AsmBuf).asmevex changed

net/http
net/http.(*http2stream).isPushed 11 -> 10  (-9.09%)

cmd/vendor/github.com/google/pprof/internal/binutils
cmd/vendor/github.com/google/pprof/internal/binutils.(*file).computeBase changed

Change-Id: I9cb2987eb263c85ee4e93d6f8455c91a55273173
Reviewed-on: https://go-review.googlesource.com/c/go/+/640975
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-03 08:42:01 -08:00
Ian Lance Taylor 6adf54a3eb cmd/cgo: declare _GoString{Len,Ptr} in _cgo_export.h
Fixes #71226

Change-Id: I91c46a4310a9c7a9fcd1e3a131ca16e46949edb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/642235
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2025-02-03 08:24:48 -08:00
Ian Lance Taylor bf351677c4 cmd/cgo: add C declaration parameter unused attribute
Fixes #71225

Change-Id: I3e60fdf632f2aa0e63b24225f13e4ace49906925
Reviewed-on: https://go-review.googlesource.com/c/go/+/642196
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2025-02-03 08:24:47 -08:00
Michael Pratt 78e6f2a1c8 runtime: rename mapiterinit and mapiternext
mapiterinit allows external linkname. These users must allocate their
own iter struct for initialization by mapiterinit. Since the type is
unexported, they also must define the struct themselves. As a result,
they of course define the struct matching the old hiter definition (in
map_noswiss.go).

The old definition is smaller on 32-bit platforms. On those platforms,
mapiternext will clobber memory outside of the caller's allocation.

On all platforms, the pointer layout between the old hiter and new
maps.Iter does not match. Thus the GC may miss pointers and free
reachable objects early, or it may see non-pointers that look like heap
pointers and throw due to invalid references to free objects.

To avoid these issues, we must keep mapiterinit and mapiternext with the
old hiter definition. The most straightforward way to do this is to use
mapiterinit and mapiternext as a compatibility layer between the old and
new iter types.

The first step to that is to move normal map use off of these functions,
which is what this CL does.

Introduce new mapIterStart and mapIterNext functions that replace the
former functions everywhere in the toolchain. These have the same
behavior as the old functions.

This CL temporarily makes the old functions throw to ensure we don't
have hidden dependencies on them. We cannot remove them entirely because
GOEXPERIMENT=noswissmap still uses the old names, and internal/goobj
requires all builtins to exist regardless of GOEXPERIMENT. The next CL
will introduce the compatibility layer.

I want to avoid using linkname between runtime and reflect, as that
would also allow external linknames. So mapIterStart and mapIterNext are
duplicated in reflect, which can be done trivially, as it imports
internal/runtime/maps.

For #71408.

Change-Id: I6a6a636c6d4bd1392618c67ca648d3f061afe669
Reviewed-on: https://go-review.googlesource.com/c/go/+/643898
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-01-28 10:54:43 -08:00
Keith Randall c5e205e928 internal/runtime/maps: re-enable some tests
Re-enable tests for stack-allocated maps and fast map accessors.
Those are implemented now.

Update #54766

Change-Id: I8c019702bd9fb077b2fe3f7c78e8e9e10d2263a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/642376
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
2025-01-14 09:55:06 -08:00
Keith Randall 44a6f817ea cmd/compile: fix write barrier coalescing
We can't coalesce a non-WB store with a subsequent Move, as the
result of the store might be the source of the move.

There's a simple codegen test. Not sure how we might do a real test,
as all the repro's I've come up with are very expensive and unreliable.

Fixes #71228

Change-Id: If18bf181a266b9b90964e2591cd2e61a7168371c
Reviewed-on: https://go-review.googlesource.com/c/go/+/642197
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2025-01-12 22:49:39 -08:00
Damien Neil 4da905bcf0 cmd/internal/objabi, internal/runtime: increase nosplit limit on OpenBSD
OpenBSD is bumping up against the nosplit limit, and openbsd/ppc64
is over it. Increase StackGuardMultiplier on OpenBSD, matching AIX.

Change-Id: I61e17c99ce77e1fd3f368159dc4615aeae99e913
Reviewed-on: https://go-review.googlesource.com/c/go/+/632996
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-12-06 00:35:41 +00:00
Jorropo d241ea8d5c sync/atomic: add missing leak tests for And & Or
Theses tests were forgot because when CL 462298 was originally written
And & Or atomics were not available in go.
Git were smart enough to rebase over And's & Or's addition.
After most reviews and before merging it were pointed I should
make theses new intrinsics noescape.
When doing this last minute addition I forgot to add tests.

Change-Id: I457f98315c0aee91d5743058ab76f256856cb782
Reviewed-on: https://go-review.googlesource.com/c/go/+/633416
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-12-04 17:30:39 +00:00
David Chase d524c954b1 cmd/compile: use very high budget for once-called closures
This should make it much more likely that rangefunc
iterators become "plain inline code".

Change-Id: I8026603afdc5249f60cc663c4bc15cb1d26d1c83
Reviewed-on: https://go-review.googlesource.com/c/go/+/630696
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-22 02:04:41 +00:00
Youlin Feng c4e6ab9750 cmd/compile: modify CSE to remove redundant OpLocalAddrs
Remove the OpLocalAddrs that are unnecessary in the CSE pass, so the
following passes like DSE and memcombine can do its work better.

Fixes #70300

Change-Id: I600025d49eeadb3ca4f092d614428399750f69bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/628075
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2024-11-22 00:12:03 +00:00