mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Keith Randall	a7e331e671	cmd/compile: implement signed loads from read-only memory In addition to unsigned loads which already exist. This helps code that does switches on strings to constant-fold the switch away when the string being switched on is constant. Fixes #71699 Change-Id: If3051af0f7255d2a573da6f96b153a987a7f159d Reviewed-on: https://go-review.googlesource.com/c/go/+/649295 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@google.com>	2025-02-13 12:27:55 -08:00
Keith Randall	21f434058c	cmd/compile: ensure constant folding of pointer arithmetic remains a pointer For c + nil, we want the result to still be of pointer type. Fixes ppc64le build failure with CL 468455, in issue33724.go. The problem in that test is that it requires a nil check to be scheduled before the corresponding load. This normally happens fine because we prioritize nil checks. If we have nilcheck(p) and load(p), once p is scheduled the nil check will always go before the load. The issue we saw in 33724 is that when p is a nil pointer, we ended up with two different p's, an int64(0) as the argument to the nil check and an (Outer)(0) as the argument to the load. Those two zeroes don't get CSEd, so if the (Outer)(0) happens to get scheduled first, the load can end up before the nilcheck. Fix this by always having constant arithmetic preserve the pointerness of the value, so that both zeroes are of type *Outer and get CSEd. Update #58482 Update #33724 Change-Id: Ib9b8c0446f1690b574e0f3c0afb9934efbaf3513 Reviewed-on: https://go-review.googlesource.com/c/go/+/468615 Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> TryBot-Bypass: Keith Randall <khr@golang.org>	2023-02-17 03:56:57 +00:00
Keith Randall	f959fb3872	cmd/compile: add anchored version of SP The SPanchored opcode is identical to SP, except that it takes a memory argument so that it (and more importantly, anything that uses it) must be scheduled at or after that memory argument. This opcode ensures that a LEAQ of a variable gets scheduled after the corresponding VARDEF for that variable. This may lead to less CSE of LEAQ operations. The effect is very small. The go binary is only 80 bytes bigger after this CL. Usually LEAQs get folded into load/store operations, so the effect is only for pointerful types, large enough to need a duffzero, and have their address passed somewhere. Even then, usually the CSEd LEAQs will be un-CSEd because the two uses are on different sides of a function call and the LEAQ ends up being rematerialized at the second use anyway. Change-Id: Ib893562cd05369b91dd563b48fb83f5250950293 Reviewed-on: https://go-review.googlesource.com/c/go/+/452916 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Martin Möhrmann <martin@golang.org> Reviewed-by: Keith Randall <khr@google.com>	2023-01-19 22:43:12 +00:00
Dmitri Shuralyov	47a0d46716	cmd/compile/internal/ssa: generate code via a //go:generate directive The standard way to generate code in a Go package is via //go:generate directives, which are invoked by the developer explicitly running: go generate import/path/of/said/package Switch to using that approach here. This way, developers don't need to learn and remember a custom way that each particular Go package may choose to implement its code generation. It also enables conveniences such as 'go generate -n' to discover how code is generated without running anything (this works on all packages that rely on //go:generate directives), being able to generate multiple packages at once and from any directory, and so on. Change-Id: I0e5b6a1edeff670a8e588befeef0c445613803c7 Reviewed-on: https://go-review.googlesource.com/c/go/+/460135 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2023-01-19 22:42:34 +00:00
Garet Halliday	50557edf10	runtime: add wasm bulk memory operations The existing implementation uses loops to implement bulk memory operations such as memcpy and memclr. Now that bulk memory operations have been standardized and are implemented in all major browsers and engines (see https://webassembly.org/roadmap/), we should use them to improve performance. Updates #28360 Change-Id: I28df0e0350287d5e7e1d1c09a4064ea1054e7575 Reviewed-on: https://go-review.googlesource.com/c/go/+/444935 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Richard Musiol <neelance@gmail.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Richard Musiol <neelance@gmail.com> Reviewed-by: Richard Musiol <neelance@gmail.com>	2022-10-27 10:37:01 +00:00
Johan Brandhorst-Satzkorn	85196fc982	cmd/internal/ssa: correct references to _gen folder The gen folder was renamed to _gen in CL 435472, but references in code and docs were not updated. This updates the references. Change-Id: Ibadc0cdcb5bed145c3257b58465a8df370487ae5 Reviewed-on: https://go-review.googlesource.com/c/go/+/444355 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-10-23 17:42:11 +00:00
Cherry Mui	c10b980220	cmd/compile: restore tail call for method wrappers For certain type of method wrappers we used to generate a tail call. That was disabled in CL 307234 when register ABI is used, because with the current IR it was difficult to generate a tail call with the arguments in the right places. The problem was that the IR does not contain a CALL-like node with arguments; instead, it contains an OAS node that adjusts the receiver, than an OTAILCALL node that just contains the target, but no argument (with the assumption that the OAS node will put the adjusted receiver in the right place). With register ABI, putting arguments in registers are done in SSA. The assignment (OAS) doesn't put the receiver in register. This CL changes the IR of a tail call to take an actual OCALL node. Specifically, a tail call is represented as OTAILCALL (OCALL target args...) This way, the call target and args are connected through the OCALL node. So the call can be analyzed in SSA and the args can be passed in the right places. (Alternatively, we could have OTAILCALL node directly take the target and the args, without the OCALL node. Using an OCALL node is convenient as there are existing code that processes OCALL nodes which do not need to be changed. Also, a tail call is similar to ORETURN (OCALL target args...), except it doesn't preserve the frame. I did the former but I'm open to change.) The SSA representation is similar. Previously, the IR lowers to a Store the receiver then a BlockRetJmp which jumps to the target (without putting the arg in register). Now we use a TailCall op, which takes the target and the args. The call expansion pass and the register allocator handles TailCall pretty much like a StaticCall, and it will do the right ABI analysis and put the args in the right places. (Args other than the receiver are already in the right places. For register args it generates no code for them. For stack args currently it generates a self copy. I'll work on optimize that out.) BlockRetJmp is still used, signaling it is a tail call. The actual call is made in the TailCall op so BlockRetJmp generates no code (we could use BlockExit if we like). This slightly reduces binary size: old new cmd/go 14003088 13953936 cmd/link 6275552 6271456 Change-Id: I2d16d8d419fe1f17554916d317427383e17e27f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/350145 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: David Chase <drchase@google.com>	2021-09-17 22:59:44 +00:00
Cherry Zhang	4a7effa418	cmd/compile: mark R12 clobbered for special calls In external linking mode the external linker may insert trampolines, which use R12 as a scratch register. So a call could potentially clobber R12 if the target is laid out too far. Mark R12 clobbered. Also, we will use R12 for trampolines in the Go linker as well. CL 310731 updated the generated rewrite files so imports are grouped, but the generator was not updated to do so. Grouped imports are nice. But as those are generated files, for simplicity and my laziness, just regenerate with the current generator (which makes imports not grouped). Change-Id: Iddb741ff7314a291ade5fbffc7d315f555808409 Reviewed-on: https://go-review.googlesource.com/c/go/+/314453 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>	2021-04-28 14:01:59 +00:00
Russ Cox	95ed5c3800	internal/buildcfg: move build configuration out of cmd/internal/objabi The go/build package needs access to this configuration, so move it into a new package available to the standard library. Change-Id: I868a94148b52350c76116451f4ad9191246adcff Reviewed-on: https://go-review.googlesource.com/c/go/+/310731 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Jay Conrod <jayconrod@google.com>	2021-04-16 19:20:53 +00:00
Daniel Martí	bd8b3fe5be	cmd/compile: make no-op rewrite funcs smaller This doesn't change any behavior, but should help the compiler realise that these funcs really do nothing at all. Change-Id: Ib26c02ef264691acac983538ec300f91d6ff98db Reviewed-on: https://go-review.googlesource.com/c/go/+/280314 Trust: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2021-03-22 16:01:44 +00:00
fanzha02	2b50ab2aee	cmd/compile: optimize single-precision floating point square root Add generic rule to rewrite the single-precision square root expression with one single-precision instruction. The optimization will reduce two times of precision converting between double-precision and single-precision. On arm64 flatform. previous: FCVTSD F0, F0 FSQRTD F0, F0 FCVTDS F0, F0 optimized: FSQRTS S0, S0 And this patch adds the test case to check the correctness. This patch refers to CL 241877, contributed by Alice Xu (dianhong.xu@arm.com) Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e Reviewed-on: https://go-review.googlesource.com/c/go/+/276873 Trust: fannie zhang <Fannie.Zhang@arm.com> Run-TryBot: fannie zhang <Fannie.Zhang@arm.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2021-03-02 06:38:07 +00:00
Cherry Zhang	37a32a1833	cmd/compile: make sure address of offset(SP) is rematerializeable An address of offset(SP) may point to the callee args area, and may be used to move things into/out of the args/results. If an address like that is spilled and picked up by the GC, it may hold an arg/result live in the callee, which may not actually be live (e.g. a result not initialized at function entry). Make sure they are rematerializeable, so they are always short-lived and never picked up by the GC. This CL changes 386, PPC64, and Wasm. On AMD64 we already have the rule (line 2159). On other architectures, we already have similar rules like (OffPtr [off] ptr:(SP)) => (MOVDaddr [int32(off)] ptr) to avoid this problem. (Probably me in the past had run into this...) Fixes #42944. Change-Id: Id2ec73ac08f8df1829a9a7ceb8f749d67fe86d1e Reviewed-on: https://go-review.googlesource.com/c/go/+/275174 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-12-03 21:34:39 +00:00
Agniva De Sarker	8acbe4c0b3	cmd/compile: optimize unsigned comparisons with 0/1 on wasm Updates #21439 Change-Id: I0fbcde6e0c2fc368fe686b271670f9d8be4a7900 Reviewed-on: https://go-review.googlesource.com/c/go/+/247557 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>	2020-08-22 12:35:47 +00:00
Josh Bleecher Snyder	67a8660b5a	cmd/compile: CSE the RHS of rewrite rules Keep track of all expressions encountered while generating a rewrite result, and re-use them whenever possible. Named expressions may still be used for clarity when desired. Change-Id: I640dca108763eb8baeff8f9a4169300af3445b82 Reviewed-on: https://go-review.googlesource.com/c/go/+/229800 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2020-04-24 16:44:20 +00:00
Richard Musiol	82f29898ea	cmd/compile: rewrite Wasm rules to use typed aux fields Passes toolstash-check -all. Change-Id: Ib731a59eadfffa81914848005b0f757649affa6f Reviewed-on: https://go-review.googlesource.com/c/go/+/228819 Run-TryBot: Richard Musiol <neelance@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-04-24 08:21:27 +00:00
Keith Randall	1e820a3432	cmd/compile: ensure ... rules have compatible aux and auxint types Otherwise, just copying the aux and auxint fields doesn't make much sense. (Although there's no bug - it just means it isn't typechecked correctly.) Change-Id: I4e21ac67f0c7bfd04ed5af1713cd24bca08af092 Reviewed-on: https://go-review.googlesource.com/c/go/+/227962 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-13 15:41:40 +00:00
Michael Munday	bfd569fcb0	cmd/compile: delete the floating point Greater and Geq ops Extend CL 220417 (which removed the integer Greater and Geq ops) to floating point comparisons. Greater and Geq can always be implemented using Less and Leq. Fixes #37316. Change-Id: Ieaddb4877dd0ff9037a1dd11d0a9a9e45ced71e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/222397 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-04-07 19:55:05 +00:00
David Chase	47ade08141	cmd/compile: add logging for large (>= 128 byte) copies For 1.15, unless someone really wants it in 1.14. A performance-sensitive user thought this would be useful, though "large" was not well-defined. If 128 is large, there are 139 static instances of "large" copies in the compiler itself. Includes test. Change-Id: I81f20c62da59d37072429f3a22c1809e6fb2946d Reviewed-on: https://go-review.googlesource.com/c/go/+/205066 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-04-03 17:24:48 +00:00
Keith Randall	cd9fd640db	cmd/compile: don't allow NaNs in floating-point constant ops Trying this CL again, with a fixed test that allows platforms to disagree on the exact behavior of converting NaNs. We store 32-bit floating point constants in a 64-bit field, by converting that 32-bit float to 64-bit float to store it, and convert it back to use it. That works for almost all floating-point constants. The exception is signaling NaNs. The round trip described above means we can't represent a 32-bit signaling NaN, because conversions strip the signaling bit. To fix this issue, just forbid NaNs as floating-point constants in SSA form. This shouldn't affect any real-world code, as people seldom constant-propagate NaNs (except in test code). Additionally, NaNs are somewhat underspecified (which of the many NaNs do you get when dividing 0/0?), so when cross-compiling there's a danger of using the compiler machine's NaN regime for some math, and the target machine's NaN regime for other math. Better to use the target machine's NaN regime always. Update #36400 Change-Id: Idf203b688a15abceabbd66ba290d4e9f63619ecb Reviewed-on: https://go-review.googlesource.com/c/go/+/221790 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2020-03-04 04:49:54 +00:00
Josh Bleecher Snyder	a2bff7c296	cmd/compile: make pre-elimination of rulegen bounds checks more precise In cases in which we had a named value whose args were all _, like this rule from ARM.rules: (MOVBUreg x:(MOVBUload _ _)) -> (MOVWreg x) We previously inserted _ = x.Args[1] even though it is unnecessary. This change eliminates this pointless bounds check. And in other cases, we now check bounds just as far as strictly necessary. No significant movement on any compiler metrics. Just nicer (and less) code. Passes toolstash-check -all. Change-Id: I075dfe9f926cc561cdc705e9ddaab563164bed3a Reviewed-on: https://go-review.googlesource.com/c/go/+/221781 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-03-02 17:40:11 +00:00
Josh Bleecher Snyder	d7c073ecbf	cmd/compile: add specialized Value reset for OpCopy This: * Simplifies and shortens the generated code for rewrite rules. * Shrinks cmd/compile by 86k (0.4%) and makes it easier to compile. * Removes the stmt boundary code wrangling from Value.reset, in favor of doing it in the one place where it actually does some work, namely the writebarrier pass. (This was ascertained by inspecting the code for cases in which notStmtBoundary values were generated.) Passes toolstash-check -all. Change-Id: I25671d4c4bbd772f235195d11da090878ea2cc07 Reviewed-on: https://go-review.googlesource.com/c/go/+/221421 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2020-03-02 16:24:47 +00:00
Josh Bleecher Snyder	7913f7dfcf	cmd/compile: add specialized AddArgN functions for rewrite rules This shrinks the compiler without impacting performance. (The performance-sensitive part of rewrite rules is the non-match case.) Passes toolstash-check -all. Executable size: file before after Δ % compile 20356168 20163960 -192208 -0.944% total 115599376 115407168 -192208 -0.166% Text size: file before after Δ % cmd/compile/internal/ssa.s 3928309 3778774 -149535 -3.807% total 18862943 18713408 -149535 -0.793% Memory allocated compiling package SSA: SSA 12.7M ± 0% 12.5M ± 0% -1.74% (p=0.008 n=5+5) Compiler speed impact: name old time/op new time/op delta Template 211ms ± 1% 211ms ± 2% ~ (p=0.832 n=49+49) Unicode 82.8ms ± 2% 83.2ms ± 2% +0.44% (p=0.022 n=46+49) GoTypes 726ms ± 1% 728ms ± 2% ~ (p=0.076 n=46+48) Compiler 3.39s ± 2% 3.40s ± 2% ~ (p=0.633 n=48+49) SSA 7.71s ± 1% 7.65s ± 1% -0.78% (p=0.000 n=45+44) Flate 134ms ± 1% 134ms ± 1% ~ (p=0.195 n=50+49) GoParser 167ms ± 1% 167ms ± 1% ~ (p=0.390 n=47+47) Reflect 453ms ± 3% 452ms ± 2% ~ (p=0.492 n=48+49) Tar 184ms ± 3% 184ms ± 2% ~ (p=0.862 n=50+48) XML 248ms ± 2% 248ms ± 2% ~ (p=0.096 n=49+47) [Geo mean] 415ms 415ms -0.03% name old user-time/op new user-time/op delta Template 273ms ± 1% 273ms ± 2% ~ (p=0.711 n=48+48) Unicode 117ms ± 6% 117ms ± 5% ~ (p=0.633 n=50+50) GoTypes 972ms ± 2% 974ms ± 1% +0.29% (p=0.016 n=47+49) Compiler 4.46s ± 6% 4.51s ± 6% ~ (p=0.093 n=50+50) SSA 10.4s ± 1% 10.3s ± 2% -0.94% (p=0.000 n=45+50) Flate 166ms ± 2% 167ms ± 2% ~ (p=0.148 n=49+48) GoParser 202ms ± 1% 202ms ± 2% -0.28% (p=0.014 n=47+49) Reflect 594ms ± 2% 594ms ± 2% ~ (p=0.717 n=48+49) Tar 224ms ± 2% 224ms ± 2% ~ (p=0.805 n=50+49) XML 311ms ± 1% 310ms ± 1% ~ (p=0.177 n=49+48) [Geo mean] 537ms 537ms +0.01% Change-Id: I562b9f349b34ddcff01771769e6dbbc80604da7a Reviewed-on: https://go-review.googlesource.com/c/go/+/221237 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-03-01 15:27:58 +00:00
Josh Bleecher Snyder	d889f0cb10	cmd/compile: use correct types in phiopt We try to preserve type correctness of generic ops. phiopt modified a bool to be an int without a conversion. Add a conversion. There are a few random fluctations in the generated code as a result, but nothing noteworthy or systematic. no binary size changes file before after Δ % math.s 35966 35961 -5 -0.014% debug/dwarf.s 108141 108147 +6 +0.006% crypto/dsa.s 6047 6044 -3 -0.050% image/png.s 42882 42885 +3 +0.007% go/parser.s 80281 80278 -3 -0.004% cmd/internal/obj.s 115116 115113 -3 -0.003% go/types.s 322130 322118 -12 -0.004% cmd/internal/obj/arm64.s 151679 151685 +6 +0.004% go/internal/gccgoimporter.s 56487 56493 +6 +0.011% cmd/test2json.s 1650 1647 -3 -0.182% cmd/link/internal/loadelf.s 35442 35443 +1 +0.003% cmd/go/internal/work.s 305039 305035 -4 -0.001% cmd/link/internal/ld.s 544835 544834 -1 -0.000% net/http.s 558777 558774 -3 -0.001% cmd/compile/internal/ssa.s 3926551 3926994 +443 +0.011% cmd/compile/internal/gc.s 1552320 1552321 +1 +0.000% total 18862241 18862670 +429 +0.002% Change-Id: I4289e773be6be534ea3f907d68f614441b8f9b46 Reviewed-on: https://go-review.googlesource.com/c/go/+/221607 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2020-02-29 17:02:29 +00:00
Josh Bleecher Snyder	2c859eae1d	cmd/compile: ignore div/mod in prove on non-x86 architectures Instead of writing AuxInt during prove and then zeroing it during lower, just don't write it in the first place. Passes toolstash-check -all. Change-Id: Iea4b555029a9d69332e835536f9cf3a42b8223db Reviewed-on: https://go-review.googlesource.com/c/go/+/220682 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-02-27 20:34:49 +00:00
Josh Bleecher Snyder	eb5cd0fb40	cmd/compile: mark Lsyms as readonly earlier The SSA backend has rules to read the contents of readonly Lsyms. However, this rule was failing to trigger for many readonly Lsyms. This is because the readonly attribute that was set on the Node.Name was not propagated to its Lsym until the dump globals phase, after SSA runs. To work around this phase ordering problem, introduce Node.SetReadonly, which sets Node.Name.Readonly and also configures the Lsym enough that SSA can use it. This change also fixes a latent problem in the rewrite rule function, namely that reads past the end of lsym.P were treated as entirely zero, instead of merely requiring padding with trailing zeros. This change also adds an amd64 rule needed to fully optimize the results of this change. It would be better not to need this, but the zero extension that should handle this for us gets optimized away too soon (see #36897 for a similar problem). I have not investigated whether other platforms also need new rules to take full advantage of the new optimizations. Compiled code for (interface{})(true) on amd64 goes from: LEAQ type.bool(SB), AX MOVBLZX ""..stmp_0(SB), BX LEAQ runtime.staticbytes(SB), CX ADDQ CX, BX to LEAQ type.bool(SB), AX LEAQ runtime.staticbytes+1(SB), BX Prior to this change, the readonly symbol rewrite rules fired a total of 884 times during make.bash. Afterwards they fire 1807 times. file before after Δ % cgo 4827832 4823736 -4096 -0.085% compile 24907768 24895656 -12112 -0.049% fix 3376952 3368760 -8192 -0.243% pprof 14751700 14747604 -4096 -0.028% total 120343528 120315032 -28496 -0.024% Change-Id: I59ea52138276c37840f69e30fb109fd376d579ec Reviewed-on: https://go-review.googlesource.com/c/go/+/220499 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-02-26 19:30:21 +00:00
Michael Munday	cb74dcc172	cmd/compile: remove Greater* and Geq* generic integer ops The generic Greater and Geq ops can always be replaced with the Less and Leq ops. This CL therefore removes them. This simplifies the compiler since it reduces the number of operations that need handling in both code and in rewrite rules. This will be especially true when adding control flow optimizations such as the integer-in-range optimizations in CL 165998. Change-Id: If0648b2b19998ac1bddccbf251283f3be4ec3040 Reviewed-on: https://go-review.googlesource.com/c/go/+/220417 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-02-26 13:11:53 +00:00
Bryan C. Mills	a9f1ea4a83	Revert "cmd/compile: don't allow NaNs in floating-point constant ops" This reverts CL 213477. Reason for revert: tests are failing on linux-mips*-rtrk builders. Change-Id: I8168f7450890233f1bd7e53930b73693c26d4dc0 Reviewed-on: https://go-review.googlesource.com/c/go/+/220897 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-02-25 15:49:19 +00:00
Keith Randall	2aa7c6c548	cmd/compile: don't allow NaNs in floating-point constant ops We store 32-bit floating point constants in a 64-bit field, by converting that 32-bit float to 64-bit float to store it, and convert it back to use it. That works for almost all floating-point constants. The exception is signaling NaNs. The round trip described above means we can't represent a 32-bit signaling NaN, because conversions strip the signaling bit. To fix this issue, just forbid NaNs as floating-point constants in SSA form. This shouldn't affect any real-world code, as people seldom constant-propagate NaNs (except in test code). Additionally, NaNs are somewhat underspecified (which of the many NaNs do you get when dividing 0/0?), so when cross-compiling there's a danger of using the compiler machine's NaN regime for some math, and the target machine's NaN regime for other math. Better to use the target machine's NaN regime always. This has been a bug since 1.10, and there's an easy workaround (declare a global varaible containing the signaling NaN pattern, and use that as the argument to math.Float32frombits) so we'll fix it in 1.15. Fixes #36400 Update #36399 Change-Id: Icf155e743281560eda2eed953d19a829552ccfda Reviewed-on: https://go-review.googlesource.com/c/go/+/213477 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2020-02-25 02:21:53 +00:00
Josh Bleecher Snyder	088aad5ff6	cmd/compile: use ellipses in wasm rules Also, explicitly zero AuxInt in some ops (like Div), to make it clear why they do not use an ellipsis. Passes toolstash-check -all. Change-Id: I2294d10e47d904d03e489e6ca43d46679323f75d Reviewed-on: https://go-review.googlesource.com/c/go/+/217009 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-02-25 00:34:24 +00:00
Josh Bleecher Snyder	ccb95b6492	cmd/compile: preserve shift boundness during wasm rewrite rules Mostly for clarity. There are nominal improvements to the generated code: file before after Δ % addr2line 4742769 4742639 -130 -0.003% api 6973284 6973209 -75 -0.001% asm 5922230 5922127 -103 -0.002% buildid 3117327 3117252 -75 -0.002% cgo 5539274 5539199 -75 -0.001% compile 27423605 27424940 +1335 +0.005% cover 6096973 6096898 -75 -0.001% dist 4121655 `4121580` -75 -0.002% doc 5386254 5386179 -75 -0.001% fix 3755243 3755168 -75 -0.002% link 7602682 7602607 -75 -0.001% nm 4687186 4687056 -130 -0.003% objdump 5184883 `5184753` -130 -0.003% pack 2511360 2511285 -75 -0.003% pprof 16872050 16871832 -218 -0.001% test2json 3060633 3060558 -75 -0.002% trace 13170181 13170018 -163 -0.001% vet 9865995 9865920 -75 -0.001% total 151188868 151188504 -364 -0.000% Change-Id: If765e3661549d183a2dbb44e83521c4a3f61d175 Reviewed-on: https://go-review.googlesource.com/c/go/+/216998 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-02-21 05:04:06 +00:00
Josh Bleecher Snyder	6dd11bcb35	cmd/compile: remove chunking of rewrite rules We added chunking of rewrite rules to speed up compiling package SSA. This series of changes has significantly shrunk the number of rewrite rules, and they are no longer being added nearly as fast. Now that we are sharing v.Args across multiple rewrite rules, there is additional benefit to having more rules in a single function. Removing chunking now has an incidental impact on compiling package SSA, marginally speeds up other compilation, shrinks the cmd/compile binary, and simplifies the code. name old time/op new time/op delta Template 211ms ± 2% 210ms ± 2% -0.50% (p=0.000 n=91+97) Unicode 81.9ms ± 3% 81.8ms ± 3% ~ (p=0.179 n=96+91) GoTypes 731ms ± 2% 731ms ± 1% ~ (p=0.442 n=94+96) Compiler 3.43s ± 2% 3.41s ± 2% -0.36% (p=0.001 n=98+94) SSA 8.30s ± 2% 8.32s ± 2% +0.19% (p=0.034 n=94+95) Flate 135ms ± 2% 134ms ± 1% -0.30% (p=0.006 n=98+94) GoParser 167ms ± 1% 167ms ± 1% -0.22% (p=0.001 n=92+94) Reflect 453ms ± 2% 453ms ± 3% ~ (p=0.306 n=98+97) Tar 184ms ± 2% 183ms ± 2% -0.31% (p=0.012 n=94+94) XML 249ms ± 2% 248ms ± 1% -0.26% (p=0.002 n=96+92) [Geo mean] 419ms 418ms -0.21% name old user-time/op new user-time/op delta Template 273ms ± 2% 272ms ± 2% -0.46% (p=0.000 n=93+96) Unicode 116ms ± 4% 117ms ± 4% ~ (p=0.433 n=98+98) GoTypes 977ms ± 2% 977ms ± 1% ~ (p=0.971 n=92+99) Compiler 4.56s ± 6% 4.53s ± 6% ~ (p=0.081 n=100+100) SSA 11.1s ± 2% 11.1s ± 2% ~ (p=0.064 n=99+96) Flate 167ms ± 2% 167ms ± 1% -0.24% (p=0.004 n=95+96) GoParser 203ms ± 1% 203ms ± 2% -0.14% (p=0.049 n=96+97) Reflect 595ms ± 2% 595ms ± 2% ~ (p=0.544 n=95+92) Tar 225ms ± 2% 224ms ± 2% ~ (p=0.562 n=99+99) XML 312ms ± 2% 311ms ± 1% ~ (p=0.050 n=97+93) [Geo mean] 543ms 542ms -0.13% Change-Id: I8d34ab59f154b28f20c6f9e416b976bfce339baa Reviewed-on: https://go-review.googlesource.com/c/go/+/216220 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-02-21 02:29:11 +00:00
Josh Bleecher Snyder	a3f234c706	cmd/compile: reduce bounds checks in generated rewrite rules CL 213703 converted generated rewrite rules for commutative ops to use loops instead of duplicated code. However, it loaded args using expressions like v.Args[i] and v.Args[i^1], which the compiler could not eliminate bounds for (including with all outstanding prove CLs). Also, given a series of separate rewrite rules for the same op, we generated bounds checks for every rewrite rule, even though we were repeatedly loading the same set of args. This change reduces both sets of bounds checks. Instead of loading v.Args[i] and v.Args[i^1] for commutative loops, we now preload v.Args[0] and v.Args[1] into local variables, and then swap them (as needed) in the commutative loop post statement. And we now load all top level v.Args into local variables at the beginning of every rewrite rule function. The second optimization is the more significant, but the first helps a little, and they play together nicely from the perspective of generating the code. This does increase register pressure, but the reduced bounds checks more than compensate. Note that the vast majority of rewrite rules evaluated are not applied, so the prologue is the most important part of the rewrite rules. There is one subtle aspect to the new generated code. Because the top level v.Args are shared across rewrite rules, and rule evaluation can swap v_0 and v_1, v_0 and v_1 can end up being swapped from one rule to the next. That is OK, because any time a rule does not get applied, they will have been swapped exactly twice. Passes toolstash-check -all. name old time/op new time/op delta Template 213ms ± 2% 211ms ± 2% -0.85% (p=0.000 n=92+96) Unicode 83.5ms ± 2% 83.2ms ± 2% -0.41% (p=0.004 n=95+90) GoTypes 737ms ± 2% 733ms ± 2% -0.51% (p=0.000 n=91+94) Compiler 3.45s ± 2% 3.43s ± 2% -0.44% (p=0.000 n=99+100) SSA 8.54s ± 1% 8.32s ± 2% -2.56% (p=0.000 n=96+99) Flate 136ms ± 2% 135ms ± 1% -0.47% (p=0.000 n=96+96) GoParser 169ms ± 1% 168ms ± 1% -0.33% (p=0.000 n=96+93) Reflect 456ms ± 3% 455ms ± 3% ~ (p=0.261 n=95+94) Tar 186ms ± 2% 185ms ± 2% -0.48% (p=0.000 n=94+95) XML 251ms ± 1% 250ms ± 1% -0.51% (p=0.000 n=91+94) [Geo mean] 424ms 421ms -0.68% name old user-time/op new user-time/op delta Template 275ms ± 1% 274ms ± 2% -0.55% (p=0.000 n=95+98) Unicode 118ms ± 4% 118ms ± 4% ~ (p=0.642 n=98+90) GoTypes 983ms ± 1% 980ms ± 1% -0.30% (p=0.000 n=93+93) Compiler 4.56s ± 6% 4.52s ± 6% -0.72% (p=0.003 n=100+100) SSA 11.4s ± 1% 11.1s ± 1% -2.50% (p=0.000 n=96+97) Flate 168ms ± 1% 167ms ± 1% -0.49% (p=0.000 n=92+92) GoParser 204ms ± 1% 204ms ± 2% -0.27% (p=0.003 n=99+96) Reflect 599ms ± 2% 598ms ± 2% ~ (p=0.116 n=95+92) Tar 227ms ± 2% 225ms ± 2% -0.57% (p=0.000 n=95+98) XML 313ms ± 2% 312ms ± 1% -0.37% (p=0.000 n=89+95) [Geo mean] 547ms 544ms -0.61% file before after Δ % compile 21113112 21109016 -4096 -0.019% total 131704940 131700844 -4096 -0.003% Change-Id: Id6c39e0367e597c0c75b8a4b1eb14cc3cbd11956 Reviewed-on: https://go-review.googlesource.com/c/go/+/216218 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-02-21 00:55:58 +00:00
Daniel Martí	870080752d	cmd/compile: reduce rulegen's output by 200 KiB First, renove unnecessary "// cond:" lines from the generated files. This shaves off about ~7k lines. Second, join "if cond { break }" statements via "\|\|", which allows us to deduplicate a large number of them. This shaves off another ~25k lines. This change is not for readability or simplicity; but rather, to avoid unnecessary verbosity that makes the generated files larger. All in all, git reports that the generated files overall weigh ~200KiB less, or about 2.7% less. While at it, add a -trace flag to rulegen. Updates #33644. Change-Id: I3fac0290a6066070cc62400bf970a4ae0929470a Reviewed-on: https://go-review.googlesource.com/c/go/+/196498 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-09-23 16:36:10 +00:00
Richard Musiol	1c50fcf853	cmd/compile: add 32 bit float registers/variables on wasm Before this change, wasm only used float variables with a size of 64 bit and applied rounding to 32 bit precision where necessary. This change adds proper 32 bit float variables. Reduces the size of pkg/js_wasm by 254 bytes. Change-Id: Ieabe846a8cb283d66def3cdf11e2523b3b31f345 Reviewed-on: https://go-review.googlesource.com/c/go/+/195117 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-09-19 20:26:22 +00:00
Agniva De Sarker	b38be35e4c	cmd/compile: optimize const rotates for wasm architecture This removes the unnecessary code to check whether the shift is within limits or not when the shift amount is a constant. The rules hit 23034 times when building std cmd. grep -E "Wasm.rules:(106\|107\|121\|122\|139\|140)" rulelog \| wc -l 23034 Reduces the size of pkg/js_wasm by 132 bytes. Change-Id: I64a2b8faca08c3b5039d6a027d4676130d2db18d Reviewed-on: https://go-review.googlesource.com/c/go/+/194239 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>	2019-09-10 09:12:32 +00:00
Brian Kessler	b003afe4fe	cmd/compile: intrinsify RotateLeft32 on wasm wasm has 32-bit versions of all integer operations. This change lowers RotateLeft32 to i32.rotl on wasm and intrinsifies the math/bits call. Benchmarking on amd64 under node.js this is ~25% faster. node v10.15.3/amd64 name old time/op new time/op delta RotateLeft 8.37ns ± 1% 8.28ns ± 0% -1.05% (p=0.029 n=4+4) RotateLeft8 11.9ns ± 1% 11.8ns ± 0% ~ (p=0.167 n=5+5) RotateLeft16 11.8ns ± 0% 11.8ns ± 0% ~ (all equal) RotateLeft32 11.9ns ± 1% 8.7ns ± 0% -26.32% (p=0.008 n=5+5) RotateLeft64 8.31ns ± 1% 8.43ns ± 2% ~ (p=0.063 n=5+5) Updates #31265 Change-Id: I5b8e155978faeea536c4f6427ac9564d2f096a46 Reviewed-on: https://go-review.googlesource.com/c/go/+/182359 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>	2019-08-31 17:03:04 +00:00
Agniva De Sarker	7be97af2ff	cmd/compile: apply optimization for readonly globals on wasm Extend the optimization introduced in CL 141118 to the wasm architecture. And for reference, the rules trigger 212 times while building std and cmd $GOOS=js GOARCH=wasm gotip build std cmd $grep -E "Wasm.rules:44(1\|2\|3\|4)" rulelog \| wc -l 212 Updates #26498 Change-Id: I153684a2b98589ae812b42268da08b65679e09d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/185477 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Richard Musiol <neelance@gmail.com>	2019-08-28 05:55:52 +00:00
Agniva De Sarker	8fedb2d338	cmd/compile: optimize bounded shifts on wasm Use the shiftIsBounded function to generate more efficient Shift instructions. Updates #25167 Change-Id: Id350f8462dc3a7ed3bfed0bcbea2860b8f40048a Reviewed-on: https://go-review.googlesource.com/c/go/+/182558 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Richard Musiol <neelance@gmail.com>	2019-08-28 04:44:21 +00:00
Ben Shi	731e6fc34e	cmd/compile: generate Select on WASM This CL performs the branchelim optimization on WASM with its select instruction. And the total size of pkg/js_wasm decreased about 80KB by this optimization. Change-Id: I868eb146120a1cac5c4609c8e9ddb07e4da8a1d9 Reviewed-on: https://go-review.googlesource.com/c/go/+/190957 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-08-28 02:29:25 +00:00
Daniel Martí	79dee788ec	cmd/compile: teach rulegen to remove unused decls First, add cpu and memory profiling flags, as these are useful to see where rulegen is spending its time. It now takes many seconds to run on a recent laptop, so we have to keep an eye on what it's doing. Second, stop writing '_ = var' lines to keep imports and variables used at all times. Now that rulegen removes all such unused names, they're unnecessary. To perform the removal, lean on go/types to first detect what names are unused. We can configure it to give us all the type-checking errors in a file, so we can collect all "declared but not used" errors in a single pass. We then use astutil.Apply to remove the relevant nodes based on the line information from each unused error. This allows us to apply the changes without having to do extra parser+printer roundtrips to plaintext, which are far too expensive. We need to do multiple such passes, as removing an unused variable declaration might then make another declaration unused. Two passes are enough to clean every file at the moment, so add a limit of three passes for now to avoid eating cpu uncontrollably by accident. The resulting performance of the changes above is a ~30% loss across the table, since go/types is fairly expensive. The numbers were obtained with 'benchcmd Rulegen go run *.go', which involves compiling rulegen itself, but that seems reflective of how the program is used. name old time/op new time/op delta Rulegen 5.61s ± 0% 7.36s ± 0% +31.17% (p=0.016 n=5+4) name old user-time/op new user-time/op delta Rulegen 7.20s ± 1% 9.92s ± 1% +37.76% (p=0.016 n=5+4) name old sys-time/op new sys-time/op delta Rulegen 135ms ±19% 169ms ±17% +25.66% (p=0.032 n=5+5) name old peak-RSS-bytes new peak-RSS-bytes delta Rulegen 71.0MB ± 2% 85.6MB ± 2% +20.56% (p=0.008 n=5+5) We can live with a bit more resource usage, but the time/op getting close to 10s isn't good. To win that back, introduce concurrency in main.go. This further increases resource usage a bit, but the real time on this quad-core laptop is greatly reduced. The final benchstat is as follows: name old time/op new time/op delta Rulegen 5.61s ± 0% 3.97s ± 1% -29.26% (p=0.008 n=5+5) name old user-time/op new user-time/op delta Rulegen 7.20s ± 1% 13.91s ± 1% +93.09% (p=0.008 n=5+5) name old sys-time/op new sys-time/op delta Rulegen 135ms ±19% 269ms ± 9% +99.17% (p=0.008 n=5+5) name old peak-RSS-bytes new peak-RSS-bytes delta Rulegen 71.0MB ± 2% 226.3MB ± 1% +218.72% (p=0.008 n=5+5) It might be possible to reduce the cpu or memory usage in the future, such as configuring go/types to do less work, or taking shortcuts to avoid having to run it many times. For now, ~2x cpu and ~4x memory usage seems like a fair trade for a faster and better rulegen. Finally, we can remove the old code that tried to remove some unused variables in a hacky and unmaintainable way. Change-Id: Iff9e83e3f253babf5a1bd48cc993033b8550cee6 Reviewed-on: https://go-review.googlesource.com/c/go/+/189798 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-27 17:04:18 +00:00
Brian Kessler	a28a942768	cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-04-27 20:46:46 +00:00
Richard Musiol	cf8cc7f63c	cmd/compile: add saturating conversions on wasm This change adds the GOWASM option "satconv" to enable the generation of experimental saturating (non-trapping) float-to-int conversions. It improves the performance of the conversion by 42%. Previously the conversions had already been augmented with helper functions to have saturating behavior. Now Wasm.rules is always using the new operation names and wasm/ssa.go is falling back to the helpers if the feature is not enabled. The feature is in phase 4 of the WebAssembly proposal process: https://github.com/WebAssembly/meetings/blob/master/process/phases.md More information on the feature can be found at: https://github.com/WebAssembly/nontrapping-float-to-int-conversions/blob/master/proposals/nontrapping-float-to-int-conversion/Overview.md Change-Id: Ic6c3688017054ede804b02b6b0ffd4a02ef33ad7 Reviewed-on: https://go-review.googlesource.com/c/go/+/170119 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-04-04 16:10:12 +00:00
Richard Musiol	4d23cbc671	cmd/compile: add sign-extension operators on wasm This change adds the GOWASM option "signext" to enable the generation of experimental sign-extension operators. The feature is in phase 4 of the WebAssembly proposal process: https://github.com/WebAssembly/meetings/blob/master/process/phases.md More information on the feature can be found at: https://github.com/WebAssembly/sign-extension-ops/blob/master/proposals/sign-extension-ops/Overview.md Change-Id: I6b30069390a8699fbecd9fb4d1d61e13c59b0333 Reviewed-on: https://go-review.googlesource.com/c/go/+/168882 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-03-28 20:23:05 +00:00
Daniel Martí	0fbf681840	cmd/compile: reduce rulegen's for loop verbosity A lot of the naked for loops begin like: for { v := b.Control if v.Op != OpConstBool { break } ... return true } Instead, write them out in a more compact and readable way: for v.Op == OpConstBool { ... return true } This requires the addition of two bytes.Buffer writers, as this helps us make a decision based on future pieces of generated code. This probably makes rulegen slightly slower, but that's not noticeable; the code generation still takes ~3.5s on my laptop, excluding build time. The "v := b.Control" declaration can be moved to the top of each function. Even though the rules can modify b.Control when firing, they also make the function return, so v can't be used again. While at it, remove three unnecessary lines from the top of each rewriteBlock func. In total, this results in ~4k lines removed from the generated code, and a slight improvement in readability. Change-Id: I317e4c6a4842c64df506f4513375475fad2aeec5 Reviewed-on: https://go-review.googlesource.com/c/go/+/167399 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-03-22 15:41:51 +00:00
Richard Musiol	5ee1b84959	math, math/bits: add intrinsics for wasm This commit adds compiler intrinsics for the packages math and math/bits on the wasm architecture for better performance. benchmark old ns/op new ns/op delta BenchmarkCeil 8.31 3.21 -61.37% BenchmarkCopysign 5.24 3.88 -25.95% BenchmarkAbs 5.42 3.34 -38.38% BenchmarkFloor 8.29 3.18 -61.64% BenchmarkRoundToEven 9.76 3.26 -66.60% BenchmarkSqrtLatency 8.13 4.88 -39.98% BenchmarkSqrtPrime 5246 3535 -32.62% BenchmarkTrunc 8.29 3.15 -62.00% BenchmarkLeadingZeros 13.0 4.23 -67.46% BenchmarkLeadingZeros8 4.65 4.42 -4.95% BenchmarkLeadingZeros16 7.60 4.38 -42.37% BenchmarkLeadingZeros32 10.7 4.48 -58.13% BenchmarkLeadingZeros64 12.9 4.31 -66.59% BenchmarkTrailingZeros 6.52 4.04 -38.04% BenchmarkTrailingZeros8 4.57 4.14 -9.41% BenchmarkTrailingZeros16 6.69 4.16 -37.82% BenchmarkTrailingZeros32 6.97 4.23 -39.31% BenchmarkTrailingZeros64 6.59 4.00 -39.30% BenchmarkOnesCount 7.93 3.30 -58.39% BenchmarkOnesCount8 3.56 3.19 -10.39% BenchmarkOnesCount16 4.85 3.19 -34.23% BenchmarkOnesCount32 7.27 3.19 -56.12% BenchmarkOnesCount64 8.08 3.28 -59.41% BenchmarkRotateLeft 4.88 3.80 -22.13% BenchmarkRotateLeft64 5.03 3.63 -27.83% Change-Id: Ic1e0c2984878be8defb6eb7eb6ee63765c793222 Reviewed-on: https://go-review.googlesource.com/c/go/+/165177 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-03-14 19:46:19 +00:00
Michael Munday	04f1b65cc6	cmd/compile: try and access last argument first in rulegen This reduces the number of extra bounds check hints we need to insert. For example, rather than producing: _ = v.Args[2] x := v.Args[0] y := v.Args[1] z := v.Args[2] We now produce: z := v.Args[2] x := v.Args[0] y := v.Args[1] This gets rid of about 7000 lines of code from the rewrite rules. Change-Id: I1291cf0f82e8d035a6d65bce7dee6cedee04cbcd Reviewed-on: https://go-review.googlesource.com/c/go/+/167397 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-03-13 15:11:37 +00:00
Josh Bleecher Snyder	53948127d3	cmd/compile: make rulegen magic variable prediction more precise The sheer length of the generated rules files makes my editor and git client unhappy. This change is a small step towards shortening them. We recognize a few magic variables during rulegen: b, config, fe, typ. Of these, only b appears prone to false positives. By tightening the heuristic and fixing one case in MIPS.rules, we can make the heuristic enough that it has no failures. That allows us to remove the hedge assignments to _, removing 3000 pointless lines of code. Change-Id: I080cde5db28c8277cb3fd9ddcd829306c9a27785 Reviewed-on: https://go-review.googlesource.com/c/go/+/166979 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-03-12 20:13:46 +00:00
Richard Musiol	c0d82bb0ec	all: rename WebAssembly instructions according to spec changes The names of some instructions have been updated in the WebAssembly specification to be more consistent, see `994591e51c`. This change to the spec is possible because it is still in a draft state. Go's support for WebAssembly is still experimental and thus excempt from the compatibility promise. Being consistent with the spec should warrant this breaking change to the assembly instruction names. Change-Id: Iafb8b18ee7f55dd0e23c6c7824aa1fad43117ef1 Reviewed-on: https://go-review.googlesource.com/c/153797 Run-TryBot: Richard Musiol <neelance@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-03-03 21:10:01 +00:00
Richard Musiol	72d24a7484	cmd/compile: simplify zero ext operations on wasm On wasm every integer is stored with 64 bits. We can do zero extension by simply zeroing the upper bits. Change-Id: I02c54a38b3b2b7654fff96055edab1b92d48ff32 Reviewed-on: https://go-review.googlesource.com/c/164461 Run-TryBot: Richard Musiol <neelance@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-02-28 15:25:42 +00:00
Josh Bleecher Snyder	3bab4373c7	cmd/compile: make fmt available in rewrite rules During development and debugging, I often want to write noteRule(fmt.Sprintf(...)), and end up manually adding the import to the generated code. Let's just make it always available instead. Change-Id: I1e2d47c98ba056e1b5da42e35fb6ad26f1d9cc3d Reviewed-on: https://go-review.googlesource.com/c/145207 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2018-10-28 18:46:36 +00:00

1 2

57 Commits