mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
David Chase	f7404974da	cmd/compile: finish GOEXPERIMENT=preemptibleloops repair A newish check for branch-likely on single-successor blocks caught a case where the preemption-check inserter was setting "likely" on an unconditional branch. Fixed by checking for that case before setting likely. Also removed an overconservative restriction on parallel compilation for GOEXPERIMENT=preemptibleloops; it works fine, it is just another control-flow transformation. Change-Id: I8e786e6281e0631cac8d80cff67bfb6402b4d225 Reviewed-on: https://go-review.googlesource.com/102317 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-26 17:44:14 +00:00
Michael Munday	8afa8a3374	cmd/compile: use 32-bit comparisons where possible on s390x We use 32-bit operations for 8- and 16-bit arithmetic, so use them for comparisons too. This won't change performance but it is more consistent and makes testing 8- and 16-bit comparison codegen slightly more straightforward (for follow up CL). Also fix a typo and add some additional double sign and zero extension rules to remove the operations inserted by the comparison rules. Change-Id: I89ec1b0e09cb8be8090cf007be283ad88bba75a4 Reviewed-on: https://go-review.googlesource.com/102556 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-26 17:41:34 +00:00
Hana Kim	68a1c9c400	internal/trace: compute span stats as computing goroutine stats Move part of UserSpan event processing from cmd/trace.analyzeAnnotations to internal/trace.GoroutineStats that returns analyzed per-goroutine execution information. Now the execution information includes list of spans and their execution information. cmd/trace.analyzeAnnotations utilizes the span execution information from internal/trace.GoroutineStats and connects them with task information. Change-Id: Ib7f79a3ba652a4ae55cd81ea17565bcc7e241c5c Reviewed-on: https://go-review.googlesource.com/101917 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Peter Weinberger <pjw@google.com>	2018-03-26 16:59:01 +00:00
Ilya Tocar	24cd112086	cmd/compile/internal/ssa: optimize away double NEG on amd64 When lowering some ops on amd64 we generate additional NEGQ. This may result in code like this: NEGQ R12 NEGQ R12 Optimize it away. Gain is not significant, about ~0.5% gain in geomean in compress/flate and 200 bytes codesize reduction in go tool. Full results below: name old time/op new time/op delta Encode/Digits/Huffman/1e4-6 65.8µs ± 0% 65.7µs ± 0% -0.21% (p=0.010 n=10+9) Encode/Digits/Huffman/1e5-6 633µs ± 0% 632µs ± 0% ~ (p=0.370 n=8+9) Encode/Digits/Huffman/1e6-6 6.30ms ± 1% 6.29ms ± 1% ~ (p=0.796 n=10+10) Encode/Digits/Speed/1e4-6 281µs ± 0% 280µs ± 1% -0.34% (p=0.043 n=8+10) Encode/Digits/Speed/1e5-6 2.66ms ± 0% 2.66ms ± 0% -0.09% (p=0.043 n=10+10) Encode/Digits/Speed/1e6-6 26.3ms ± 0% 26.3ms ± 0% ~ (p=0.190 n=10+10) Encode/Digits/Default/1e4-6 554µs ± 0% 557µs ± 0% +0.46% (p=0.001 n=9+10) Encode/Digits/Default/1e5-6 8.63ms ± 1% 8.62ms ± 1% ~ (p=0.912 n=10+10) Encode/Digits/Default/1e6-6 92.7ms ± 1% 92.2ms ± 1% ~ (p=0.052 n=10+10) Encode/Digits/Compression/1e4-6 558µs ± 1% 557µs ± 1% ~ (p=0.481 n=10+10) Encode/Digits/Compression/1e5-6 8.58ms ± 0% 8.61ms ± 1% ~ (p=0.315 n=8+10) Encode/Digits/Compression/1e6-6 92.3ms ± 1% 92.4ms ± 1% ~ (p=0.971 n=10+10) Encode/Twain/Huffman/1e4-6 89.5µs ± 0% 89.0µs ± 1% -0.48% (p=0.001 n=9+9) Encode/Twain/Huffman/1e5-6 727µs ± 1% 728µs ± 0% ~ (p=0.604 n=10+9) Encode/Twain/Huffman/1e6-6 7.21ms ± 0% 7.19ms ± 1% ~ (p=0.696 n=8+10) Encode/Twain/Speed/1e4-6 320µs ± 1% 321µs ± 1% ~ (p=0.353 n=10+10) Encode/Twain/Speed/1e5-6 2.63ms ± 0% 2.62ms ± 1% -0.33% (p=0.016 n=8+10) Encode/Twain/Speed/1e6-6 25.8ms ± 0% 25.8ms ± 0% ~ (p=0.360 n=10+8) Encode/Twain/Default/1e4-6 677µs ± 1% 671µs ± 1% -0.88% (p=0.000 n=10+10) Encode/Twain/Default/1e5-6 10.5ms ± 1% 10.3ms ± 0% -2.06% (p=0.000 n=10+10) Encode/Twain/Default/1e6-6 113ms ± 1% 111ms ± 1% -1.96% (p=0.000 n=10+9) Encode/Twain/Compression/1e4-6 688µs ± 0% 679µs ± 1% -1.30% (p=0.000 n=7+10) Encode/Twain/Compression/1e5-6 11.6ms ± 1% 11.3ms ± 1% -2.10% (p=0.000 n=10+10) Encode/Twain/Compression/1e6-6 126ms ± 1% 124ms ± 0% -1.57% (p=0.000 n=10+10) [Geo mean] 3.45ms 3.44ms -0.46% name old speed new speed delta Encode/Digits/Huffman/1e4-6 152MB/s ± 0% 152MB/s ± 0% +0.21% (p=0.009 n=10+9) Encode/Digits/Huffman/1e5-6 158MB/s ± 0% 158MB/s ± 0% ~ (p=0.336 n=8+9) Encode/Digits/Huffman/1e6-6 159MB/s ± 1% 159MB/s ± 1% ~ (p=0.781 n=10+10) Encode/Digits/Speed/1e4-6 35.6MB/s ± 0% 35.7MB/s ± 1% +0.34% (p=0.020 n=8+10) Encode/Digits/Speed/1e5-6 37.6MB/s ± 0% 37.7MB/s ± 0% +0.09% (p=0.049 n=10+10) Encode/Digits/Speed/1e6-6 38.0MB/s ± 0% 38.0MB/s ± 0% ~ (p=0.146 n=10+10) Encode/Digits/Default/1e4-6 18.0MB/s ± 0% 18.0MB/s ± 0% -0.45% (p=0.002 n=9+10) Encode/Digits/Default/1e5-6 11.6MB/s ± 1% 11.6MB/s ± 1% ~ (p=0.644 n=10+10) Encode/Digits/Default/1e6-6 10.8MB/s ± 1% 10.8MB/s ± 1% +0.51% (p=0.044 n=10+10) Encode/Digits/Compression/1e4-6 17.9MB/s ± 1% 17.9MB/s ± 1% ~ (p=0.468 n=10+10) Encode/Digits/Compression/1e5-6 11.7MB/s ± 0% 11.6MB/s ± 1% ~ (p=0.322 n=8+10) Encode/Digits/Compression/1e6-6 10.8MB/s ± 1% 10.8MB/s ± 1% ~ (p=0.983 n=10+10) Encode/Twain/Huffman/1e4-6 112MB/s ± 0% 112MB/s ± 1% +0.42% (p=0.002 n=8+9) Encode/Twain/Huffman/1e5-6 138MB/s ± 1% 137MB/s ± 0% ~ (p=0.616 n=10+9) Encode/Twain/Huffman/1e6-6 139MB/s ± 0% 139MB/s ± 1% ~ (p=0.652 n=8+10) Encode/Twain/Speed/1e4-6 31.3MB/s ± 1% 31.2MB/s ± 1% ~ (p=0.342 n=10+10) Encode/Twain/Speed/1e5-6 38.0MB/s ± 0% 38.1MB/s ± 1% +0.33% (p=0.011 n=8+10) Encode/Twain/Speed/1e6-6 38.8MB/s ± 0% 38.7MB/s ± 0% ~ (p=0.325 n=10+8) Encode/Twain/Default/1e4-6 14.8MB/s ± 1% 14.9MB/s ± 1% +0.88% (p=0.000 n=10+10) Encode/Twain/Default/1e5-6 9.48MB/s ± 1% 9.68MB/s ± 0% +2.11% (p=0.000 n=10+10) Encode/Twain/Default/1e6-6 8.86MB/s ± 1% 9.03MB/s ± 1% +1.97% (p=0.000 n=10+9) Encode/Twain/Compression/1e4-6 14.5MB/s ± 0% 14.7MB/s ± 1% +1.31% (p=0.000 n=7+10) Encode/Twain/Compression/1e5-6 8.63MB/s ± 1% 8.82MB/s ± 1% +2.17% (p=0.000 n=10+10) Encode/Twain/Compression/1e6-6 7.92MB/s ± 1% 8.05MB/s ± 1% +1.59% (p=0.000 n=10+10) [Geo mean] 29.0MB/s 29.1MB/s +0.47% // symSizeComp `which go` go_old: section differences: global text (code) = 203 bytes (0.005131%) read-only data = 1 bytes (0.000057%) Total difference 204 bytes (0.003297%) Change-Id: Ie2cdfa1216472d78694fff44d215b3b8e71cf7bf Reviewed-on: https://go-review.googlesource.com/102277 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-26 16:19:53 +00:00
Hana (Hyang-Ah) Kim	ea1f483240	cmd/trace: beautify goroutine page - Summary: also includes links to pprof data. - Sortable table: sorting is done on server-side. The intention is that later, I want to add pagination feature and limit the page size the browser has to handle. - Stacked horizontal bar graph to present total time breakdown. - Human-friendly time representation. - No dependency on external fancy javascript libraries to allow it to function without an internet connection. Change-Id: I91e5c26746e59ad0329dfb61e096e11f768c7b73 Reviewed-on: https://go-review.googlesource.com/102156 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Andrew Bonventre <andybons@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-26 15:36:56 +00:00
Alberto Donizetti	2ba98f1ae9	cmd/compile: avoid some allocations in regalloc Compilebench: name old time/op new time/op delta Template 283ms ± 3% 281ms ± 4% ~ (p=0.242 n=20+20) Unicode 137ms ± 6% 135ms ± 6% ~ (p=0.194 n=20+19) GoTypes 890ms ± 2% 883ms ± 1% -0.74% (p=0.001 n=19+19) Compiler 4.21s ± 2% 4.20s ± 2% -0.40% (p=0.033 n=20+19) SSA 9.86s ± 2% 9.68s ± 1% -1.80% (p=0.000 n=20+19) Flate 185ms ± 5% 185ms ± 7% ~ (p=0.429 n=20+20) GoParser 222ms ± 3% 222ms ± 4% ~ (p=0.588 n=19+20) Reflect 572ms ± 2% 570ms ± 3% ~ (p=0.113 n=19+20) Tar 263ms ± 4% 259ms ± 2% -1.41% (p=0.013 n=20+20) XML 321ms ± 2% 321ms ± 4% ~ (p=0.835 n=20+19) name old user-time/op new user-time/op delta Template 400ms ± 5% 405ms ± 5% ~ (p=0.096 n=20+20) Unicode 217ms ± 8% 213ms ± 8% ~ (p=0.242 n=20+20) GoTypes 1.23s ± 3% 1.22s ± 3% ~ (p=0.923 n=19+20) Compiler 5.76s ± 6% 5.81s ± 2% ~ (p=0.687 n=20+19) SSA 14.2s ± 4% 14.0s ± 4% ~ (p=0.121 n=20+20) Flate 248ms ± 7% 251ms ±10% ~ (p=0.369 n=20+20) GoParser 308ms ± 5% 305ms ± 6% ~ (p=0.336 n=19+20) Reflect 771ms ± 2% 766ms ± 2% ~ (p=0.113 n=20+19) Tar 370ms ± 5% 362ms ± 7% -2.06% (p=0.036 n=19+20) XML 435ms ± 4% 432ms ± 5% ~ (p=0.369 n=20+20) name old alloc/op new alloc/op delta Template 39.5MB ± 0% 39.4MB ± 0% -0.20% (p=0.000 n=20+20) Unicode 29.1MB ± 0% 29.1MB ± 0% ~ (p=0.064 n=20+20) GoTypes 117MB ± 0% 117MB ± 0% -0.17% (p=0.000 n=20+20) Compiler 503MB ± 0% 502MB ± 0% -0.15% (p=0.000 n=19+19) SSA 1.42GB ± 0% 1.42GB ± 0% -0.16% (p=0.000 n=20+20) Flate 25.3MB ± 0% 25.3MB ± 0% -0.19% (p=0.000 n=20+20) GoParser 31.4MB ± 0% 31.3MB ± 0% -0.14% (p=0.000 n=20+18) Reflect 78.1MB ± 0% 77.9MB ± 0% -0.34% (p=0.000 n=20+19) Tar 40.1MB ± 0% 40.0MB ± 0% -0.17% (p=0.000 n=20+20) XML 45.3MB ± 0% 45.2MB ± 0% -0.13% (p=0.000 n=20+20) name old allocs/op new allocs/op delta Template 393k ± 0% 392k ± 0% -0.21% (p=0.000 n=20+19) Unicode 337k ± 0% 337k ± 0% -0.02% (p=0.000 n=20+20) GoTypes 1.22M ± 0% 1.22M ± 0% -0.21% (p=0.000 n=20+20) Compiler 4.77M ± 0% 4.76M ± 0% -0.16% (p=0.000 n=20+20) SSA 11.8M ± 0% 11.8M ± 0% -0.12% (p=0.000 n=20+20) Flate 242k ± 0% 241k ± 0% -0.20% (p=0.000 n=20+20) GoParser 324k ± 0% 324k ± 0% -0.14% (p=0.000 n=20+20) Reflect 985k ± 0% 981k ± 0% -0.38% (p=0.000 n=20+20) Tar 403k ± 0% 402k ± 0% -0.19% (p=0.000 n=20+20) XML 424k ± 0% 424k ± 0% -0.16% (p=0.000 n=19+20) Change-Id: I131e382b64cd6db11a9263a477d45d80c180c499 Reviewed-on: https://go-review.googlesource.com/102421 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-25 18:27:49 +00:00
Daniel Martí	b1892d740e	cmd/compile/internal/gc: various cleanups Remove a couple of unnecessary var declarations, an unused sort.Sort type, and simplify a range by using the two-name variant. Change-Id: Ia251f634db0bfbe8b1d553b8659272ddbd13b2c3 Reviewed-on: https://go-review.googlesource.com/102336 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-24 19:44:18 +00:00
Daniel Nephin	5526ef1c51	cmd/test2json: document missing "skip" action Change-Id: I906e61170279f0647598e2fd4fa931aac1b69288 GitHub-Last-Rev: `f6df43e8e1` GitHub-Pull-Request: golang/go#24517 Reviewed-on: https://go-review.googlesource.com/102396 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-24 18:37:22 +00:00
Alberto Donizetti	a27cd4fd31	test/codegen: port tbz/tbnz arm64 tests And delete them from asm_test. Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d Reviewed-on: https://go-review.googlesource.com/102296 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-24 09:35:53 +00:00
Giovanni Bajo	d54902ece9	cmd/compile: in prove, shortcircuit self-facts Sometimes, we can end up calling update with a self-relation about a variable (x REL x). In this case, there is no need to record anything: the relation is unsatisfiable if and only if it doesn't contain eq. This also helps avoiding infinite loop in next CL that will introduce transitive closure of relations. Passes toolstash -cmp. Change-Id: Ic408452ec1c13653f22ada35466ec98bc14aaa8e Reviewed-on: https://go-review.googlesource.com/100276 Reviewed-by: Austin Clements <austin@google.com>	2018-03-24 03:06:21 +00:00
Giovanni Bajo	385d936fb2	cmd/compile: in prove, fail fast when unsat is found When an unsatisfiable relation is recorded in the facts table, there is no need to compute further relations or updates additional data structures. Since we're about to transitively propagate relations, make sure to fail as fast as possible to avoid doing useless work in dead branches. Passes toolstash -cmp. Change-Id: I23eed376d62776824c33088163c7ac9620abce85 Reviewed-on: https://go-review.googlesource.com/100275 Reviewed-by: Austin Clements <austin@google.com>	2018-03-24 03:06:01 +00:00
Giovanni Bajo	79112707bb	cmd/compile: add patterns for bit set/clear/complement on amd64 This patch completes implementation of BT(Q\|L), and adds support for BT(S\|R\|C)(Q\|L). Example of code changes from time.(*Time).addSec: if t.wall&hasMonotonic != 0 { 0x1073465 488b08 MOVQ 0(AX), CX 0x1073468 4889ca MOVQ CX, DX 0x107346b 48c1e93f SHRQ $0x3f, CX 0x107346f 48c1e13f SHLQ $0x3f, CX 0x1073473 48f7c1ffffffff TESTQ $-0x1, CX 0x107347a 746b JE 0x10734e7 if t.wall&hasMonotonic != 0 { 0x1073435 488b08 MOVQ 0(AX), CX 0x1073438 480fbae13f BTQ $0x3f, CX 0x107343d 7363 JAE 0x10734a2 Another example: t.wall = t.wall&nsecMask \| uint64(dsec)<<nsecShift \| hasMonotonic 0x10734c8 4881e1ffffff3f ANDQ $0x3fffffff, CX 0x10734cf 48c1e61e SHLQ $0x1e, SI 0x10734d3 4809ce ORQ CX, SI 0x10734d6 48b90000000000000080 MOVQ $0x8000000000000000, CX 0x10734e0 4809f1 ORQ SI, CX 0x10734e3 488908 MOVQ CX, 0(AX) t.wall = t.wall&nsecMask \| uint64(dsec)<<nsecShift \| hasMonotonic 0x107348b 4881e2ffffff3f ANDQ $0x3fffffff, DX 0x1073492 48c1e61e SHLQ $0x1e, SI 0x1073496 4809f2 ORQ SI, DX 0x1073499 480fbaea3f BTSQ $0x3f, DX 0x107349e 488910 MOVQ DX, 0(AX) Go1 benchmarks seem unaffected, and I would be surprised otherwise: name old time/op new time/op delta BinaryTree17-4 2.64s ± 4% 2.56s ± 9% -2.92% (p=0.008 n=9+9) Fannkuch11-4 2.90s ± 1% 2.95s ± 3% +1.76% (p=0.010 n=10+9) FmtFprintfEmpty-4 35.3ns ± 1% 34.5ns ± 2% -2.34% (p=0.004 n=9+8) FmtFprintfString-4 57.0ns ± 1% 58.4ns ± 5% +2.52% (p=0.029 n=9+10) FmtFprintfInt-4 59.8ns ± 3% 59.8ns ± 6% ~ (p=0.565 n=10+10) FmtFprintfIntInt-4 93.9ns ± 3% 91.2ns ± 5% -2.94% (p=0.014 n=10+9) FmtFprintfPrefixedInt-4 107ns ± 6% 104ns ± 6% ~ (p=0.099 n=10+10) FmtFprintfFloat-4 187ns ± 3% 188ns ± 3% ~ (p=0.505 n=10+9) FmtManyArgs-4 410ns ± 1% 415ns ± 6% ~ (p=0.649 n=8+10) GobDecode-4 5.30ms ± 3% 5.27ms ± 3% ~ (p=0.436 n=10+10) GobEncode-4 4.62ms ± 5% 4.47ms ± 2% -3.24% (p=0.001 n=9+10) Gzip-4 197ms ± 4% 193ms ± 3% ~ (p=0.123 n=10+10) Gunzip-4 30.4ms ± 3% 30.1ms ± 3% ~ (p=0.481 n=10+10) HTTPClientServer-4 76.3µs ± 1% 76.0µs ± 1% ~ (p=0.236 n=8+9) JSONEncode-4 10.5ms ± 9% 10.3ms ± 3% ~ (p=0.280 n=10+10) JSONDecode-4 42.3ms ±10% 41.3ms ± 2% ~ (p=0.053 n=9+10) Mandelbrot200-4 3.80ms ± 2% 3.72ms ± 2% -2.15% (p=0.001 n=9+10) GoParse-4 2.88ms ±10% 2.81ms ± 2% ~ (p=0.247 n=10+10) RegexpMatchEasy0_32-4 69.5ns ± 4% 68.6ns ± 2% ~ (p=0.171 n=10+10) RegexpMatchEasy0_1K-4 165ns ± 3% 162ns ± 3% ~ (p=0.137 n=10+10) RegexpMatchEasy1_32-4 65.7ns ± 6% 64.4ns ± 2% -2.02% (p=0.037 n=10+10) RegexpMatchEasy1_1K-4 278ns ± 2% 279ns ± 3% ~ (p=0.991 n=8+9) RegexpMatchMedium_32-4 99.3ns ± 3% 98.5ns ± 4% ~ (p=0.457 n=10+9) RegexpMatchMedium_1K-4 30.1µs ± 1% 30.4µs ± 2% ~ (p=0.173 n=8+10) RegexpMatchHard_32-4 1.40µs ± 2% 1.41µs ± 4% ~ (p=0.565 n=10+10) RegexpMatchHard_1K-4 42.5µs ± 1% 41.5µs ± 3% -2.13% (p=0.002 n=8+9) Revcomp-4 332ms ± 4% 328ms ± 5% ~ (p=0.720 n=9+10) Template-4 48.3ms ± 2% 49.6ms ± 3% +2.56% (p=0.002 n=8+10) TimeParse-4 252ns ± 2% 249ns ± 3% ~ (p=0.116 n=9+10) TimeFormat-4 262ns ± 4% 252ns ± 3% -4.01% (p=0.000 n=9+10) name old speed new speed delta GobDecode-4 145MB/s ± 3% 146MB/s ± 3% ~ (p=0.436 n=10+10) GobEncode-4 166MB/s ± 5% 172MB/s ± 2% +3.28% (p=0.001 n=9+10) Gzip-4 98.6MB/s ± 4% 100.4MB/s ± 3% ~ (p=0.123 n=10+10) Gunzip-4 639MB/s ± 3% 645MB/s ± 3% ~ (p=0.481 n=10+10) JSONEncode-4 185MB/s ± 8% 189MB/s ± 3% ~ (p=0.280 n=10+10) JSONDecode-4 46.0MB/s ± 9% 47.0MB/s ± 2% +2.21% (p=0.046 n=9+10) GoParse-4 20.1MB/s ± 9% 20.6MB/s ± 2% ~ (p=0.239 n=10+10) RegexpMatchEasy0_32-4 460MB/s ± 4% 467MB/s ± 2% ~ (p=0.165 n=10+10) RegexpMatchEasy0_1K-4 6.19GB/s ± 3% 6.28GB/s ± 3% ~ (p=0.165 n=10+10) RegexpMatchEasy1_32-4 487MB/s ± 5% 497MB/s ± 2% +2.00% (p=0.043 n=10+10) RegexpMatchEasy1_1K-4 3.67GB/s ± 2% 3.67GB/s ± 3% ~ (p=0.963 n=8+9) RegexpMatchMedium_32-4 10.1MB/s ± 3% 10.1MB/s ± 4% ~ (p=0.435 n=10+9) RegexpMatchMedium_1K-4 34.0MB/s ± 1% 33.7MB/s ± 2% ~ (p=0.173 n=8+10) RegexpMatchHard_32-4 22.9MB/s ± 2% 22.7MB/s ± 4% ~ (p=0.565 n=10+10) RegexpMatchHard_1K-4 24.0MB/s ± 3% 24.7MB/s ± 3% +2.64% (p=0.001 n=9+9) Revcomp-4 766MB/s ± 4% 775MB/s ± 5% ~ (p=0.720 n=9+10) Template-4 40.2MB/s ± 2% 39.2MB/s ± 3% -2.47% (p=0.002 n=8+10) The rules match ~1800 times during all.bash. Fixes #18943 Change-Id: I64be1ada34e89c486dfd935bf429b35652117ed4 Reviewed-on: https://go-review.googlesource.com/94766 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-24 02:38:50 +00:00
isharipo	3afd2d7fc8	cmd/compile/internal/gc: properly initialize ssa.Func Type field The ssa.Func has Type field that is described as function signature type. It never gets any value and remains nil. This leads to "<T>" signature printed representation. Given this function declaration: func foo(x int, f func() string) (int, error) GOSSAFUNC printed it as below: compiling foo foo <T> After this change: compiling foo foo func(int, func() string) (int, error) Change-Id: Iec5eec8aac5c76ff184659e30f41b2f5fe86d329 Reviewed-on: https://go-review.googlesource.com/102375 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-03-24 02:18:52 +00:00
Matthew Dempsky	ea668e18a6	cmd/compile: always write pack files By always writing out pack files, the object file format can be simplified somewhat. In particular, the export data format will no longer require escaping, because the pack file provides appropriate framing. This CL does not affect build systems that use -pack, which includes all major Go build systems (cmd/go, gb, bazel). Also, existing package import logic already distinguishes pack/object files based on file contents rather than file extension. The only exception is cmd/pack, which specially handled object files created by cmd/compile when used with the 'c' mode. This mode is extended to now recognize the pack files produced by cmd/compile and handle them as before. Passes toolstash-check. Updates #21705. Updates #24512. Change-Id: Idf131013bfebd73a5cde7e087eb19964503a9422 Reviewed-on: https://go-review.googlesource.com/102236 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-24 00:51:24 +00:00
Matthew Dempsky	699b0d4e52	cmd/link: skip __.PKGDEF in archives The __.PKGDEF file is a compiler object file only intended for other compilers. Also, for build systems that use -linkobj, all of the information it contains is present within the linker object files already, so look for it there instead. This requires a little bit of code reorganization. Significantly, previously when loading an archive file, the __.PKGDEF file was authoritative on whether the package was "main" and/or "safe". Now that we're using the Go object files instead, there's the issue that there can be multiple Go object files in an archive (because when using assembly, each assembly file becomes its own additional object file). The solution taken here is to check if any object file within the package declares itself as "main" and/or "safe". Updates #24512. Change-Id: I70243a293bdf34b8555c0bf1833f8933b2809449 Reviewed-on: https://go-review.googlesource.com/102281 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-24 00:17:33 +00:00
Matthew Dempsky	946af1b658	cmd/link: make sure we're hashing __.PKGDEF in genhash This is currently always the case because loadobjfile complains if it's not, but that will be changed soon. Updates #24512. Change-Id: I262daca765932a0f4cea3fcc1cc80ca90de07a59 Reviewed-on: https://go-review.googlesource.com/102280 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-23 22:28:03 +00:00
Hyang-Ah Hana Kim	58734039bd	Revert "cmd/vendor/.../pprof: refresh from upstream@a74ae6f" This reverts commit `c6e69ec7f9`. Reason for revert: Broke builders. #24508 Change-Id: I66abff0dd14ec6e1f8d8d982ccfb0438633b639d Reviewed-on: https://go-review.googlesource.com/102316 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2018-03-23 15:09:04 +00:00
Hana (Hyang-Ah) Kim	c6e69ec7f9	cmd/vendor/.../pprof: refresh from upstream@a74ae6f Merges updates listed in `0e0e5b725...a74ae6f` Update #24443 cmd/vendor/vendor.json was updated manually. Change-Id: I15d5fe82ac18263d4d54f5773cee0e197e93dd59 Reviewed-on: https://go-review.googlesource.com/101736 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>	2018-03-23 14:36:29 +00:00
Matthew Dempsky	50921bfa2e	cmd/compile: change unsafeUintptrTag from var to const Change-Id: Ie30878199e24cce5b75428e6b602c017ebd16642 Reviewed-on: https://go-review.googlesource.com/102175 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2018-03-22 19:38:06 +00:00
Daniel Martí	02798ed936	cmd/compile: use more range fors in gc Slightly simplifies the code. Made sure to exclude the cases that would change behavior, such as when the iterated value is a string, when the index is modified within the body, or when the slice is modified. Also checked that all the elements are of pointer type, to avoid the corner case where non-pointer types could be copied by mistake. Change-Id: Iea64feb2a9a6a4c94ada9ff3ace40ee173505849 Reviewed-on: https://go-review.googlesource.com/100557 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-22 18:38:19 +00:00
Austin Clements	48f990b4a5	cmd/compile: fix GOEXPERIMENT=preemptibleloops type-checking This experiment has gone stale. It causes a type-checking failure because the condition of the OIF produced by range loop lowering has type "untyped bool". Fix this by typechecking the whole OIF statement, not just its condition. This doesn't quite fix the whole experiment, but it gets further. Something about preemption point insertion is causing failures like "internal compiler error: likeliness prediction 1 for block b10 with 1 successors" in cmd/compile/internal/gc. Change-Id: I7d80d618d7c91c338bf5f2a8dc174d582a479df3 Reviewed-on: https://go-review.googlesource.com/102157 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-03-22 18:20:31 +00:00
Travis Bischel	4f7b774822	cmd/compile: specialize Move up to 79B on amd64 Move currently uses mov instructions directly up to 31 bytes and then switches to duffcopy. Moving 31 bytes is 4 instructions corresponding to two loads and two stores, (or 6 if !useSSE) depending on the usage, duffcopy is five (one or two mov, two or three lea, one call). This adds direct mov instructions for Move's of size 32, 48, and 64 with sse and for only size 32 without. With useSSE: - 32 is 4 instructions (byte +/- comparison below) - 33 thru 48 is 6 - 49 thru 64 is 8 Without: - 32 is 8 Note that the only platform with useSSE set to false is plan 9. I have built three projects based off tip and tip with this patch and the project's byte size is equal to or less than they were prior. The basis of this change is that copying data with instructions directly is nearly free, whereas calling into duffcopy adds a bit of overhead. This is most noticeable in range statements where elements are 32+ bytes. For code with the following pattern: func Benchmark32Range(b testing.B) { var f s32 for _, count := range []int{10, 100, 1000, 10000} { name := strconv.Itoa(count) b.Run(name, func(b testing.B) { base := make([]s32, count) for i := 0; i < b.N; i++ { for _, v := range base { f = v } } }) } _ = f } These are the resulting benchmarks: Benchmark16Range/10-4 19.1 19.1 +0.00% Benchmark16Range/100-4 169 170 +0.59% Benchmark16Range/1000-4 1684 1691 +0.42% Benchmark16Range/10000-4 18147 18124 -0.13% Benchmark31Range/10-4 141 142 +0.71% Benchmark31Range/100-4 1407 1410 +0.21% Benchmark31Range/1000-4 14070 14074 +0.03% Benchmark31Range/10000-4 141781 141759 -0.02% Benchmark32Range/10-4 71.4 32.2 -54.90% Benchmark32Range/100-4 695 326 -53.09% Benchmark32Range/1000-4 7166 3313 -53.77% Benchmark32Range/10000-4 72571 35425 -51.19% Benchmark64Range/10-4 87.8 64.9 -26.08% Benchmark64Range/100-4 868 629 -27.53% Benchmark64Range/1000-4 9355 6907 -26.17% Benchmark64Range/10000-4 94463 70385 -25.49% Benchmark79Range/10-4 177 152 -14.12% Benchmark79Range/100-4 1769 1531 -13.45% Benchmark79Range/1000-4 17893 15532 -13.20% Benchmark79Range/10000-4 178947 155551 -13.07% Benchmark80Range/10-4 99.6 99.7 +0.10% Benchmark80Range/100-4 987 985 -0.20% Benchmark80Range/1000-4 10573 10560 -0.12% Benchmark80Range/10000-4 106792 106639 -0.14% For runtime's BenchCopyFat* benchmarks: CopyFat8-4 0.40ns ± 0% 0.40ns ± 0% ~ (all equal) CopyFat12-4 0.40ns ± 0% 0.80ns ± 0% +100.00% (p=0.000 n=9+9) CopyFat16-4 0.40ns ± 0% 0.80ns ± 0% +100.00% (p=0.000 n=10+8) CopyFat24-4 0.80ns ± 0% 0.40ns ± 0% -50.00% (p=0.001 n=8+9) CopyFat32-4 2.01ns ± 0% 0.40ns ± 0% -80.10% (p=0.000 n=8+8) CopyFat64-4 2.87ns ± 0% 0.40ns ± 0% -86.07% (p=0.000 n=8+10) CopyFat128-4 4.82ns ± 0% 4.82ns ± 0% ~ (p=1.000 n=8+8) CopyFat256-4 8.83ns ± 0% 8.83ns ± 0% ~ (p=1.000 n=8+8) CopyFat512-4 16.9ns ± 0% 16.9ns ± 0% ~ (all equal) CopyFat520-4 14.6ns ± 0% 14.6ns ± 1% ~ (p=0.529 n=8+9) CopyFat1024-4 32.9ns ± 0% 33.0ns ± 0% +0.20% (p=0.041 n=8+9) Function calls are not benefitted as much due how they are compiled, but other benchmarks I ran show that calling function with 64 byte elements is marginally improved. The main downside with this change is that it may increase binary sizes depending on the size of the copy, but this change also decreases binaries for moves of 48 bytes or less. For the following code: package main type size [32]byte //go:noinline func use(t size) { } //go:noinline func get() size { var z size return z } func main() { var a size use(a) } Changing size around gives the following assembly leading up to the call (the initialization and actual call are removed): tip func call with 32B arg: 27B 48 89 e7 mov %rsp,%rdi 48 8d 74 24 20 lea 0x20(%rsp),%rsi 48 89 6c 24 f0 mov %rbp,-0x10(%rsp) 48 8d 6c 24 f0 lea -0x10(%rsp),%rbp e8 53 ab ff ff callq 448964 <runtime.duffcopy+0x364> 48 8b 6d 00 mov 0x0(%rbp),%rbp modified: 19B (-8B) 0f 10 44 24 20 movups 0x20(%rsp),%xmm0 0f 11 04 24 movups %xmm0,(%rsp) 0f 10 44 24 30 movups 0x30(%rsp),%xmm0 0f 11 44 24 10 movups %xmm0,0x10(%rsp) - tip with 47B arg: 29B 48 8d 7c 24 0f lea 0xf(%rsp),%rdi 48 8d 74 24 40 lea 0x40(%rsp),%rsi 48 89 6c 24 f0 mov %rbp,-0x10(%rsp) 48 8d 6c 24 f0 lea -0x10(%rsp),%rbp e8 43 ab ff ff callq 448964 <runtime.duffcopy+0x364> 48 8b 6d 00 mov 0x0(%rbp),%rbp modified: 20B (-9B) 0f 10 44 24 40 movups 0x40(%rsp),%xmm0 0f 11 44 24 0f movups %xmm0,0xf(%rsp) 0f 10 44 24 50 movups 0x50(%rsp),%xmm0 0f 11 44 24 1f movups %xmm0,0x1f(%rsp) - tip with 64B arg: 27B 48 89 e7 mov %rsp,%rdi 48 8d 74 24 40 lea 0x40(%rsp),%rsi 48 89 6c 24 f0 mov %rbp,-0x10(%rsp) 48 8d 6c 24 f0 lea -0x10(%rsp),%rbp e8 1f ab ff ff callq 448948 <runtime.duffcopy+0x348> 48 8b 6d 00 mov 0x0(%rbp),%rbp modified: 39B [+12B] 0f 10 44 24 40 movups 0x40(%rsp),%xmm0 0f 11 04 24 movups %xmm0,(%rsp) 0f 10 44 24 50 movups 0x50(%rsp),%xmm0 0f 11 44 24 10 movups %xmm0,0x10(%rsp) 0f 10 44 24 60 movups 0x60(%rsp),%xmm0 0f 11 44 24 20 movups %xmm0,0x20(%rsp) 0f 10 44 24 70 movups 0x70(%rsp),%xmm0 0f 11 44 24 30 movups %xmm0,0x30(%rsp) - tip with 79B arg: 29B 48 8d 7c 24 0f lea 0xf(%rsp),%rdi 48 8d 74 24 60 lea 0x60(%rsp),%rsi 48 89 6c 24 f0 mov %rbp,-0x10(%rsp) 48 8d 6c 24 f0 lea -0x10(%rsp),%rbp e8 09 ab ff ff callq 448948 <runtime.duffcopy+0x348> 48 8b 6d 00 mov 0x0(%rbp),%rbp modified: 46B [+17B] 0f 10 44 24 60 movups 0x60(%rsp),%xmm0 0f 11 44 24 0f movups %xmm0,0xf(%rsp) 0f 10 44 24 70 movups 0x70(%rsp),%xmm0 0f 11 44 24 1f movups %xmm0,0x1f(%rsp) 0f 10 84 24 80 00 00 movups 0x80(%rsp),%xmm0 00 0f 11 44 24 2f movups %xmm0,0x2f(%rsp) 0f 10 84 24 90 00 00 movups 0x90(%rsp),%xmm0 00 0f 11 44 24 3f movups %xmm0,0x3f(%rsp) So, at best we save 9B, at worst we gain 17. I do not think that copying around 65+B sized types is common enough to bloat program sizes. Using bincmp on the go binary itself shows a zero byte difference; there are gains and losses all over. One of the largest gains in binary size comes from cmd/go/internal/cache.(*Cache).Get, which passes around a 64 byte sized type -- this is one of the cases I would expect to be benefitted by this change. I think that this marginal improvement in struct copying for 64 byte structs is worth it: most data structs / work items I use in my programs are small, but few are smaller than 32 bytes: with one slice, the budget is up. The 32 rule alone would allow another 16 bytes, the 48 and 64 rules allow another 32 and 48. Change-Id: I19a8f9190d5d41825091f17f268f4763bfc12a62 Reviewed-on: https://go-review.googlesource.com/100718 Reviewed-by: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-22 18:17:37 +00:00
Alberto Donizetti	fc6280d4b0	test/codegen: port direct comparisons with memory tests And remove them from asm_test. Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454 Reviewed-on: https://go-review.googlesource.com/102115 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-22 17:20:09 +00:00
Carlos Eduardo Seo	6633bb2aa7	cmd/compile/internal/ppc64, runtime internal/atomic, sync/atomic: implement faster atomics for ppc64x This change implements faster atomics for ppc64x based on the ISA 2.07B, Appendix B.2 recommendations, replacing SYNC/ISYNC by LWSYNC in some cases. Updates #21348 name old time/op new time/op delta Cond1-16 955ns 856ns -10.33% Cond2-16 2.38µs 2.03µs -14.59% Cond4-16 5.90µs 5.44µs -7.88% Cond8-16 12.1µs 11.1µs -8.42% Cond16-16 27.0µs 25.1µs -7.04% Cond32-16 59.1µs 55.5µs -6.14% LoadMostlyHits/sync_test.DeepCopyMap-16 22.1ns 24.1ns +9.02% LoadMostlyHits/sync_test.RWMutexMap-16 252ns 249ns -1.20% LoadMostlyHits/sync.Map-16 16.2ns 16.3ns ~ LoadMostlyMisses/sync_test.DeepCopyMap-16 22.3ns 22.6ns ~ LoadMostlyMisses/sync_test.RWMutexMap-16 249ns 247ns -0.51% LoadMostlyMisses/sync.Map-16 12.7ns 12.7ns ~ LoadOrStoreBalanced/sync_test.RWMutexMap-16 1.27µs 1.17µs -7.54% LoadOrStoreBalanced/sync.Map-16 1.12µs 1.10µs -2.35% LoadOrStoreUnique/sync_test.RWMutexMap-16 1.75µs 1.68µs -3.84% LoadOrStoreUnique/sync.Map-16 2.07µs 1.97µs -5.13% LoadOrStoreCollision/sync_test.DeepCopyMap-16 15.8ns 15.9ns ~ LoadOrStoreCollision/sync_test.RWMutexMap-16 496ns 424ns -14.48% LoadOrStoreCollision/sync.Map-16 6.07ns 6.07ns ~ Range/sync_test.DeepCopyMap-16 1.65µs 1.64µs ~ Range/sync_test.RWMutexMap-16 278µs 288µs +3.75% Range/sync.Map-16 2.00µs 2.01µs ~ AdversarialAlloc/sync_test.DeepCopyMap-16 3.45µs 3.44µs ~ AdversarialAlloc/sync_test.RWMutexMap-16 226ns 227ns ~ AdversarialAlloc/sync.Map-16 1.09µs 1.07µs -2.36% AdversarialDelete/sync_test.DeepCopyMap-16 553ns 550ns -0.57% AdversarialDelete/sync_test.RWMutexMap-16 273ns 274ns ~ AdversarialDelete/sync.Map-16 247ns 249ns ~ UncontendedSemaphore-16 79.0ns 65.5ns -17.11% ContendedSemaphore-16 112ns 97ns -13.77% MutexUncontended-16 3.34ns 2.51ns -24.69% Mutex-16 266ns 191ns -28.26% MutexSlack-16 226ns 159ns -29.55% MutexWork-16 377ns 338ns -10.14% MutexWorkSlack-16 335ns 308ns -8.20% MutexNoSpin-16 196ns 184ns -5.91% MutexSpin-16 710ns 666ns -6.21% Once-16 1.29ns 1.29ns ~ Pool-16 8.64ns 8.71ns ~ PoolOverflow-16 1.60µs 1.44µs -10.25% SemaUncontended-16 5.39ns 4.42ns -17.96% SemaSyntNonblock-16 539ns 483ns -10.42% SemaSyntBlock-16 413ns 354ns -14.20% SemaWorkNonblock-16 305ns 258ns -15.36% SemaWorkBlock-16 266ns 229ns -14.06% RWMutexUncontended-16 12.9ns 9.7ns -24.80% RWMutexWrite100-16 203ns 147ns -27.47% RWMutexWrite10-16 177ns 119ns -32.74% RWMutexWorkWrite100-16 435ns 403ns -7.39% RWMutexWorkWrite10-16 642ns 611ns -4.79% WaitGroupUncontended-16 4.67ns 3.70ns -20.92% WaitGroupAddDone-16 402ns 355ns -11.54% WaitGroupAddDoneWork-16 208ns 250ns +20.09% WaitGroupWait-16 1.21ns 1.21ns ~ WaitGroupWaitWork-16 5.91ns 5.87ns -0.81% WaitGroupActuallyWait-16 92.2ns 85.8ns -6.91% Updates #21348 Change-Id: Ibb9b271d11b308264103829e176c6d9fe8f867d3 Reviewed-on: https://go-review.googlesource.com/95175 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2018-03-22 14:13:01 +00:00
Tim Wright	88129f0cb2	all: enable c-shared/c-archive support for freebsd/amd64 Fixes #14327 Much of the code is based on the linux/amd64 code that implements these build modes, and code is shared where possible. Change-Id: Ia510f2023768c0edbc863aebc585929ec593b332 Reviewed-on: https://go-review.googlesource.com/93875 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-21 21:56:20 +00:00
isharipo	ff5cf43df5	runtime,sync/atomic: replace asm BYTEs with insts for x86 For each replacement, test case is added to new 386enc.s file with exception of EMMS, SYSENTER, MFENCE and LFENCE as they are already covered in amd64enc.s (same on amd64 and 386). The replacement became less obvious after go vet suggested changes Before: BYTE $0x0f; BYTE $0x7f; BYTE $0x44; BYTE $0x24; BYTE $0x08 Changed to MOVQ (this form is being tested): MOVQ M0, 8(SP) Refactored to FP-relative access (go vet advice): MOVQ M0, val+4(FP) Change-Id: I56b87cf3371b6ad81ad0cd9db2033aee407b5818 Reviewed-on: https://go-review.googlesource.com/101475 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-03-21 20:51:04 +00:00
Daniel Martí	77c3ef6f6f	cmd/doc: use empty GOPATH when running the tests Otherwise, a populated GOPATH might result in failures such as: $ go test [...] no buildable Go source files in [...]/gopherjs/compiler/natives/src/crypto/rand exit status 1 Move the initialization of the dirs walker out of the init func, so that we can control its behavior in the tests. Updates #24464. Change-Id: I4b26a7d3d6809bdd8e9b6b0556d566e7855f80fe Reviewed-on: https://go-review.googlesource.com/101836 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-21 13:43:22 +00:00
Alberto Donizetti	041c5d8348	cmd/trace: remove unused variable in tests Unused variables in closures are currently not diagnosed by the compiler (this is Issue #3059), while go/types catches them. One unused variable in the cmd/trace tests is causing the go/types test that typechecks the whole standard library to fail: FAIL: TestStdlib (8.05s) stdlib_test.go:223: cmd/trace/annotations_test.go:241:6: gcTime declared but not used FAIL Remove it. Updates #24464 Change-Id: I0f1b9db6ae1f0130616ee649bdbfdc91e38d2184 Reviewed-on: https://go-review.googlesource.com/101815 Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2018-03-21 11:10:03 +00:00
Ilya Tocar	983dcf70ba	cmd/compile/internal/ssa: update regalloc in loops Currently we don't lift spill out of loop if loop contains call. However often we have code like this: for .. { if hard_case { call() } // simple case, without call } So instead of checking for any call, check for unavoidable call. For #22698 cases I see: mime/quotedprintable/Writer-6 10.9µs ± 4% 9.2µs ± 3% -15.02% (p=0.000 n=8+8) And: compress/flate/Encode/Twain/Huffman/1e4-6 99.4µs ± 6% 90.9µs ± 0% -8.57% (p=0.000 n=8+8) compress/flate/Encode/Twain/Huffman/1e5-6 760µs ± 1% 725µs ± 1% -4.56% (p=0.000 n=8+8) compress/flate/Encode/Twain/Huffman/1e6-6 7.55ms ± 0% 7.24ms ± 0% -4.07% (p=0.000 n=8+7) There are no significant changes on go1 benchmarks. But for cases with runtime arch checks, where we call generic version on old hardware, there are respectable performance gains: math/RoundToEven-6 1.43ns ± 0% 1.25ns ± 0% -12.59% (p=0.001 n=7+7) math/bits/OnesCount64-6 1.60ns ± 1% 1.42ns ± 1% -11.32% (p=0.000 n=8+8) Also on some runtime benchmarks loops have less loads and higher performance: runtime/RuneIterate/range1/ASCII-6 15.6ns ± 1% 13.9ns ± 1% -10.74% (p=0.000 n=7+8) runtime/ArrayEqual-6 3.22ns ± 0% 2.86ns ± 2% -11.06% (p=0.000 n=7+8) Fixes #22698 Updates #22234 Change-Id: I0ae2f19787d07a9026f064366dedbe601bf7257a Reviewed-on: https://go-review.googlesource.com/84055 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-03-20 21:02:39 +00:00
Alberto Donizetti	be371edd67	test/codegen: port comparisons tests to codegen And delete them from asm_test. Change-Id: I64c512bfef3b3da6db5c5d29277675dade28b8ab Reviewed-on: https://go-review.googlesource.com/101595 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-20 19:38:06 +00:00
Than McIntosh	f45c07e84a	cmd/compile: fix regression in DWARF inlined routine variable tracking Fix a bug in the code that generates the pre-inlined variable declaration table used as raw material for emitting DWARF inline routine records. The fix for issue 23704 altered the recipe for assigning file/line/col to variables in one part of the compiler, but didn't update a similar recipe in the code for variable tracking. Added a new test that should catch problems of a similar nature. Fixes #24460. Change-Id: I255c036637f4151aa579c0e21d123fd413724d61 Reviewed-on: https://go-review.googlesource.com/101676 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-20 18:56:52 +00:00
Michael Munday	ae10914e67	cmd/compile: mark LAA and LAAG as clobbering flags on s390x The atomic add instructions modify the condition code and so need to be marked as clobbering flags. Fixes #24449. Change-Id: Ic69c8d775fbdbfb2a56c5e0cfca7a49c0d7f6897 Reviewed-on: https://go-review.googlesource.com/101455 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-20 09:44:50 +00:00
Fangming.Fang	9c312245ac	cmd/asm: fix bug about VMOV instruction (move a vector element to another) on ARM64 This change fixes index error when encoding VMOV instruction which pattern is vmov Vn.<T>[index], Vd.<T>[index] Change-Id: I949166e6dfd63fb0a9365f183b6c50d452614f9d Reviewed-on: https://go-review.googlesource.com/101335 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-20 03:45:04 +00:00
Fangming.Fang	7673e30503	cmd/asm: fix bug about VMOV instruction (move register to vector element) on ARM64 This change fixes index error when encoding VMOV instruction which pattern is VMOV Rn, V.<T>[index]. For example VMOV R1, V1.S[1] is assembled as VMOV R1, V1.S[0] Fixes #24400 Change-Id: I82b5edc8af4e06862bc4692b119697c6bb7dc3fb Reviewed-on: https://go-review.googlesource.com/101297 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-20 03:43:37 +00:00
Vladimir Kuzmin	c12b185a6e	cmd/compile: avoid mapaccess at m[k]=append(m[k].. Currently rvalue m[k] is transformed during walk into: tmp1 := mapaccess(m, k) tmp2 := append(tmp1, ...) mapassign(m, k) = tmp2 However, this is suboptimal, as we could instead produce just: tmp := mapassign(m, k) tmp := append(tmp, ...) Optimization is possible only if during Order it may tell that m[k] is exactly the same at left and right part of assignment. It doesn't work: 1) m[f(k)] = append(m[f(k)], ...) 2) sink, m[k] = sink, append(m[k]...) 3) m[k] = append(..., m[k],...) Benchmark: name old time/op new time/op delta MapAppendAssign/Int32/256-8 33.5ns ± 3% 22.4ns ±10% -33.24% (p=0.000 n=16+18) MapAppendAssign/Int32/65536-8 68.2ns ± 6% 48.5ns ±29% -28.90% (p=0.000 n=20+20) MapAppendAssign/Int64/256-8 34.3ns ± 4% 23.3ns ± 5% -32.23% (p=0.000 n=17+18) MapAppendAssign/Int64/65536-8 65.9ns ± 7% 61.2ns ±19% -7.06% (p=0.002 n=18+20) MapAppendAssign/Str/256-8 116ns ±12% 79ns ±16% -31.70% (p=0.000 n=20+19) MapAppendAssign/Str/65536-8 134ns ±15% 111ns ±45% -16.95% (p=0.000 n=19+20) name old alloc/op new alloc/op delta MapAppendAssign/Int32/256-8 47.0B ± 0% 46.0B ± 0% -2.13% (p=0.000 n=19+18) MapAppendAssign/Int32/65536-8 27.0B ± 0% 20.7B ±30% -23.33% (p=0.000 n=20+20) MapAppendAssign/Int64/256-8 47.0B ± 0% 46.0B ± 0% -2.13% (p=0.000 n=20+17) MapAppendAssign/Int64/65536-8 27.0B ± 0% 27.0B ± 0% ~ (all equal) MapAppendAssign/Str/256-8 94.0B ± 0% 78.0B ± 0% -17.02% (p=0.000 n=20+16) MapAppendAssign/Str/65536-8 54.0B ± 0% 54.0B ± 0% ~ (all equal) Fixes #24364 Updates #5147 Change-Id: Id257d052b75b9a445b4885dc571bf06ce6f6b409 Reviewed-on: https://go-review.googlesource.com/100838 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-20 01:47:07 +00:00
fanzha02	910c3a9dfc	cmd/asm: add ARM64 assembler check for incorrect input Current ARM64 assembler has no check for the invalid value of both shift amount and post-index immediate offset of LD1/ST1. This patch adds the check. This patch also fixes the printing error of register number equals to 31, which should be printed as ZR instead of R31. Test cases are also added. Change-Id: I476235f3ab3a3fc91fe89c5a3149a4d4529c05c7 Reviewed-on: https://go-review.googlesource.com/100255 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com>	2018-03-19 23:45:50 +00:00
Alberto Donizetti	5a4e09837c	test/codegen: port maps test to codegen And delete them from asm_test. Change-Id: I3cf0934706a640136cb0f646509174f8c1bf3363 Reviewed-on: https://go-review.googlesource.com/101395 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-19 13:39:34 +00:00
Alberto Donizetti	b61b1d2c57	test/codegen: port structs test to codegen And delete them from asm_test. Change-Id: Ia286239a3d8f3915f2ca25dbcb39f3354a4f8aea Reviewed-on: https://go-review.googlesource.com/101138 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-18 16:53:53 +00:00
Daniel Martí	2767c4e285	cmd/go: remove some unused parameters Change-Id: I441b3045e76afc1c561914926c14efc8a116c8a7 Reviewed-on: https://go-review.googlesource.com/101195 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-16 21:01:28 +00:00
David Chase	b30bf958da	cmd/compile: enable scopes unconditionally This revives Alessandro Arzilli's CL to enable scopes whenever any dwarf is emitted (with optimization or not), adds a test that detects this changes and shows that it creates more truthful debugging output. Reverted change to ssa/debug_test tests made when scopes were disabled during dwarflocationlist development. Also included are updates to the Delve test output (it had fallen out of sync; creating test output for one updates it for all) and minor naming changes in ssa/debug_test. Compile-time/space changes (relative to tip including dwarflocationlists): benchstat -geomean after.log scopes.log name old time/op new time/op delta Template 182ms ± 1% 182ms ± 1% ~ (p=0.666 n=9+9) Unicode 82.8ms ± 1% 86.6ms ±14% ~ (p=0.211 n=9+10) GoTypes 611ms ± 1% 616ms ± 2% +0.97% (p=0.001 n=10+9) Compiler 2.95s ± 1% 2.95s ± 0% ~ (p=0.573 n=10+8) SSA 6.70s ± 1% 6.81s ± 1% +1.68% (p=0.000 n=9+10) Flate 117ms ± 1% 118ms ± 1% +0.60% (p=0.036 n=9+8) GoParser 145ms ± 1% 145ms ± 1% ~ (p=1.000 n=9+9) Reflect 398ms ± 1% 396ms ± 1% ~ (p=0.053 n=9+10) Tar 171ms ± 1% 171ms ± 1% ~ (p=0.356 n=9+10) XML 214ms ± 1% 214ms ± 1% ~ (p=0.605 n=9+9) StdCmd 12.4s ± 2% 12.4s ± 1% ~ (p=1.000 n=9+9) [Geo mean] 506ms 509ms +0.71% name old user-ns/op new user-ns/op delta Template 254M ± 4% 249M ± 6% ~ (p=0.155 n=10+10) Unicode 121M ±11% 124M ± 6% ~ (p=0.516 n=10+10) GoTypes 824M ± 2% 869M ± 5% +5.49% (p=0.001 n=8+10) Compiler 4.01G ± 2% 4.02G ± 1% ~ (p=0.561 n=9+9) SSA 10.0G ± 2% 10.2G ± 2% +2.29% (p=0.000 n=9+10) Flate 154M ± 7% 154M ± 7% ~ (p=0.960 n=10+9) GoParser 190M ± 7% 196M ± 6% ~ (p=0.064 n=9+10) Reflect 528M ± 2% 517M ± 3% -1.97% (p=0.025 n=10+10) Tar 227M ± 5% 232M ± 3% ~ (p=0.061 n=9+10) XML 286M ± 4% 283M ± 4% ~ (p=0.343 n=9+9) [Geo mean] 502M 508M +1.09% name old text-bytes new text-bytes delta HelloSize 672k ± 0% 672k ± 0% +0.01% (p=0.000 n=10+10) CmdGoSize 7.21M ± 0% 7.21M ± 0% -0.00% (p=0.000 n=10+10) [Geo mean] 2.20M 2.20M +0.00% name old data-bytes new data-bytes delta HelloSize 9.88k ± 0% 9.88k ± 0% ~ (all equal) CmdGoSize 248k ± 0% 248k ± 0% ~ (all equal) [Geo mean] 49.5k 49.5k +0.00% name old bss-bytes new bss-bytes delta HelloSize 125k ± 0% 125k ± 0% ~ (all equal) CmdGoSize 144k ± 0% 144k ± 0% -0.04% (p=0.000 n=10+10) [Geo mean] 135k 135k -0.02% name old exe-bytes new exe-bytes delta HelloSize 1.30M ± 0% 1.34M ± 0% +3.15% (p=0.000 n=10+10) CmdGoSize 13.5M ± 0% 13.9M ± 0% +2.70% (p=0.000 n=10+10) [Geo mean] 4.19M 4.31M +2.92% Change-Id: Id53b8d57bd00440142ccbd39b95710e14e083fb5 Reviewed-on: https://go-review.googlesource.com/101217 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-16 20:25:10 +00:00
Matthew Dempsky	86a338960d	reflect: sort exported methods first By moving exported methods to the front of method lists, filtering down to only the exported methods just needs a count of how many exported methods exist, which the compiler can statically provide. This allows getting rid of the exported method cache. For #22075. Change-Id: I8eeb274563a2940e1347c34d673f843ae2569064 Reviewed-on: https://go-review.googlesource.com/100846 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-15 21:56:08 +00:00
Matthew Dempsky	91bbe5388d	cmd/compile: sort method sets earlier By sorting method sets earlier, we can change the interface satisfaction problem from taking O(NM) time to O(N+M). This is the same algorithm already used by runtime and reflect for dynamic interface satisfaction testing. For #22075. Change-Id: I3d889f0227f37704535739bbde11f5107b4eea17 Reviewed-on: https://go-review.googlesource.com/100845 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-15 21:53:01 +00:00
David Chase	1c24ffbf93	cmd/compile: turn on DWARF locations lists for ssa vars This changes the default setting for -dwarflocationlists from false to true, removes the flag from ssa/debug_test.go, and updates runtime/runtime-gdb_test.go to match a change in debugging output for composite variables. Current benchmarks (perflock, -count 10) benchstat -geomean before.log after.log name old time/op new time/op delta Template 175ms ± 0% 182ms ± 1% +3.68% (p=0.000 n=8+9) Unicode 82.0ms ± 2% 82.8ms ± 1% +0.96% (p=0.019 n=9+9) GoTypes 590ms ± 1% 611ms ± 1% +3.42% (p=0.000 n=9+10) Compiler 2.85s ± 0% 2.95s ± 1% +3.60% (p=0.000 n=9+10) SSA 6.42s ± 1% 6.70s ± 1% +4.31% (p=0.000 n=10+9) Flate 113ms ± 2% 117ms ± 1% +3.11% (p=0.000 n=10+9) GoParser 140ms ± 1% 145ms ± 1% +3.47% (p=0.000 n=10+9) Reflect 384ms ± 0% 398ms ± 1% +3.56% (p=0.000 n=8+9) Tar 165ms ± 1% 171ms ± 1% +3.33% (p=0.000 n=9+9) XML 207ms ± 2% 214ms ± 1% +3.41% (p=0.000 n=9+9) StdCmd 11.8s ± 2% 12.4s ± 2% +4.41% (p=0.000 n=10+9) [Geo mean] 489ms 506ms +3.38% name old user-ns/op new user-ns/op delta Template 247M ± 4% 254M ± 4% +2.76% (p=0.040 n=10+10) Unicode 118M ±16% 121M ±11% ~ (p=0.364 n=10+10) GoTypes 805M ± 2% 824M ± 2% +2.37% (p=0.003 n=9+8) Compiler 3.92G ± 2% 4.01G ± 2% +2.20% (p=0.001 n=9+9) SSA 9.63G ± 4% 10.00G ± 2% +3.81% (p=0.000 n=10+9) Flate 155M ±10% 154M ± 7% ~ (p=0.718 n=9+10) GoParser 184M ±11% 190M ± 7% ~ (p=0.220 n=10+9) Reflect 506M ± 4% 528M ± 2% +4.27% (p=0.000 n=10+10) Tar 224M ± 4% 227M ± 5% ~ (p=0.207 n=10+9) XML 272M ± 7% 286M ± 4% +5.23% (p=0.010 n=10+9) [Geo mean] 489M 502M +2.76% name old text-bytes new text-bytes delta HelloSize 672k ± 0% 672k ± 0% ~ (all equal) CmdGoSize 7.21M ± 0% 7.21M ± 0% ~ (all equal) [Geo mean] 2.20M 2.20M +0.00% name old data-bytes new data-bytes delta HelloSize 9.88k ± 0% 9.88k ± 0% ~ (all equal) CmdGoSize 248k ± 0% 248k ± 0% ~ (all equal) [Geo mean] 49.5k 49.5k +0.00% name old bss-bytes new bss-bytes delta HelloSize 125k ± 0% 125k ± 0% ~ (all equal) CmdGoSize 144k ± 0% 144k ± 0% ~ (all equal) [Geo mean] 135k 135k +0.00% name old exe-bytes new exe-bytes delta HelloSize 1.10M ± 0% 1.30M ± 0% +17.82% (p=0.000 n=10+10) CmdGoSize 11.6M ± 0% 13.5M ± 0% +16.90% (p=0.000 n=10+10) [Geo mean] 3.57M 4.19M +17.36% Change-Id: I250055813cadd25cebee8da1f9a7f995a6eae432 Reviewed-on: https://go-review.googlesource.com/100738 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-15 21:34:17 +00:00
Heschi Kreinick	1814a0595c	cmd/trace: filter tasks by log text Add a search box to the top of the user task views that only displays tasks containing a particular log message. Change-Id: I92f4aa113f930954e8811416901e37824f0eb884 Reviewed-on: https://go-review.googlesource.com/100843 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2018-03-15 19:59:22 +00:00
Alberto Donizetti	cceee685be	test/codegen: port floats tests to codegen And delete them from asm_test. Change-Id: Ibdaca3496eefc73c731b511ddb9636a1f3dff68c Reviewed-on: https://go-review.googlesource.com/100915 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-15 18:05:59 +00:00
Keith Randall	9d4215311b	runtime: identify special functions by flag instead of address When there are plugins, there may not be a unique copy of runtime functions like goexit, mcall, etc. So identifying them by entry address is problematic. Instead, keep track of each special function using a field in the symbol table. That way, multiple copies of the same runtime function will be treated identically. Fixes #24351 Fixes #23133 Change-Id: Iea3232df8a6af68509769d9ca618f530cc0f84fd Reviewed-on: https://go-review.googlesource.com/100739 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-15 17:31:57 +00:00
Daniel Martí	cd2cb6e3f5	cmd/compile: cache sparse maps across ssa passes This is done for sparse sets already, but it was missing for sparse maps. Only affects deadstore and regalloc, as they're the only ones that use sparse maps. name old time/op new time/op delta DSEPass-4 247µs ± 0% 216µs ± 0% -12.75% (p=0.008 n=5+5) DSEPassBlock-4 3.05ms ± 1% 2.87ms ± 1% -6.02% (p=0.002 n=6+6) CSEPass-4 2.30ms ± 0% 2.32ms ± 0% +0.53% (p=0.026 n=6+6) CSEPassBlock-4 23.8ms ± 0% 23.8ms ± 0% ~ (p=0.931 n=6+5) DeadcodePass-4 51.7µs ± 1% 51.5µs ± 2% ~ (p=0.429 n=5+6) DeadcodePassBlock-4 734µs ± 1% 742µs ± 3% ~ (p=0.394 n=6+6) MultiPass-4 152µs ± 0% 149µs ± 2% ~ (p=0.082 n=5+6) MultiPassBlock-4 2.67ms ± 1% 2.41ms ± 2% -9.77% (p=0.008 n=5+5) name old alloc/op new alloc/op delta DSEPass-4 41.2kB ± 0% 0.1kB ± 0% -99.68% (p=0.002 n=6+6) DSEPassBlock-4 560kB ± 0% 4kB ± 0% -99.34% (p=0.026 n=5+6) CSEPass-4 189kB ± 0% 189kB ± 0% ~ (all equal) CSEPassBlock-4 3.10MB ± 0% 3.10MB ± 0% ~ (p=0.444 n=5+5) DeadcodePass-4 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) DeadcodePassBlock-4 164kB ± 0% 164kB ± 0% ~ (all equal) MultiPass-4 240kB ± 0% 199kB ± 0% -17.06% (p=0.002 n=6+6) MultiPassBlock-4 3.60MB ± 0% 2.99MB ± 0% -17.06% (p=0.002 n=6+6) name old allocs/op new allocs/op delta DSEPass-4 8.00 ± 0% 4.00 ± 0% -50.00% (p=0.002 n=6+6) DSEPassBlock-4 240 ± 0% 120 ± 0% -50.00% (p=0.002 n=6+6) CSEPass-4 9.00 ± 0% 9.00 ± 0% ~ (all equal) CSEPassBlock-4 1.35k ± 0% 1.35k ± 0% ~ (all equal) DeadcodePass-4 3.00 ± 0% 3.00 ± 0% ~ (all equal) DeadcodePassBlock-4 9.00 ± 0% 9.00 ± 0% ~ (all equal) MultiPass-4 11.0 ± 0% 10.0 ± 0% -9.09% (p=0.002 n=6+6) MultiPassBlock-4 165 ± 0% 150 ± 0% -9.09% (p=0.002 n=6+6) Change-Id: I43860687c88f33605eb1415f36473c5cfe8fde4a Reviewed-on: https://go-review.googlesource.com/98449 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-03-15 17:24:39 +00:00
Giovanni Bajo	a35ec9a59e	cmd/compile: implement CMOV on amd64 This builds upon the branchelim pass, activating it for amd64 and lowering CondSelect. Special care is made to FPU instructions for NaN handling. Benchmark results on Xeon E5630 (Westmere EP): name old time/op new time/op delta BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5) Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5) FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5) FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5) FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5) FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5) FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5) FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5) FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5) GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5) GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5) Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5) Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5) HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5) JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4) JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5) Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5) GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4) RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5) RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5) RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5) RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5) RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5) RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5) Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5) Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5) TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4) TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5) name old speed new speed delta GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5) GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5) Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5) Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5) JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4) JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5) GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5) RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5) RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5) RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5) RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5) Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5) Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5) Change-Id: I6972e8f35f2b31f9a42ac473a6bf419a18022558 Reviewed-on: https://go-review.googlesource.com/100935 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-15 16:41:59 +00:00
James Cowgill	423111081b	cmd/internal/obj/mips: load/store even float registers first There is a bug in Octeon III processors where storing an odd floating point register after it has recently been written to by a double floating point operation will store the old value from before the double operation (there are some extra details - the operation and store must be a certain number of cycles apart). However, this bug does not occur if the even register is stored first. Currently the bug only happens on big endian because go always loads the even register first on little endian. Workaround the bug by always loading / storing the even floating point register first. Since this is just an instruction reordering, it should have no performance penalty. This follows other compilers like GCC which will always store the even register first (although you do have to set the ISA level to MIPS I to prevent it from using SDC1). Change-Id: I5e73daa4d724ca1df7bf5228aab19f53f26a4976 Reviewed-on: https://go-review.googlesource.com/97735 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-15 15:40:39 +00:00
Geoff Berry	e244a7a7d3	cmd/compile/internal/ssa: add patterns for arm64 bitfield opcodes Add patterns to match common idioms for EXTR, BFI, BFXIL, SBFIZ, SBFX, UBFIZ and UBFX opcodes. go1 benchmarks results on Amberwing: name old time/op new time/op delta FmtManyArgs 786ns ± 2% 714ns ± 1% -9.20% (p=0.000 n=10+10) Gzip 437ms ± 0% 402ms ± 0% -7.99% (p=0.000 n=10+10) FmtFprintfIntInt 196ns ± 0% 182ns ± 0% -7.28% (p=0.000 n=10+9) FmtFprintfPrefixedInt 207ns ± 0% 199ns ± 0% -3.86% (p=0.000 n=10+10) FmtFprintfFloat 324ns ± 0% 316ns ± 0% -2.47% (p=0.000 n=10+8) FmtFprintfInt 119ns ± 0% 117ns ± 0% -1.68% (p=0.000 n=10+9) GobDecode 12.8ms ± 2% 12.6ms ± 1% -1.62% (p=0.002 n=10+10) JSONDecode 94.4ms ± 1% 93.4ms ± 0% -1.10% (p=0.000 n=10+10) RegexpMatchEasy0_32 247ns ± 0% 245ns ± 0% -0.65% (p=0.000 n=10+10) RegexpMatchMedium_32 314ns ± 0% 312ns ± 0% -0.64% (p=0.000 n=10+10) RegexpMatchEasy0_1K 541ns ± 0% 538ns ± 0% -0.55% (p=0.000 n=10+9) TimeParse 450ns ± 1% 448ns ± 1% -0.42% (p=0.035 n=9+9) RegexpMatchEasy1_32 244ns ± 0% 243ns ± 0% -0.41% (p=0.000 n=10+10) GoParse 6.03ms ± 0% 6.00ms ± 0% -0.40% (p=0.002 n=10+10) RegexpMatchEasy1_1K 779ns ± 0% 777ns ± 0% -0.26% (p=0.000 n=10+10) RegexpMatchHard_32 2.75µs ± 0% 2.74µs ± 1% -0.06% (p=0.026 n=9+9) BinaryTree17 11.7s ± 0% 11.6s ± 0% ~ (p=0.089 n=10+10) HTTPClientServer 89.1µs ± 1% 89.5µs ± 2% ~ (p=0.436 n=10+10) RegexpMatchHard_1K 78.9µs ± 0% 79.5µs ± 2% ~ (p=0.469 n=10+10) FmtFprintfEmpty 58.5ns ± 0% 58.5ns ± 0% ~ (all equal) GobEncode 12.0ms ± 1% 12.1ms ± 0% ~ (p=0.075 n=10+10) Revcomp 669ms ± 0% 668ms ± 0% ~ (p=0.091 n=7+9) Mandelbrot200 5.35ms ± 0% 5.36ms ± 0% +0.07% (p=0.000 n=9+9) RegexpMatchMedium_1K 52.1µs ± 0% 52.1µs ± 0% +0.10% (p=0.000 n=9+9) Fannkuch11 3.25s ± 0% 3.26s ± 0% +0.36% (p=0.000 n=9+10) FmtFprintfString 114ns ± 1% 115ns ± 0% +0.52% (p=0.011 n=10+10) JSONEncode 20.2ms ± 0% 20.3ms ± 0% +0.65% (p=0.000 n=10+10) Template 91.3ms ± 0% 92.3ms ± 0% +1.08% (p=0.000 n=10+10) TimeFormat 484ns ± 0% 495ns ± 1% +2.30% (p=0.000 n=9+10) There are some opportunities to improve this change further by adding patterns to match the "extended register" versions of ADD/SUB/CMP, but I think that should be evaluated on its own. The regressions in Template and TimeFormat would likely be recovered by this, as they seem to be due to generating: ubfiz x0, x0, #3, #8 add x1, x2, x0 instead of add x1, x2, x0, lsl #3 Change-Id: I5644a8d70ac7a98e784a377a2b76ab47a3415a4b Reviewed-on: https://go-review.googlesource.com/88355 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-15 14:10:41 +00:00
Alberto Donizetti	ded9a1b372	test/codegen: port len/cap pow2 div tests to codegen And delete them from asm_test. Change-Id: I29c8d098a8893e6b669b6272a2f508985ac9d618 Reviewed-on: https://go-review.googlesource.com/100876 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-15 13:34:01 +00:00
Matthew Dempsky	29517daff9	cmd/compile: extract common noding code from func{Decl,Lit} Passes toolstash-check. Change-Id: I8290221d6169e077dfa4ea737d685c7fcecf6841 Reviewed-on: https://go-review.googlesource.com/100835 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-15 01:09:42 +00:00
Matthew Dempsky	463fe95bdd	cmd/compile: fix duplicate code generation in swt.go When combining adjacent type switch cases with the same type hash, we failed to actually remove the combined cases, so we would generate code for them twice. We use MD5 for type hashes, so collisions are rare, but they do currently appear in test/fixedbugs/bug248.dir/bug2.go, which is how I noticed this failure. Passes toolstash-check. Change-Id: I66729b3366b96cb8ddc8fa6f3ebea11ef6d74012 Reviewed-on: https://go-review.googlesource.com/100461 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-03-15 01:01:56 +00:00
Matthew Dempsky	eb3c44b2c4	cmd/compile: cleanup closure.go The main thing is we now eagerly create the ODCLFUNC node for closures, immediately cross-link them, and assign fields (e.g., Nbody, Dcl, Parents, Marks) directly on the ODCLFUNC (previously they were assigned on the OCLOSURE and later moved to the ODCLFUNC). This allows us to set Curfn to the ODCLFUNC instead of the OCLOSURE, which makes things more consistent with normal function declarations. (Notably, this means Cvars now hang off the ODCLFUNC instead of the OCLOSURE.) Assignment of xfunc symbol names also now happens before typechecking their body, which means debugging output now provides a more helpful name than "<S>". In golang.org/cl/66810, we changed "x := y" statements to avoid creating false closure variables for x, but we still create them for struct literals like "s{f: x}". Update comment in capturevars accordingly. More opportunity for cleanups still, but this makes some substantial progress, IMO. Passes toolstash-check. Change-Id: I65a4efc91886e3dcd1000561348af88297775cd7 Reviewed-on: https://go-review.googlesource.com/100197 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-14 23:54:39 +00:00
Ilya Tocar	644d14ea0f	Revert "cmd/compile: implement CMOV on amd64" This reverts commit `080187f4f7`. It broke build of golang.org/x/exp/shiny/iconvg See issue 24395 for details Change-Id: Ifd6134f6214e6cee40bd3c63c32941d5fc96ae8b Reviewed-on: https://go-review.googlesource.com/100755 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-14 21:21:23 +00:00
Heschi Kreinick	44e65f2c94	cmd/compile/internal/ssa: track stack-only vars User variables that cannot be SSA'd, either because their addresses are taken or because they are too large for the decomposition heuristic, do not explicitly appear as operands of SSA values. Instead they are written to directly via the stack pointer. This hid them from the location list generation, which is only interested in the named value table. Fortunately, the lifetime of stack-only variables is delineated by VarDef/VarKill ops, and it's easy enough to turn those into location list bounds. One wrinkle: stack frame information is not explicitly available in the SSA phases, because it's owned by the frontend in AllocFrame. It would be easier if the set of live LocalSlots were returned by that, but this is the minimal change to fix missing variables. Or VarDef/VarKills could appear in NamedValues, which would make this change even easier. Change-Id: Ice6654dad6f9babb0286e95c7ec28594561dc91f Reviewed-on: https://go-review.googlesource.com/100458 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-14 20:52:29 +00:00
Lynn Boger	aff222cd18	cmd/compile: improve PPC64.rules to reduce size of rewritePPC64.go Some rules in PPC64.rules cause an extremely large rewritePPC64.go file to be generated, due to rules with commutative operations and many operands. This happens with the existing rules for combining byte loads in little endian order, and also happens with the pending change to do the same for bytes in big endian order. The change improves the existing rules and reduces the size of the rewrite file by more than 60%. Once this change is merged, then the pending change for big endian ordered rules will be updated to use rules that avoid generating an excessively large rewrite file. This also includes a fix to a performance regression for littleEndian.PutUint16 on ppc64le. Change-Id: I8d2ea42885fa2b84b30c63aa124b0a9b130564ff Reviewed-on: https://go-review.googlesource.com/100675 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-14 19:03:05 +00:00
Balaram Makam	b46d398887	runtime: improve arm64 memclr implementation Improve runtime memclr_arm64.s using ZVA feature to zero out memory when n is at least 64 bytes. Also add DCZID_EL0 system register to use in MRS instruction. Benchmark results of runtime/Memclr on Amberwing: name old time/op new time/op delta Memclr/5 12.7ns ± 0% 12.7ns ± 0% ~ (all equal) Memclr/16 12.7ns ± 0% 12.2ns ± 1% -4.13% (p=0.000 n=7+8) Memclr/64 14.0ns ± 0% 14.6ns ± 1% +4.29% (p=0.000 n=7+8) Memclr/256 23.7ns ± 0% 25.7ns ± 0% +8.44% (p=0.000 n=8+7) Memclr/4096 204ns ± 0% 74ns ± 0% -63.71% (p=0.000 n=8+8) Memclr/65536 2.89µs ± 0% 0.84µs ± 0% -70.91% (p=0.000 n=8+8) Memclr/1M 45.9µs ± 0% 17.0µs ± 0% -62.88% (p=0.000 n=8+8) Memclr/4M 184µs ± 0% 77µs ± 4% -57.94% (p=0.001 n=6+8) Memclr/8M 367µs ± 0% 144µs ± 1% -60.72% (p=0.000 n=7+8) Memclr/16M 734µs ± 0% 293µs ± 1% -60.09% (p=0.000 n=8+8) Memclr/64M 2.94ms ± 0% 1.23ms ± 0% -58.06% (p=0.000 n=7+8) GoMemclr/5 8.00ns ± 0% 8.79ns ± 0% +9.83% (p=0.000 n=8+8) GoMemclr/16 8.00ns ± 0% 7.60ns ± 0% -5.00% (p=0.000 n=8+8) GoMemclr/64 10.8ns ± 0% 10.4ns ± 0% -3.70% (p=0.000 n=8+8) GoMemclr/256 20.4ns ± 0% 21.2ns ± 0% +3.92% (p=0.000 n=8+8) name old speed new speed delta Memclr/5 394MB/s ± 0% 393MB/s ± 0% -0.28% (p=0.006 n=8+8) Memclr/16 1.26GB/s ± 0% 1.31GB/s ± 1% +4.07% (p=0.000 n=7+8) Memclr/64 4.57GB/s ± 0% 4.39GB/s ± 2% -3.91% (p=0.000 n=7+8) Memclr/256 10.8GB/s ± 0% 10.0GB/s ± 0% -7.95% (p=0.001 n=7+6) Memclr/4096 20.1GB/s ± 0% 55.3GB/s ± 0% +175.46% (p=0.000 n=8+8) Memclr/65536 22.6GB/s ± 0% 77.8GB/s ± 0% +243.63% (p=0.000 n=7+8) Memclr/1M 22.8GB/s ± 0% 61.5GB/s ± 0% +169.38% (p=0.000 n=8+8) Memclr/4M 22.8GB/s ± 0% 54.3GB/s ± 4% +137.85% (p=0.001 n=6+8) Memclr/8M 22.8GB/s ± 0% 58.1GB/s ± 1% +154.56% (p=0.000 n=7+8) Memclr/16M 22.8GB/s ± 0% 57.2GB/s ± 1% +150.54% (p=0.000 n=8+8) Memclr/64M 22.8GB/s ± 0% 54.4GB/s ± 0% +138.42% (p=0.000 n=7+8) GoMemclr/5 625MB/s ± 0% 569MB/s ± 0% -8.90% (p=0.000 n=7+8) GoMemclr/16 2.00GB/s ± 0% 2.10GB/s ± 0% +5.26% (p=0.000 n=8+8) GoMemclr/64 5.92GB/s ± 0% 6.15GB/s ± 0% +3.83% (p=0.000 n=7+8) GoMemclr/256 12.5GB/s ± 0% 12.1GB/s ± 0% -3.77% (p=0.000 n=8+7) Benchmark results of runtime/Memclr on Amberwing without ZVA: name old time/op new time/op delta Memclr/5 12.7ns ± 0% 12.8ns ± 0% +0.79% (p=0.008 n=5+5) Memclr/16 12.7ns ± 0% 12.7ns ± 0% ~ (p=0.444 n=5+5) Memclr/64 14.0ns ± 0% 14.4ns ± 0% +2.86% (p=0.008 n=5+5) Memclr/256 23.7ns ± 1% 19.2ns ± 0% -19.06% (p=0.008 n=5+5) Memclr/4096 203ns ± 0% 119ns ± 0% -41.38% (p=0.008 n=5+5) Memclr/65536 2.89µs ± 0% 1.66µs ± 0% -42.76% (p=0.008 n=5+5) Memclr/1M 45.9µs ± 0% 26.2µs ± 0% -42.82% (p=0.008 n=5+5) Memclr/4M 184µs ± 0% 105µs ± 0% -42.81% (p=0.008 n=5+5) Memclr/8M 367µs ± 0% 210µs ± 0% -42.76% (p=0.008 n=5+5) Memclr/16M 734µs ± 0% 420µs ± 0% -42.74% (p=0.008 n=5+5) Memclr/64M 2.94ms ± 0% 1.69ms ± 0% -42.46% (p=0.008 n=5+5) GoMemclr/5 8.00ns ± 0% 8.40ns ± 0% +5.00% (p=0.008 n=5+5) GoMemclr/16 8.00ns ± 0% 8.40ns ± 0% +5.00% (p=0.008 n=5+5) GoMemclr/64 10.8ns ± 0% 9.6ns ± 0% -11.02% (p=0.008 n=5+5) GoMemclr/256 20.4ns ± 0% 17.2ns ± 0% -15.69% (p=0.008 n=5+5) name old speed new speed delta Memclr/5 393MB/s ± 0% 391MB/s ± 0% -0.64% (p=0.008 n=5+5) Memclr/16 1.26GB/s ± 0% 1.26GB/s ± 0% -0.55% (p=0.008 n=5+5) Memclr/64 4.57GB/s ± 0% 4.44GB/s ± 0% -2.79% (p=0.008 n=5+5) Memclr/256 10.8GB/s ± 0% 13.3GB/s ± 0% +23.07% (p=0.016 n=4+5) Memclr/4096 20.1GB/s ± 0% 34.3GB/s ± 0% +70.91% (p=0.008 n=5+5) Memclr/65536 22.7GB/s ± 0% 39.6GB/s ± 0% +74.65% (p=0.008 n=5+5) Memclr/1M 22.8GB/s ± 0% 40.0GB/s ± 0% +74.88% (p=0.008 n=5+5) Memclr/4M 22.8GB/s ± 0% 39.9GB/s ± 0% +74.84% (p=0.008 n=5+5) Memclr/8M 22.9GB/s ± 0% 39.9GB/s ± 0% +74.71% (p=0.008 n=5+5) Memclr/16M 22.9GB/s ± 0% 39.9GB/s ± 0% +74.64% (p=0.008 n=5+5) Memclr/64M 22.8GB/s ± 0% 39.7GB/s ± 0% +73.79% (p=0.008 n=5+5) GoMemclr/5 625MB/s ± 0% 595MB/s ± 0% -4.77% (p=0.000 n=4+5) GoMemclr/16 2.00GB/s ± 0% 1.90GB/s ± 0% -4.77% (p=0.008 n=5+5) GoMemclr/64 5.92GB/s ± 0% 6.66GB/s ± 0% +12.48% (p=0.016 n=4+5) GoMemclr/256 12.5GB/s ± 0% 14.9GB/s ± 0% +18.95% (p=0.008 n=5+5) Fixes #22948 Change-Id: Iaae4e22391e25b54d299821bb7f8a81ac3986b93 Reviewed-on: https://go-review.googlesource.com/82055 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-14 18:20:40 +00:00
Robert Griesemer	e65d6a6abe	cmd/compile: document new line directives Fixes #24183. Change-Id: I5ef31c4a3aad7e05568b7de1227745d686d4aff8 Reviewed-on: https://go-review.googlesource.com/100462 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-14 18:11:30 +00:00
Alberto Donizetti	cd3aae9b81	test/codegen: port all small memmove tests to codegen This change ports all the remaining tests checking that small memmoves are replaced with MOVs to the new codegen test harness, and deletes them from the asm_test file. Change-Id: I01c94b441e27a5d61518035af62d62779dafeb56 Reviewed-on: https://go-review.googlesource.com/100476 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-14 15:57:07 +00:00
Daniel Martí	b8d26225c1	cmd/asm: move manual tests out of generated file Thanks to Iskander Sharipov for spotting this in an earlier CL of mine. Change-Id: Idf45ad266205ff83985367cb38f585badfbed151 Reviewed-on: https://go-review.googlesource.com/100535 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Iskander Sharipov <iskander.sharipov@intel.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-14 15:32:55 +00:00
Giovanni Bajo	eb44b8635c	cmd/compile: remove BTQconst rule This rule is meant for code optimization, but it makes other rules potentially more complex, as they need to cope with the fact that a 32-bit op (BTLconst) can appear everywhere a 64-bit rule maches. Move the optimization to opcode expansion instead. Tests will be added in following CL. Change-Id: Ica5ef291e7963c4af17c124d4a2869e6c8f7b0c7 Reviewed-on: https://go-review.googlesource.com/99995 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-13 23:21:38 +00:00
Matthew Dempsky	e601c07908	cmd/compile: reject type switch with guarded declaration and no cases Fixes #23116. Change-Id: I5db5c5c39bbb50148ffa18c9393b045f255f80a3 Reviewed-on: https://go-review.googlesource.com/100459 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-13 22:02:46 +00:00
Robert Griesemer	363bcd7b4f	cmd/compile: use key position for key:val elements in composite literals Fixes #24339. Change-Id: Ie47764fed27f76b480834b1fdbed0512c94831d9 Reviewed-on: https://go-review.googlesource.com/100457 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-13 21:44:12 +00:00
Daniel Martí	63f4ab98eb	cmd/compile: deduplicate racewalk switch cases Only the contiguous ones, to keep the patch simple. Remove some unnecessary newlines, while at it. Change-Id: Ia588f80538b49a169fbf49835979ebff5a0a7b6d Reviewed-on: https://go-review.googlesource.com/94756 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-13 21:42:58 +00:00
Daniel Martí	1178e51a37	cmd/asm: VPERMQ's imm8 arg is an uint8 The imm8 argument consists of 4 2-bit indices, so it can take values up to $255. However, the assembler was treating it as Yi8, which reads "fits in int8". Add a Yu8 variant, to also keep backwards compatibility with negative values possible with Yi8. Fixes #24378. Change-Id: I24ddb19c219b54d039a6c1bcdb903717d1c7c3b8 Reviewed-on: https://go-review.googlesource.com/100475 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-13 21:29:45 +00:00
pityonline	375280f1bb	cmd/vendor: fix JSON format Change-Id: I9c5a4a4031c12d67c7e75e9a276a766927abf83d Reviewed-on: https://go-review.googlesource.com/100415 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-13 20:41:16 +00:00
Matthew Dempsky	09d4455f45	cmd/compile: enable inlining variadic functions As a side effect of working on mid-stack inlining, we've fixed support for inlining variadic functions. Might as well enable it. Change-Id: I7f555f8b941969791db7eb598c0b49f6dc0820aa Reviewed-on: https://go-review.googlesource.com/100456 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-13 20:34:03 +00:00
Matthew Dempsky	c74aa39f47	cmd/compile: eliminate mkinlcall's isddd parameter These are always set to n.Isddd(), which is readily available within mkinlcall. Change-Id: I3d7fbc9dc19a40d6b905691c666eee9bcd031a00 Reviewed-on: https://go-review.googlesource.com/100455 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-13 20:23:55 +00:00
Carlos Eduardo Seo	e1f8fe8dff	cmd/internal/obj/ppc64: implement full operand support for larx instructions The current implementation of larx instructions does not accept non-zero offsets in RA nor the EH field. This change adds full functionality to those instructions. Updates #23845 Change-Id: If113f70d11de5f35f8389520b049390dbc40e863 Reviewed-on: https://go-review.googlesource.com/99635 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2018-03-13 19:46:54 +00:00
Daniel Martí	ca9abbb731	cmd/compile: remove some unused parameters As reported by unparam. Passes toolstash -cmp on std cmd. Change-Id: I55473e1eed096ed1c3e431aed2cbf0b6b5444b91 Reviewed-on: https://go-review.googlesource.com/97895 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-13 16:50:11 +00:00
Cherry Zhang	518e6f0893	cmd/internal/obj/arm64: support logical instructions targeting RSP Logical instructions can have RSP as its destination. Support it. Note that the two-operand form, like "AND $1, RSP", which is equivalent to the three-operand form "AND $1, RSP, RSP", is invalid, because the source register is not allowed to be RSP. Also note that instructions that set the conditional flags, like ANDS, cannot target RSP. Because of this, we split out the optab entries of AND et al. and ANDS et al. Merge the optab entries of BIC et al. to AND et al., because they are same. Fixes #24332. Change-Id: I3584d6f2e7cea98a659a1ed9fdf67c353e090637 Reviewed-on: https://go-review.googlesource.com/100217 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-13 16:09:57 +00:00
Cherry Zhang	911839c1f4	cmd/internal/obj/arm64: fix branch-too-far with TBZ like instructions The compiler now emits TBZ like instructions, but the assembler's too-far-branch patch code didn't include that case. Add it. Fixes #23889. Change-Id: Ib75f9250c660b9fb652835fbc83263a5d5073dc5 Reviewed-on: https://go-review.googlesource.com/94902 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-03-13 13:27:58 +00:00
David Chase	3c16934f16	cmd/compile: fix failure to reset reused bit of storage This is the "3rd bug" that caused compilations to sometimes produce different results when dwarf location lists were enabled. A loop had not been properly rewritten in an earlier optimization CL, and it accessed uninitialized data, which was deterministically perhaps wrong when single threaded, but variably wrong when multithreaded. Change-Id: Ib3da538762fdf7d5e4407106f2434f3b14a1d7ea Reviewed-on: https://go-review.googlesource.com/99935 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-13 05:03:31 +00:00
Vladimir Kuzmin	7395083136	cmd/compile: avoid extra mapaccess in "m[k] op= r" Currently, order desugars map assignment operations like m[k] op= r into m[k] = m[k] op r which in turn is transformed during walk into: tmp := mapaccess(m, k) tmp = tmp op r mapassign(m, k) = tmp However, this is suboptimal, as we could instead produce just: mapassign(m, k) op= r One complication though is if "r == 0", then "m[k] /= r" and "m[k] %= r" will panic, and they need to do so before* calling mapassign, otherwise we may insert a new zero-value element into the map. It would be spec compliant to just emit the "r != 0" check before calling mapassign (see #23735), but currently these checks aren't generated until SSA construction. For now, it's simpler to continue desugaring /= and %= into two map indexing operations. Fixes #23661. Change-Id: I46e3739d9adef10e92b46fdd78b88d5aabe68952 Reviewed-on: https://go-review.googlesource.com/91557 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-12 19:27:44 +00:00
isharipo	85a8d25d53	cmd/compile/internal/ssa: emit IMUL3{L/Q} for MUL{L/Q}const on x86 cmd/asm now supports three-operand form of IMUL, so instead of using IMUL with resultInArg0, emit IMUL3 instruction. This results in less redundant MOVs where SSA assigns different registers to input[0] and dst arguments. Note: these have exactly the same encoding when reg0=reg1: IMUL3x $const, reg0, reg1 IMULx $const, reg Two-operand IMULx is like a crippled IMUL3x, with dst fixed to input[0]. This is why we don't bother to generate IMULx for the case where dst is the same as input[0]. Change-Id: I4becda475b3dffdd07b6fdf1c75bacc82af654e4 Reviewed-on: https://go-review.googlesource.com/99656 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-12 19:02:36 +00:00
Giovanni Bajo	080187f4f7	cmd/compile: implement CMOV on amd64 This builds upon the branchelim pass, activating it for amd64 and lowering CondSelect. Special care is made to FPU instructions for NaN handling. Benchmark results on Xeon E5630 (Westmere EP): name old time/op new time/op delta BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5) Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5) FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5) FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5) FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5) FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5) FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5) FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5) FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5) GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5) GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5) Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5) Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5) HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5) JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4) JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5) Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5) GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4) RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5) RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5) RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5) RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5) RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5) RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5) Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5) Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5) TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4) TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5) name old speed new speed delta GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5) GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5) Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5) Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5) JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4) JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5) GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5) RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5) RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5) RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5) RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5) RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5) RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5) Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5) Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5) Change-Id: Ic7cb7374d07da031e771bdcbfdd832fd1b17159c Reviewed-on: https://go-review.googlesource.com/98695 Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-03-12 18:01:33 +00:00
fanzha02	fdf5aaf555	cmd/asm: fix ARM64 vector register arrangement encoding bug The current code assigns vector register arrangement a wrong value when the arrangement specifier is S2, which causes the incorrect assembly. The patch fixes the issue and adds the test cases. Fixes #24249 Change-Id: I9736df1279494003d0b178da1af9cee9cd85ce21 Reviewed-on: https://go-review.googlesource.com/98555 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-12 15:05:53 +00:00
Giovanni Bajo	f7ac70a566	test: move rotate tests to top-level testsuite. Remove old tests from asm_test. Change-Id: Ib408ec7faa60068bddecf709b93ce308e0ef665a Reviewed-on: https://go-review.googlesource.com/100075 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>	2018-03-11 10:08:18 +00:00
Daniel Martí	c15b7b2a54	cmd: re-generate all stringer files The tool has gotten better over time, so re-generating the files brings some advantages like fewer objects, dropping the use of fmt, and dropping unnecessary bounds checks. While at it, add the missing go:generate line for obj.AddrType. Change-Id: I120c9795ee8faddf5961ff0384b9dcaf58d831ff Reviewed-on: https://go-review.googlesource.com/100015 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-10 21:20:50 +00:00
Giovanni Bajo	5c432fe0e3	cmd/compile: gofmt rewriteARM64.go Change-Id: I7424257e496f8f40c9601b62335b64d641dcd3b5 Reviewed-on: https://go-review.googlesource.com/99996 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2018-03-10 13:03:55 +00:00
Daniel Martí	0c5cfec844	cmd/internal/test2json: support subtests containing colons The "updates" lines, such as RUN, do not contain a colon. However, test2json looked for one anyway, meaning that it would be thrown off if it encountered a line like: === RUN TestWithColons/[::1] In that case, it must not use the first colon it encounters to separate the action from the test name. Fixes #23920. Change-Id: I82eff23e24b83dae183c0cf9f85fc5f409f51c25 Reviewed-on: https://go-review.googlesource.com/98445 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-10 10:13:25 +00:00
David Chase	0eacf8cbdf	cmd/compile: add DWARF reg defs & fix 32-bit location list bug Before DWARF location lists can be turned on, 3 bugs need fixing. This CL addresses two -- lack of register definitions for various architectures, and bugs on 32-bit platforms. The third bug comes later. Passes GO_GCFLAGS=-dwarflocationlists ./run.bash -no-rebuild (-no-rebuild because the map dependence causes trouble) Change-Id: I4223b48ade84763e4b048e4aeb81149f082c7bc7 Reviewed-on: https://go-review.googlesource.com/99255 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-09 23:17:18 +00:00
Ben Shi	20046020c4	cmd/compile: fix an issue in MNEG of ARM64 There are two less optimized SSA rules in my previous CL https://go-review.googlesource.com/c/go/+/95075 . This CL fixes that issue and a test case gets about 10% performance improvement. name old time/op new time/op delta MNEG-4 263µs ± 3% 235µs ± 3% -10.53% (p=0.000 n=20+20) (https://github.com/benshi001/ugo1/blob/master/mneg_7_test.go) Change-Id: I30087097e281dd9d9d1c870d32e13b4ef4a96ad3 Reviewed-on: https://go-review.googlesource.com/99495 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-09 22:45:21 +00:00
Matthew Dempsky	e4de522c95	cmd/compile: fix Node.Etype overloading Add helper methods that validate n.Op and convert to/from the appropriate type. Notably, there was a lot of code in walk.go that thought setting Etype=1 on an OADDR node affected escape analysis. Passes toolstash-check. TBR=marvin Change-Id: Ieae7c67225c1459c9719f9e6a748a25b975cf758 Reviewed-on: https://go-review.googlesource.com/99535 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-09 21:44:35 +00:00
Alberto Donizetti	5f541b11aa	test/codegen: port MULs merging tests to codegen And delete them from asm_go. Change-Id: I0057cbd90ca55fa51c596e32406e190f3866f93e Reviewed-on: https://go-review.googlesource.com/99815 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-09 17:01:56 +00:00
Jean de Klerk	709317138f	cmd/go: briefly document test caching in go test -h output Fixes #23971 Change-Id: I073f278cc058aa15a23c0ea06292c02d50a3df21 Reviewed-on: https://go-review.googlesource.com/95582 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Rob Pike <r@golang.org>	2018-03-09 11:39:11 +00:00
Alberto Donizetti	cde34780b7	test/codegen: port math/bits.RotateLeft tests to codegen Only RotateLeft{64,32} were tested, and just for ppc64. This CL adds tests for RotateLeft{64,32,16,8} on arm64 and amd64/386, for the cases where the calls are actually instrinsified. RotateLeft tests (the last ones for math/bits functions) are deleted from asm_test. This CL also adds a space between the "//" and the arch name in the comments, to uniform this file to the style used in all the other files. Change-Id: Ifc2a27261d70bcc294b4ec64490d8367f62d2b89 Reviewed-on: https://go-review.googlesource.com/99596 Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-09 10:53:38 +00:00
Hana Kim	6b5a0b5c16	cmd/trace: set cname for span slices Define a set of color names available in trace viewer https://user-images.githubusercontent.com/4999471/37063995-5d0bad48-2169-11e8-92be-9cb363e21c38.png Change-Id: I312fcbc5430d7512b4c39ddc79a769259bad8c22 Reviewed-on: https://go-review.googlesource.com/99055 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-09 01:58:41 +00:00
Hana Kim	476f71c958	cmd/trace: remove unrelated arrows in task-oriented traceview Also grey out instants that represent events occurred outside the task's span. Furthermore, if the unrelated instants represent user annotation events but not for the task of the interest, skip rendering completely. This helps users to focus on the task-related events better. UI screen shot: https://gist.github.com/hyangah/1df5d2c8f429fd933c481e9636b89b55#file-golang-org_cl_99035 Change-Id: I2b5aef41584c827f8c1e915d0d8e5c95fe2b4b65 Reviewed-on: https://go-review.googlesource.com/99035 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-09 01:34:12 +00:00
Austin Clements	60a9e5d613	runtime: ensure abort actually crashes the process On all non-x86 arches, runtime.abort simply reads from nil. Unfortunately, if this happens on a user stack, the signal handler will dutifully turn this into a panicmem, which lets user defers run and which user code can even recover from. To fix this, add an explicit check to the signal handler that turns faults in abort into hard crashes directly in the signal handler. This has the added benefit of giving a register dump at the abort point. Change-Id: If26a7f13790745ee3867db7f53b72d8281176d70 Reviewed-on: https://go-review.googlesource.com/93661 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:55 +00:00
Austin Clements	c950a90d72	runtime: call abort instead of raw INT $3 or bad MOV Everything except for amd64, amd64p32, and 386 currently defines and uses an abort function. This CL makes these match. The next CL will recognize the abort function to make this more useful. Change-Id: I7c155871ea48919a9220417df0630005b444f488 Reviewed-on: https://go-review.googlesource.com/93660 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:55:54 +00:00
Austin Clements	da022da900	cmd/compile: simplify OpSlicemask optimization The previous CL introduced isConstDelta. Use it to simplify the OpSlicemask optimization in the prove pass. This passes toolstash -cmp. Change-Id: If2aa762db4cdc0cd1c581a536340530a9831081b Reviewed-on: https://go-review.googlesource.com/87481 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:29 +00:00
Austin Clements	6436270dad	cmd/compile: add fence-post implications to prove This adds four new deductions to the prove pass, all related to adding or subtracting one from a value. This is the first hint of actual arithmetic relations in the prove pass. The most effective of these is x-1 >= w && x > min ⇒ x > w This helps eliminate bounds checks in code like if x > 0 { // do something with s[x-1] } Altogether, these deductions prove an additional 260 branches in std and cmd. Furthermore, they will let us eliminate some tricky compiler-inserted panics in the runtime that are interfering with static analysis. Fixes #23354. Change-Id: I7088223e0e0cd6ff062a75c127eb4bb60e6dce02 Reviewed-on: https://go-review.googlesource.com/87480 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:28 +00:00
Austin Clements	941fc129e2	cmd/compile: derive unsigned limits from signed limits in prove This adds a few simple deductions to the prove pass' fact table to derive unsigned concrete limits from signed concrete limits where possible. This tweak lets the pass prove 70 additional branch conditions in std and cmd. This is based on a comment from the recently-deleted factsTable.get: "// TODO: also use signed data if lim.min >= 0". Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392 Reviewed-on: https://go-review.googlesource.com/87479 Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:27 +00:00
Austin Clements	669db2cef5	cmd/compile: make prove pass use unsatisfiability Currently the prove pass uses implication queries. For each block, it collects the set of branch conditions leading to that block, and queries this fact table for whether any of these facts imply the block's own branch condition (or its inverse). This works remarkably well considering it doesn't do any deduction on these facts, but it has various downsides: 1. It requires an implementation both of adding facts to the table and determining implications. These are very nearly duals of each other, but require separate implementations. Likewise, the process of asserting facts of dominating branch conditions is very nearly the dual of the process of querying implied branch conditions. 2. It leads to less effective use of derived facts. For example, the prove pass currently derives facts about the relations between len and cap, but can't make use of these unless a branch condition is in the exact form of a derived fact. If one of these derived facts contradicts another fact, it won't notice or make use of this. This CL changes the approach of the prove pass to instead use contradiction instead of implication. Rather than ever querying a branch condition, it simply adds branch conditions to the fact table. If this leads to a contradiction (specifically, it makes the fact set unsatisfiable), that branch is impossible and can be cut. As a result, 1. We can eliminate the code for determining implications (factsTable.get disappears entirely). Also, there is now a single implementation of visiting and asserting branch conditions, since we don't have to flip them around to treat them as facts in one place and queries in another. 2. Derived facts can be used effectively. It doesn't matter why the fact table is unsatisfiable; a contradiction in any of the facts is enough. 3. As an added benefit, it's now quite easy to avoid traversing beyond provably-unreachable blocks. In contrast, the current implementation always visits all blocks. The prove pass already has nearly all of the mechanism necessary to compute unsatisfiability, which means this both simplifies the code and makes it more powerful. The only complication is that the current implication procedure has a hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and OpIsSliceInBounds. We replace this with asserting the appropriate fact when we process one of these conditions. This seems much cleaner anyway, and works because we can now take advantage of derived facts. This has no measurable effect on compiler performance. Effectiveness: There is exactly one condition in all of std and cmd that this fails to prove that the old implementation could: (int64(^uint(0)>>1) < x) in encoding/gob. This can never be true because x is an int, and it's basically coincidence that the old code gets this. (For example, it fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that immediately precedes it, and even though the conditions are logically unrelated, it wouldn't get the second one if it hadn't first processed the first!) It does, however, prove a few dozen additional branches. These come from facts that are added to the fact table about the relations between len and cap. These were almost never queried directly before, but could lead to contradictions, which the unsat-based approach is able to use. There are exactly two branches in std and cmd that this implementation proves in the other direction. This sounds scary, but is okay because both occur in already-unreachable blocks, so it doesn't matter what we chose. Because the fact table logic is sound but incomplete, it fails to prove that the block isn't reachable, even though it is able to prove that both outgoing branches are impossible. We could turn these blocks into BlockExit blocks, but it doesn't seem worth the trouble of the extra proof effort for something that happens twice in all of std and cmd. Tests: This CL updates test/prove.go to change the expected messages because it can no longer give a "reason" why it proved or disproved a condition. It also adds a new test of a branch it couldn't prove before. It mostly guts test/sliceopt.go, removing everything related to slice bounds optimizations and moving a few relevant tests to test/prove.go. Much of this test is actually unreachable. The new prove pass figures this out and doesn't try to prove anything about the unreachable parts. The output on the unreachable parts is already suspect because anything can be proved at that point, so it's really just a regression test for an algorithm the compiler no longer uses. This is a step toward fixing #23354. That issue is quite easy to fix once we can use derived facts effectively. Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8 Reviewed-on: https://go-review.googlesource.com/87478 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:25 +00:00
Austin Clements	2e9cf5f66e	cmd/compile: simplify limit logic in prove This replaces the open-coded intersection of limits in the prove pass with a general limit intersection operation. This should get identical results except in one case where it's more precise: when handling an equality relation, if the value is outside the existing range, this will reduce the range to empty rather than resetting it. This will be important to a follow-up CL where we can take advantage of empty ranges. For #23354. Change-Id: I3d3d75924f61b1da1cb604b3a9d189b26fb3a14e Reviewed-on: https://go-review.googlesource.com/87477 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:24 +00:00
Austin Clements	44e20b64ef	cmd/compile: more String methods for prove types These aid in debugging. Change-Id: Ieb38c996765f780f6103f8c3292639d408c25123 Reviewed-on: https://go-review.googlesource.com/87476 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-08 22:25:23 +00:00
Austin Clements	491f409a32	cmd/compile: minor comment improvements/corrections Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3 Reviewed-on: https://go-review.googlesource.com/87475 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:21 +00:00
Matthew Dempsky	b55eedd173	Revert "cmd/compile: cleanup nodpc and nodfp" This reverts commit `dcac984b97`. Reason for revert: broke LR architectures (arm64, ppc64, s390x) Change-Id: I531d311c9053e81503c8c78d6cf044b318fc828b Reviewed-on: https://go-review.googlesource.com/99695 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-08 21:23:01 +00:00
isharipo	d2a5263a9c	math/big: speedup nat.setBytes for bigger slices Set up to _S (number of bytes in Uint) bytes at time by using BigEndian.Uint32 and BigEndian.Uint64. The performance improves for slices bigger than _S bytes. This is the case for 128/256bit arith that initializes it's objects from bytes. name old time/op new time/op delta NatSetBytes/8-4 29.8ns ± 1% 11.4ns ± 0% -61.63% (p=0.000 n=9+8) NatSetBytes/24-4 109ns ± 1% 56ns ± 0% -48.75% (p=0.000 n=9+8) NatSetBytes/128-4 420ns ± 2% 110ns ± 1% -73.83% (p=0.000 n=10+10) NatSetBytes/7-4 26.2ns ± 1% 21.3ns ± 2% -18.63% (p=0.000 n=8+9) NatSetBytes/23-4 106ns ± 1% 67ns ± 1% -36.93% (p=0.000 n=9+10) NatSetBytes/127-4 410ns ± 2% 121ns ± 0% -70.46% (p=0.000 n=9+8) Found this optimization opportunity by looking at ethereum_corevm community benchmark cpuprofile. name old time/op new time/op delta OpDiv256-4 715ns ± 1% 596ns ± 1% -16.57% (p=0.008 n=5+5) OpDiv128-4 373ns ± 1% 314ns ± 1% -15.83% (p=0.008 n=5+5) OpDiv64-4 301ns ± 0% 285ns ± 1% -5.12% (p=0.008 n=5+5) Change-Id: I8e5a680ae6284c8233d8d7431d51253a8a740b57 Reviewed-on: https://go-review.googlesource.com/98775 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-08 18:50:10 +00:00
Matthew Dempsky	dcac984b97	cmd/compile: cleanup nodpc and nodfp Instead of creating a new &nodfp expression for every recover() call, or a new nodpc variable for every function instrumented by the race detector, this CL introduces two new uintptr-typed pseudo-variables callerSP and callerPC. These pseudo-variables act just like calls to the runtime's getcallersp() and getcallerpc() functions. For consistency, change runtime.gorecover's builtin stub's parameter type from "*int32" to "uintptr". Passes toolstash-check, but toolstash-check -race fails because of register allocator changes. Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358 Reviewed-on: https://go-review.googlesource.com/99416 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 18:22:29 +00:00
Matthew Dempsky	6a5cfa8b63	cmd/compile: remove two out-of-phase calls to walk All calls to walkstmt/walkexpr/etc should be rooted from funccompile, whereas transformclosure and fninit are called by main. Passes toolstash-check. Change-Id: Ic880e2d2d83af09618ce4daa8e7716f6b389e53e Reviewed-on: https://go-review.googlesource.com/99418 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 18:22:13 +00:00
Matthew Dempsky	8b766e5d09	cmd/compile: remove state.exitCode We're holding onto the function's complete AST anyway, so might as well grab the exit code from there. Passes toolstash-check. Change-Id: I851b5dfdb53f991e9cd9488d25d0d2abc2a8379f Reviewed-on: https://go-review.googlesource.com/99417 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 18:22:04 +00:00
Matthew Dempsky	e3127f023f	cmd/compile: fuse escape analysis parameter tagging loops Simplifies the code somewhat and allows removing Param.Field. Passes toolstash-check. Change-Id: Id854416aea8afd27ce4830ff0f5ff940f7353792 Reviewed-on: https://go-review.googlesource.com/99336 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-08 18:21:52 +00:00
Alberto Donizetti	3772b2e1d5	test/codegen: port 2^n muls tests to codegen harness And delete them from the asm_test.go file. Change-Id: I124c8c352299646ec7db0968cdb0fe59a3b5d83d Reviewed-on: https://go-review.googlesource.com/99475 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-08 16:30:14 +00:00
Lynn Boger	5b14c7b324	cmd/asm, cmd/internal/obj/ppc64: avoid unnecessary load zeros When instructions add, and, or, xor, and movd have constant operands in some cases more instructions are generated than necessary by the assembler. This adds more opcode/operand combinations to the optab and improves the code generation for the cases where the size and sign of the constant allows the use of 1 instructions instead of 2. Example of previous code: oris r3, r0, 0 ori r3, r3, 65533 now: ori r3, r0, 65533 This does not significantly reduce the overall binary size because the improvement depends on the constant value. Some procedures show a 1-2% reduction in size. This improvement could also be significant in cases where the extra instructions occur in a critical loop. Testcase ppc64enc.s was added to cmd/asm/internal/asm/testdata with the variations affected by this change. Updates #23845 Change-Id: I7fdf2320c95815d99f2755ba77d0c6921cd7fad7 Reviewed-on: https://go-review.googlesource.com/95135 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-03-08 14:17:34 +00:00
Matthew Dempsky	88466e93a4	cmd/compile: mark anonymous receiver parameters as non-escaping This was already done for normal parameters, and the same logic applies for receiver parameters too. Updates #24305. Change-Id: Ia2a46f68d14e8fb62004ff0da1db0f065a95a1b7 Reviewed-on: https://go-review.googlesource.com/99335 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 00:20:01 +00:00
Ian Lance Taylor	8b8625a328	cmd/cover: don't crash on non-gofmt'ed input Without the change to cover.go, the new test fails with panic: overlapping edits: [4946,4950)->"", [4947,4947)->"thisNameMustBeVeryLongToCauseOverflowOfCounterIncrementStatementOntoNextLineForTest.Count[112]++;" The original code inserts "else{", deletes "else", and then positions a new block just after the "}" that must come before the "else". That works on gofmt'ed code, but fails if the code looks like "}else". When there is no space between the "{" and the "else", the new block is inserted into a location that we are deleting, leading to the "overlapping edits" mentioned above. This CL fixes this case by not deleting the "else" but just using the one that is already there. That requires adjust the block offset to come after the "{" that we insert. Fixes #23927 Change-Id: I40ef592490878765bbce6550ddb439e43ac525b2 Reviewed-on: https://go-review.googlesource.com/98935 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-07 23:36:25 +00:00
Brad Fitzpatrick	d8c9ef9e5c	cmd/dist: skip rebuild before running tests when on the build systems Updates #24300 Change-Id: I7752dab67e15a6dfe5fffe5b5ccbf3373bbc2c13 Reviewed-on: https://go-review.googlesource.com/99296 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-07 23:27:24 +00:00
David du Colombier	b1335037fa	cmd/go: skip TestVetWithOnlyCgoFiles when cgo is disabled CL 99175 added TestVetWithOnlyCgoFiles. However, this test is failing on platforms where cgo is disabled, because no file can be built. This change fixes TestVetWithOnlyCgoFiles by skipping this test when cgo is disabled. Fixes #24304. Change-Id: Ibb38fcd3e0ed1a791782145d3f2866f12117c6fe Reviewed-on: https://go-review.googlesource.com/99275 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-07 22:03:43 +00:00
Ian Lance Taylor	709da95513	cmd/go: run vet on packages with only cgo files CgoFiles is not included in GoFiles, so we need to check both. Fixes #24193 Change-Id: I6a67bd912e3d9a4be0eae8fa8db6fa8a07fb5df3 Reviewed-on: https://go-review.googlesource.com/99175 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-07 18:42:17 +00:00
Matthew Dempsky	a3b3284ddc	cmd/compile: prevent untyped types from reaching walk We already require expressions to have already been typechecked before reaching walk. Moreover, all untyped expressions should have been converted to their default type by walk. However, in practice, we've been somewhat sloppy and inconsistent about ensuring this. In particular, a lot of AST rewrites ended up leaving untyped bool expressions scattered around. These likely aren't harmful in practice, but it seems worth cleaning up. The two most common cases addressed by this CL are: 1) When generating OIF and OFOR nodes, we would often typecheck the conditional expression, but not apply defaultlit to force it to the expression's default type. 2) When rewriting string comparisons into more fundamental primitives, we were simply overwriting r.Type with the desired type, which didn't propagate the type to nested subexpressions. These are fixed by utilizing finishcompare, which correctly handles this (and is already used by other comparison lowering rewrites). Lastly, walkexpr is extended to assert that it's not called on untyped expressions. Fixes #23834. Change-Id: Icbd29648a293555e4015d3b06a95a24ccbd3f790 Reviewed-on: https://go-review.googlesource.com/98337 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-07 18:14:22 +00:00
Kunpei Sakai	ed8b7a7785	cmd/compile: go fmt Change-Id: I2eae33928641c6ed74badfe44d079ae90e5cc8c8 Reviewed-on: https://go-review.googlesource.com/99195 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-07 16:57:03 +00:00
Hana Kim	93b0261d0a	cmd/trace: force GC occassionally to return memory to the OS after completing potentially large operations. Update #21870 Sys went down to 3.7G $ DEBUG_MEMORY_USAGE=1 go tool trace trace.out 2018/03/07 09:35:52 Parsing trace... after parsing trace Alloc: 3385754360 Bytes Sys: 3662047864 Bytes HeapReleased: 0 Bytes HeapSys: 3488907264 Bytes HeapInUse: 3426549760 Bytes HeapAlloc: 3385754360 Bytes Enter to continue... 2018/03/07 09:36:09 Splitting trace... after spliting trace Alloc: 3238309424 Bytes Sys: 3684410168 Bytes HeapReleased: 0 Bytes HeapSys: 3488874496 Bytes HeapInUse: 3266461696 Bytes HeapAlloc: 3238309424 Bytes Enter to continue... 2018/03/07 09:36:39 Opening browser. Trace viewer is listening on http://100.101.224.241:12345 after httpJsonTrace Alloc: 3000633872 Bytes Sys: 3693978424 Bytes HeapReleased: 0 Bytes HeapSys: 3488743424 Bytes HeapInUse: 3030966272 Bytes HeapAlloc: 3000633872 Bytes Enter to continue... Change-Id: I56f64cae66c809cbfbad03fba7bd0d35494c1d04 Reviewed-on: https://go-review.googlesource.com/92376 Reviewed-by: Peter Weinberger <pjw@google.com>	2018-03-07 14:39:25 +00:00
Hana Kim	ee465831ec	cmd/trace: generate jsontrace data in a streaming fashion Update #21870 The Sys went down to 4.25G from 6.2G. $ DEBUG_MEMORY_USAGE=1 go tool trace trace.out 2018/03/07 08:49:01 Parsing trace... after parsing trace Alloc: 3385757184 Bytes Sys: 3661195896 Bytes HeapReleased: 0 Bytes HeapSys: 3488841728 Bytes HeapInUse: 3426516992 Bytes HeapAlloc: 3385757184 Bytes Enter to continue... 2018/03/07 08:49:18 Splitting trace... after spliting trace Alloc: 2352071904 Bytes Sys: 4243825464 Bytes HeapReleased: 0 Bytes HeapSys: 4025712640 Bytes HeapInUse: 2377703424 Bytes HeapAlloc: 2352071904 Bytes Enter to continue... after httpJsonTrace Alloc: 3228697832 Bytes Sys: 4250379064 Bytes HeapReleased: 0 Bytes HeapSys: 4025647104 Bytes HeapInUse: 3260014592 Bytes HeapAlloc: 3228697832 Bytes Change-Id: I546f26bdbc68b1e58f1af1235a0e299dc0ff115e Reviewed-on: https://go-review.googlesource.com/92375 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Peter Weinberger <pjw@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-07 14:33:54 +00:00
Matthew Dempsky	d7eb4901f1	cmd/compile: remove funcdepth variables There were only two large classes of use for these variables: 1) Testing "funcdepth != 0" or "funcdepth > 0", which is equivalent to checking "Curfn != nil". 2) In oldname, detecting whether a closure variable has been created for the current function, which can be handled by instead testing "n.Name.Curfn != Curfn". Lastly, merge funcstart into funchdr, since it's only called once, and it better matches up with funcbody now. Passes toolstash-check. Change-Id: I8fe159a9d37ef7debc4cd310354cea22a8b23394 Reviewed-on: https://go-review.googlesource.com/99076 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-07 06:05:18 +00:00
Matthew Dempsky	aa00ca12fe	cmd/compile: cleanup funccompile and compile Bring these functions next to each other, and clean them up a little bit. Also, change emitptrargsmap to take Curfn as a parameter instead of a global. Passes toolstash-check. Change-Id: Ib9c94fda3b2cb6f0dcec1585622b33b4f311b5e9 Reviewed-on: https://go-review.googlesource.com/99075 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-07 03:12:38 +00:00
Kunpei Sakai	b75e8a2a3b	cmd/compile: prevent detection of wrong duplicates by including *types.Type in typeVal. Updates #21866 Fixes #24159 Change-Id: I2f8cac252d88d43e723124f2867b1410b7abab7b Reviewed-on: https://go-review.googlesource.com/98476 Run-TryBot: Kunpei Sakai <namusyaka@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-07 01:26:00 +00:00
Matthew Dempsky	2c0c68d621	cmd/compile: fix miscompilation of "defer delete(m, k)" Previously, for slow map key types (i.e., any type other than a 32-bit or 64-bit plain memory type), we would rewrite defer delete(m, k) into ktmp := k defer delete(m, &ktmp) However, if the defer statement was inside a loop, we would end up reusing the same ktmp value for all of the deferred deletes. We already rewrite defer print(x, y, z) into defer func(a1, a2, a3) { print(a1, a2, a3) }(x, y, z) This CL generalizes this rewrite to also apply for slow map deletes. This could be extended to apply even more generally to other builtins, but as discussed on #24259, there are cases where we must not do this (e.g., "defer recover()"). However, if we elect to do this more generally, this CL should still make that easier. Lastly, while here, fix a few isues in wrapCall (nee walkprintfunc): 1) lookupN appends the generation number to the symbol anyway, so "%d" was being literally included in the generated function names. 2) walkstmt will be called when the function is compiled later anyway, so no need to do it now. Fixes #24259. Change-Id: I70286867c64c69c18e9552f69e3f4154a0fc8b04 Reviewed-on: https://go-review.googlesource.com/99017 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-06 23:33:28 +00:00
ChrisALiles	42ecf39e85	cmd/compile: improve compiler error on embedded structs Fixes #23609 Change-Id: I751aae3d849de7fce1306324fcb1a4c3842d873e Reviewed-on: https://go-review.googlesource.com/97076 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-06 21:06:46 +00:00
Alberto Donizetti	8516ecd05f	test/codegen: port math/bits.ReverseBytes tests to codegen And remove them from ssa_test. Change-Id: If767af662801219774d1bdb787c77edfa6067770 Reviewed-on: https://go-review.googlesource.com/98976 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-06 20:34:33 +00:00
Wei Xiao	05962561ae	cmd/compile/internal/ssa: improve store combine optimization on arm64 Current implementation doesn't consider MOVDreg type operand and fail to combine it into larger store. This patch fixes the issue. Fixes #24242 Change-Id: I7d68697f80e76f48c3528ece01a602bf513248ec Reviewed-on: https://go-review.googlesource.com/98397 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-06 20:29:04 +00:00
Balaram Makam	0e8b7110f6	cmd/compile/internal/ssa: inline small memmove for arm64 This patch enables the optimization for arm64 target. Performance results on Amberwing for strconv benchmark: name old time/op new time/op delta Quote 721ns ± 0% 617ns ± 0% -14.40% (p=0.016 n=5+4) QuoteRune 118ns ± 0% 117ns ± 0% -0.85% (p=0.008 n=5+5) AppendQuote 436ns ± 2% 321ns ± 0% -26.31% (p=0.008 n=5+5) AppendQuoteRune 34.7ns ± 0% 28.4ns ± 0% -18.16% (p=0.000 n=5+4) [Geo mean] 189ns 160ns -15.41% Change-Id: I5714c474e7483d07ca338fbaf49beb4bbcc11c44 Reviewed-on: https://go-review.googlesource.com/98735 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-06 18:37:19 +00:00
Alberto Donizetti	18ae5eca3b	test/codegen: port math/bits.OnesCount tests to codegen And remove them from ssa_test. Change-Id: I3efac5fea529bb0efa2dae32124530482ba5058e Reviewed-on: https://go-review.googlesource.com/98815 Reviewed-by: Keith Randall <khr@golang.org>	2018-03-06 17:53:00 +00:00
Cherry Zhang	f624445473	cmd/internal/obj/arm64: gofmt Change-Id: Ica778fef2d0245fbb14f595597e45c7cf6adef84 Reviewed-on: https://go-review.googlesource.com/98895 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-06 16:35:20 +00:00
Elias Naur	ad87a67cdf	cmd/dist: default to GOARM=7 on android Auto-detecting GOARM on Android makes as little sense as for nacl/arm and darwin/arm. Also update androidtest.sh to not require GOARM set. Change-Id: Id409ce1573d3c668d00fa4b7e3562ad7ece6fef5 Reviewed-on: https://go-review.googlesource.com/98875 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-06 16:08:04 +00:00
Alberto Donizetti	85dcc709a8	test/codegen: port math/bits.TrailingZeros tests to codegen And remove them from ssa_test. Change-Id: Ib5de5c0d908f23915e0847eca338cacf2fa5325b Reviewed-on: https://go-review.googlesource.com/98795 Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-06 11:48:37 +00:00
Meng Zhuo	8916773a3d	runtime, cmd/compile: use ldp for DUFFCOPY on ARM64 name old time/op new time/op delta CopyFat8 2.15ns ± 1% 2.19ns ± 6% ~ (p=0.171 n=8+9) CopyFat12 2.15ns ± 0% 2.17ns ± 2% ~ (p=0.137 n=8+10) CopyFat16 2.17ns ± 3% 2.15ns ± 0% ~ (p=0.211 n=10+10) CopyFat24 2.16ns ± 1% 2.15ns ± 0% ~ (p=0.087 n=10+10) CopyFat32 11.5ns ± 0% 12.8ns ± 2% +10.87% (p=0.000 n=8+10) CopyFat64 20.2ns ± 2% 12.9ns ± 0% -36.11% (p=0.000 n=10+10) CopyFat128 37.2ns ± 0% 21.5ns ± 0% -42.20% (p=0.000 n=10+10) CopyFat256 71.6ns ± 0% 38.7ns ± 0% -45.95% (p=0.000 n=10+10) CopyFat512 140ns ± 0% 73ns ± 0% -47.86% (p=0.000 n=10+9) CopyFat520 142ns ± 0% 74ns ± 0% -47.54% (p=0.000 n=10+10) CopyFat1024 277ns ± 0% 141ns ± 0% -49.10% (p=0.000 n=10+10) Change-Id: If54bc571add5db674d5e081579c87e80153d0a5a Reviewed-on: https://go-review.googlesource.com/97395 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-06 04:14:59 +00:00
Rob Pike	baf3eb1625	cmd/doc: make local dot-slash path names work Before, an argument that started ./ or ../ was not treated as a package relative to the current directory. Thus $ cd $GOROOT/src/text $ go doc ./template could find html/template as $GOROOT/src/html/./template is a valid Go source directory. Fix this by catching such paths and making them absolute before processing. Fixes #23383. Change-Id: Ic2a92eaa3a6328f728635657f9de72ac3ee82afb Reviewed-on: https://go-review.googlesource.com/98396 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-06 01:11:26 +00:00
Matthew Dempsky	26708439ec	cmd/compile: refactor order.go into methods No functional changes, just changing all the orderfoo functions into (*Order).foo methods. Passes toolstash-check. Change-Id: Ib9833daa98aff3c645ce56794a414f8472689152 Reviewed-on: https://go-review.googlesource.com/98617 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-05 21:25:24 +00:00
Hana Kim	d3946f75d3	internal/trace: remove backlinks from span/task end to start This is an updated version of golang.org/cl/96395, with the fix to TestUserSpan. This reverts commit 7b6f6267e90a8e4eab37a3f2164ba882e6222adb. Change-Id: I31eec8ba0997f9178dffef8dac608e731ab70872 Reviewed-on: https://go-review.googlesource.com/98236 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-05 20:10:22 +00:00
Alberto Donizetti	83e41b3e76	test/codegen: port math/bits.Leadingzero tests to codegen Change-Id: Ic21d25db5d56ce77516c53082dfbc010e5875b81 Reviewed-on: https://go-review.googlesource.com/98655 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-05 19:52:04 +00:00
Alberto Donizetti	c1806906d8	test: port bits.Len intrinsics tests to the new codegen harness This change move bits.Len* intrinsification tests to the new codegen test harness, removing them from the old ssa_test file. Five different test functions (one for each bit.Len function tested) was used, to avoid possible unwanted interactions between multiple calls inside one function. Change-Id: Iffd5be55b58e88597fa30a562a28dacb01236d8b Reviewed-on: https://go-review.googlesource.com/98156 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com>	2018-03-05 18:01:19 +00:00
Giovanni Bajo	29fcd57a9f	cmd/compile: fold offsets into memory ops Fold offsets for: {ADD,SUB,MUL}[SD]mem ADD[LQ]constmem {ADD,SUB,AND,OR,XOR}[LQ]mem Cumulatively, the rules trigger ~900 times in all.bash. Fixes #23325 Change-Id: If6c701f68fa0b57907a353a07a516b914127d0d8 Reviewed-on: https://go-review.googlesource.com/98035 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-04 22:53:50 +00:00
Keith Randall	ee58eccc56	internal/bytealg: move short string Index implementations into bytealg Also move the arm64 CountByte implementation while we're here. Fixes #19792 Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e Reviewed-on: https://go-review.googlesource.com/98518 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-04 19:49:44 +00:00
Keith Randall	f6332bb84a	internal/bytealg: move compare functions to bytealg Move bytes.Compare and runtime·cmpstring to bytealg. Update #19792 Change-Id: I139e6d7c59686bef7a3017e3dec99eba5fd10447 Reviewed-on: https://go-review.googlesource.com/98515 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-04 17:49:39 +00:00
Keith Randall	45964e4f9c	internal/bytealg: move Count to bytealg Move bytes.Count and strings.Count to bytealg. Update #19792 Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb Reviewed-on: https://go-review.googlesource.com/98495 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-04 17:49:25 +00:00
Giovanni Bajo	89ae7045f3	test: convert all math-related tests from asm_test Change-Id: If542f0b5c5754e6eb2f9b302fe5a148ba9a57338 Reviewed-on: https://go-review.googlesource.com/98443 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-04 16:52:33 +00:00
Giovanni Bajo	fad31e513d	test: move load/store combines into asmcheck This CL moves the load/store combining tests into asmcheck. In addition at being more compact, it's also now easier to spot what it is missing in each architecture. While doing so, I think I uncovered a bug in ppc64le and arm64 rules, because they fail to load/store combine in non-trivial functions. Not sure why, I'll open an issue. Change-Id: Ia1572d53c0553d9104f3e52b95e4d1768a8440a3 Reviewed-on: https://go-review.googlesource.com/98441 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-04 16:52:03 +00:00
Giovanni Bajo	8ce74b7d11	test: port a nil-check interface test from asm_test Change-Id: I69c1688506d1aeca655047acf35d1bff966fc01e Reviewed-on: https://go-review.googlesource.com/98442 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-03-03 20:20:54 +00:00
Keith Randall	1dfa380e3d	internal/bytealg: move equal functions to bytealg Move bytes.Equal, runtime.memequal, and runtime.memequal_varlen to the bytealg package. Update #19792 Change-Id: Ic4175e952936016ea0bda6c7c3dbb33afdc8e4ac Reviewed-on: https://go-review.googlesource.com/98355 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-03 04:18:27 +00:00
Keith Randall	403ab0f221	internal/bytealg: move IndexByte asssembly to the new bytealg package Move the IndexByte function from the runtime to a new bytealg package. The new package will eventually hold all the optimized assembly for groveling through byte slices and strings. It seems a better home for this code than randomly keeping it in runtime. Once this is in, the next step is to move the other functions (Compare, Equal, ...). Update #19792 This change seems complicated enough that we might just declare "not worth it" and abandon. Opinions welcome. The core assembly is all unchanged, except minor modifications where the code reads cpu feature bits. The wrapper functions have been cleaned up as they are now actually checked by vet. Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796 Reviewed-on: https://go-review.googlesource.com/98016 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-02 22:46:15 +00:00
David du Colombier	1c9297c365	cmd/compile: skip TestEmptyDwarfRanges on Plan 9 TestEmptyDwarfRanges has been added in CL 94816. This test is failing on Plan 9 because executables don't have a DWARF symbol table. Fixes #24226. Change-Id: Iff7e34b8c2703a2f19ee8087a4d64d0bb98496cd Reviewed-on: https://go-review.googlesource.com/98275 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-02 21:23:07 +00:00
Hana Kim	d3562c9db9	internal/trace: Revert "remove backlinks from span/task end to start" This reverts commit `16398894dc`. This broke TestUserTaskSpan test. Change-Id: If5ff8bdfe84e8cb30787b03ead87205ece3d5601 Reviewed-on: https://go-review.googlesource.com/98235 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-02 20:32:08 +00:00
Hana Kim	16398894dc	internal/trace: remove backlinks from span/task end to start Even though undocumented, the assumption is the Event's link field points to the following event in the future. The new span/task event processing breaks the assumption. Change-Id: I4ce2f30c67c4f525ec0a121a7e43d8bdd2ec3f77 Reviewed-on: https://go-review.googlesource.com/96395 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-02 20:15:57 +00:00
Michael Fraenkel	5b071bfa88	cmd/compile: convert type during finishcompare When recursively calling walkexpr, r.Type is still the untyped value. It then sometimes recursively calls finishcompare, which complains that you can't compare the resulting expression to that untyped value. Updates #23834. Change-Id: I6b7acd3970ceaff8da9216bfa0ae24aca5dee828 Reviewed-on: https://go-review.googlesource.com/97856 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2018-03-02 19:48:23 +00:00
Than McIntosh	9b95611e38	cmd/compile: add DWARF register mappings for ARM64. Add DWARF register mappings for ARM64, so that that arch will become usable with "-dwarflocationlists". [NB: I've plugged in a set of numbers from the doc, but this will require additional manual testing.] Change-Id: Id9aa63857bc8b4f5c825f49274101cf372e9e856 Reviewed-on: https://go-review.googlesource.com/82515 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-02 19:40:29 +00:00
Alessandro Arzilli	eca41af012	cmd/link: fix up debug_range for dsymutil (revert CL 72371) Dsymutil, an utility used on macOS when externally linking executables, does not support base address selector entries in debug_ranges. CL 73271 worked around this problem by removing base address selectors and emitting CU-relative relocations for each list entry. This commit, as an optimization, reintroduces the base address selectors and changes the linker to remove them again, but only when it knows that it will have to invoke the external linker on macOS. Compilecmp comparing master with a branch that has scope tracking always enabled: completed 15 of 15, estimated time remaining 0s (eta 2:43PM) name old time/op new time/op delta Template 272ms ± 8% 257ms ± 5% -5.33% (p=0.000 n=15+14) Unicode 124ms ± 7% 122ms ± 5% ~ (p=0.210 n=14+14) GoTypes 873ms ± 3% 870ms ± 5% ~ (p=0.856 n=15+13) Compiler 4.49s ± 2% 4.49s ± 5% ~ (p=0.982 n=14+14) SSA 11.8s ± 4% 11.8s ± 3% ~ (p=0.653 n=15+15) Flate 163ms ± 6% 164ms ± 9% ~ (p=0.914 n=14+15) GoParser 203ms ± 6% 202ms ±10% ~ (p=0.571 n=14+14) Reflect 547ms ± 7% 542ms ± 4% ~ (p=0.914 n=15+14) Tar 244ms ± 7% 237ms ± 3% -2.80% (p=0.002 n=14+13) XML 289ms ± 6% 289ms ± 5% ~ (p=0.839 n=14+14) [Geo mean] 537ms 531ms -1.10% name old user-time/op new user-time/op delta Template 360ms ± 4% 341ms ± 7% -5.16% (p=0.000 n=14+14) Unicode 189ms ±11% 190ms ± 8% ~ (p=0.844 n=15+15) GoTypes 1.13s ± 4% 1.14s ± 7% ~ (p=0.582 n=15+14) Compiler 5.34s ± 2% 5.40s ± 4% +1.19% (p=0.036 n=11+13) SSA 14.7s ± 2% 14.7s ± 3% ~ (p=0.602 n=15+15) Flate 211ms ± 7% 214ms ± 8% ~ (p=0.252 n=14+14) GoParser 267ms ±12% 266ms ± 2% ~ (p=0.837 n=15+11) Reflect 706ms ± 4% 701ms ± 3% ~ (p=0.213 n=14+12) Tar 331ms ± 9% 320ms ± 5% -3.30% (p=0.025 n=15+14) XML 378ms ± 4% 373ms ± 6% ~ (p=0.253 n=14+15) [Geo mean] 704ms 700ms -0.58% name old alloc/op new alloc/op delta Template 38.0MB ± 0% 38.4MB ± 0% +1.12% (p=0.000 n=15+15) Unicode 28.8MB ± 0% 28.8MB ± 0% +0.17% (p=0.000 n=15+15) GoTypes 112MB ± 0% 114MB ± 0% +1.47% (p=0.000 n=15+15) Compiler 465MB ± 0% 473MB ± 0% +1.71% (p=0.000 n=15+15) SSA 1.48GB ± 0% 1.53GB ± 0% +3.07% (p=0.000 n=15+15) Flate 24.3MB ± 0% 24.7MB ± 0% +1.67% (p=0.000 n=15+15) GoParser 30.7MB ± 0% 31.0MB ± 0% +1.15% (p=0.000 n=12+15) Reflect 76.3MB ± 0% 77.1MB ± 0% +0.97% (p=0.000 n=15+15) Tar 39.2MB ± 0% 39.6MB ± 0% +0.91% (p=0.000 n=15+15) XML 41.5MB ± 0% 42.0MB ± 0% +1.29% (p=0.000 n=15+15) [Geo mean] 77.5MB 78.6MB +1.35% name old allocs/op new allocs/op delta Template 385k ± 0% 387k ± 0% +0.51% (p=0.000 n=15+15) Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=14+15) GoTypes 1.19M ± 0% 1.19M ± 0% +0.62% (p=0.000 n=15+15) Compiler 4.51M ± 0% 4.54M ± 0% +0.50% (p=0.000 n=14+15) SSA 12.2M ± 0% 12.4M ± 0% +1.12% (p=0.000 n=14+15) Flate 234k ± 0% 236k ± 0% +0.60% (p=0.000 n=15+15) GoParser 318k ± 0% 320k ± 0% +0.60% (p=0.000 n=15+15) Reflect 974k ± 0% 977k ± 0% +0.27% (p=0.000 n=15+15) Tar 395k ± 0% 397k ± 0% +0.37% (p=0.000 n=14+15) XML 404k ± 0% 407k ± 0% +0.53% (p=0.000 n=15+15) [Geo mean] 794k 798k +0.52% name old text-bytes new text-bytes delta HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.11MB ± 0% 1.13MB ± 0% +1.85% (p=0.000 n=15+15) Change-Id: I61c98ba0340cb798034b2bb55e3ab3a58ac1cf23 Reviewed-on: https://go-review.googlesource.com/98075 Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-03-02 19:33:44 +00:00
Heschi Kreinick	9dc351beba	cmd/compile/internal/ssa: batch up all zero-width instructions When generating location lists, batch up changes for all zero-width instructions, not just phis. This prevents the creation of location list entries that don't actually cover any instructions. This isn't perfect because of the caveats in the prior CL (Copy is zero-width sometimes) but in practice this seems to fix all of the empty lists in std. Change-Id: Ice4a9ade36b6b24ca111d1494c414eec96e5af25 Reviewed-on: https://go-review.googlesource.com/97958 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-03-02 18:55:56 +00:00

1 2 3 4 5 ...

11738 Commits