Ilya Tocar
f31492ffe7
bytes,strings: use IndexByte more often in Index on AMD64
...
IndexByte+compare is faster than indexShortStr in good case, when
first byte is rare, but is more costly in bad cases.
Start with IndexByte and switch to indexShortStr if we encounter
false positives more often than once per 8 bytes.
Benchmark changes for package bytes:
IndexRune/4K-8 416ns ± 0% 86ns ± 0% -79.24% (p=0.000 n=10+10)
IndexRune/4M-8 413µs ± 0% 100µs ± 1% -75.88% (p=0.000 n=10+10)
IndexRune/64M-8 6.73ms ± 0% 2.86ms ± 1% -57.49% (p=0.000 n=10+10)
Index/10-8 8.45ns ± 0% 8.96ns ± 0% +6.04% (p=0.000 n=9+10)
Index/32-8 9.64ns ± 0% 9.51ns ± 0% -1.30% (p=0.000 n=8+9)
Index/4K-8 2.11µs ± 0% 2.12µs ± 0% +0.26% (p=0.000 n=10+10)
Index/4M-8 3.60ms ± 5% 3.59ms ± 7% ~ (p=0.497 n=9+10)
Index/64M-8 57.1ms ± 3% 58.7ms ± 5% ~ (p=0.113 n=9+10)
IndexEasy/10-8 7.10ns ± 1% 7.71ns ± 1% +8.60% (p=0.000 n=10+10)
IndexEasy/32-8 9.29ns ± 1% 9.22ns ± 0% -0.75% (p=0.000 n=9+10)
IndexEasy/4K-8 1.06µs ± 0% 0.08µs ± 0% -92.18% (p=0.000 n=10+10)
IndexEasy/4M-8 1.07ms ± 0% 0.10ms ± 1% -90.74% (p=0.000 n=9+10)
IndexEasy/64M-8 17.3ms ± 0% 2.8ms ± 1% -83.76% (p=0.000 n=10+9)
IndexRune/4K-8 9.84GB/s ± 0% 47.42GB/s ± 0% +381.85% (p=0.000 n=8+10)
IndexRune/4M-8 10.1GB/s ± 0% 42.1GB/s ± 1% +314.56% (p=0.000 n=10+10)
IndexRune/64M-8 10.0GB/s ± 0% 23.4GB/s ± 1% +135.25% (p=0.000 n=10+10)
Index/10-8 1.18GB/s ± 0% 1.12GB/s ± 0% -5.67% (p=0.000 n=10+9)
Index/32-8 3.32GB/s ± 0% 3.36GB/s ± 0% +1.27% (p=0.000 n=10+9)
Index/4K-8 1.94GB/s ± 0% 1.93GB/s ± 0% -0.25% (p=0.000 n=10+9)
Index/4M-8 1.17GB/s ± 5% 1.17GB/s ± 7% ~ (p=0.497 n=9+10)
Index/64M-8 1.17GB/s ± 3% 1.15GB/s ± 6% ~ (p=0.113 n=9+10)
IndexEasy/10-8 1.41GB/s ± 1% 1.30GB/s ± 1% -7.90% (p=0.000 n=10+10)
IndexEasy/32-8 3.45GB/s ± 1% 3.47GB/s ± 0% +0.73% (p=0.000 n=9+10)
IndexEasy/4K-8 3.84GB/s ± 0% 49.16GB/s ± 0% +1178.78% (p=0.000 n=9+10)
IndexEasy/4M-8 3.91GB/s ± 0% 42.19GB/s ± 1% +980.37% (p=0.000 n=9+10)
IndexEasy/64M-8 3.88GB/s ± 0% 23.91GB/s ± 1% +515.76% (p=0.000 n=10+9)
No significant changes in strings.
In regexp I see:
Match/Easy0/32-8 536MB/s ± 1% 540MB/s ± 1% +0.75% (p=0.001 n=9+10)
Match/Easy0/1K-8 1.62GB/s ± 0% 4.42GB/s ± 1% +172.48% (p=0.000 n=10+10)
Match/Easy0/32K-8 1.87GB/s ± 0% 9.07GB/s ± 1% +384.24% (p=0.000 n=7+10)
Match/Easy0/1M-8 1.90GB/s ± 0% 4.83GB/s ± 0% +154.56% (p=0.000 n=8+10)
Match/Easy0/32M-8 1.90GB/s ± 0% 4.53GB/s ± 0% +138.62% (p=0.000 n=7+10)
Compared to in 1.7:
Match/Easy0/32-8 59.5ns ± 0% 59.2ns ± 1% -0.45% (p=0.008 n=9+10)
Match/Easy0/1K-8 226ns ± 1% 231ns ± 1% +2.30% (p=0.000 n=10+10)
Match/Easy0/32K-8 3.73µs ± 2% 3.61µs ± 1% -3.12% (p=0.000 n=10+10)
Match/Easy0/1M-8 206µs ± 1% 217µs ± 0% +5.34% (p=0.000 n=10+10)
Match/Easy0/32M-8 7.03ms ± 1% 7.40ms ± 0% +5.23% (p=0.000 n=10+10)
Fixes #17456
Change-Id: I38b2fabcaed7119cc4bf37007ba7bfe7504c8f9f
Reviewed-on: https://go-review.googlesource.com/31690
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-11-01 18:30:52 +00:00
Ilya Tocar
0cff219c12
strings: use AVX2 for Index if available
...
IndexHard4-4 1.50ms ± 2% 0.71ms ± 0% -52.36% (p=0.000 n=20+19)
This also fixes a bug, that caused a string of length 16 to use
two 8-byte comparisons instead of one 16-byte. And adds a test for
cases when partial_match fails.
Change-Id: I1ee8fc4e068bb36c95c45de78f067c822c0d9df0
Reviewed-on: https://go-review.googlesource.com/22551
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-07 10:43:13 +00:00
Hiroshi Ioka
8737dac1f2
strings: make IndexRune faster
...
re-implement IndexRune by Index which is well optimized to get
performance gain.
name old time/op new time/op delta
IndexRune-4 30.2ns ± 1% 28.3ns ± 1% -6.22% (p=0.000 n=20+19)
IndexRuneLongString-4 156ns ± 1% 49ns ± 1% -68.72% (p=0.000 n=19+19)
IndexRuneFastPath-4 10.6ns ± 2% 10.0ns ± 1% -6.30% (p=0.000 n=18+18)
Change-Id: Ie663b8f7860ca51892dd4be182fca3caa5f8ae61
Reviewed-on: https://go-review.googlesource.com/28546
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-07 01:03:10 +00:00
Ilya Tocar
429bbf3312
strings: fix and reenable amd64 Index for 17-31 byte strings
...
Fixes #15689
Change-Id: I56d0103738cc35cd5bc5e77a0e0341c0dd55530e
Reviewed-on: https://go-review.googlesource.com/23440
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2016-05-27 22:57:32 +00:00
Keith Randall
0bc14f57ec
strings: fix Contains on amd64
...
The 17-31 byte code is broken. Disabled it.
Added a bunch of tests to at least cover the cases
in indexShortStr. I'll channel Brad and wonder why
this CL ever got in without any tests.
Fixes #15679
Change-Id: I84a7b283a74107db865b9586c955dcf5f2d60161
Reviewed-on: https://go-review.googlesource.com/23106
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-15 05:21:03 +00:00
Brad Fitzpatrick
519474451a
all: make copyright headers consistent with one space after period
...
This is a subset of https://golang.org/cl/20022 with only the copyright
header lines, so the next CL will be smaller and more reviewable.
Go policy has been single space after periods in comments for some time.
The copyright header template at:
https://golang.org/doc/contribute.html#copyright
also uses a single space.
Make them all consistent.
Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0
Reviewed-on: https://go-review.googlesource.com/20111
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-01 23:34:33 +00:00
Ilya Tocar
95333aea53
strings: add asm version of Index() for short strings on amd64
...
Currently we have special case for 1-byte strings,
This extends this to strings shorter than 32 bytes on amd64.
Results (broadwell):
name old time/op new time/op delta
IndexRune-4 57.4ns ± 0% 57.5ns ± 0% +0.10% (p=0.000 n=20+19)
IndexRuneFastPath-4 20.4ns ± 0% 20.4ns ± 0% ~ (all samples are equal)
Index-4 21.0ns ± 0% 21.8ns ± 0% +3.81% (p=0.000 n=20+20)
LastIndex-4 7.07ns ± 1% 6.98ns ± 0% -1.21% (p=0.000 n=20+16)
IndexByte-4 18.3ns ± 0% 18.3ns ± 0% ~ (all samples are equal)
IndexHard1-4 1.46ms ± 0% 0.39ms ± 0% -73.06% (p=0.000 n=16+16)
IndexHard2-4 1.46ms ± 0% 0.30ms ± 0% -79.55% (p=0.000 n=18+18)
IndexHard3-4 1.46ms ± 0% 0.66ms ± 0% -54.68% (p=0.000 n=19+19)
LastIndexHard1-4 1.46ms ± 0% 1.46ms ± 0% -0.01% (p=0.036 n=18+20)
LastIndexHard2-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.588 n=19+19)
LastIndexHard3-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.283 n=17+20)
IndexTorture-4 11.1µs ± 0% 11.1µs ± 0% +0.01% (p=0.000 n=18+17)
Change-Id: I892781549f558f698be4e41f9f568e3d0611efb5
Reviewed-on: https://go-review.googlesource.com/16430
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
2015-11-03 16:04:28 +00:00