Commit Graph

5 Commits

Author SHA1 Message Date
Michael Munday cfd89164bb all: make copyright headers consistent with one space after period
Continuation of CL 20111.

Change-Id: Ie2f62237e6ec316989c021de9b267cc9d6ee6676
Reviewed-on: https://go-review.googlesource.com/32830
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-11-04 20:46:25 +00:00
Ilya Tocar f31492ffe7 bytes,strings: use IndexByte more often in Index on AMD64
IndexByte+compare is faster than indexShortStr in good case, when
first byte is rare, but is more costly in bad cases.
Start with IndexByte and switch to indexShortStr if we encounter
false positives more often than once per 8 bytes.

Benchmark changes for package bytes:

IndexRune/4K-8                    416ns ± 0%       86ns ± 0%    -79.24%        (p=0.000 n=10+10)
IndexRune/4M-8                    413µs ± 0%      100µs ± 1%    -75.88%        (p=0.000 n=10+10)
IndexRune/64M-8                  6.73ms ± 0%     2.86ms ± 1%    -57.49%        (p=0.000 n=10+10)
Index/10-8                       8.45ns ± 0%     8.96ns ± 0%     +6.04%         (p=0.000 n=9+10)
Index/32-8                       9.64ns ± 0%     9.51ns ± 0%     -1.30%          (p=0.000 n=8+9)
Index/4K-8                       2.11µs ± 0%     2.12µs ± 0%     +0.26%        (p=0.000 n=10+10)
Index/4M-8                       3.60ms ± 5%     3.59ms ± 7%       ~            (p=0.497 n=9+10)
Index/64M-8                      57.1ms ± 3%     58.7ms ± 5%       ~            (p=0.113 n=9+10)
IndexEasy/10-8                   7.10ns ± 1%     7.71ns ± 1%     +8.60%        (p=0.000 n=10+10)
IndexEasy/32-8                   9.29ns ± 1%     9.22ns ± 0%     -0.75%         (p=0.000 n=9+10)
IndexEasy/4K-8                   1.06µs ± 0%     0.08µs ± 0%    -92.18%        (p=0.000 n=10+10)
IndexEasy/4M-8                   1.07ms ± 0%     0.10ms ± 1%    -90.74%         (p=0.000 n=9+10)
IndexEasy/64M-8                  17.3ms ± 0%      2.8ms ± 1%    -83.76%         (p=0.000 n=10+9)

IndexRune/4K-8                 9.84GB/s ± 0%  47.42GB/s ± 0%   +381.85%         (p=0.000 n=8+10)
IndexRune/4M-8                 10.1GB/s ± 0%   42.1GB/s ± 1%   +314.56%        (p=0.000 n=10+10)
IndexRune/64M-8                10.0GB/s ± 0%   23.4GB/s ± 1%   +135.25%        (p=0.000 n=10+10)
Index/10-8                     1.18GB/s ± 0%   1.12GB/s ± 0%     -5.67%         (p=0.000 n=10+9)
Index/32-8                     3.32GB/s ± 0%   3.36GB/s ± 0%     +1.27%         (p=0.000 n=10+9)
Index/4K-8                     1.94GB/s ± 0%   1.93GB/s ± 0%     -0.25%         (p=0.000 n=10+9)
Index/4M-8                     1.17GB/s ± 5%   1.17GB/s ± 7%       ~            (p=0.497 n=9+10)
Index/64M-8                    1.17GB/s ± 3%   1.15GB/s ± 6%       ~            (p=0.113 n=9+10)
IndexEasy/10-8                 1.41GB/s ± 1%   1.30GB/s ± 1%     -7.90%        (p=0.000 n=10+10)
IndexEasy/32-8                 3.45GB/s ± 1%   3.47GB/s ± 0%     +0.73%         (p=0.000 n=9+10)
IndexEasy/4K-8                 3.84GB/s ± 0%  49.16GB/s ± 0%  +1178.78%         (p=0.000 n=9+10)
IndexEasy/4M-8                 3.91GB/s ± 0%  42.19GB/s ± 1%   +980.37%         (p=0.000 n=9+10)
IndexEasy/64M-8                3.88GB/s ± 0%  23.91GB/s ± 1%   +515.76%         (p=0.000 n=10+9)

No significant changes in strings.

In regexp I see:

Match/Easy0/32-8                 536MB/s ± 1%   540MB/s ± 1%    +0.75%         (p=0.001 n=9+10)
Match/Easy0/1K-8                1.62GB/s ± 0%  4.42GB/s ± 1%  +172.48%        (p=0.000 n=10+10)
Match/Easy0/32K-8               1.87GB/s ± 0%  9.07GB/s ± 1%  +384.24%         (p=0.000 n=7+10)
Match/Easy0/1M-8                1.90GB/s ± 0%  4.83GB/s ± 0%  +154.56%         (p=0.000 n=8+10)
Match/Easy0/32M-8               1.90GB/s ± 0%  4.53GB/s ± 0%  +138.62%         (p=0.000 n=7+10)

Compared to in 1.7:

Match/Easy0/32-8                  59.5ns ± 0%    59.2ns ± 1%   -0.45%         (p=0.008 n=9+10)
Match/Easy0/1K-8                   226ns ± 1%     231ns ± 1%   +2.30%        (p=0.000 n=10+10)
Match/Easy0/32K-8                 3.73µs ± 2%    3.61µs ± 1%   -3.12%        (p=0.000 n=10+10)
Match/Easy0/1M-8                   206µs ± 1%     217µs ± 0%   +5.34%        (p=0.000 n=10+10)
Match/Easy0/32M-8                 7.03ms ± 1%    7.40ms ± 0%   +5.23%        (p=0.000 n=10+10)

Fixes #17456

Change-Id: I38b2fabcaed7119cc4bf37007ba7bfe7504c8f9f
Reviewed-on: https://go-review.googlesource.com/31690
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-11-01 18:30:52 +00:00
Ilya Tocar 0cff219c12 strings: use AVX2 for Index if available
IndexHard4-4      1.50ms ± 2%  0.71ms ± 0%  -52.36%  (p=0.000 n=20+19)

This also fixes a bug, that caused a string of length 16 to use
two 8-byte comparisons instead of one 16-byte. And adds a test for
cases when partial_match fails.

Change-Id: I1ee8fc4e068bb36c95c45de78f067c822c0d9df0
Reviewed-on: https://go-review.googlesource.com/22551
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-07 10:43:13 +00:00
Hiroshi Ioka e10286aeda bytes: make IndexRune faster
re-implement IndexRune by IndexByte and Index which are well optimized
to get performance gain.

name                  old time/op   new time/op     delta
IndexRune/10-4         53.2ns ± 1%     29.1ns ± 1%    -45.32%  (p=0.008 n=5+5)
IndexRune/32-4          191ns ± 1%       27ns ± 1%    -85.75%  (p=0.008 n=5+5)
IndexRune/4K-4         23.5µs ± 1%      1.0µs ± 1%    -95.77%  (p=0.008 n=5+5)
IndexRune/4M-4         23.8ms ± 0%      1.0ms ± 2%    -95.90%  (p=0.008 n=5+5)
IndexRune/64M-4         384ms ± 1%       15ms ± 1%    -95.98%  (p=0.008 n=5+5)
IndexRuneASCII/10-4    61.5ns ± 0%     10.3ns ± 4%    -83.17%  (p=0.008 n=5+5)
IndexRuneASCII/32-4     203ns ± 0%       11ns ± 5%    -94.68%  (p=0.008 n=5+5)
IndexRuneASCII/4K-4    23.4µs ± 0%      0.3µs ± 2%    -98.60%  (p=0.008 n=5+5)
IndexRuneASCII/4M-4    24.0ms ± 1%      0.3ms ± 1%    -98.60%  (p=0.008 n=5+5)
IndexRuneASCII/64M-4    386ms ± 2%        6ms ± 1%    -98.57%  (p=0.008 n=5+5)

name                  old speed     new speed       delta
IndexRune/10-4        188MB/s ± 1%    344MB/s ± 1%    +82.91%  (p=0.008 n=5+5)
IndexRune/32-4        167MB/s ± 0%   1175MB/s ± 1%   +603.52%  (p=0.008 n=5+5)
IndexRune/4K-4        174MB/s ± 1%   4117MB/s ± 1%  +2262.71%  (p=0.008 n=5+5)
IndexRune/4M-4        176MB/s ± 0%   4299MB/s ± 2%  +2340.46%  (p=0.008 n=5+5)
IndexRune/64M-4       175MB/s ± 1%   4354MB/s ± 1%  +2388.57%  (p=0.008 n=5+5)
IndexRuneASCII/10-4   163MB/s ± 0%    968MB/s ± 4%   +494.66%  (p=0.008 n=5+5)
IndexRuneASCII/32-4   157MB/s ± 0%   2974MB/s ± 4%  +1788.59%  (p=0.008 n=5+5)
IndexRuneASCII/4K-4   175MB/s ± 0%  12481MB/s ± 2%  +7027.71%  (p=0.008 n=5+5)
IndexRuneASCII/4M-4   175MB/s ± 1%  12510MB/s ± 1%  +7061.15%  (p=0.008 n=5+5)
IndexRuneASCII/64M-4  174MB/s ± 2%  12143MB/s ± 1%  +6881.70%  (p=0.008 n=5+5)

Change-Id: I0632eadb83937c2a9daa7f0ce79df1dee64f992e
Reviewed-on: https://go-review.googlesource.com/28537
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-06 23:32:57 +00:00
Ilya Tocar 44f1854c9d bytes: Use the same algorithm as strings for Index
name                     old time/op    new time/op      delta
IndexByte32-48             9.05ns ± 7%      9.59ns ±11%     +5.93%  (p=0.001 n=19+20)
IndexByte4K-48              118ns ± 4%       122ns ± 8%     +3.52%  (p=0.002 n=19+19)
IndexByte4M-48              172µs ±13%       188µs ±12%     +9.49%  (p=0.000 n=20+20)
IndexByte64M-48            8.00ms ±14%      8.05ms ±23%       ~     (p=0.799 n=20+20)
IndexBytePortable32-48     41.7ns ±15%      42.5ns ±12%       ~     (p=0.372 n=20+20)
IndexBytePortable4K-48     3.08µs ±16%      3.26µs ±10%     +5.77%  (p=0.018 n=20+20)
IndexBytePortable4M-48     3.12ms ±17%      3.20ms ±10%       ~     (p=0.157 n=20+20)
IndexBytePortable64M-48    54.0ms ±14%      55.3ms ±14%       ~     (p=0.640 n=20+20)
Index32-48                  230ns ±12%        46ns ± 6%    -79.87%  (p=0.000 n=20+19)
Index4K-48                 43.2µs ± 9%       3.2µs ±12%    -92.58%  (p=0.000 n=19+20)
Index4M-48                 44.4ms ± 7%       3.3ms ±13%    -92.59%  (p=0.000 n=19+20)
Index64M-48                 714ms ±10%        56ms ± 8%    -92.22%  (p=0.000 n=19+19)
IndexEasy32-48             52.7ns ±10%      31.0ns ±11%    -41.21%  (p=0.000 n=20+20)
IndexEasy4K-48              139ns ± 5%      1598ns ± 6%  +1046.37%  (p=0.000 n=19+19)
IndexEasy4M-48              179µs ± 8%      1674µs ±10%   +834.31%  (p=0.000 n=19+20)
IndexEasy64M-48            8.56ms ±10%     27.82ms ±16%   +225.14%  (p=0.000 n=19+20)

name                     old speed      new speed        delta
IndexByte32-48           3.52GB/s ± 7%    3.35GB/s ±11%     -4.99%  (p=0.001 n=20+20)
IndexByte4K-48           34.5GB/s ± 7%    33.2GB/s ±10%     -3.67%  (p=0.002 n=20+20)
IndexByte4M-48           24.6GB/s ±14%    22.4GB/s ±14%     -8.73%  (p=0.000 n=20+20)
IndexByte64M-48          8.42GB/s ±16%    8.42GB/s ±19%       ~     (p=0.799 n=20+20)
IndexBytePortable32-48    770MB/s ±13%     756MB/s ±11%       ~     (p=0.383 n=20+20)
IndexBytePortable4K-48   1.34GB/s ±14%    1.26GB/s ±10%     -5.76%  (p=0.018 n=20+20)
IndexBytePortable4M-48   1.35GB/s ±15%    1.31GB/s ±11%       ~     (p=0.157 n=20+20)
IndexBytePortable64M-48  1.25GB/s ±16%    1.22GB/s ±13%       ~     (p=0.640 n=20+20)
Index32-48                138MB/s ± 8%     687MB/s ± 8%   +398.57%  (p=0.000 n=19+20)
Index4K-48               94.9MB/s ± 9%  1280.5MB/s ±11%  +1249.11%  (p=0.000 n=19+20)
Index4M-48               94.6MB/s ± 7%  1278.5MB/s ±12%  +1250.99%  (p=0.000 n=19+20)
Index64M-48              94.2MB/s ±10%  1210.9MB/s ± 8%  +1185.04%  (p=0.000 n=19+19)
IndexEasy32-48            608MB/s ±10%    1035MB/s ±10%    +70.15%  (p=0.000 n=20+20)
IndexEasy4K-48           29.3GB/s ± 6%     2.6GB/s ± 6%    -91.24%  (p=0.000 n=19+19)
IndexEasy4M-48           23.3GB/s ±10%     2.5GB/s ± 9%    -89.23%  (p=0.000 n=20+20)
IndexEasy64M-48          7.86GB/s ±11%    2.42GB/s ±14%    -69.18%  (p=0.000 n=19+20)

Change-Id: Ia191f0a6ca80e113397d9ed98d25f195768b65bc
Reviewed-on: https://go-review.googlesource.com/22550
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-09-01 18:05:50 +00:00