go/src
Ben Shi 78ea9a7129 cmd/compile: optimize MOVBS/MOVBU/MOVHS/MOVHU on ARMv6 and ARMv7
MOVBS/MOVBU/MOVHS/MOVHU can be optimized with a single instruction
on ARMv6 and ARMv7, instead of a pair of left/right shifts.

The benchmark tests show big improvement in special cases and a little
improvement in total.

1. A special case gets about 29% improvement.
name                     old time/op    new time/op    delta
TypePro-4                  3.81ms ± 1%    2.71ms ± 1%  -28.97%  (p=0.000 n=26+25)
The source code of this case can be found at
https://github.com/benshi001/ugo1/blob/master/typepromotion_test.go

2. There is a little improvement in the go1 benchmark, excluding the noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              42.1s ± 3%     42.1s ± 2%    ~     (p=0.883 n=28+30)
Fannkuch11-4                24.3s ± 4%     24.7s ± 7%  +1.64%  (p=0.026 n=30+30)
FmtFprintfEmpty-4           833ns ± 2%     835ns ± 2%    ~     (p=0.371 n=26+28)
FmtFprintfString-4         1.36µs ± 3%    1.35µs ± 1%    ~     (p=0.202 n=26+23)
FmtFprintfInt-4            1.42µs ± 3%    1.43µs ± 1%  +0.66%  (p=0.000 n=26+27)
FmtFprintfIntInt-4         2.10µs ± 1%    2.10µs ± 2%    ~     (p=0.104 n=25+26)
FmtFprintfPrefixedInt-4    2.37µs ± 2%    2.33µs ± 1%  -1.75%  (p=0.000 n=25+28)
FmtFprintfFloat-4          4.50µs ± 0%    4.37µs ± 1%  -2.81%  (p=0.000 n=23+25)
FmtManyArgs-4              8.08µs ± 0%    8.13µs ± 3%    ~     (p=0.160 n=23+26)
GobDecode-4                 102ms ± 4%     103ms ± 4%  +1.08%  (p=0.001 n=28+26)
GobEncode-4                96.0ms ± 2%    95.2ms ± 3%  -0.81%  (p=0.000 n=24+25)
Gzip-4                      4.17s ± 3%     4.11s ± 2%  -1.45%  (p=0.000 n=25+25)
Gunzip-4                    597ms ± 2%     594ms ± 2%  -0.57%  (p=0.000 n=24+26)
HTTPClientServer-4          708µs ± 4%     708µs ± 4%    ~     (p=0.852 n=28+28)
JSONEncode-4                241ms ± 1%     245ms ± 3%  +1.62%  (p=0.000 n=27+28)
JSONDecode-4                906ms ± 3%     889ms ± 3%  -1.85%  (p=0.000 n=23+24)
Mandelbrot200-4            41.8ms ± 1%    41.8ms ± 1%    ~     (p=0.929 n=25+24)
GoParse-4                  47.1ms ± 2%    45.3ms ± 4%  -3.80%  (p=0.000 n=28+24)
RegexpMatchEasy0_32-4      1.27µs ± 2%    1.28µs ± 1%  +0.77%  (p=0.000 n=26+28)
RegexpMatchEasy0_1K-4      8.08µs ± 9%    7.83µs ±10%  -3.10%  (p=0.012 n=26+26)
RegexpMatchEasy1_32-4      1.29µs ± 5%    1.29µs ± 2%    ~     (p=0.301 n=26+29)
RegexpMatchEasy1_1K-4      10.5µs ± 4%    10.3µs ± 5%  -1.95%  (p=0.003 n=26+26)
RegexpMatchMedium_32-4     1.94µs ± 1%    1.95µs ± 1%    ~     (p=0.251 n=24+27)
RegexpMatchMedium_1K-4      502µs ± 2%     502µs ± 2%    ~     (p=0.336 n=25+28)
RegexpMatchHard_32-4       26.7µs ± 1%    26.6µs ± 3%    ~     (p=0.454 n=27+26)
RegexpMatchHard_1K-4        801µs ± 3%     799µs ± 2%    ~     (p=0.097 n=24+26)
Revcomp-4                  73.5ms ± 5%    73.2ms ± 3%    ~     (p=0.240 n=26+26)
Template-4                  1.07s ± 2%     1.05s ± 1%  -2.39%  (p=0.000 n=26+24)
TimeParse-4                6.87µs ± 1%    6.85µs ± 1%    ~     (p=0.094 n=28+23)
TimeFormat-4               13.4µs ± 1%    13.4µs ± 1%    ~     (p=0.664 n=25+29)
[Geo mean]                  717µs          713µs       -0.54%

name                     old speed      new speed      delta
GobDecode-4              7.52MB/s ± 4%  7.44MB/s ± 4%  -1.10%  (p=0.001 n=28+26)
GobEncode-4              7.99MB/s ± 2%  8.06MB/s ± 3%  +0.81%  (p=0.000 n=24+25)
Gzip-4                   4.66MB/s ± 3%  4.72MB/s ± 2%  +1.43%  (p=0.000 n=25+25)
Gunzip-4                 32.5MB/s ± 2%  32.7MB/s ± 2%  +0.56%  (p=0.001 n=24+26)
JSONEncode-4             8.04MB/s ± 1%  7.92MB/s ± 3%  -1.59%  (p=0.000 n=27+28)
JSONDecode-4             2.14MB/s ± 3%  2.18MB/s ± 3%  +1.90%  (p=0.000 n=23+24)
GoParse-4                1.23MB/s ± 3%  1.28MB/s ± 4%  +4.23%  (p=0.000 n=30+24)
RegexpMatchEasy0_32-4    25.2MB/s ± 2%  25.0MB/s ± 1%  -0.76%  (p=0.000 n=26+28)
RegexpMatchEasy0_1K-4     127MB/s ± 8%   131MB/s ± 9%  +3.29%  (p=0.012 n=26+26)
RegexpMatchEasy1_32-4    24.8MB/s ± 5%  24.8MB/s ± 2%    ~     (p=0.339 n=26+29)
RegexpMatchEasy1_1K-4    97.9MB/s ± 4%  99.8MB/s ± 5%  +1.98%  (p=0.004 n=26+26)
RegexpMatchMedium_32-4    514kB/s ± 3%   515kB/s ± 3%    ~     (p=0.391 n=28+28)
RegexpMatchMedium_1K-4   2.04MB/s ± 2%  2.04MB/s ± 2%    ~     (p=0.517 n=25+28)
RegexpMatchHard_32-4     1.20MB/s ± 3%  1.20MB/s ± 3%    ~     (p=0.203 n=28+28)
RegexpMatchHard_1K-4     1.28MB/s ± 3%  1.28MB/s ± 2%    ~     (p=0.499 n=24+26)
Revcomp-4                34.6MB/s ± 4%  34.7MB/s ± 3%    ~     (p=0.245 n=26+26)
Template-4               1.81MB/s ± 2%  1.85MB/s ± 3%  +2.30%  (p=0.000 n=26+25)
[Geo mean]               6.82MB/s       6.88MB/s       +0.84%

fixes #20653

Change-Id: Ief0d6e726e517e51ae511325b21ee72598e759ff
Reviewed-on: https://go-review.googlesource.com/71992
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-26 12:34:02 +00:00
..
archive archive/zip: restrict UTF-8 detection for comment and name fields 2017-10-25 22:16:46 +00:00
bufio
builtin
bytes bytes: add examples of Equal and IndexByte 2017-10-16 03:34:28 +00:00
cmd cmd/compile: optimize MOVBS/MOVBU/MOVHS/MOVHU on ARMv6 and ARMv7 2017-10-26 12:34:02 +00:00
compress compress/bzip2: fix checksum mismatch on empty reads 2017-09-25 23:05:58 +00:00
container container/ring: add examples for various Ring functions 2017-10-25 13:34:06 +00:00
context context: fix references to "d" in WithDeadline docs 2017-09-21 03:00:51 +00:00
crypto crypto/elliptic: don't unmarshal invalid encoded points 2017-10-15 02:24:19 +00:00
database/sql database/sql: scan into *time.Time without reflection 2017-10-25 19:29:16 +00:00
debug debug/dwarf: clarify StructField.ByteSize doc 2017-10-18 21:45:30 +00:00
encoding encoding/csv: forbid certain Comma and Comment runes 2017-10-25 01:43:46 +00:00
errors
expvar expvar: make (*Map).Init clear existing keys 2017-09-11 21:31:51 +00:00
flag flag: simplify switch-case in isZeroValue 2017-10-17 20:23:14 +00:00
fmt fmt: clarify wording of * flag 2017-10-15 06:03:34 +00:00
go go/types: improved documentation for WriteExpr and ExprString 2017-10-23 18:10:06 +00:00
hash
html all: revert "all: prefer strings.IndexByte over strings.Index" 2017-10-05 23:19:10 +00:00
image image/draw: reduce drawPaletted allocations for special source cases 2017-10-25 23:43:27 +00:00
index/suffixarray
internal os: add deadline methods for File type 2017-10-25 18:27:06 +00:00
io io: flatten MultiWriter writers 2017-10-25 21:48:50 +00:00
log log: Remove unnecessary else 2017-10-25 05:02:37 +00:00
math math/big: implement Lehmer's GCD algorithm 2017-10-24 22:42:43 +00:00
mime mime/multipart: permit empty file name 2017-10-24 20:21:03 +00:00
net net/smtp: added Noop to Client 2017-10-25 20:13:18 +00:00
os os: add deadline methods for File type 2017-10-25 18:27:06 +00:00
path all: revert "all: prefer strings.LastIndexByte over strings.LastIndex" 2017-10-05 23:19:42 +00:00
plugin runtime, plugin: error not throw on duplicate open 2017-09-09 16:26:33 +00:00
reflect reflect: allow Copy to a byte array or byte slice from a string 2017-10-13 02:35:56 +00:00
regexp all: revert "all: prefer strings.IndexByte over strings.Index" 2017-10-05 23:19:10 +00:00
runtime runtime: avoid monotonic time zero on systems with low-res timers 2017-10-25 17:10:20 +00:00
sort sort: update main example to use Slice along with Sort 2017-09-24 14:40:37 +00:00
strconv unicode: update to Unicode 10.0.0 2017-10-24 12:42:35 +00:00
strings strings: improve readability of IndexAny and LastIndexAny functions. 2017-09-25 18:23:11 +00:00
sync sync/atomic: add memory barriers to Load/StoreInt32 on darwin/arm 2017-10-02 09:57:23 +00:00
syscall syscall: correct type for timeout argument to Select on linux/{arm64,mips64x} 2017-10-13 14:01:17 +00:00
testing testing/iotest: fix NewReadLogger documentation typo 2017-10-19 15:59:21 +00:00
text text/template: add break, continue actions in ranges 2017-10-17 02:06:15 +00:00
time all: revert "all: prefer strings.LastIndexByte over strings.LastIndex" 2017-10-05 23:19:42 +00:00
unicode unicode: update to Unicode 10.0.0 2017-10-24 12:42:35 +00:00
unsafe
vendor/golang_org/x unicode: update to Unicode 10.0.0 2017-10-24 12:42:35 +00:00
Make.dist
all.bash
all.bat
all.rc
androidtest.bash
bootstrap.bash
buildall.bash
clean.bash
clean.bat
clean.rc
cmp.bash
iostest.bash misc/ios,src/iostest.bash: support GOIOS_DEVICE_ID 2017-08-28 16:37:25 +00:00
make.bash build: move final steps of make.bash, make.bat, make.rc into cmd/dist 2017-10-25 01:13:01 +00:00
make.bat build: move final steps of make.bash, make.bat, make.rc into cmd/dist 2017-10-25 01:13:01 +00:00
make.rc build: move final steps of make.bash, make.bat, make.rc into cmd/dist 2017-10-25 01:13:01 +00:00
naclmake.bash
nacltest.bash
race.bash
race.bat
run.bash
run.bat
run.rc