fanzha02
d31ee64186
internal/bytealg: add optimized Compare for arm64
...
Use LDP instructions to load 16 bytes per loop when the source length is long. Specially
process the 8 bytes length, 4 bytes length and 2 bytes length to get a better performance.
Benchmark result:
name old time/op new time/op delta
BytesCompare/1-8 21.0ns ± 0% 10.5ns ± 0% ~ (p=0.079 n=4+5)
BytesCompare/2-8 11.5ns ± 0% 10.5ns ± 0% -8.70% (p=0.008 n=5+5)
BytesCompare/4-8 13.5ns ± 0% 10.0ns ± 0% -25.93% (p=0.008 n=5+5)
BytesCompare/8-8 28.8ns ± 0% 9.5ns ± 0% ~ (p=0.079 n=4+5)
BytesCompare/16-8 40.5ns ± 0% 10.5ns ± 0% -74.07% (p=0.008 n=5+5)
BytesCompare/32-8 64.6ns ± 0% 12.5ns ± 0% -80.65% (p=0.008 n=5+5)
BytesCompare/64-8 112ns ± 0% 16ns ± 0% -85.27% (p=0.008 n=5+5)
BytesCompare/128-8 208ns ± 0% 24ns ± 0% -88.22% (p=0.008 n=5+5)
BytesCompare/256-8 400ns ± 0% 50ns ± 0% -87.62% (p=0.008 n=5+5)
BytesCompare/512-8 785ns ± 0% 82ns ± 0% -89.61% (p=0.008 n=5+5)
BytesCompare/1024-8 1.55µs ± 0% 0.14µs ± 0% ~ (p=0.079 n=4+5)
BytesCompare/2048-8 3.09µs ± 0% 0.27µs ± 0% ~ (p=0.079 n=4+5)
CompareBytesEqual-8 39.0ns ± 0% 12.0ns ± 0% -69.23% (p=0.008 n=5+5)
CompareBytesToNil-8 8.57ns ± 5% 8.23ns ± 2% -3.99% (p=0.016 n=5+5)
CompareBytesEmpty-8 7.37ns ± 0% 7.36ns ± 4% ~ (p=0.690 n=5+5)
CompareBytesIdentical-8 7.39ns ± 0% 7.46ns ± 2% ~ (p=0.667 n=5+5)
CompareBytesSameLength-8 17.0ns ± 0% 10.5ns ± 0% -38.24% (p=0.008 n=5+5)
CompareBytesDifferentLength-8 17.0ns ± 0% 10.5ns ± 0% -38.24% (p=0.008 n=5+5)
CompareBytesBigUnaligned-8 1.58ms ± 0% 0.19ms ± 0% -88.31% (p=0.016 n=4+5)
CompareBytesBig-8 1.59ms ± 0% 0.19ms ± 0% -88.27% (p=0.016 n=5+4)
CompareBytesBigIdentical-8 7.01ns ± 0% 6.60ns ± 3% -5.91% (p=0.008 n=5+5)
name old speed new speed delta
CompareBytesBigUnaligned-8 662MB/s ± 0% 5660MB/s ± 0% +755.15% (p=0.016 n=4+5)
CompareBytesBig-8 661MB/s ± 0% 5636MB/s ± 0% +752.57% (p=0.016 n=5+4)
CompareBytesBigIdentical-8 150TB/s ± 0% 159TB/s ± 3% +6.27% (p=0.008 n=5+5)
This is resubmit of CL90175.
Change-Id: Ie841daedb3123a68dd2554f27ebef0b3f8a855c2
Reviewed-on: https://go-review.googlesource.com/101635
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-11 14:54:59 +00:00
Ilya Tocar
f8b28e28f8
internal/bytealg: remove dependency on runtime·support_avx2
...
Use internal/cpu instead.
Change-Id: I8670440389cbd88951fee61e352c4a10ac7eee6e
Reviewed-on: https://go-review.googlesource.com/102737
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-28 15:39:16 +00:00
Cherry Zhang
e22d24131c
Revert "bytes: add optimized Compare for arm64"
...
This reverts commit bfa8b6f8ff .
Reason for revert: This depends on another CL which is not yet submitted.
Change-Id: I50e7594f1473c911a2079fe910849a6694ac6c07
Reviewed-on: https://go-review.googlesource.com/101496
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-20 00:10:24 +00:00
fanzha02
bfa8b6f8ff
bytes: add optimized Compare for arm64
...
Use LDP instructions to load 16 bytes per loop when the source length is long. Specially
process the 8 bytes length, 4 bytes length and 2 bytes length to get a better performance.
Benchmark result:
name old time/op new time/op delta
BytesCompare/1-8 21.0ns ± 0% 10.5ns ± 0% ~ (p=0.079 n=4+5)
BytesCompare/2-8 11.5ns ± 0% 10.5ns ± 0% -8.70% (p=0.008 n=5+5)
BytesCompare/4-8 13.5ns ± 0% 10.0ns ± 0% -25.93% (p=0.008 n=5+5)
BytesCompare/8-8 28.8ns ± 0% 9.5ns ± 0% ~ (p=0.079 n=4+5)
BytesCompare/16-8 40.5ns ± 0% 10.5ns ± 0% -74.07% (p=0.008 n=5+5)
BytesCompare/32-8 64.6ns ± 0% 12.5ns ± 0% -80.65% (p=0.008 n=5+5)
BytesCompare/64-8 112ns ± 0% 16ns ± 0% -85.27% (p=0.008 n=5+5)
BytesCompare/128-8 208ns ± 0% 24ns ± 0% -88.22% (p=0.008 n=5+5)
BytesCompare/256-8 400ns ± 0% 50ns ± 0% -87.62% (p=0.008 n=5+5)
BytesCompare/512-8 785ns ± 0% 82ns ± 0% -89.61% (p=0.008 n=5+5)
BytesCompare/1024-8 1.55µs ± 0% 0.14µs ± 0% ~ (p=0.079 n=4+5)
BytesCompare/2048-8 3.09µs ± 0% 0.27µs ± 0% ~ (p=0.079 n=4+5)
CompareBytesEqual-8 39.0ns ± 0% 12.0ns ± 0% -69.23% (p=0.008 n=5+5)
CompareBytesToNil-8 8.57ns ± 5% 8.23ns ± 2% -3.99% (p=0.016 n=5+5)
CompareBytesEmpty-8 7.37ns ± 0% 7.36ns ± 4% ~ (p=0.690 n=5+5)
CompareBytesIdentical-8 7.39ns ± 0% 7.46ns ± 2% ~ (p=0.667 n=5+5)
CompareBytesSameLength-8 17.0ns ± 0% 10.5ns ± 0% -38.24% (p=0.008 n=5+5)
CompareBytesDifferentLength-8 17.0ns ± 0% 10.5ns ± 0% -38.24% (p=0.008 n=5+5)
CompareBytesBigUnaligned-8 1.58ms ± 0% 0.19ms ± 0% -88.31% (p=0.016 n=4+5)
CompareBytesBig-8 1.59ms ± 0% 0.19ms ± 0% -88.27% (p=0.016 n=5+4)
CompareBytesBigIdentical-8 7.01ns ± 0% 6.60ns ± 3% -5.91% (p=0.008 n=5+5)
name old speed new speed delta
CompareBytesBigUnaligned-8 662MB/s ± 0% 5660MB/s ± 0% +755.15% (p=0.016 n=4+5)
CompareBytesBig-8 661MB/s ± 0% 5636MB/s ± 0% +752.57% (p=0.016 n=5+4)
CompareBytesBigIdentical-8 150TB/s ± 0% 159TB/s ± 3% +6.27% (p=0.008 n=5+5)
Change-Id: I70332de06f873df3bc12c4a5af1028307b670046
Reviewed-on: https://go-review.googlesource.com/90175
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-20 00:06:34 +00:00
Keith Randall
63bcabed49
internal/bytealg: fix arm64 Index function
...
Missed removing the argument loading from the indexbody function.
Change-Id: Ia1391231fc99771d00410a09fe80a09f08ceed02
Reviewed-on: https://go-review.googlesource.com/98575
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-05 15:38:29 +00:00
Keith Randall
ee58eccc56
internal/bytealg: move short string Index implementations into bytealg
...
Also move the arm64 CountByte implementation while we're here.
Fixes #19792
Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 19:49:44 +00:00
Keith Randall
f6332bb84a
internal/bytealg: move compare functions to bytealg
...
Move bytes.Compare and runtime·cmpstring to bytealg.
Update #19792
Change-Id: I139e6d7c59686bef7a3017e3dec99eba5fd10447
Reviewed-on: https://go-review.googlesource.com/98515
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:39 +00:00
Keith Randall
45964e4f9c
internal/bytealg: move Count to bytealg
...
Move bytes.Count and strings.Count to bytealg.
Update #19792
Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:25 +00:00
Keith Randall
1dfa380e3d
internal/bytealg: move equal functions to bytealg
...
Move bytes.Equal, runtime.memequal, and runtime.memequal_varlen
to the bytealg package.
Update #19792
Change-Id: Ic4175e952936016ea0bda6c7c3dbb33afdc8e4ac
Reviewed-on: https://go-review.googlesource.com/98355
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-03 04:18:27 +00:00
Keith Randall
403ab0f221
internal/bytealg: move IndexByte asssembly to the new bytealg package
...
Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.
Once this is in, the next step is to move the other functions
(Compare, Equal, ...).
Update #19792
This change seems complicated enough that we might just declare
"not worth it" and abandon. Opinions welcome.
The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.
The wrapper functions have been cleaned up as they are now actually
checked by vet.
Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 22:46:15 +00:00