Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.
Once this is in, the next step is to move the other functions
(Compare, Equal, ...).
Update #19792
This change seems complicated enough that we might just declare
"not worth it" and abandon. Opinions welcome.
The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.
The wrapper functions have been cleaned up as they are now actually
checked by vet.
Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Despite the existing test that locks in the allocation behavior, people
really want a benchmark. So:
BenchmarkBuildString_Builder/1Write_NoGrow-4 20000000 60.4 ns/op 48 B/op 1 allocs/op
BenchmarkBuildString_Builder/3Write_NoGrow-4 10000000 230 ns/op 336 B/op 3 allocs/op
BenchmarkBuildString_Builder/3Write_Grow-4 20000000 102 ns/op 112 B/op 1 allocs/op
BenchmarkBuildString_ByteBuffer/1Write_NoGrow-4 10000000 125 ns/op 160 B/op 2 allocs/op
BenchmarkBuildString_ByteBuffer/3Write_NoGrow-4 5000000 339 ns/op 400 B/op 3 allocs/op
BenchmarkBuildString_ByteBuffer/3Write_Grow-4 5000000 316 ns/op 336 B/op 3 allocs/op
I don't think these allocate-as-fast-as-you-can benchmarks are very
interesting because they're effectively just GC benchmarks, but sure.
If one wants to see that there's 1 fewer allocation, there it is. The
ns/op and B/op numbers will change as the built string size changes.
Updates #18990
Change-Id: Ifccf535bd396217434a0e6989e195105f90132ae
Reviewed-on: https://go-review.googlesource.com/96980
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
All credit and blame goes to Ian for this suggestion, copied from the
runtime.
Fixes#23382
Updates #7921
Change-Id: I3d5a9ee4ab730c87e0f3feff3e7fceff9bcf9e18
Reviewed-on: https://go-review.googlesource.com/86976
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The Builder's ReadFrom method allows the underlying unsafe slice to
escape, and for callers to subsequently modify memory that had been
unsafely converted into an immutable string.
In the original proposal for Builder (#18990), I'd noted there should
be no Read methods:
> There would be no Reset or Bytes or Truncate or Read methods.
> Nothing that could mutate the []byte once it was unsafely converted
> to a string.
And in my prototype (https://golang.org/cl/37767), I handled ReadFrom
properly, but when https://golang.org/cl/74931 arrived, I missed that
it had a ReadFrom method and approved it.
Because we're so close to the Go 1.10 release, just remove the
ReadFrom method rather than think about possible fixes. It has
marginal utility in a Builder anyway.
Also, fix a separate bug that also allowed mutation of a slice's
backing array after it had been converted into a slice by disallowing
copies of the Builder by value.
Updates #18990Fixes#23083Fixes#23084
Change-Id: Id1f860f8a4f5f88b32213cf85108ebc609acb95f
Reviewed-on: https://go-review.googlesource.com/83255
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
CL 65851 (bytes) and CL 65910 (strings) “improve[d] readability”
by removing the special case that bypassed the whole function body
when chars == "". In doing so, yes, the function was unindented a
level, which is nice, but the runtime of that case went from O(1) to O(n)
where n = len(s).
I don't know if anyone's code depends on the O(1) behavior in this case,
but quite possibly someone's does.
This CL adds the special case back, with a comment to prevent future
deletions, and without reindenting each function body in full.
Change-Id: I5aba33922b304dd1b8657e6d51d6c937a7f95c81
Reviewed-on: https://go-review.googlesource.com/78112
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This is like a write-only subset of bytes.Buffer with an
allocation-free String method.
Fixes#18990.
Change-Id: Icdf7240f4309a52924dc3af04a39ecd737a210f4
Reviewed-on: https://go-review.googlesource.com/74931
Run-TryBot: Caleb Spare <cespare@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Change-Id: Ifa0384722dd879af7f5edb7b7aaac5ede3cff46d
Reviewed-on: https://go-review.googlesource.com/74690
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This change removes the check of len(chars) > 0 inside the Index and
IndexAny functions which was redundant.
Change-Id: Iffbc0f2b3332c6e31c7514b5f644b6fe7bdcfe0d
Reviewed-on: https://go-review.googlesource.com/65910
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
We initialize fieldStart to 0, then set it to i without ever reading
0, so we might as well just initialize it to i.
Change-Id: I17905b25d54a62b6bc76f915353756ed5eb6972b
Reviewed-on: https://go-review.googlesource.com/52933
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Avelino <t@avelino.xxx>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Change-Id: I6bfe5b914cf11be1cd1f8e61d557cc718725f0be
Reviewed-on: https://go-review.googlesource.com/49013
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Add a example for string.Compare that return the three possible results.
Change-Id: I103cf39327c1868fb249538d9e22b11865ba4b70
Reviewed-on: https://go-review.googlesource.com/49011
Reviewed-by: Heschi Kreinick <heschi@google.com>
Change-Id: Ib6a59735381ce744553f1ac96eeb65a194c8da10
Reviewed-on: https://go-review.googlesource.com/48860
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: I69d1359d8868d4c5b173e4d831e38cea7dfeb713
Reviewed-on: https://go-review.googlesource.com/48859
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Apparently people get confused by the fact that
Split("", ",")
returns []{""} instead of []{}.
This is actually just a consequence of the fact that if the separator
sep (2nd argument) is not found the string s (1st argument), then the
Split* functions return a length 1 slice with the string s in it.
Document the general case: if sep is not in s, what you get is a len 1
slice with s in it; unless both s and sep are "", in that case you get
an empty slice of length 0.
Fixes#19726
Change-Id: I64c8220b91acd1e5aa1cc1829199e0cd8c47c404
Reviewed-on: https://go-review.googlesource.com/44950
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Implements detection of x86 cpu features that
are used in the go standard library.
Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.
Updates: #15403
Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The new Map implementation introduced in golang.org/cl/33201
did not differentiate if an invalid UTF-8 sequence was decoded
or the RuneError rune. It would therefore always advance by
3 bytes (which is the length of the RuneError rune) instead
of 1 for an invalid sequences. This cl adds a check to correctly
determine the length of bytes needed to advance to the next rune.
Fixes#19330.
Change-Id: I1e7f9333f3ef6068ffc64015bb0a9f32b0b7111d
Reviewed-on: https://go-review.googlesource.com/37597
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Using 'sep' as parameter name for strings functions that take a
separator argument is fine, but for functions like Index or Count that
look for a substring it's better to use 'substr' (like Contains
already does).
Fixes#19039
Change-Id: Idd557409c8fea64ce830ab0e3fec37d3d56a79f0
Reviewed-on: https://go-review.googlesource.com/36874
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Updates the s390x-specific files in these packages with the changes
to the amd64-specific files made during the review of CL 31690. I'd
like to keep these files in sync unless there is a reason to
diverge.
Change-Id: Id83e5ce11a45f877bdcc991d02b14416d1a2d8d2
Reviewed-on: https://go-review.googlesource.com/32574
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In a large codebase within Google, there are thousands of uses of:
ContainsAny|IndexAny|LastIndexAny|Trim|TrimLeft|TrimRight
An analysis of their usage shows that over 97% of them only use character
sets consisting of only ASCII symbols.
Uses of ContainsAny|IndexAny|LastIndexAny:
6% are 1 character (e.g., "\n" or " ")
58% are 2-4 characters (e.g., "<>" or "\r\n\t ")
24% are 5-9 characters (e.g., "()[]*^$")
10% are 10+ characters (e.g., "+-=&|><!(){}[]^\"~*?:\\/ ")
We optimize for ASCII sets, which are commonly used to search for
"control" characters in some string. We don't optimize for the
single character scenario since IndexRune or IndexByte could be used.
Uses of Trim|TrimLeft|TrimRight:
71% are 1 character (e.g., "\n" or " ")
14% are 2 characters (e.g., "\r\n")
10% are 3-4 characters (e.g., " \t\r\n")
5% are 10+ characters (e.g., "0123456789abcdefABCDEF")
We optimize for the single character case with a simple closured function
that only checks for that character's value. We optimize for the medium
and larger sets using a 16-byte bit-map representing a set of ASCII characters.
The benchmarks below have the following suffix name "%d:%d" where the first
number is the length of the input and the second number is the length
of the charset.
== bytes package ==
benchmark old ns/op new ns/op delta
BenchmarkIndexAnyASCII/1:1-4 5.09 5.23 +2.75%
BenchmarkIndexAnyASCII/1:2-4 5.81 5.85 +0.69%
BenchmarkIndexAnyASCII/1:4-4 7.22 7.50 +3.88%
BenchmarkIndexAnyASCII/1:8-4 11.0 11.1 +0.91%
BenchmarkIndexAnyASCII/1:16-4 17.5 17.8 +1.71%
BenchmarkIndexAnyASCII/16:1-4 36.0 34.0 -5.56%
BenchmarkIndexAnyASCII/16:2-4 46.6 36.5 -21.67%
BenchmarkIndexAnyASCII/16:4-4 78.0 40.4 -48.21%
BenchmarkIndexAnyASCII/16:8-4 136 47.4 -65.15%
BenchmarkIndexAnyASCII/16:16-4 254 61.5 -75.79%
BenchmarkIndexAnyASCII/256:1-4 542 388 -28.41%
BenchmarkIndexAnyASCII/256:2-4 705 382 -45.82%
BenchmarkIndexAnyASCII/256:4-4 1089 386 -64.55%
BenchmarkIndexAnyASCII/256:8-4 1994 394 -80.24%
BenchmarkIndexAnyASCII/256:16-4 3843 411 -89.31%
BenchmarkIndexAnyASCII/4096:1-4 8522 5873 -31.08%
BenchmarkIndexAnyASCII/4096:2-4 11253 5861 -47.92%
BenchmarkIndexAnyASCII/4096:4-4 17824 5883 -66.99%
BenchmarkIndexAnyASCII/4096:8-4 32053 5871 -81.68%
BenchmarkIndexAnyASCII/4096:16-4 60512 5888 -90.27%
BenchmarkTrimASCII/1:1-4 79.5 70.8 -10.94%
BenchmarkTrimASCII/1:2-4 79.0 105 +32.91%
BenchmarkTrimASCII/1:4-4 79.6 109 +36.93%
BenchmarkTrimASCII/1:8-4 78.8 118 +49.75%
BenchmarkTrimASCII/1:16-4 80.2 132 +64.59%
BenchmarkTrimASCII/16:1-4 243 116 -52.26%
BenchmarkTrimASCII/16:2-4 243 171 -29.63%
BenchmarkTrimASCII/16:4-4 243 176 -27.57%
BenchmarkTrimASCII/16:8-4 241 184 -23.65%
BenchmarkTrimASCII/16:16-4 238 199 -16.39%
BenchmarkTrimASCII/256:1-4 2580 840 -67.44%
BenchmarkTrimASCII/256:2-4 2603 1175 -54.86%
BenchmarkTrimASCII/256:4-4 2572 1188 -53.81%
BenchmarkTrimASCII/256:8-4 2550 1191 -53.29%
BenchmarkTrimASCII/256:16-4 2585 1208 -53.27%
BenchmarkTrimASCII/4096:1-4 39773 12181 -69.37%
BenchmarkTrimASCII/4096:2-4 39946 17231 -56.86%
BenchmarkTrimASCII/4096:4-4 39641 17179 -56.66%
BenchmarkTrimASCII/4096:8-4 39835 17175 -56.88%
BenchmarkTrimASCII/4096:16-4 40229 17215 -57.21%
== strings package ==
benchmark old ns/op new ns/op delta
BenchmarkIndexAnyASCII/1:1-4 5.94 4.97 -16.33%
BenchmarkIndexAnyASCII/1:2-4 5.94 5.55 -6.57%
BenchmarkIndexAnyASCII/1:4-4 7.45 7.21 -3.22%
BenchmarkIndexAnyASCII/1:8-4 10.8 10.6 -1.85%
BenchmarkIndexAnyASCII/1:16-4 17.4 17.2 -1.15%
BenchmarkIndexAnyASCII/16:1-4 36.4 32.2 -11.54%
BenchmarkIndexAnyASCII/16:2-4 49.6 34.6 -30.24%
BenchmarkIndexAnyASCII/16:4-4 77.5 37.9 -51.10%
BenchmarkIndexAnyASCII/16:8-4 138 45.5 -67.03%
BenchmarkIndexAnyASCII/16:16-4 241 59.1 -75.48%
BenchmarkIndexAnyASCII/256:1-4 509 378 -25.74%
BenchmarkIndexAnyASCII/256:2-4 720 381 -47.08%
BenchmarkIndexAnyASCII/256:4-4 1142 384 -66.37%
BenchmarkIndexAnyASCII/256:8-4 1999 391 -80.44%
BenchmarkIndexAnyASCII/256:16-4 3735 403 -89.21%
BenchmarkIndexAnyASCII/4096:1-4 7973 5824 -26.95%
BenchmarkIndexAnyASCII/4096:2-4 11432 5809 -49.19%
BenchmarkIndexAnyASCII/4096:4-4 18327 5819 -68.25%
BenchmarkIndexAnyASCII/4096:8-4 33059 5828 -82.37%
BenchmarkIndexAnyASCII/4096:16-4 59703 5817 -90.26%
BenchmarkTrimASCII/1:1-4 71.9 71.8 -0.14%
BenchmarkTrimASCII/1:2-4 73.3 103 +40.52%
BenchmarkTrimASCII/1:4-4 71.8 106 +47.63%
BenchmarkTrimASCII/1:8-4 71.2 113 +58.71%
BenchmarkTrimASCII/1:16-4 71.6 128 +78.77%
BenchmarkTrimASCII/16:1-4 152 116 -23.68%
BenchmarkTrimASCII/16:2-4 160 168 +5.00%
BenchmarkTrimASCII/16:4-4 172 170 -1.16%
BenchmarkTrimASCII/16:8-4 200 177 -11.50%
BenchmarkTrimASCII/16:16-4 254 193 -24.02%
BenchmarkTrimASCII/256:1-4 1438 864 -39.92%
BenchmarkTrimASCII/256:2-4 1551 1195 -22.95%
BenchmarkTrimASCII/256:4-4 1770 1200 -32.20%
BenchmarkTrimASCII/256:8-4 2195 1216 -44.60%
BenchmarkTrimASCII/256:16-4 3054 1224 -59.92%
BenchmarkTrimASCII/4096:1-4 21726 12557 -42.20%
BenchmarkTrimASCII/4096:2-4 23586 17508 -25.77%
BenchmarkTrimASCII/4096:4-4 26898 17510 -34.90%
BenchmarkTrimASCII/4096:8-4 33714 17595 -47.81%
BenchmarkTrimASCII/4096:16-4 47429 17700 -62.68%
The benchmarks added test the worst case. For IndexAny, that is when the
charset matches none of the input. For Trim, it is when the charset matches
all of the input.
Change-Id: I970874d101a96b33528fc99b165379abe58cf6ea
Reviewed-on: https://go-review.googlesource.com/31593
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Martin Möhrmann <martisch@uos.de>
In all previous versions of Go, the behavior of IndexRune(s, r)
where r was utf.RuneError was that it would effectively return the
index of any invalid UTF-8 byte sequence (include RuneError).
Optimizations made in http://golang.org/cl/28537 and
http://golang.org/cl/28546 altered this undocumented behavior such
that RuneError would only match on the RuneError rune itself.
Although, the new behavior is arguably reasonable, it did break code
that depended on the previous behavior. Thus, we add special checks
to ensure that we preserve the old behavior.
There is a slight performance hit for correctness:
benchmark old ns/op new ns/op delta
BenchmarkIndexRune/10-4 19.3 21.6 +11.92%
BenchmarkIndexRune/32-4 33.6 35.2 +4.76%
This only occurs on small strings. The performance hit for larger strings
is neglible and not shown.
Fixes#17611
Change-Id: I1d863a741213d46c40b2e1724c41245df52502a5
Reviewed-on: https://go-review.googlesource.com/32123
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Panic if Repeat is given a negative count or
if the value of (len(*) * count) is detected
to overflow.
We panic because we cannot change the
signature of Repeat to return an error.
Fixes#16237
Change-Id: I9f5ba031a5b8533db0582d7a672ffb715143f3fb
Reviewed-on: https://go-review.googlesource.com/29954
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
IndexHard4-4 1.50ms ± 2% 0.71ms ± 0% -52.36% (p=0.000 n=20+19)
This also fixes a bug, that caused a string of length 16 to use
two 8-byte comparisons instead of one 16-byte. And adds a test for
cases when partial_match fails.
Change-Id: I1ee8fc4e068bb36c95c45de78f067c822c0d9df0
Reviewed-on: https://go-review.googlesource.com/22551
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We already had special cases for 0 and 1. Add 2 and 3 for now too.
To be removed if the compiler is improved later (#6714).
This halves the number of allocations and total bytes allocated via
common filepath.Join calls, improving filepath.Walk performance.
Noticed as part of investigating filepath.Walk in #16399.
Change-Id: If7b1bb85606d4720f3ebdf8de7b1e12ad165079d
Reviewed-on: https://go-review.googlesource.com/25005
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
The 17-31 byte code is broken. Disabled it.
Added a bunch of tests to at least cover the cases
in indexShortStr. I'll channel Brad and wonder why
this CL ever got in without any tests.
Fixes#15679
Change-Id: I84a7b283a74107db865b9586c955dcf5f2d60161
Reviewed-on: https://go-review.googlesource.com/23106
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
CL/19862 (f79b50b8d5) recently introduced the constants
SeekStart, SeekCurrent, and SeekEnd to the io package. We should use these constants
consistently throughout the code base.
Updates #15269
Change-Id: If7fcaca7676e4a51f588528f5ced28220d9639a2
Reviewed-on: https://go-review.googlesource.com/22097
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>