Commit Graph

29 Commits

Author SHA1 Message Date
Andy Balholm 0ce1ffc49d regexp/syntax: append patchLists in constant time
By keeping a tail pointer, we can append to a patchList in constant
time, rather than in time proportional to the length of the list. This
gets rid of the quadratic compile times we were seeing for long series
of alternations.

This is basically the same change as
e9d517989f.

Fixes #39542.

Change-Id: Ib4ca0ca9c55abd1594df1984653c7d311ccf7572
Reviewed-on: https://go-review.googlesource.com/c/go/+/238079
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-06-18 00:40:11 +00:00
Russ Cox 4d9ecde30a regexp/syntax: fix comment on p.literal and simplify
p.literal's doc comment said it returned a value but it doesn't.
While we're here, p.newLiteral is only called from p.literal,
so simplify the code by merging the two.

Change-Id: Ia357937a99f4e7473f0f1ec837113a39eaeb83d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/222659
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-04-17 22:12:02 +00:00
Keegan Carruthers-Smith 648c7b592a regexp/syntax: exclude full range from String negation case
If the char class is 0x0-0x10ffff we mistakenly would String that to `[^]`,
which is not a valid regex.

Fixes #31807

Change-Id: I9ceeaddc28b67b8e1de12b6703bcb124cc784556
Reviewed-on: https://go-review.googlesource.com/c/go/+/175679
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-05-22 04:43:25 +00:00
Brad Fitzpatrick 3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
Ian Lance Taylor 3396034155 regexp/syntax: don't do both linear and binary sesarch in MatchRunePos
MatchRunePos is a significant element of regexp performance, so some
attention to optimization is appropriate. Before this CL, a
non-matching rune would do both a linear search in the first four
entries, and a binary search over all the entries. Change the code to
optimize for the common case of two runes, to only do a linear search
when there are up to four entries, and to only do a binary search when
there are more than four entries.

Updates #26623

name                             old time/op    new time/op    delta
Find-12                             260ns ± 1%     275ns ± 7%   +5.84%  (p=0.000 n=8+10)
FindAllNoMatches-12                 144ns ± 9%     143ns ±12%     ~     (p=0.187 n=10+10)
FindString-12                       256ns ± 4%     254ns ± 1%     ~     (p=0.357 n=9+8)
FindSubmatch-12                     587ns ±12%     593ns ±11%     ~     (p=0.516 n=10+10)
FindStringSubmatch-12               534ns ±12%     525ns ±14%     ~     (p=0.565 n=10+10)
Literal-12                          104ns ±14%     106ns ±11%     ~     (p=0.145 n=10+10)
NotLiteral-12                      1.51µs ± 8%    1.47µs ± 2%     ~     (p=0.508 n=10+9)
MatchClass-12                      2.47µs ± 1%    2.26µs ± 6%   -8.55%  (p=0.000 n=8+10)
MatchClass_InRange-12              2.18µs ± 5%    2.25µs ±11%   +2.85%  (p=0.009 n=9+10)
ReplaceAll-12                      2.35µs ± 6%    2.08µs ±23%  -11.59%  (p=0.010 n=9+10)
AnchoredLiteralShortNonMatch-12    93.2ns ± 9%    93.2ns ±11%     ~     (p=0.716 n=10+10)
AnchoredLiteralLongNonMatch-12      118ns ±10%     117ns ± 9%     ~     (p=0.802 n=10+10)
AnchoredShortMatch-12               142ns ± 1%     141ns ± 1%   -0.53%  (p=0.007 n=8+8)
AnchoredLongMatch-12                303ns ± 9%     304ns ± 6%     ~     (p=0.724 n=10+10)
OnePassShortA-12                    620ns ± 1%     618ns ± 9%     ~     (p=0.162 n=8+10)
NotOnePassShortA-12                 599ns ± 8%     568ns ± 1%   -5.21%  (p=0.000 n=10+8)
OnePassShortB-12                    525ns ± 7%     489ns ± 1%   -6.93%  (p=0.000 n=10+8)
NotOnePassShortB-12                 449ns ± 9%     431ns ±11%   -4.05%  (p=0.033 n=10+10)
OnePassLongPrefix-12                119ns ± 6%     114ns ± 0%   -3.88%  (p=0.006 n=10+9)
OnePassLongNotPrefix-12             420ns ± 9%     410ns ± 7%     ~     (p=0.645 n=10+9)
MatchParallelShared-12              376ns ± 0%     375ns ± 0%   -0.45%  (p=0.003 n=8+10)
MatchParallelCopied-12             39.4ns ± 1%    39.1ns ± 0%   -0.55%  (p=0.004 n=10+9)
QuoteMetaAll-12                     139ns ± 7%     142ns ± 7%     ~     (p=0.445 n=10+10)
QuoteMetaNone-12                   56.7ns ± 0%    61.3ns ± 7%   +8.03%  (p=0.001 n=8+10)
Match/Easy0/32-12                  83.4ns ± 7%    83.1ns ± 8%     ~     (p=0.541 n=10+10)
Match/Easy0/1K-12                   417ns ± 8%     394ns ± 6%     ~     (p=0.059 n=10+9)
Match/Easy0/32K-12                 7.05µs ± 8%    7.30µs ± 9%     ~     (p=0.190 n=10+10)
Match/Easy0/1M-12                   291µs ±17%     284µs ±10%     ~     (p=0.481 n=10+10)
Match/Easy0/32M-12                 9.89ms ± 4%   10.27ms ± 8%     ~     (p=0.315 n=10+10)
Match/Easy0i/32-12                 1.13µs ± 1%    1.14µs ± 1%   +1.51%  (p=0.000 n=8+8)
Match/Easy0i/1K-12                 35.7µs ±11%    36.8µs ±10%     ~     (p=0.143 n=10+10)
Match/Easy0i/32K-12                1.70ms ± 7%    1.72ms ± 7%     ~     (p=0.776 n=9+6)

name                             old alloc/op   new alloc/op   delta
Find-12                             0.00B          0.00B          ~     (all equal)
FindAllNoMatches-12                 0.00B          0.00B          ~     (all equal)
FindString-12                       0.00B          0.00B          ~     (all equal)
FindSubmatch-12                     48.0B ± 0%     48.0B ± 0%     ~     (all equal)
FindStringSubmatch-12               32.0B ± 0%     32.0B ± 0%     ~     (all equal)

name                             old allocs/op  new allocs/op  delta
Find-12                              0.00           0.00          ~     (all equal)
FindAllNoMatches-12                  0.00           0.00          ~     (all equal)
FindString-12                        0.00           0.00          ~     (all equal)
FindSubmatch-12                      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
FindStringSubmatch-12                1.00 ± 0%      1.00 ± 0%     ~     (all equal)

name                             old speed      new speed      delta
QuoteMetaAll-12                   101MB/s ± 8%    99MB/s ± 7%     ~     (p=0.529 n=10+10)
QuoteMetaNone-12                  458MB/s ± 0%   425MB/s ± 8%   -7.22%  (p=0.003 n=8+10)
Match/Easy0/32-12                 385MB/s ± 7%   386MB/s ± 7%     ~     (p=0.579 n=10+10)
Match/Easy0/1K-12                2.46GB/s ± 8%  2.60GB/s ± 6%     ~     (p=0.065 n=10+9)
Match/Easy0/32K-12               4.66GB/s ± 7%  4.50GB/s ±10%     ~     (p=0.190 n=10+10)
Match/Easy0/1M-12                3.63GB/s ±15%  3.70GB/s ± 9%     ~     (p=0.481 n=10+10)
Match/Easy0/32M-12               3.40GB/s ± 4%  3.28GB/s ± 8%     ~     (p=0.315 n=10+10)
Match/Easy0i/32-12               28.4MB/s ± 1%  28.0MB/s ± 1%   -1.50%  (p=0.000 n=8+8)
Match/Easy0i/1K-12               28.8MB/s ±10%  27.9MB/s ±11%     ~     (p=0.143 n=10+10)
Match/Easy0i/32K-12              19.0MB/s ±14%  19.1MB/s ± 8%     ~     (p=1.000 n=10+6)

Change-Id: I238a451b36ad84b0f5534ff0af5c077a0d52d73a
Reviewed-on: https://go-review.googlesource.com/130417
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-22 17:11:57 +00:00
Tim Cooper 161874da2a all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect
content.

Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2018-06-01 21:52:00 +00:00
Ian Lance Taylor 9967582f77 regexp/syntax: update perl script to preserve \s behavior
Incorporate https://code-review.googlesource.com/#/c/re2/+/3050/ from
the re2 repository. Description of that change:

    Preserve the original behaviour of \s.

    Prior to Perl 5.18, \s did not match vertical tab. Bake that into
    make_perl_groups.pl as an override so that perl_groups.cc retains
    its current definitions when rebuilt with newer versions of Perl.

This fixes make_perl_groups.pl to generate an unchanged perl_groups.go
with perl versions 5.18 and later.

Fixes #22057

Change-Id: I9a56e9660092ed6c1ff1045b4a3847de355441a7
Reviewed-on: https://go-review.googlesource.com/103517
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-29 22:31:57 +00:00
Brad Fitzpatrick 48db2c01b4 all: use strings.Builder instead of bytes.Buffer where appropriate
I grepped for "bytes.Buffer" and "buf.String" and mostly ignored test
files. I skipped a few on purpose and probably missed a few others,
but otherwise I think this should be most of them.

Updates #18990

Change-Id: I5a6ae4296b87b416d8da02d7bfaf981d8cc14774
Reviewed-on: https://go-review.googlesource.com/102479
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-26 23:05:53 +00:00
Daniel Martí 8da180f6ca all: remove some unused return parameters
As found by unparam. Picked the low-hanging fruit, consisting only of
errors that were always nil and results that were never used. Left out
those that were useful for consistency with other func signatures.

Change-Id: I06b52bbd3541f8a5d66659c909bd93cb3e172018
Reviewed-on: https://go-review.googlesource.com/102418
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-24 19:44:47 +00:00
Daniel Martí d4b2168b23 regexp/syntax: make Op an fmt.Stringer
Using stringer.

Fixes #22684.

Change-Id: I62fbde5dcb337cf269686615616bd39a27491ac1
Reviewed-on: https://go-review.googlesource.com/95355
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-20 15:33:50 +00:00
Joe Tsai b53088a634 Revert "go/printer: forbid empty line before first comment in block"
This reverts commit 08f19bbde1.

Reason for revert:
The changed transformation takes effect on a larger set
of code snippets than expected.

For example, this:
    func foo() {

        // Comment
        bar()

    }
becomes:
    func foo() {
        // Comment
        bar()

    }

This is an unintended consequence.

Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8
Reviewed-on: https://go-review.googlesource.com/81335
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-12-01 01:12:26 +00:00
Joe Tsai 08f19bbde1 go/printer: forbid empty line before first comment in block
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.

Before the change:
<<<
type NamedArg struct {

        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}
        // contains filtered or unexported fields
}
>>>

After the change:
<<<
type NamedArg struct {
        // Name is the name of the parameter placeholder.
        //
        // If empty, the ordinal position in the argument list will be
        // used.
        //
        // Name must omit any symbol prefix.
        Name string

        // Value is the value of the parameter.
        // It may be assigned the same value types as the query
        // arguments.
        Value interface{}

        // contains filtered or unexported fields
}
>>>

Fixes #18264

Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-02 18:17:22 +00:00
Sylvain Zimmer 67da597312 regexp: Remove duplicated function wordRune()
Fixes #21742

Change-Id: Ib56b092c490c27a4ba7ebdb6391f1511794710b8
Reviewed-on: https://go-review.googlesource.com/61034
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2017-09-08 23:39:37 +00:00
Daniel Martí ad74e450ca regexp/syntax: remove unused flags parameter
Found by github.com/mvdan/unparam.

Change-Id: I186d2afd067e97eb05d65c4599119b347f82867d
Reviewed-on: https://go-review.googlesource.com/37840
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-06 19:11:09 +00:00
Marcel van Lohuizen a2a4db7375 unicode: upgrade to version 9.0.0
Changes beyond generated tables:
- Now supports aliases to handle deprecated
  property classes.
- Some Mongolian letters are now modifiers.

Other changes:
- strconv: newly generated table to be in sync
- regexp/syntax: updated maxFold

Fixes #16191

Change-Id: I56bdf21ee2f775f2a82d0465b3772faf5c24cb61
Reviewed-on: https://go-review.googlesource.com/24496
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-06-28 15:08:11 +00:00
Brad Fitzpatrick cdcb8271a4 regexp/syntax: clarify that \Z means Perl's \Z
Fixes #14793

Change-Id: I408056d096cd6a999fa5e349704b5ea8e26d2e4e
Reviewed-on: https://go-review.googlesource.com/23201
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-18 04:43:32 +00:00
Emmanuel Odeke 53fd522c0d all: make copyright headers consistent with one space after period
Follows suit with https://go-review.googlesource.com/#/c/20111.

Generated by running
$ grep -R 'Go Authors.  All' * | cut -d":" -f1 | while read F;do perl -pi -e 's/Go
Authors.  All/Go Authors. All/g' $F;done

The code in cmd/internal/unvendor wasn't changed.

Fixes #15213

Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f
Reviewed-on: https://go-review.googlesource.com/21819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-02 13:43:18 +00:00
Brad Fitzpatrick 5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Brad Fitzpatrick 519474451a all: make copyright headers consistent with one space after period
This is a subset of https://golang.org/cl/20022 with only the copyright
header lines, so the next CL will be smaller and more reviewable.

Go policy has been single space after periods in comments for some time.

The copyright header template at:

    https://golang.org/doc/contribute.html#copyright

also uses a single space.

Make them all consistent.

Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0
Reviewed-on: https://go-review.googlesource.com/20111
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-01 23:34:33 +00:00
Nathan VanBenschoten b04f3b06ec all: replace strings.Index with strings.Contains where possible
Change-Id: Ia613f1c37bfce800ece0533a5326fca91d99a66a
Reviewed-on: https://go-review.googlesource.com/18120
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
2016-02-19 01:06:05 +00:00
Paul Wankadia 5ccaf0255b regexp/syntax: fix factoring of common prefixes in alternations
In the past, `a.*?c|a.*?b` was factored to `a.*?[bc]`. Thus, given
"abc" as its input string, the automaton would consume "ab" and
then stop (when unanchored) whereas it should consume all of "abc"
as per leftmost semantics.

Fixes #13812.

Change-Id: I67ac0a353d7793b3d0c9c4aaf22d157621dfe784
Reviewed-on: https://go-review.googlesource.com/18357
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-08 16:41:46 +00:00
Russ Cox 0680e9c0c1 regexp/syntax: fix handling of \Q...\E
It's not a group: must handle the inside as a sequence of literal chars,
not a single literal string.

That is, \Qab\E+ is the same as ab+, not (ab)+.

Fixes #11187.

Change-Id: I5406d05ccf7efff3a7f15395bdb0cfb2bd23a8ed
Reviewed-on: https://go-review.googlesource.com/17233
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-01 22:45:12 +00:00
Tamir Duberstein 3aa755b8fb regexp/syntax: correctly print `^` BOL and `$` EOL
Fixes #12980.

Change-Id: I936db2f57f7c4dc80bb8ec32715c4c6b7bf0d708
Reviewed-on: https://go-review.googlesource.com/16112
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-25 17:25:22 +00:00
Josh Bleecher Snyder 2adc4e8927 all: use "reports whether" in place of "returns true if(f)"
Comment changes only.

Change-Id: I56848814564c4aa0988b451df18bebdfc88d6d94
Reviewed-on: https://go-review.googlesource.com/7721
Reviewed-by: Rob Pike <r@golang.org>
2015-03-18 15:14:06 +00:00
Robin Eklind 04c7b68b4a regexp/syntax: Clarify comment of OpAnyCharNotNL.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/171560043
2014-11-11 18:52:07 -08:00
Ian Lance Taylor 0f022fdd52 regexp/syntax: fix validity testing of zero repeats
This is already tested by TestRE2Exhaustive, but the build has
not broken because that test is not run when using -test.short.

LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/155580043
2014-10-20 08:12:45 -07:00
Russ Cox 85fd0fd7c4 regexp/syntax: regenerate doc.go from re2 syntax
Generated using re2/doc/mksyntaxgo.

Fixes #8505.

LGTM=iant
R=r, iant
CC=golang-codereviews
https://golang.org/cl/155890043
2014-10-06 15:32:11 -04:00
Russ Cox 9b2b0c8c16 regexp/syntax: reject large repetitions created by nesting small ones
Fixes #7609.

LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/150270043
2014-09-30 12:08:09 -04:00
Russ Cox c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00