mirror of https://github.com/golang/go.git
66 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
bc92fe8b34 |
compress/flate: improve deflate performance by register allocating the index
Use local index variable to help the go compiler use a register to for the hash index instead of continuous memory read and write operations. compress/flate: Encode/Digits/Huffman/1e4-4 35.3µs ± 1% 32.7µs ± 0% -7.48% (p=0.000 n=17+19) Encode/Digits/Huffman/1e5-4 330µs ± 0% 312µs ± 0% -5.55% (p=0.000 n=17+18) Encode/Digits/Huffman/1e6-4 3.30ms ± 0% 3.12ms ± 0% -5.64% (p=0.000 n=18+19) Encode/Digits/Speed/1e4-4 157µs ± 0% 156µs ± 0% -0.41% (p=0.000 n=17+19) Encode/Digits/Speed/1e5-4 1.46ms ± 0% 1.46ms ± 1% ~ (p=0.478 n=20+19) Encode/Digits/Speed/1e6-4 14.4ms ± 0% 14.4ms ± 0% ~ (p=0.835 n=19+20) Encode/Digits/Default/1e4-4 309µs ± 0% 310µs ± 0% +0.23% (p=0.000 n=19+17) Encode/Digits/Default/1e5-4 4.76ms ± 0% 4.76ms ± 0% ~ (p=0.297 n=19+19) Encode/Digits/Default/1e6-4 51.0ms ± 0% 51.0ms ± 1% ~ (p=0.233 n=18+19) Encode/Digits/Compression/1e4-4 309µs ± 0% 310µs ± 0% +0.21% (p=0.000 n=17+20) Encode/Digits/Compression/1e5-4 4.76ms ± 0% 4.76ms ± 0% ~ (p=0.749 n=20+19) Encode/Digits/Compression/1e6-4 50.9ms ± 0% 50.9ms ± 0% ~ (p=0.499 n=18+19) Encode/Newton/Huffman/1e4-4 51.9µs ± 0% 48.0µs ± 0% -7.61% (p=0.000 n=19+19) Encode/Newton/Huffman/1e5-4 396µs ± 0% 377µs ± 0% -4.79% (p=0.000 n=18+19) Encode/Newton/Huffman/1e6-4 3.95ms ± 0% 3.74ms ± 0% -5.21% (p=0.000 n=20+17) Encode/Newton/Speed/1e4-4 155µs ± 0% 154µs ± 0% -0.67% (p=0.000 n=17+18) Encode/Newton/Speed/1e5-4 1.17ms ± 0% 1.16ms ± 0% -0.64% (p=0.000 n=20+16) Encode/Newton/Speed/1e6-4 11.6ms ± 0% 11.5ms ± 0% -0.63% (p=0.000 n=19+20) Encode/Newton/Default/1e4-4 347µs ± 0% 347µs ± 0% ~ (p=0.744 n=20+19) Encode/Newton/Default/1e5-4 5.06ms ± 0% 5.02ms ± 0% -0.77% (p=0.000 n=20+19) Encode/Newton/Default/1e6-4 53.3ms ± 1% 52.8ms ± 0% -0.91% (p=0.000 n=18+16) Encode/Newton/Compression/1e4-4 351µs ± 0% 351µs ± 0% ~ (p=0.277 n=20+20) Encode/Newton/Compression/1e5-4 6.90ms ± 0% 6.85ms ± 0% -0.61% (p=0.000 n=19+18) Encode/Newton/Compression/1e6-4 73.2ms ± 0% 72.8ms ± 0% -0.52% (p=0.000 n=18+18) name old speed new speed delta Encode/Digits/Huffman/1e4-4 283MB/s ± 1% 306MB/s ± 0% +8.09% (p=0.000 n=17+19) Encode/Digits/Huffman/1e5-4 303MB/s ± 0% 321MB/s ± 0% +5.87% (p=0.000 n=18+18) Encode/Digits/Huffman/1e6-4 303MB/s ± 0% 321MB/s ± 0% +5.98% (p=0.000 n=18+19) Encode/Digits/Speed/1e4-4 63.9MB/s ± 0% 64.2MB/s ± 0% +0.41% (p=0.000 n=17+19) Encode/Digits/Speed/1e5-4 68.5MB/s ± 0% 68.4MB/s ± 1% ~ (p=0.481 n=20+19) Encode/Digits/Speed/1e6-4 69.4MB/s ± 0% 69.3MB/s ± 0% ~ (p=0.712 n=19+20) Encode/Digits/Default/1e4-4 32.3MB/s ± 0% 32.3MB/s ± 0% -0.23% (p=0.000 n=19+17) Encode/Digits/Default/1e5-4 21.0MB/s ± 0% 21.0MB/s ± 0% ~ (p=0.460 n=19+19) Encode/Digits/Default/1e6-4 19.6MB/s ± 0% 19.6MB/s ± 1% ~ (p=0.180 n=18+19) Encode/Digits/Compression/1e4-4 32.3MB/s ± 0% 32.3MB/s ± 0% -0.21% (p=0.000 n=17+20) Encode/Digits/Compression/1e5-4 21.0MB/s ± 0% 21.0MB/s ± 0% ~ (p=0.700 n=20+19) Encode/Digits/Compression/1e6-4 19.6MB/s ± 0% 19.6MB/s ± 0% ~ (p=0.486 n=18+19) Encode/Newton/Huffman/1e4-4 193MB/s ± 0% 208MB/s ± 0% +8.23% (p=0.000 n=19+19) Encode/Newton/Huffman/1e5-4 252MB/s ± 0% 265MB/s ± 0% +5.04% (p=0.000 n=18+19) Encode/Newton/Huffman/1e6-4 253MB/s ± 0% 267MB/s ± 0% +5.49% (p=0.000 n=20+17) Encode/Newton/Speed/1e4-4 64.5MB/s ± 0% 65.0MB/s ± 0% +0.67% (p=0.000 n=17+18) Encode/Newton/Speed/1e5-4 85.7MB/s ± 0% 86.3MB/s ± 0% +0.65% (p=0.000 n=20+16) Encode/Newton/Speed/1e6-4 86.2MB/s ± 0% 86.7MB/s ± 0% +0.63% (p=0.000 n=19+20) Encode/Newton/Default/1e4-4 28.9MB/s ± 0% 28.9MB/s ± 0% ~ (p=0.840 n=20+19) Encode/Newton/Default/1e5-4 19.8MB/s ± 0% 19.9MB/s ± 0% +0.78% (p=0.000 n=20+19) Encode/Newton/Default/1e6-4 18.8MB/s ± 1% 18.9MB/s ± 0% +0.93% (p=0.000 n=18+16) Encode/Newton/Compression/1e4-4 28.5MB/s ± 0% 28.5MB/s ± 0% ~ (p=0.244 n=20+20) Encode/Newton/Compression/1e5-4 14.5MB/s ± 0% 14.6MB/s ± 0% +0.61% (p=0.000 n=19+18) Encode/Newton/Compression/1e6-4 13.7MB/s ± 0% 13.7MB/s ± 0% +0.53% (p=0.000 n=18+18) image/png: name old time/op new time/op delta EncodeGray-4 2.16ms ± 1% 1.85ms ± 1% -14.17% (p=0.000 n=86+91) EncodeGrayWithBufferPool-4 1.99ms ± 0% 1.69ms ± 0% -15.09% (p=0.000 n=97+94) EncodeNRGBOpaque-4 6.51ms ± 1% 5.62ms ± 1% -13.66% (p=0.000 n=90+92) EncodeNRGBA-4 7.33ms ± 1% 6.12ms ± 1% -16.49% (p=0.000 n=89+90) EncodePaletted-4 5.10ms ± 1% 4.96ms ± 1% -2.76% (p=0.000 n=90+87) EncodeRGBOpaque-4 6.51ms ± 1% 5.63ms ± 1% -13.49% (p=0.000 n=94+87) EncodeRGBA-4 24.3ms ± 2% 23.0ms ± 0% -5.23% (p=0.000 n=91+89) name old speed new speed delta EncodeGray-4 142MB/s ± 1% 166MB/s ± 1% +16.50% (p=0.000 n=86+91) EncodeGrayWithBufferPool-4 154MB/s ± 0% 182MB/s ± 0% +17.78% (p=0.000 n=97+94) EncodeNRGBOpaque-4 189MB/s ± 1% 219MB/s ± 1% +15.82% (p=0.000 n=90+93) EncodeNRGBA-4 168MB/s ± 1% 201MB/s ± 1% +19.75% (p=0.000 n=89+90) EncodePaletted-4 60.3MB/s ± 1% 62.0MB/s ± 1% +2.84% (p=0.000 n=90+87) EncodeRGBOpaque-4 189MB/s ± 1% 218MB/s ± 1% +15.60% (p=0.000 n=94+87) EncodeRGBA-4 50.6MB/s ± 2% 53.4MB/s ± 0% +5.51% (p=0.000 n=91+89) Change-Id: Ifed4486a7ba19a26abe5cbf2142f15cc7464e84f Reviewed-on: https://go-review.googlesource.com/c/go/+/187837 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> |
|
|
|
06b0babf31 |
all: shorten some tests
Shorten some of the longest tests that run during all.bash. Removes 7r 50u 21s from all.bash. After this change, all.bash is under 5 minutes again on my laptop. For #26473. Change-Id: Ie0460aa935808d65460408feaed210fbaa1d5d79 Reviewed-on: https://go-review.googlesource.com/c/go/+/177559 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
|
|
|
e6ad619ad6 |
cmd/go: further reduce init work
The first biggest offender was crypto/des.init at ~1%. It's cryptographically broken and the init function is relatively expensive, which is unfortunate as both crypto/tls and crypto/x509 (and by extension, cmd/go) import it. Hide the work behind sync.Once. The second biggest offender was flag.sortFlags at just under 1%, used by the Visit flagset methods. It allocated two slices, which made a difference as cmd/go iterates over multiple flagsets during init. Use a single slice with a direct sort.Interface implementation. Another big offender is initializing global maps. Reducing this work in cmd/go/internal/imports and net/textproto gives us close to another whole 1% in saved work. The former can use map literals, and the latter can hide the work behind sync.Once. Finally, compress/flate used newHuffmanBitWriter as part of init, which allocates many objects and slices. Yet it only used one of the slice fields. Allocating just that slice saves a surprising ~0.3%, since we generated a lot of unnecessary garbage. All in all, these little pieces amount to just over 3% saved CPU time. name old time/op new time/op delta ExecGoEnv-8 3.61ms ± 1% 3.50ms ± 0% -3.02% (p=0.000 n=10+10) Updates #26775. Updates #29382. Change-Id: I915416e88a874c63235ba512617c8aef35c0ca8b Reviewed-on: https://go-review.googlesource.com/c/go/+/166459 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
fae44a2be3 |
src, misc: apply gofmt
This applies the new gofmt literal normalizations to the library. Change-Id: I8c1e8ef62eb556fc568872c9f77a31ef236348e7 Reviewed-on: https://go-review.googlesource.com/c/162539 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
|
|
|
1e708337b2 |
compress/flate: fix the old url for the flate algorithm
Change-Id: I84b74bc96516033bbf4a01f9aa81fe60d5a41355 Reviewed-on: https://go-review.googlesource.com/c/155317 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
f90e89e675 |
all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev:
|
|
|
|
43cd907017 |
Revert "compress: move benchmark text from src/testdata to src/compress/testdata"
This reverts commit
|
|
|
|
067bb443af |
compress: move benchmark text from src/testdata to src/compress/testdata
This text is used mainly for benchmark compression testing, and in one net test. The text was prevoiusly in a src/testdata directory, but since that directory would only include one file, the text is moved to the existing src/compression/testdata directory. This does not cause any change to the benchmark results. Updates #27151 Change-Id: I38ab5089dfe744189a970947d15be50ef1d48517 Reviewed-on: https://go-review.googlesource.com/138495 Run-TryBot: Katie Hockman <katie@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
fdceb2a11b |
compress: reduce copies of new text for compression testing
The previous book was 387 KiB decompressed and 119 KiB compressed, the new book is 567 KiB decompressed and 132 KiB compressed. Overall, this change will reduce the release binary size by 196 KiB. The new book will allow for slightly more extensive compression testing with a larger text. Command to run the benchmark tests used with benchstat: `../bin/go test -run='^$' -count=4 -bench=. compress/bzip2 compress/flate` When running the benchmarks locally, changed "Newton" to "Twain" and filtered the tests with the -bench flag to include only those which were relevant to these changes. benchstat results below: name old time/op new time/op delta DecodeTwain-8 19.6ms ± 2% 24.1ms ± 1% +23.04% (p=0.029 n=4+4) Decode/Twain/Huffman/1e4-8 140µs ± 3% 139µs ± 5% ~ (p=0.886 n=4+4) Decode/Twain/Huffman/1e5-8 1.27ms ± 3% 1.26ms ± 1% ~ (p=1.000 n=4+4) Decode/Twain/Huffman/1e6-8 12.4ms ± 0% 13.2ms ± 1% +6.42% (p=0.029 n=4+4) Decode/Twain/Speed/1e4-8 133µs ± 1% 123µs ± 1% -7.35% (p=0.029 n=4+4) Decode/Twain/Speed/1e5-8 1.20ms ± 0% 1.02ms ± 3% -15.32% (p=0.029 n=4+4) Decode/Twain/Speed/1e6-8 12.0ms ± 2% 10.1ms ± 3% -15.89% (p=0.029 n=4+4) Decode/Twain/Default/1e4-8 131µs ± 6% 108µs ± 5% -17.84% (p=0.029 n=4+4) Decode/Twain/Default/1e5-8 1.06ms ± 2% 0.80ms ± 1% -24.97% (p=0.029 n=4+4) Decode/Twain/Default/1e6-8 10.0ms ± 3% 8.0ms ± 3% -20.06% (p=0.029 n=4+4) Decode/Twain/Compression/1e4-8 128µs ± 4% 115µs ± 4% -9.70% (p=0.029 n=4+4) Decode/Twain/Compression/1e5-8 1.04ms ± 2% 0.83ms ± 4% -20.37% (p=0.029 n=4+4) Decode/Twain/Compression/1e6-8 10.4ms ± 4% 8.1ms ± 5% -22.25% (p=0.029 n=4+4) Encode/Twain/Huffman/1e4-8 55.7µs ± 2% 55.6µs ± 1% ~ (p=1.000 n=4+4) Encode/Twain/Huffman/1e5-8 441µs ± 0% 435µs ± 2% ~ (p=0.343 n=4+4) Encode/Twain/Huffman/1e6-8 4.31ms ± 4% 4.30ms ± 4% ~ (p=0.886 n=4+4) Encode/Twain/Speed/1e4-8 193µs ± 1% 166µs ± 2% -14.09% (p=0.029 n=4+4) Encode/Twain/Speed/1e5-8 1.54ms ± 1% 1.22ms ± 1% -20.53% (p=0.029 n=4+4) Encode/Twain/Speed/1e6-8 15.3ms ± 1% 12.2ms ± 3% -20.62% (p=0.029 n=4+4) Encode/Twain/Default/1e4-8 393µs ± 1% 390µs ± 1% ~ (p=0.114 n=4+4) Encode/Twain/Default/1e5-8 6.12ms ± 4% 6.02ms ± 5% ~ (p=0.486 n=4+4) Encode/Twain/Default/1e6-8 69.4ms ± 5% 59.0ms ± 4% -15.07% (p=0.029 n=4+4) Encode/Twain/Compression/1e4-8 423µs ± 2% 379µs ± 2% -10.34% (p=0.029 n=4+4) Encode/Twain/Compression/1e5-8 7.00ms ± 1% 7.88ms ± 3% +12.49% (p=0.029 n=4+4) Encode/Twain/Compression/1e6-8 76.6ms ± 5% 80.9ms ± 3% ~ (p=0.114 n=4+4) name old speed new speed delta DecodeTwain-8 19.8MB/s ± 2% 23.6MB/s ± 1% +18.84% (p=0.029 n=4+4) Decode/Twain/Huffman/1e4-8 71.7MB/s ± 3% 72.1MB/s ± 6% ~ (p=0.943 n=4+4) Decode/Twain/Huffman/1e5-8 78.8MB/s ± 3% 79.5MB/s ± 1% ~ (p=1.000 n=4+4) Decode/Twain/Huffman/1e6-8 80.5MB/s ± 0% 75.6MB/s ± 1% -6.03% (p=0.029 n=4+4) Decode/Twain/Speed/1e4-8 75.2MB/s ± 1% 81.2MB/s ± 1% +7.93% (p=0.029 n=4+4) Decode/Twain/Speed/1e5-8 83.4MB/s ± 0% 98.6MB/s ± 3% +18.16% (p=0.029 n=4+4) Decode/Twain/Speed/1e6-8 83.6MB/s ± 2% 99.5MB/s ± 3% +18.91% (p=0.029 n=4+4) Decode/Twain/Default/1e4-8 76.3MB/s ± 6% 92.8MB/s ± 4% +21.62% (p=0.029 n=4+4) Decode/Twain/Default/1e5-8 94.4MB/s ± 3% 125.7MB/s ± 1% +33.24% (p=0.029 n=4+4) Decode/Twain/Default/1e6-8 100MB/s ± 3% 125MB/s ± 3% +25.12% (p=0.029 n=4+4) Decode/Twain/Compression/1e4-8 78.4MB/s ± 4% 86.8MB/s ± 4% +10.73% (p=0.029 n=4+4) Decode/Twain/Compression/1e5-8 95.7MB/s ± 2% 120.3MB/s ± 4% +25.65% (p=0.029 n=4+4) Decode/Twain/Compression/1e6-8 96.4MB/s ± 4% 124.0MB/s ± 5% +28.64% (p=0.029 n=4+4) Encode/Twain/Huffman/1e4-8 179MB/s ± 2% 180MB/s ± 1% ~ (p=1.000 n=4+4) Encode/Twain/Huffman/1e5-8 227MB/s ± 0% 230MB/s ± 2% ~ (p=0.343 n=4+4) Encode/Twain/Huffman/1e6-8 232MB/s ± 4% 233MB/s ± 4% ~ (p=0.886 n=4+4) Encode/Twain/Speed/1e4-8 51.8MB/s ± 1% 60.4MB/s ± 2% +16.43% (p=0.029 n=4+4) Encode/Twain/Speed/1e5-8 65.1MB/s ± 1% 81.9MB/s ± 1% +25.83% (p=0.029 n=4+4) Encode/Twain/Speed/1e6-8 65.2MB/s ± 1% 82.2MB/s ± 3% +26.00% (p=0.029 n=4+4) Encode/Twain/Default/1e4-8 25.4MB/s ± 1% 25.6MB/s ± 1% ~ (p=0.114 n=4+4) Encode/Twain/Default/1e5-8 16.4MB/s ± 4% 16.6MB/s ± 5% ~ (p=0.486 n=4+4) Encode/Twain/Default/1e6-8 14.4MB/s ± 6% 17.0MB/s ± 4% +17.67% (p=0.029 n=4+4) Encode/Twain/Compression/1e4-8 23.6MB/s ± 2% 26.4MB/s ± 2% +11.54% (p=0.029 n=4+4) Encode/Twain/Compression/1e5-8 14.3MB/s ± 1% 12.7MB/s ± 3% -11.08% (p=0.029 n=4+4) Encode/Twain/Compression/1e6-8 13.1MB/s ± 4% 12.4MB/s ± 3% ~ (p=0.114 n=4+4) name old alloc/op new alloc/op delta DecodeTwain-8 3.63MB ± 0% 3.63MB ± 0% +0.15% (p=0.029 n=4+4) Decode/Twain/Huffman/1e4-8 42.0kB ± 0% 41.3kB ± 0% -1.62% (p=0.029 n=4+4) Decode/Twain/Huffman/1e5-8 43.5kB ± 0% 45.1kB ± 0% +3.74% (p=0.029 n=4+4) Decode/Twain/Huffman/1e6-8 71.7kB ± 0% 80.0kB ± 0% +11.55% (p=0.029 n=4+4) Decode/Twain/Speed/1e4-8 41.2kB ± 0% 41.3kB ± 0% ~ (p=0.286 n=4+4) Decode/Twain/Speed/1e5-8 45.1kB ± 0% 43.9kB ± 0% -2.80% (p=0.029 n=4+4) Decode/Twain/Speed/1e6-8 72.8kB ± 0% 81.3kB ± 0% +11.72% (p=0.029 n=4+4) Decode/Twain/Default/1e4-8 41.2kB ± 0% 41.2kB ± 0% -0.22% (p=0.029 n=4+4) Decode/Twain/Default/1e5-8 44.4kB ± 0% 43.0kB ± 0% -3.02% (p=0.029 n=4+4) Decode/Twain/Default/1e6-8 71.0kB ± 0% 61.8kB ± 0% -13.00% (p=0.029 n=4+4) Decode/Twain/Compression/1e4-8 41.3kB ± 0% 41.2kB ± 0% -0.29% (p=0.029 n=4+4) Decode/Twain/Compression/1e5-8 43.3kB ± 0% 43.0kB ± 0% -0.72% (p=0.029 n=4+4) Decode/Twain/Compression/1e6-8 69.1kB ± 0% 63.7kB ± 0% -7.90% (p=0.029 n=4+4) name old allocs/op new allocs/op delta DecodeTwain-8 51.0 ± 0% 51.2 ± 1% ~ (p=1.000 n=4+4) Decode/Twain/Huffman/1e4-8 15.0 ± 0% 14.0 ± 0% -6.67% (p=0.029 n=4+4) Decode/Twain/Huffman/1e5-8 20.0 ± 0% 23.0 ± 0% +15.00% (p=0.029 n=4+4) Decode/Twain/Huffman/1e6-8 134 ± 0% 161 ± 0% +20.15% (p=0.029 n=4+4) Decode/Twain/Speed/1e4-8 17.0 ± 0% 18.0 ± 0% +5.88% (p=0.029 n=4+4) Decode/Twain/Speed/1e5-8 30.0 ± 0% 31.0 ± 0% +3.33% (p=0.029 n=4+4) Decode/Twain/Speed/1e6-8 193 ± 0% 228 ± 0% +18.13% (p=0.029 n=4+4) Decode/Twain/Default/1e4-8 17.0 ± 0% 15.0 ± 0% -11.76% (p=0.029 n=4+4) Decode/Twain/Default/1e5-8 28.0 ± 0% 32.0 ± 0% +14.29% (p=0.029 n=4+4) Decode/Twain/Default/1e6-8 199 ± 0% 158 ± 0% -20.60% (p=0.029 n=4+4) Decode/Twain/Compression/1e4-8 17.0 ± 0% 15.0 ± 0% -11.76% (p=0.029 n=4+4) Decode/Twain/Compression/1e5-8 28.0 ± 0% 32.0 ± 0% +14.29% (p=0.029 n=4+4) Decode/Twain/Compression/1e6-8 196 ± 0% 150 ± 0% -23.47% (p=0.029 n=4+4) Updates #27151 Change-Id: I6c439694ed16a33bb4c63fbfb8570c7de46b4f2d Reviewed-on: https://go-review.googlesource.com/135495 Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> |
|
|
|
161874da2a |
all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect content. Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df Reviewed-on: https://go-review.googlesource.com/115798 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> |
|
|
|
d474f582fe |
compress/flate: do not rename math/bits import
Makes compress/flate work better with cmd/dist bootstrap. Change-Id: Ifc7d74027367008e82c1d14ec77141830583ba82 Reviewed-on: https://go-review.googlesource.com/111815 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
4984d843d9 |
compress/flate: optimize huffSym
By using local variables and assigning them back to decompressor at the end of huffSym, we allow compiler to keep them in registers and avoid reloading/storing them repeatedly. To archive this, moreBits was inlined and specialized to work with local variables. Also move EOF error conversion to helper function, to make inlined part of moreBits more readable. Together this results in nice speed-up: name old time/op new time/op delta Decode/Digits/Huffman/1e4-6 278µs ± 1% 240µs ± 2% -13.72% (p=0.000 n=10+10) Decode/Digits/Huffman/1e5-6 2.38ms ± 1% 2.05ms ± 1% -14.12% (p=0.000 n=10+10) Decode/Digits/Huffman/1e6-6 23.4ms ± 1% 19.9ms ± 0% -14.69% (p=0.000 n=9+9) Decode/Digits/Speed/1e4-6 280µs ± 2% 254µs ± 1% -9.28% (p=0.000 n=10+9) Decode/Digits/Speed/1e5-6 2.53ms ± 1% 2.35ms ± 1% -7.17% (p=0.000 n=10+10) Decode/Digits/Speed/1e6-6 24.8ms ± 1% 23.0ms ± 1% -7.22% (p=0.000 n=10+10) Decode/Digits/Default/1e4-6 281µs ± 2% 259µs ± 3% -8.03% (p=0.000 n=10+10) Decode/Digits/Default/1e5-6 2.45ms ± 1% 2.30ms ± 1% -6.15% (p=0.000 n=10+10) Decode/Digits/Default/1e6-6 24.1ms ± 1% 22.6ms ± 0% -6.31% (p=0.000 n=9+9) Decode/Digits/Compression/1e4-6 279µs ± 2% 261µs ± 2% -6.53% (p=0.000 n=8+9) Decode/Digits/Compression/1e5-6 2.44ms ± 1% 2.30ms ± 1% -5.72% (p=0.000 n=10+9) Decode/Digits/Compression/1e6-6 24.0ms ± 1% 22.6ms ± 0% -6.10% (p=0.000 n=10+9) Decode/Twain/Huffman/1e4-6 316µs ± 2% 267µs ± 3% -15.30% (p=0.000 n=9+10) Decode/Twain/Huffman/1e5-6 2.62ms ± 0% 2.22ms ± 0% -15.24% (p=0.000 n=10+10) Decode/Twain/Huffman/1e6-6 25.7ms ± 1% 21.8ms ± 0% -15.19% (p=0.000 n=10+10) Decode/Twain/Speed/1e4-6 290µs ± 1% 264µs ± 2% -9.17% (p=0.000 n=9+10) Decode/Twain/Speed/1e5-6 2.35ms ± 1% 2.13ms ± 1% -9.74% (p=0.000 n=9+10) Decode/Twain/Speed/1e6-6 22.9ms ± 0% 20.7ms ± 0% -9.68% (p=0.000 n=10+9) Decode/Twain/Default/1e4-6 270µs ± 2% 252µs ± 2% -6.67% (p=0.000 n=9+10) Decode/Twain/Default/1e5-6 2.02ms ± 1% 1.84ms ± 1% -8.85% (p=0.000 n=10+10) Decode/Twain/Default/1e6-6 19.1ms ± 0% 17.5ms ± 1% -8.73% (p=0.000 n=9+10) Decode/Twain/Compression/1e4-6 272µs ± 1% 250µs ± 4% -8.20% (p=0.000 n=9+10) Decode/Twain/Compression/1e5-6 2.01ms ± 0% 1.84ms ± 1% -8.57% (p=0.000 n=9+10) Decode/Twain/Compression/1e6-6 19.1ms ± 0% 17.4ms ± 1% -8.75% (p=0.000 n=9+10) name old speed new speed delta Decode/Digits/Huffman/1e4-6 35.9MB/s ± 1% 41.7MB/s ± 2% +15.91% (p=0.000 n=10+10) Decode/Digits/Huffman/1e5-6 41.9MB/s ± 1% 48.8MB/s ± 1% +16.44% (p=0.000 n=10+10) Decode/Digits/Huffman/1e6-6 42.8MB/s ± 1% 50.2MB/s ± 0% +17.22% (p=0.000 n=9+9) Decode/Digits/Speed/1e4-6 35.7MB/s ± 2% 39.4MB/s ± 1% +10.22% (p=0.000 n=10+9) Decode/Digits/Speed/1e5-6 39.6MB/s ± 1% 42.6MB/s ± 1% +7.73% (p=0.000 n=10+10) Decode/Digits/Speed/1e6-6 40.3MB/s ± 1% 43.4MB/s ± 1% +7.78% (p=0.000 n=10+10) Decode/Digits/Default/1e4-6 35.6MB/s ± 2% 38.7MB/s ± 2% +8.74% (p=0.000 n=10+10) Decode/Digits/Default/1e5-6 40.9MB/s ± 1% 43.6MB/s ± 1% +6.55% (p=0.000 n=10+10) Decode/Digits/Default/1e6-6 41.5MB/s ± 1% 44.3MB/s ± 0% +6.73% (p=0.000 n=9+9) Decode/Digits/Compression/1e4-6 35.8MB/s ± 2% 38.3MB/s ± 2% +6.99% (p=0.000 n=8+9) Decode/Digits/Compression/1e5-6 40.9MB/s ± 1% 43.4MB/s ± 1% +6.07% (p=0.000 n=10+9) Decode/Digits/Compression/1e6-6 41.6MB/s ± 1% 44.3MB/s ± 0% +6.49% (p=0.000 n=10+9) Decode/Twain/Huffman/1e4-6 31.7MB/s ± 2% 37.4MB/s ± 3% +18.08% (p=0.000 n=9+10) Decode/Twain/Huffman/1e5-6 38.2MB/s ± 0% 45.0MB/s ± 0% +17.97% (p=0.000 n=10+10) Decode/Twain/Huffman/1e6-6 38.9MB/s ± 1% 45.9MB/s ± 0% +17.90% (p=0.000 n=10+10) Decode/Twain/Speed/1e4-6 34.5MB/s ± 1% 38.0MB/s ± 2% +10.11% (p=0.000 n=9+10) Decode/Twain/Speed/1e5-6 42.5MB/s ± 1% 47.0MB/s ± 1% +10.79% (p=0.000 n=9+10) Decode/Twain/Speed/1e6-6 43.7MB/s ± 0% 48.3MB/s ± 0% +10.72% (p=0.000 n=10+9) Decode/Twain/Default/1e4-6 37.1MB/s ± 2% 39.8MB/s ± 2% +7.15% (p=0.000 n=9+10) Decode/Twain/Default/1e5-6 49.5MB/s ± 1% 54.3MB/s ± 1% +9.71% (p=0.000 n=10+10) Decode/Twain/Default/1e6-6 52.3MB/s ± 0% 57.3MB/s ± 1% +9.57% (p=0.000 n=9+10) Decode/Twain/Compression/1e4-6 36.7MB/s ± 1% 40.0MB/s ± 4% +8.96% (p=0.000 n=9+10) Decode/Twain/Compression/1e5-6 49.8MB/s ± 0% 54.5MB/s ± 1% +9.38% (p=0.000 n=9+10) Decode/Twain/Compression/1e6-6 52.3MB/s ± 0% 57.3MB/s ± 1% +9.58% (p=0.000 n=9+10) Change-Id: Iabfd285535ddb210f7f48f33317c6463b5532400 Reviewed-on: https://go-review.googlesource.com/102235 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
d39649087b |
compress/flate: remove non-standard extensions to flate
Some constants were added to flate that seem to be an experimental attempt at increasing the window size. However, according to RFC1951, the largest window size is 32KiB, so these constants are non-standard. Delete them. Fixes #18458 Change-Id: Ia94989637ca031a56bce2548624fa48044caa7b9 Reviewed-on: https://go-review.googlesource.com/60490 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> |
|
|
|
bca0320641 |
compress/flate: use math/bits.Reverse8/16 instead of local implementation
No measurable impact on performance (specifically, no degradation). Reverse is used in Huffman en/de-coding. For completeness, here are all the speed-related benchmark results: name old time/op new time/op delta Decode/Digits/Huffman/1e4-8 181µs ± 0% 178µs ± 1% ~ (p=0.100 n=3+3) Decode/Digits/Huffman/1e5-8 1.60ms ± 3% 1.56ms ± 3% ~ (p=0.400 n=3+3) Decode/Digits/Huffman/1e6-8 15.7ms ± 1% 15.3ms ± 3% ~ (p=0.700 n=3+3) Decode/Digits/Speed/1e4-8 179µs ± 0% 180µs ± 0% ~ (p=0.200 n=3+3) Decode/Digits/Speed/1e5-8 1.68ms ± 0% 1.66ms ± 3% ~ (p=0.700 n=3+3) Decode/Digits/Speed/1e6-8 16.6ms ± 2% 16.6ms ± 5% ~ (p=0.700 n=3+3) Decode/Digits/Default/1e4-8 179µs ± 1% 178µs ± 1% ~ (p=0.700 n=3+3) Decode/Digits/Default/1e5-8 1.62ms ± 3% 1.62ms ± 4% ~ (p=1.000 n=3+3) Decode/Digits/Default/1e6-8 16.0ms ± 2% 16.0ms ± 3% ~ (p=1.000 n=3+3) Decode/Digits/Compression/1e4-8 179µs ± 1% 179µs ± 0% ~ (p=0.200 n=3+3) Decode/Digits/Compression/1e5-8 1.62ms ± 2% 1.62ms ± 3% ~ (p=1.000 n=3+3) Decode/Digits/Compression/1e6-8 16.1ms ± 3% 16.0ms ± 3% ~ (p=1.000 n=3+3) Decode/Twain/Huffman/1e4-8 205µs ± 2% 207µs ± 1% ~ (p=1.000 n=3+3) Decode/Twain/Huffman/1e5-8 1.77ms ± 2% 1.77ms ± 4% ~ (p=0.700 n=3+3) Decode/Twain/Huffman/1e6-8 17.4ms ± 2% 17.4ms ± 3% ~ (p=1.000 n=3+3) Decode/Twain/Speed/1e4-8 186µs ± 1% 186µs ± 1% ~ (p=0.400 n=3+3) Decode/Twain/Speed/1e5-8 1.53ms ± 2% 1.52ms ± 0% ~ (p=0.700 n=3+3) Decode/Twain/Speed/1e6-8 14.9ms ± 1% 14.8ms ± 1% ~ (p=1.000 n=3+3) Decode/Twain/Default/1e4-8 176µs ± 1% 174µs ± 0% ~ (p=0.200 n=3+3) Decode/Twain/Default/1e5-8 1.30ms ± 2% 1.31ms ± 1% ~ (p=0.700 n=3+3) Decode/Twain/Default/1e6-8 12.6ms ± 3% 12.5ms ± 0% ~ (p=0.700 n=3+3) Decode/Twain/Compression/1e4-8 177µs ± 0% 174µs ± 1% ~ (p=0.100 n=3+3) Decode/Twain/Compression/1e5-8 1.30ms ± 1% 1.31ms ± 0% ~ (p=0.700 n=3+3) Decode/Twain/Compression/1e6-8 12.5ms ± 1% 12.5ms ± 1% ~ (p=1.000 n=3+3) Encode/Digits/Huffman/1e4-8 47.4µs ± 1% 46.5µs ± 0% ~ (p=0.100 n=3+3) Encode/Digits/Huffman/1e5-8 453µs ± 2% 446µs ± 1% ~ (p=0.700 n=3+3) Encode/Digits/Huffman/1e6-8 4.44ms ± 3% 4.39ms ± 0% ~ (p=1.000 n=3+3) Encode/Digits/Speed/1e4-8 190µs ± 4% 185µs ± 0% ~ (p=0.100 n=3+3) Encode/Digits/Speed/1e5-8 1.78ms ± 5% 1.75ms ± 1% ~ (p=1.000 n=3+3) Encode/Digits/Speed/1e6-8 17.9ms ± 7% 17.3ms ± 1% ~ (p=0.400 n=3+3) Encode/Digits/Default/1e4-8 366µs ± 1% 361µs ± 0% ~ (p=0.200 n=3+3) Encode/Digits/Default/1e5-8 5.58ms ± 5% 5.44ms ± 1% ~ (p=0.400 n=3+3) Encode/Digits/Default/1e6-8 59.0ms ± 3% 58.2ms ± 1% ~ (p=0.700 n=3+3) Encode/Digits/Compression/1e4-8 369µs ± 3% 362µs ± 0% ~ (p=0.100 n=3+3) Encode/Digits/Compression/1e5-8 5.50ms ± 2% 5.47ms ± 1% ~ (p=1.000 n=3+3) Encode/Digits/Compression/1e6-8 59.4ms ± 2% 58.5ms ± 1% ~ (p=0.400 n=3+3) Encode/Twain/Huffman/1e4-8 64.4µs ± 3% 64.7µs ± 1% ~ (p=0.700 n=3+3) Encode/Twain/Huffman/1e5-8 526µs ± 1% 526µs ± 2% ~ (p=1.000 n=3+3) Encode/Twain/Huffman/1e6-8 5.18ms ± 2% 5.17ms ± 1% ~ (p=0.700 n=3+3) Encode/Twain/Speed/1e4-8 206µs ± 1% 204µs ± 0% ~ (p=0.100 n=3+3) Encode/Twain/Speed/1e5-8 1.73ms ± 2% 1.70ms ± 0% ~ (p=0.100 n=3+3) Encode/Twain/Speed/1e6-8 16.7ms ± 0% 16.7ms ± 1% ~ (p=1.000 n=3+3) Encode/Twain/Default/1e4-8 423µs ± 3% 418µs ± 1% ~ (p=1.000 n=3+3) Encode/Twain/Default/1e5-8 6.34ms ± 4% 6.23ms ± 0% ~ (p=1.000 n=3+3) Encode/Twain/Default/1e6-8 68.0ms ± 3% 67.5ms ± 0% ~ (p=0.700 n=3+3) Encode/Twain/Compression/1e4-8 435µs ± 3% 424µs ± 0% ~ (p=0.700 n=3+3) Encode/Twain/Compression/1e5-8 7.01ms ± 1% 6.92ms ± 0% ~ (p=0.100 n=3+3) Encode/Twain/Compression/1e6-8 77.1ms ± 4% 75.5ms ± 1% ~ (p=0.400 n=3+3) name old speed new speed delta Decode/Digits/Huffman/1e4-8 55.2MB/s ± 0% 56.2MB/s ± 1% ~ (p=0.100 n=3+3) Decode/Digits/Huffman/1e5-8 62.4MB/s ± 3% 64.1MB/s ± 3% ~ (p=0.400 n=3+3) Decode/Digits/Huffman/1e6-8 63.8MB/s ± 1% 65.3MB/s ± 3% ~ (p=0.700 n=3+3) Decode/Digits/Speed/1e4-8 55.8MB/s ± 0% 55.4MB/s ± 0% ~ (p=0.200 n=3+3) Decode/Digits/Speed/1e5-8 59.6MB/s ± 0% 60.3MB/s ± 3% ~ (p=0.700 n=3+3) Decode/Digits/Speed/1e6-8 60.1MB/s ± 2% 60.3MB/s ± 4% ~ (p=0.700 n=3+3) Decode/Digits/Default/1e4-8 55.8MB/s ± 1% 56.1MB/s ± 1% ~ (p=0.700 n=3+3) Decode/Digits/Default/1e5-8 61.8MB/s ± 3% 61.7MB/s ± 4% ~ (p=1.000 n=3+3) Decode/Digits/Default/1e6-8 62.4MB/s ± 2% 62.4MB/s ± 3% ~ (p=1.000 n=3+3) Decode/Digits/Compression/1e4-8 55.7MB/s ± 1% 56.0MB/s ± 0% ~ (p=0.300 n=3+3) Decode/Digits/Compression/1e5-8 61.7MB/s ± 2% 61.9MB/s ± 3% ~ (p=1.000 n=3+3) Decode/Digits/Compression/1e6-8 62.2MB/s ± 3% 62.6MB/s ± 3% ~ (p=1.000 n=3+3) Decode/Twain/Huffman/1e4-8 48.8MB/s ± 2% 48.4MB/s ± 1% ~ (p=1.000 n=3+3) Decode/Twain/Huffman/1e5-8 56.4MB/s ± 2% 56.6MB/s ± 4% ~ (p=0.700 n=3+3) Decode/Twain/Huffman/1e6-8 57.6MB/s ± 2% 57.5MB/s ± 3% ~ (p=1.000 n=3+3) Decode/Twain/Speed/1e4-8 53.7MB/s ± 1% 53.9MB/s ± 1% ~ (p=0.400 n=3+3) Decode/Twain/Speed/1e5-8 65.5MB/s ± 2% 65.6MB/s ± 0% ~ (p=0.700 n=3+3) Decode/Twain/Speed/1e6-8 66.9MB/s ± 1% 67.4MB/s ± 1% ~ (p=1.000 n=3+3) Decode/Twain/Default/1e4-8 56.9MB/s ± 1% 57.3MB/s ± 0% ~ (p=0.200 n=3+3) Decode/Twain/Default/1e5-8 77.2MB/s ± 2% 76.6MB/s ± 1% ~ (p=0.700 n=3+3) Decode/Twain/Default/1e6-8 79.3MB/s ± 3% 80.0MB/s ± 0% ~ (p=0.700 n=3+3) Decode/Twain/Compression/1e4-8 56.4MB/s ± 0% 57.5MB/s ± 1% ~ (p=0.100 n=3+3) Decode/Twain/Compression/1e5-8 76.8MB/s ± 1% 76.5MB/s ± 0% ~ (p=0.700 n=3+3) Decode/Twain/Compression/1e6-8 80.1MB/s ± 1% 79.8MB/s ± 1% ~ (p=1.000 n=3+3) Encode/Digits/Huffman/1e4-8 211MB/s ± 1% 215MB/s ± 0% ~ (p=0.100 n=3+3) Encode/Digits/Huffman/1e5-8 221MB/s ± 2% 224MB/s ± 1% ~ (p=0.700 n=3+3) Encode/Digits/Huffman/1e6-8 225MB/s ± 3% 228MB/s ± 0% ~ (p=1.000 n=3+3) Encode/Digits/Speed/1e4-8 52.8MB/s ± 4% 54.1MB/s ± 0% ~ (p=0.100 n=3+3) Encode/Digits/Speed/1e5-8 56.2MB/s ± 5% 57.0MB/s ± 1% ~ (p=1.000 n=3+3) Encode/Digits/Speed/1e6-8 56.0MB/s ± 6% 57.7MB/s ± 1% ~ (p=0.400 n=3+3) Encode/Digits/Default/1e4-8 27.3MB/s ± 1% 27.7MB/s ± 0% ~ (p=0.200 n=3+3) Encode/Digits/Default/1e5-8 17.9MB/s ± 4% 18.4MB/s ± 1% ~ (p=0.400 n=3+3) Encode/Digits/Default/1e6-8 17.0MB/s ± 3% 17.2MB/s ± 1% ~ (p=0.500 n=3+3) Encode/Digits/Compression/1e4-8 27.1MB/s ± 3% 27.6MB/s ± 0% ~ (p=0.100 n=3+3) Encode/Digits/Compression/1e5-8 18.2MB/s ± 2% 18.3MB/s ± 1% ~ (p=1.000 n=3+3) Encode/Digits/Compression/1e6-8 16.9MB/s ± 2% 17.1MB/s ± 1% ~ (p=0.400 n=3+3) Encode/Twain/Huffman/1e4-8 155MB/s ± 3% 155MB/s ± 1% ~ (p=0.700 n=3+3) Encode/Twain/Huffman/1e5-8 190MB/s ± 1% 190MB/s ± 2% ~ (p=1.000 n=3+3) Encode/Twain/Huffman/1e6-8 193MB/s ± 2% 193MB/s ± 1% ~ (p=0.700 n=3+3) Encode/Twain/Speed/1e4-8 48.5MB/s ± 1% 49.1MB/s ± 0% ~ (p=0.100 n=3+3) Encode/Twain/Speed/1e5-8 57.7MB/s ± 2% 59.0MB/s ± 0% ~ (p=0.100 n=3+3) Encode/Twain/Speed/1e6-8 59.7MB/s ± 0% 59.7MB/s ± 1% ~ (p=1.000 n=3+3) Encode/Twain/Default/1e4-8 23.6MB/s ± 3% 23.9MB/s ± 1% ~ (p=1.000 n=3+3) Encode/Twain/Default/1e5-8 15.8MB/s ± 4% 16.1MB/s ± 0% ~ (p=1.000 n=3+3) Encode/Twain/Default/1e6-8 14.7MB/s ± 3% 14.8MB/s ± 0% ~ (p=0.700 n=3+3) Encode/Twain/Compression/1e4-8 23.0MB/s ± 3% 23.6MB/s ± 0% ~ (p=0.700 n=3+3) Encode/Twain/Compression/1e5-8 14.3MB/s ± 1% 14.5MB/s ± 0% ~ (p=0.100 n=3+3) Encode/Twain/Compression/1e6-8 13.0MB/s ± 4% 13.2MB/s ± 1% ~ (p=0.400 n=3+3) Measured on a "quiet" (no browser running) 2.3 GHz Intel Core i7, running macOS 10.12.3. See also #19279. Change-Id: Ice759eb34eb37442b543957447c264e0aadc1fa9 Reviewed-on: https://go-review.googlesource.com/37460 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
9c3630f578 |
compress/flate: avoid large stack growth in fillDeflate
Ranging over an array causes the array to be copied over to the stack, which cause large re-growths. Instead, we should iterate over slices of the array. Also, assigning a large struct literal uses the stack even though the actual fields being populated are small in comparison to the entirety of the struct (see #18636). Fixing the stack growth does not alter CPU-time performance much since the stack-growth and copying was such a tiny portion of the compression work: name old time/op new time/op delta Encode/Digits/Default/1e4-8 332µs ± 1% 332µs ± 1% ~ (p=0.796 n=10+10) Encode/Digits/Default/1e5-8 5.07ms ± 2% 5.05ms ± 1% ~ (p=0.815 n=9+8) Encode/Digits/Default/1e6-8 53.7ms ± 1% 53.9ms ± 1% ~ (p=0.075 n=10+10) Encode/Twain/Default/1e4-8 380µs ± 1% 380µs ± 1% ~ (p=0.684 n=10+10) Encode/Twain/Default/1e5-8 5.79ms ± 2% 5.79ms ± 1% ~ (p=0.497 n=9+10) Encode/Twain/Default/1e6-8 61.5ms ± 1% 61.8ms ± 1% ~ (p=0.247 n=10+10) name old speed new speed delta Encode/Digits/Default/1e4-8 30.1MB/s ± 1% 30.1MB/s ± 1% ~ (p=0.753 n=10+10) Encode/Digits/Default/1e5-8 19.7MB/s ± 2% 19.8MB/s ± 1% ~ (p=0.795 n=9+8) Encode/Digits/Default/1e6-8 18.6MB/s ± 1% 18.5MB/s ± 1% ~ (p=0.072 n=10+10) Encode/Twain/Default/1e4-8 26.3MB/s ± 1% 26.3MB/s ± 1% ~ (p=0.616 n=10+10) Encode/Twain/Default/1e5-8 17.3MB/s ± 2% 17.3MB/s ± 1% ~ (p=0.484 n=9+10) Encode/Twain/Default/1e6-8 16.3MB/s ± 1% 16.2MB/s ± 1% ~ (p=0.238 n=10+10) Updates #18636 Fixes #18625 Change-Id: I471b20339bf675f63dc56d38b3acdd824fe23328 Reviewed-on: https://go-review.googlesource.com/35122 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
091ba60bd8 |
compress/flate: add examples
Updates #16360 Change-Id: I66ff23e0501363f58fe891d5e95806422071f93b Reviewed-on: https://go-review.googlesource.com/30162 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
2341631506 |
all: sprinkle t.Parallel on some slow tests
I used the slowtests.go tool as described in https://golang.org/cl/32684 on packages that stood out. go test -short std drops from ~56 to ~52 seconds. This isn't a huge win, but it was mostly an exercise. Updates #17751 Change-Id: I9f3402e36a038d71e662d06ce2c1d52f6c4b674d Reviewed-on: https://go-review.googlesource.com/32751 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
|
|
|
590fce4884 |
compress/flate: tighten the BestSpeed max match offset bound.
Previously, we were off by one. Also fix a comment typo. Change-Id: Ib94d23acc56d5fccd44144f71655481f98803ac8 Reviewed-on: https://go-review.googlesource.com/32149 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
461adfd817 |
compress/flate: make compression level 0 consistent
Tests for determinism was not working as intended since io.Copybuffer uses the io.WriterTo if available. This exposed that level 0 (no compression) changed output based on the number of writes and buffers given to the writer. Previously, Write would emit a new raw block (BTYPE=00) for every non-empty call to Write. This CL fixes it such that a raw block is only emitted upon the following conditions: * A full window is obtained (every 65535 bytes) * Flush is called * Close is called Change-Id: I807f866d97e2db7820f11febab30a96266a6cbf1 Reviewed-on: https://go-review.googlesource.com/31174 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> |
|
|
|
2e196b15b9 |
compress/flate: level 1 (best speed) match across blocks
This change makes deflate level 1 (best speed) match across block boundaries. This comes at a small speed penalty, but improves compression on almost all output. Sample numbers on various content types: enwik9: 391052014 -> 382578469 bytes, 77.59 -> 74.28 MB/s adresser.001: 57269799 -> 47756095 bytes, 287.84 -> 357.86 MB/s 10gb: 5233055166 -> 5198328382 bytes, 105.85 -> 96.99 MB/s rawstudio-mint14: 3972329211 -> 3927423364 bytes, 100.07 -> 94.22 MB/s sites: 165556800 -> 163178702 bytes, 72.31 -> 70.15 MB/s objectfiles: 115962472 -> 111649524 bytes, 132.60 -> 128.05 MB/s sharnd.out: 200015283 -> 200015283 bytes, 221.50 -> 218.83 MB/s Change-Id: I62a139e5c06976e803439a4268acede5139b8cfc Reviewed-on: https://go-review.googlesource.com/31640 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Reviewed-by: Nigel Tao <nigeltao@golang.org> |
|
|
|
c1cd64d0ac |
compress/flate: use correct table for size estimation
The incorrect table was used for estimating output size. This can give suboptimal selection of entropy encoder in rare cases. Change-Id: I8b358200f2d1f9a3f9b79a44269d7be704e1d2d9 Reviewed-on: https://go-review.googlesource.com/31172 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
960016eca2 |
compress/flate: clarify the behavior of Writer.Flush
Fixes #16068 Change-Id: I04e80a181c0b7356996f7a1158ea4895ff9e1e39 Reviewed-on: https://go-review.googlesource.com/28477 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
d309246462 |
compress/flate: always return uncompressed data in the event of error
In the event of an unexpected error, we should always flush available decompressed data to the user. Fixes #16924 Change-Id: I0bc0824c3201f3149e84e6a26e3dbcba72a1aae5 Reviewed-on: https://go-review.googlesource.com/28216 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> |
|
|
|
8cd04da762 |
compress/flate: make huffmanBitWriter errors persistent
For persistent error handling, the methods of huffmanBitWriter have to be consistent about how they check errors. It must either consistently check error *before* every operation OR immediately *after* every operation. Since most of the current logic uses the previous approach, we apply the same style of error checking to writeBits and all calls to Write such that they only operate if w.err is already nil going into them. The error handling approach is brittle and easily broken by future commits to the code. In the near future, we should switch the logic to use panic at the lowest levels and a recover at the edge of the public API to ensure that errors are always persistent. Fixes #16749 Change-Id: Ie1d83e4ed8842f6911a31e23311cd3cbf38abe8c Reviewed-on: https://go-review.googlesource.com/27200 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
d0256118de |
compress/flate: document HuffmanOnly
Fixes #16489 Change-Id: I13e2ed6de59102f977566de637d8d09b4e541980 Reviewed-on: https://go-review.googlesource.com/25200 Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
0ce100dc96 |
compress/flate: don't ignore dict in Reader.Reset
Fixes #16162. Change-Id: I6f4ae906630079ef5fc29ee5f70e2e3d1c962170 Reviewed-on: https://go-review.googlesource.com/24390 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
7cd6cae6a6 |
compress/flate: use seperate const block for exported constants
As rendered on https://tip.golang.org/pkg/compress/flate/, there is an extra new-line because of the unexported constants in the same block. <<< const ( NoCompression = 0 BestSpeed = 1 BestCompression = 9 DefaultCompression = -1 HuffmanOnly = -2 // Disables match search and only does Huffman entropy reduction. ) >>> Instead, seperate the exported compression level constants into its own const block. This is both more readable and also fixes the issue. Change-Id: I60b7966c83fb53356c02e4640d05f55a3bee35b7 Reviewed-on: https://go-review.googlesource.com/23557 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
d2aa5f95cc |
compress/flate: simplify using subtests and sub-benchmarks
This causes the large files to be loaded only once per benchmark. This CL also serves as an example use case of sub(tests|-benchmarks). This CL ensures that names are identical to the original except for an added slashes. Things could be simplified further if this restriction were dropped. Change-Id: I45e303e158e3152e33d0d751adfef784713bf997 Reviewed-on: https://go-review.googlesource.com/23420 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
27d0d849fe |
compress/flate: distinguish between base and min match length.
Change-Id: I93db5cd86e3fb568e4444cad95268ba4a02ce8a0 Reviewed-on: https://go-review.googlesource.com/22787 Reviewed-by: Nigel Tao <nigeltao@golang.org> |
|
|
|
1fb4e4de26 |
compress/flate: use a constant hash table size for Best Speed.
This makes compress/flate's version of Snappy diverge from the upstream golang/snappy version, but the latter has a goal of matching C++ snappy output byte-for-byte. Both C++ and the asm version of golang/snappy can use a smaller N for the O(N) zero-initialization of the hash table when the input is small, even if the pure Go golang/snappy algorithm cannot: "var table [tableSize]uint16" zeroes all tableSize elements. For this package, we don't have the match-C++-snappy goal, so we can use a different (constant) hash table size. This is a small win, in terms of throughput and output size, but it also enables us to re-use the (constant size) hash table between encodeBestSpeed calls, avoiding the cost of zero-initializing the hash table altogether. This will be implemented in follow-up commits. This package's benchmarks: name old speed new speed delta EncodeDigitsSpeed1e4-8 72.8MB/s ± 1% 73.5MB/s ± 1% +0.86% (p=0.000 n=10+10) EncodeDigitsSpeed1e5-8 77.5MB/s ± 1% 78.0MB/s ± 0% +0.69% (p=0.000 n=10+10) EncodeDigitsSpeed1e6-8 82.0MB/s ± 1% 82.7MB/s ± 1% +0.85% (p=0.000 n=10+9) EncodeTwainSpeed1e4-8 65.1MB/s ± 1% 65.6MB/s ± 0% +0.78% (p=0.000 n=10+9) EncodeTwainSpeed1e5-8 80.0MB/s ± 0% 80.6MB/s ± 1% +0.66% (p=0.000 n=9+10) EncodeTwainSpeed1e6-8 81.6MB/s ± 1% 82.1MB/s ± 1% +0.55% (p=0.017 n=10+10) Input size in bytes, output size (and time taken) before and after on some larger files: 1073741824 57269781 ( 3183ms) 57269781 ( 3177ms) adresser.001 1000000000 391052000 ( 11071ms) 391051996 ( 11067ms) enwik9 1911399616 378679516 ( 13450ms) 378679514 ( 13079ms) gob-stream 8558382592 3972329193 ( 99962ms) 3972329193 ( 91290ms) rawstudio-mint14.tar 200000000 200015265 ( 776ms) 200015265 ( 774ms) sharnd.out Thanks to Klaus Post for the original suggestion on cl/21021. Change-Id: Ia4c63a8d1b92c67e1765ec5c3c8c69d289d9a6ce Reviewed-on: https://go-review.googlesource.com/22604 Reviewed-by: Russ Cox <rsc@golang.org> |
|
|
|
d8b7bd6a1f |
compress/flate: replace "Best Speed" with specialized version
This encoding algorithm, which prioritizes speed over output size, is
based on Snappy's LZ77-style encoder: github.com/golang/snappy
This commit keeps the diff between this package's encodeBestSpeed
function and and Snappy's encodeBlock function as small as possible (see
the diff below). Follow-up commits will improve this package's
performance and output size.
This package's speed benchmarks:
name old speed new speed delta
EncodeDigitsSpeed1e4-8 40.7MB/s ± 0% 73.0MB/s ± 0% +79.18% (p=0.008 n=5+5)
EncodeDigitsSpeed1e5-8 33.0MB/s ± 0% 77.3MB/s ± 1% +134.04% (p=0.008 n=5+5)
EncodeDigitsSpeed1e6-8 32.1MB/s ± 0% 82.1MB/s ± 0% +156.18% (p=0.008 n=5+5)
EncodeTwainSpeed1e4-8 42.1MB/s ± 0% 65.0MB/s ± 0% +54.61% (p=0.008 n=5+5)
EncodeTwainSpeed1e5-8 46.3MB/s ± 0% 80.0MB/s ± 0% +72.81% (p=0.008 n=5+5)
EncodeTwainSpeed1e6-8 47.3MB/s ± 0% 81.7MB/s ± 0% +72.86% (p=0.008 n=5+5)
Here's the milliseconds taken, before and after this commit, to compress
a number of test files:
Go's src/compress/testdata files:
4 1 e.txt
8 4 Mark.Twain-Tom.Sawyer.txt
github.com/golang/snappy's benchmark files:
3 1 alice29.txt
12 3 asyoulik.txt
6 1 fireworks.jpeg
1 1 geo.protodata
1 0 html
2 2 html_x_4
6 3 kppkn.gtb
11 4 lcet10.txt
5 1 paper-100k.pdf
14 6 plrabn12.txt
17 6 urls.10K
Larger files linked to from
https://docs.google.com/spreadsheets/d/1VLxi-ac0BAtf735HyH3c1xRulbkYYUkFecKdLPH7NIQ/edit#gid=166102500
2409 3182 adresser.001
16757 11027 enwik9
13764 12946 gob-stream
153978 74317 rawstudio-mint14.tar
4371 770 sharnd.out
Output size is larger. In the table below, the first column is the input
size, the second column is the output size prior to this commit, the
third column is the output size after this commit.
100003 47707 50006 e.txt
387851 172707 182930 Mark.Twain-Tom.Sawyer.txt
152089 62457 66705 alice29.txt
125179 54503 57274 asyoulik.txt
123093 122827 123108 fireworks.jpeg
118588 18574 20558 geo.protodata
102400 16601 17305 html
409600 65506 70313 html_x_4
184320 49007 50944 kppkn.gtb
426754 166957 179355 lcet10.txt
102400 82126 84937 paper-100k.pdf
481861 218617 231988 plrabn12.txt
702087 241774 258020 urls.10K
1073741824 43074110 57269781 adresser.001
1000000000 365772256 391052000 enwik9
1911399616 340364558 378679516 gob-stream
8558382592 3807229562 3972329193 rawstudio-mint14.tar
200000000 200061040 200015265 sharnd.out
The diff between github.com/golang/snappy's encodeBlock function and
this commit's encodeBestSpeed function:
1c1,7
< func encodeBlock(dst, src []byte) (d int) {
---
> func encodeBestSpeed(dst []token, src []byte) []token {
> // This check isn't in the Snappy implementation, but there, the caller
> // instead of the callee handles this case.
> if len(src) < minNonLiteralBlockSize {
> return emitLiteral(dst, src)
> }
>
4c10
< // and len(src) <= maxBlockSize and maxBlockSize == 65536.
---
> // and len(src) <= maxStoreBlockSize and maxStoreBlockSize == 65535.
65c71
< if load32(src, s) == load32(src, candidate) {
---
> if s-candidate < maxOffset && load32(src, s) == load32(src, candidate) {
73c79
< d += emitLiteral(dst[d:], src[nextEmit:s])
---
> dst = emitLiteral(dst, src[nextEmit:s])
90c96
< // This is an inlined version of:
---
> // This is an inlined version of Snappy's:
93c99,103
< for i := candidate + 4; s < len(src) && src[i] == src[s]; i, s = i+1, s+1 {
---
> s1 := base + maxMatchLength
> if s1 > len(src) {
> s1 = len(src)
> }
> for i := candidate + 4; s < s1 && src[i] == src[s]; i, s = i+1, s+1 {
96c106,107
< d += emitCopy(dst[d:], base-candidate, s-base)
---
> // matchToken is flate's equivalent of Snappy's emitCopy.
> dst = append(dst, matchToken(uint32(s-base-3), uint32(base-candidate-minOffsetSize)))
114c125
< if uint32(x>>8) != load32(src, candidate) {
---
> if s-candidate >= maxOffset || uint32(x>>8) != load32(src, candidate) {
124c135
< d += emitLiteral(dst[d:], src[nextEmit:])
---
> dst = emitLiteral(dst, src[nextEmit:])
126c137
< return d
---
> return dst
This change is based on https://go-review.googlesource.com/#/c/21021/ by
Klaus Post, but it is a separate changelist as cl/21021 seems to have
stalled in code review, and the Go 1.7 feature freeze approaches.
Golang-dev discussion:
https://groups.google.com/d/topic/golang-dev/XYgHX9p8IOk/discussion and
of course cl/21021.
Change-Id: Ib662439417b3bd0b61c2977c12c658db3e44d164
Reviewed-on: https://go-review.googlesource.com/22370
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
|
|
6ec481b06c |
compress/flate: use uncompressed if dynamic encoding is larger
This adds size calculation to "dynamic" writes. This ensures that if dynamic Huffman encoding is bigger, or only slightly smaller than raw data, the block is written uncompressed. To minimize the code duplication of this function, the size calculation has been moved to separate functions. Since I was modifying these calculations, I changed "int64" size calculations to "int". Blocks are of very limited size, so there is not any risk of overflows. This should mainly improve 32 bit performance, but amd64 also gets a slight boost: name old time/op new time/op delta EncodeDigitsHuffman1e4-8 49.9µs ± 1% 49.3µs ± 1% -1.21% (p=0.000 n=10+10) EncodeDigitsHuffman1e5-8 476µs ± 1% 471µs ± 3% ~ (p=0.218 n=10+10) EncodeDigitsHuffman1e6-8 4.80ms ± 2% 4.75ms ± 2% ~ (p=0.243 n=10+9) EncodeDigitsSpeed1e4-8 305µs ± 3% 300µs ± 1% -1.86% (p=0.005 n=10+10) EncodeDigitsSpeed1e5-8 3.67ms ± 2% 3.58ms ± 1% -2.29% (p=0.000 n=9+8) EncodeDigitsSpeed1e6-8 38.3ms ± 2% 37.0ms ± 1% -3.45% (p=0.000 n=9+9) EncodeDigitsDefault1e4-8 361µs ± 2% 353µs ± 1% -2.21% (p=0.000 n=10+10) EncodeDigitsDefault1e5-8 5.24ms ± 2% 5.19ms ± 2% ~ (p=0.105 n=10+10) EncodeDigitsDefault1e6-8 56.5ms ± 3% 55.1ms ± 1% -2.42% (p=0.001 n=10+10) EncodeDigitsCompress1e4-8 362µs ± 2% 358µs ± 2% ~ (p=0.123 n=10+10) EncodeDigitsCompress1e5-8 5.26ms ± 3% 5.20ms ± 1% ~ (p=0.089 n=10+10) EncodeDigitsCompress1e6-8 56.0ms ± 4% 55.0ms ± 1% ~ (p=0.065 n=10+9) EncodeTwainHuffman1e4-8 70.9µs ± 3% 67.6µs ± 2% -4.59% (p=0.000 n=10+10) EncodeTwainHuffman1e5-8 556µs ± 2% 533µs ± 1% -4.20% (p=0.000 n=10+10) EncodeTwainHuffman1e6-8 5.54ms ± 3% 5.29ms ± 1% -4.37% (p=0.000 n=10+9) EncodeTwainSpeed1e4-8 294µs ± 3% 293µs ± 1% ~ (p=0.965 n=10+8) EncodeTwainSpeed1e5-8 2.59ms ± 2% 2.56ms ± 1% ~ (p=0.353 n=10+10) EncodeTwainSpeed1e6-8 25.6ms ± 1% 24.9ms ± 1% -2.62% (p=0.000 n=9+10) EncodeTwainDefault1e4-8 419µs ± 2% 417µs ± 1% ~ (p=0.780 n=10+9) EncodeTwainDefault1e5-8 6.23ms ± 4% 6.16ms ± 1% ~ (p=0.218 n=10+10) EncodeTwainDefault1e6-8 66.2ms ± 2% 65.7ms ± 1% ~ (p=0.529 n=10+10) EncodeTwainCompress1e4-8 426µs ± 1% 428µs ± 2% ~ (p=0.549 n=9+10) EncodeTwainCompress1e5-8 6.80ms ± 1% 6.85ms ± 3% ~ (p=0.156 n=9+10) EncodeTwainCompress1e6-8 74.6ms ± 3% 73.8ms ± 2% ~ (p=0.280 n=10+10) name old speed new speed delta EncodeDigitsHuffman1e4-8 200MB/s ± 1% 203MB/s ± 1% +1.23% (p=0.000 n=10+10) EncodeDigitsHuffman1e5-8 210MB/s ± 1% 212MB/s ± 3% ~ (p=0.356 n=10+9) EncodeDigitsHuffman1e6-8 208MB/s ± 2% 210MB/s ± 2% ~ (p=0.243 n=10+9) EncodeDigitsSpeed1e4-8 32.8MB/s ± 3% 33.4MB/s ± 1% +1.88% (p=0.005 n=10+10) EncodeDigitsSpeed1e5-8 27.2MB/s ± 2% 27.9MB/s ± 1% +2.60% (p=0.000 n=10+8) EncodeDigitsSpeed1e6-8 26.1MB/s ± 2% 27.0MB/s ± 1% +3.56% (p=0.000 n=9+9) EncodeDigitsDefault1e4-8 27.7MB/s ± 2% 28.4MB/s ± 1% +2.24% (p=0.000 n=10+10) EncodeDigitsDefault1e5-8 19.1MB/s ± 2% 19.3MB/s ± 2% ~ (p=0.101 n=10+10) EncodeDigitsDefault1e6-8 17.7MB/s ± 3% 18.1MB/s ± 1% +2.46% (p=0.001 n=10+10) EncodeDigitsCompress1e4-8 27.6MB/s ± 2% 27.9MB/s ± 2% ~ (p=0.119 n=10+10) EncodeDigitsCompress1e5-8 19.0MB/s ± 3% 19.2MB/s ± 1% ~ (p=0.085 n=10+10) EncodeDigitsCompress1e6-8 17.9MB/s ± 4% 18.1MB/s ± 3% ~ (p=0.110 n=10+10) EncodeTwainHuffman1e4-8 141MB/s ± 3% 148MB/s ± 2% +4.79% (p=0.000 n=10+10) EncodeTwainHuffman1e5-8 180MB/s ± 2% 188MB/s ± 1% +4.38% (p=0.000 n=10+10) EncodeTwainHuffman1e6-8 181MB/s ± 3% 189MB/s ± 1% +4.54% (p=0.000 n=10+9) EncodeTwainSpeed1e4-8 34.0MB/s ± 3% 34.1MB/s ± 1% ~ (p=0.948 n=10+8) EncodeTwainSpeed1e5-8 38.7MB/s ± 2% 39.0MB/s ± 1% ~ (p=0.353 n=10+10) EncodeTwainSpeed1e6-8 39.1MB/s ± 1% 40.1MB/s ± 1% +2.68% (p=0.000 n=9+10) EncodeTwainDefault1e4-8 23.9MB/s ± 2% 24.0MB/s ± 1% ~ (p=0.734 n=10+9) EncodeTwainDefault1e5-8 16.0MB/s ± 4% 16.2MB/s ± 1% ~ (p=0.210 n=10+10) EncodeTwainDefault1e6-8 15.1MB/s ± 2% 15.2MB/s ± 1% ~ (p=0.515 n=10+10) EncodeTwainCompress1e4-8 23.5MB/s ± 1% 23.4MB/s ± 2% ~ (p=0.536 n=9+10) EncodeTwainCompress1e5-8 14.7MB/s ± 1% 14.6MB/s ± 3% ~ (p=0.138 n=9+10) EncodeTwainCompress1e6-8 13.4MB/s ± 3% 13.5MB/s ± 2% ~ (p=0.239 n=10+10) This improves "random input" to the dynamic writer, which is why the test data is updated. The output size goes from 1051 to 1005 bytes. Change-Id: I3ee11d2d2511b277d2dd16734aeea07c98bca450 Reviewed-on: https://go-review.googlesource.com/21757 Reviewed-by: Joe Tsai <joetsai@digital-static.net> Run-TryBot: Joe Tsai <joetsai@digital-static.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Nigel Tao <nigeltao@golang.org> |
|
|
|
0da4dbe232 |
all: remove unnecessary type conversions
cmd and runtime were handled separately, and I'm intentionally skipped syscall. This is the rest of the standard library. CL generated mechanically with github.com/mdempsky/unconvert. Change-Id: I9e0eff886974dedc37adb93f602064b83e469122 Reviewed-on: https://go-review.googlesource.com/22104 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
80e7dddffa |
compress/flate: fix a fmt.Fprintf style nit in a test.
It's not a big deal (the for loop drops from 130-ish to 120-ish milliseconds for me) but it's not a big change either. Change-Id: I161a49caab5cae5a2b87866ed1dfb93627be8013 Reviewed-on: https://go-review.googlesource.com/22110 Reviewed-by: Klaus Post <klauspost@gmail.com> Reviewed-by: Nigel Tao <nigeltao@golang.org> |
|
|
|
a56f5a0322 |
compress/flate: improve short writer error test
This improves the short version of the writer test. First of all, it has a much quicker setup. Previously that could take up towards 0.5 second. Secondly, it will test all compression levels in short mode as well. Execution time is 1.7s/0.03s for normal/short mode. Change-Id: I275a21f712daff6f7125cc6a493415e86439cb19 Reviewed-on: https://go-review.googlesource.com/21800 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
f20b1809f2 |
compress/flate: eliminate most common bounds checks
This uses the SSA compiler to eliminate various unneeded bounds checks in loops and various lookups. This fixes the low hanging fruit, without any major code changes. name old time/op new time/op delta EncodeDigitsHuffman1e4-8 49.9µs ± 1% 48.1µs ± 1% -3.74% (p=0.000 n=10+9) EncodeDigitsHuffman1e5-8 476µs ± 1% 458µs ± 1% -3.58% (p=0.000 n=10+10) EncodeDigitsHuffman1e6-8 4.80ms ± 2% 4.56ms ± 1% -5.07% (p=0.000 n=10+9) EncodeDigitsSpeed1e4-8 305µs ± 3% 290µs ± 2% -5.03% (p=0.000 n=10+9) EncodeDigitsSpeed1e5-8 3.67ms ± 2% 3.49ms ± 2% -4.78% (p=0.000 n=9+10) EncodeDigitsSpeed1e6-8 38.3ms ± 2% 35.8ms ± 1% -6.58% (p=0.000 n=9+10) EncodeDigitsDefault1e4-8 361µs ± 2% 346µs ± 3% -4.12% (p=0.000 n=10+9) EncodeDigitsDefault1e5-8 5.24ms ± 2% 4.96ms ± 3% -5.38% (p=0.000 n=10+10) EncodeDigitsDefault1e6-8 56.5ms ± 3% 52.2ms ± 2% -7.68% (p=0.000 n=10+10) EncodeDigitsCompress1e4-8 362µs ± 2% 343µs ± 1% -5.20% (p=0.000 n=10+9) EncodeDigitsCompress1e5-8 5.26ms ± 3% 4.98ms ± 2% -5.48% (p=0.000 n=10+10) EncodeDigitsCompress1e6-8 56.0ms ± 4% 52.1ms ± 1% -7.01% (p=0.000 n=10+10) EncodeTwainHuffman1e4-8 70.9µs ± 3% 64.7µs ± 1% -8.68% (p=0.000 n=10+9) EncodeTwainHuffman1e5-8 556µs ± 2% 524µs ± 2% -5.84% (p=0.000 n=10+10) EncodeTwainHuffman1e6-8 5.54ms ± 3% 5.22ms ± 2% -5.70% (p=0.000 n=10+10) EncodeTwainSpeed1e4-8 294µs ± 3% 284µs ± 1% -3.71% (p=0.000 n=10+10) EncodeTwainSpeed1e5-8 2.59ms ± 2% 2.48ms ± 1% -4.14% (p=0.000 n=10+9) EncodeTwainSpeed1e6-8 25.6ms ± 1% 24.3ms ± 1% -5.28% (p=0.000 n=9+10) EncodeTwainDefault1e4-8 419µs ± 2% 396µs ± 1% -5.59% (p=0.000 n=10+9) EncodeTwainDefault1e5-8 6.23ms ± 4% 5.75ms ± 1% -7.83% (p=0.000 n=10+9) EncodeTwainDefault1e6-8 66.2ms ± 2% 61.4ms ± 1% -7.22% (p=0.000 n=10+10) EncodeTwainCompress1e4-8 426µs ± 1% 405µs ± 1% -4.97% (p=0.000 n=9+10) EncodeTwainCompress1e5-8 6.80ms ± 1% 6.32ms ± 1% -6.97% (p=0.000 n=9+10) EncodeTwainCompress1e6-8 74.6ms ± 3% 68.7ms ± 1% -7.90% (p=0.000 n=10+9) name old speed new speed delta EncodeDigitsHuffman1e4-8 200MB/s ± 1% 208MB/s ± 1% +3.88% (p=0.000 n=10+9) EncodeDigitsHuffman1e5-8 210MB/s ± 1% 218MB/s ± 1% +3.71% (p=0.000 n=10+10) EncodeDigitsHuffman1e6-8 208MB/s ± 2% 219MB/s ± 1% +5.32% (p=0.000 n=10+9) EncodeDigitsSpeed1e4-8 32.8MB/s ± 3% 34.5MB/s ± 2% +5.29% (p=0.000 n=10+9) EncodeDigitsSpeed1e5-8 27.2MB/s ± 2% 28.6MB/s ± 2% +5.29% (p=0.000 n=10+10) EncodeDigitsSpeed1e6-8 26.1MB/s ± 2% 27.9MB/s ± 1% +7.02% (p=0.000 n=9+10) EncodeDigitsDefault1e4-8 27.7MB/s ± 2% 28.9MB/s ± 3% +4.30% (p=0.000 n=10+9) EncodeDigitsDefault1e5-8 19.1MB/s ± 2% 20.2MB/s ± 3% +5.69% (p=0.000 n=10+10) EncodeDigitsDefault1e6-8 17.7MB/s ± 3% 19.2MB/s ± 2% +8.31% (p=0.000 n=10+10) EncodeDigitsCompress1e4-8 27.6MB/s ± 2% 29.1MB/s ± 1% +5.47% (p=0.000 n=10+9) EncodeDigitsCompress1e5-8 19.0MB/s ± 3% 20.1MB/s ± 2% +5.78% (p=0.000 n=10+10) EncodeDigitsCompress1e6-8 17.9MB/s ± 4% 19.2MB/s ± 1% +7.50% (p=0.000 n=10+10) EncodeTwainHuffman1e4-8 141MB/s ± 3% 154MB/s ± 1% +9.46% (p=0.000 n=10+9) EncodeTwainHuffman1e5-8 180MB/s ± 2% 191MB/s ± 2% +6.19% (p=0.000 n=10+10) EncodeTwainHuffman1e6-8 181MB/s ± 3% 192MB/s ± 2% +6.02% (p=0.000 n=10+10) EncodeTwainSpeed1e4-8 34.0MB/s ± 3% 35.3MB/s ± 1% +3.84% (p=0.000 n=10+10) EncodeTwainSpeed1e5-8 38.7MB/s ± 2% 40.3MB/s ± 1% +4.30% (p=0.000 n=10+9) EncodeTwainSpeed1e6-8 39.1MB/s ± 1% 41.2MB/s ± 1% +5.57% (p=0.000 n=9+10) EncodeTwainDefault1e4-8 23.9MB/s ± 2% 25.3MB/s ± 1% +5.91% (p=0.000 n=10+9) EncodeTwainDefault1e5-8 16.0MB/s ± 4% 17.4MB/s ± 1% +8.47% (p=0.000 n=10+9) EncodeTwainDefault1e6-8 15.1MB/s ± 2% 16.3MB/s ± 1% +7.76% (p=0.000 n=10+10) EncodeTwainCompress1e4-8 23.5MB/s ± 1% 24.7MB/s ± 1% +5.24% (p=0.000 n=9+10) EncodeTwainCompress1e5-8 14.7MB/s ± 1% 15.8MB/s ± 1% +7.50% (p=0.000 n=9+10) EncodeTwainCompress1e6-8 13.4MB/s ± 3% 14.6MB/s ± 1% +8.57% (p=0.000 n=10+9) Change-Id: I5c7e84c2f9ea4d38a2115995705eebb93387e22f Reviewed-on: https://go-review.googlesource.com/21759 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
1cb3044c9f |
all: use bytes.Equal, bytes.Contains and strings.Contains
Change-Id: Iba82a5bd3846f7ab038cc10ec72ff6bcd2c0b484 Reviewed-on: https://go-review.googlesource.com/21377 Run-TryBot: Dave Cheney <dave@cheney.net> Reviewed-by: Dave Cheney <dave@cheney.net> |
|
|
|
c27efce66b |
compress/flate: make Reader.Read return io.EOF eagerly
Rather than checking the block final bit on the next invocation of nextBlock, we check it at the termination of the current block. This ensures that we return (n, io.EOF) instead of (0, io.EOF) more frequently for most streams. However, there are certain situations where an eager io.EOF is not done: 1) We previously returned from Read because the write buffer of the internal dictionary was full, and it just so happens that there is no more data remaining in the stream. 2) There exists a [non-final, empty, raw block] after all blocks that actually contain uncompressed data. We cannot return io.EOF eagerly here since it would break flushing semantics. Both situations happen infrequently, but it is still important to note that this change does *not* guarantee that flate will *always* return (n, io.EOF). Furthermore, this CL makes no changes to the pattern of ReadByte calls to the underlying io.ByteReader. Below is the motivation for this change, pulling the text from @bradfitz's CL/21290: net/http and other things work better when io.Reader implementations return (n, io.EOF) at the end, instead of (n, nil) followed by (0, io.EOF). Both are legal, but the standard library has been moving towards n+io.EOF. An investigation of net/http connection re-use in https://github.com/google/go-github/pull/317 revealed that with gzip compression + http/1.1 chunking, the net/http package was not automatically reusing the underlying TCP connections when the final EOF bytes were already read off the wire. The net/http package only reuses the connection if the underlying Readers (many of them nested in this case) all eagerly return io.EOF. Previous related CLs: https://golang.org/cl/76400046 - tls.Reader https://golang.org/cl/58240043 - http chunked reader In addition to net/http, this behavior also helps things like ioutil.ReadAll (see comments about performance improvements in https://codereview.appspot.com/49570044) Updates #14867 Updates google/go-github#317 Change-Id: I637c45552efb561d34b13ed918b73c660f668378 Reviewed-on: https://go-review.googlesource.com/21302 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
42ad1dc01e |
compress/flate: add pure huffman deflater
Add a "HuffmanOnly" compression level, where the input is only entropy encoded. The output is fully inflate compatible. Typical compression is reduction is about 50% of typical level 1 compression, however the compression time is very stable, and does not vary as much as nearly as much level 1 compression (or Snappy). This mode is useful for: * HTTP compression in a CPU limited environment. * Entropy encoding Snappy compressed data, for archiving, etc. * Compression where compression time needs to be predictable. * Fast network transfer. Snappy "usually" performs inbetween this and level 1 compression-wise, but at the same speed as "Huffman", so this is not a replacement, but a good supplement for Snappy, since it usually can compress Snappy output further. This is implemented as level -2, since this would be too much of a compression reduction to replace level 1. >go test -bench=Encode -cpu=1 BenchmarkEncodeDigitsHuffman1e4 30000 52334 ns/op 191.08 MB/s BenchmarkEncodeDigitsHuffman1e5 3000 518343 ns/op 192.92 MB/s BenchmarkEncodeDigitsHuffman1e6 300 5356884 ns/op 186.68 MB/s BenchmarkEncodeDigitsSpeed1e4 5000 324214 ns/op 30.84 MB/s BenchmarkEncodeDigitsSpeed1e5 500 3952614 ns/op 25.30 MB/s BenchmarkEncodeDigitsSpeed1e6 30 40760350 ns/op 24.53 MB/s BenchmarkEncodeDigitsDefault1e4 5000 387056 ns/op 25.84 MB/s BenchmarkEncodeDigitsDefault1e5 300 5950614 ns/op 16.80 MB/s BenchmarkEncodeDigitsDefault1e6 20 63842195 ns/op 15.66 MB/s BenchmarkEncodeDigitsCompress1e4 5000 391859 ns/op 25.52 MB/s BenchmarkEncodeDigitsCompress1e5 300 5707112 ns/op 17.52 MB/s BenchmarkEncodeDigitsCompress1e6 20 59839465 ns/op 16.71 MB/s BenchmarkEncodeTwainHuffman1e4 20000 73498 ns/op 136.06 MB/s BenchmarkEncodeTwainHuffman1e5 2000 595892 ns/op 167.82 MB/s BenchmarkEncodeTwainHuffman1e6 200 6059016 ns/op 165.04 MB/s BenchmarkEncodeTwainSpeed1e4 5000 321212 ns/op 31.13 MB/s BenchmarkEncodeTwainSpeed1e5 500 2823873 ns/op 35.41 MB/s BenchmarkEncodeTwainSpeed1e6 50 27237864 ns/op 36.71 MB/s BenchmarkEncodeTwainDefault1e4 3000 454634 ns/op 22.00 MB/s BenchmarkEncodeTwainDefault1e5 200 6859537 ns/op 14.58 MB/s BenchmarkEncodeTwainDefault1e6 20 71547405 ns/op 13.98 MB/s BenchmarkEncodeTwainCompress1e4 3000 462307 ns/op 21.63 MB/s BenchmarkEncodeTwainCompress1e5 200 7534992 ns/op 13.27 MB/s BenchmarkEncodeTwainCompress1e6 20 80353365 ns/op 12.45 MB/s PASS ok compress/flate 55.333s Change-Id: I8e12ad13220e50d4cf7ddba6f292333efad61b0c Reviewed-on: https://go-review.googlesource.com/20982 Reviewed-by: Joe Tsai <joetsai@digital-static.net> Reviewed-by: Nigel Tao <nigeltao@golang.org> |
|
|
|
fdba5a7544 |
all: delete dead non-test code
This change removes a lot of dead code. Some of the code has never been used, not even when it was first commited. The rest shouldn't have survived refactors. This change doesn't remove unused routines helpful for debugging, nor does it remove code that's used in commented out blocks of code that are only unused temporarily. Furthermore, unused constants weren't removed when they were part of a set of constants from specifications. One noteworthy omission from this CL are about 1000 lines of unused code in cmd/fix, 700 lines of which are the typechecker, which hasn't been used ever since the pre-Go 1 fixes have been removed. I wasn't sure if this code should stick around for future uses of cmd/fix or be culled as well. Change-Id: Ib714bc7e487edc11ad23ba1c3222d1fd02e4a549 Reviewed-on: https://go-review.googlesource.com/20926 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
53efe1e121 |
compress/flate: rework matching algorithm
This changes how matching is done in deflate algorithm. The major change is that we do not look for matches that are only 3 bytes in length, matches must be 4 bytes at least. Contrary to what you would expect this actually improves the compresion ratio, since 3 literal bytes will often be shorter than a match after huffman encoding. This varies a bit by source, but is most often the case when the source is "easy" to compress. Second of all, a "stronger" hash is used. The hash is similar to the hashing function used by Snappy. Overall, the speed impact is biggest on higher compression levels. I intend to replace the "speed" compression level, which can be seen in CL 21021. The built-in benchmark using "digits" is slower at level 1. I see this as an exception, since "digits" is a special type of data, where you have low entropy (numbers 0->9), but no significant matches. Again, CL 20021 fixes that case. NewWriterDict is also made considerably faster, by not running data through the entire encoder. This is not reflected by the benchmark. Overall, the speed impact is biggest on higher compression levels. I intend to replace the "speed" compression level. COMPARED to tip/master: name old time/op new time/op delta EncodeDigitsSpeed1e4-4 401µs ± 1% 345µs ± 2% -13.95% EncodeDigitsSpeed1e5-4 3.19ms ± 1% 4.27ms ± 3% +33.96% EncodeDigitsSpeed1e6-4 27.7ms ± 4% 43.8ms ± 3% +58.00% EncodeDigitsDefault1e4-4 641µs ± 0% 403µs ± 1% -37.15% EncodeDigitsDefault1e5-4 13.8ms ± 1% 6.4ms ± 3% -53.73% EncodeDigitsDefault1e6-4 162ms ± 1% 64ms ± 2% -60.51% EncodeDigitsCompress1e4-4 627µs ± 1% 405µs ± 2% -35.45% EncodeDigitsCompress1e5-4 13.9ms ± 0% 6.3ms ± 2% -54.46% EncodeDigitsCompress1e6-4 159ms ± 1% 64ms ± 0% -59.91% EncodeTwainSpeed1e4-4 433µs ± 4% 331µs ± 1% -23.53% EncodeTwainSpeed1e5-4 2.82ms ± 1% 3.08ms ± 0% +9.10% EncodeTwainSpeed1e6-4 28.1ms ± 2% 28.8ms ± 0% +2.82% EncodeTwainDefault1e4-4 695µs ± 4% 474µs ± 1% -31.78% EncodeTwainDefault1e5-4 11.8ms ± 0% 7.4ms ± 0% -37.31% EncodeTwainDefault1e6-4 128ms ± 0% 75ms ± 0% -40.93% EncodeTwainCompress1e4-4 719µs ± 3% 480µs ± 0% -33.27% EncodeTwainCompress1e5-4 15.0ms ± 3% 8.2ms ± 2% -45.55% EncodeTwainCompress1e6-4 170ms ± 0% 85ms ± 1% -49.99% name old speed new speed delta EncodeDigitsSpeed1e4-4 25.0MB/s ± 1% 29.0MB/s ± 2% +16.24% EncodeDigitsSpeed1e5-4 31.4MB/s ± 1% 23.4MB/s ± 3% -25.34% EncodeDigitsSpeed1e6-4 36.1MB/s ± 4% 22.8MB/s ± 3% -36.74% EncodeDigitsDefault1e4-4 15.6MB/s ± 0% 24.8MB/s ± 1% +59.11% EncodeDigitsDefault1e5-4 7.27MB/s ± 1% 15.72MB/s ± 3% +116.23% EncodeDigitsDefault1e6-4 6.16MB/s ± 0% 15.60MB/s ± 2% +153.25% EncodeDigitsCompress1e4-4 15.9MB/s ± 1% 24.7MB/s ± 2% +54.97% EncodeDigitsCompress1e5-4 7.19MB/s ± 0% 15.78MB/s ± 2% +119.62% EncodeDigitsCompress1e6-4 6.27MB/s ± 1% 15.65MB/s ± 0% +149.52% EncodeTwainSpeed1e4-4 23.1MB/s ± 4% 30.2MB/s ± 1% +30.68% EncodeTwainSpeed1e5-4 35.4MB/s ± 1% 32.5MB/s ± 0% -8.34% EncodeTwainSpeed1e6-4 35.6MB/s ± 2% 34.7MB/s ± 0% -2.77% EncodeTwainDefault1e4-4 14.4MB/s ± 4% 21.1MB/s ± 1% +46.48% EncodeTwainDefault1e5-4 8.49MB/s ± 0% 13.55MB/s ± 0% +59.50% EncodeTwainDefault1e6-4 7.83MB/s ± 0% 13.25MB/s ± 0% +69.19% EncodeTwainCompress1e4-4 13.9MB/s ± 3% 20.8MB/s ± 0% +49.83% EncodeTwainCompress1e5-4 6.65MB/s ± 3% 12.20MB/s ± 2% +83.51% EncodeTwainCompress1e6-4 5.88MB/s ± 0% 11.76MB/s ± 1% +100.06% Change-Id: I724e33c1dd3e3a6a1b0a68e094baa959352baf32 Reviewed-on: https://go-review.googlesource.com/20929 Run-TryBot: Nigel Tao <nigeltao@golang.org> Reviewed-by: Nigel Tao <nigeltao@golang.org> |
|
|
|
53984e5be2 |
compress/flate: optimize huffman bit encoder
Part 1 of optimizing the deflater. This optimizes the bitwriter by: * Removing allocations. * Storing compound values for bit codes instead of 2 separate tables. * Accumulate 48 bits between writes instead of 24. * Inline bit flushing. This also contains code that will be used in later CL's (writeBlockDynamic, writeBlockHuff). Tests for Huffman bit writer encoding regressions has been added. name old speed new speed delta EncodeDigitsSpeed1e4-4 19.3MB/s ± 1% 21.6MB/s ± 1% +11.77% EncodeDigitsSpeed1e5-4 25.0MB/s ± 6% 30.7MB/s ± 1% +22.70% EncodeDigitsSpeed1e6-4 28.2MB/s ± 1% 32.3MB/s ± 1% +14.64% EncodeDigitsDefault1e4-4 13.3MB/s ± 0% 14.2MB/s ± 1% +7.07% EncodeDigitsDefault1e5-4 6.43MB/s ± 1% 6.64MB/s ± 1% +3.27% EncodeDigitsDefault1e6-4 5.81MB/s ± 0% 5.85MB/s ± 1% +0.69% EncodeDigitsCompress1e4-4 13.2MB/s ± 0% 14.4MB/s ± 0% +9.10% EncodeDigitsCompress1e5-4 6.40MB/s ± 1% 6.61MB/s ± 0% +3.20% EncodeDigitsCompress1e6-4 5.80MB/s ± 1% 5.90MB/s ± 1% +1.64% EncodeTwainSpeed1e4-4 18.4MB/s ± 1% 20.7MB/s ± 1% +12.72% EncodeTwainSpeed1e5-4 27.7MB/s ± 1% 31.0MB/s ± 1% +11.78% EncodeTwainSpeed1e6-4 29.1MB/s ± 0% 32.9MB/s ± 2% +13.25% EncodeTwainDefault1e4-4 12.4MB/s ± 0% 13.1MB/s ± 1% +5.88% EncodeTwainDefault1e5-4 7.52MB/s ± 1% 7.83MB/s ± 0% +4.19% EncodeTwainDefault1e6-4 7.08MB/s ± 1% 7.26MB/s ± 0% +2.54% EncodeTwainCompress1e4-4 12.0MB/s ± 1% 12.8MB/s ± 1% +6.70% EncodeTwainCompress1e5-4 5.96MB/s ± 1% 6.16MB/s ± 0% +3.27% EncodeTwainCompress1e6-4 5.37MB/s ± 0% 5.39MB/s ± 1% +0.47% >Allocations: benchmark old allocs new allocs delta BenchmarkEncodeDigitsSpeed1e4-4 50 0 -100.00% BenchmarkEncodeDigitsSpeed1e5-4 110 0 -100.00% BenchmarkEncodeDigitsSpeed1e6-4 1032 0 -100.00% BenchmarkEncodeDigitsDefault1e4-4 56 0 -100.00% BenchmarkEncodeDigitsDefault1e5-4 120 0 -100.00% BenchmarkEncodeDigitsDefault1e6-4 966 0 -100.00% BenchmarkEncodeDigitsCompress1e4-4 56 0 -100.00% BenchmarkEncodeDigitsCompress1e5-4 120 0 -100.00% BenchmarkEncodeDigitsCompress1e6-4 966 0 -100.00% BenchmarkEncodeTwainSpeed1e4-4 58 0 -100.00% BenchmarkEncodeTwainSpeed1e5-4 132 0 -100.00% BenchmarkEncodeTwainSpeed1e6-4 1082 0 -100.00% BenchmarkEncodeTwainDefault1e4-4 52 0 -100.00% BenchmarkEncodeTwainDefault1e5-4 126 0 -100.00% BenchmarkEncodeTwainDefault1e6-4 886 0 -100.00% BenchmarkEncodeTwainCompress1e4-4 52 0 -100.00% BenchmarkEncodeTwainCompress1e5-4 120 0 -100.00% BenchmarkEncodeTwainCompress1e6-4 880 0 -100.00% benchmark old bytes new bytes delta BenchmarkEncodeDigitsSpeed1e4-4 4288 2 -99.95% BenchmarkEncodeDigitsSpeed1e5-4 8896 15 -99.83% BenchmarkEncodeDigitsSpeed1e6-4 84098 153 -99.82% BenchmarkEncodeDigitsDefault1e4-4 4480 3 -99.93% BenchmarkEncodeDigitsDefault1e5-4 9216 76 -99.18% BenchmarkEncodeDigitsDefault1e6-4 73920 768 -98.96% BenchmarkEncodeDigitsCompress1e4-4 4480 3 -99.93% BenchmarkEncodeDigitsCompress1e5-4 9216 76 -99.18% BenchmarkEncodeDigitsCompress1e6-4 73920 768 -98.96% BenchmarkEncodeTwainSpeed1e4-4 4544 2 -99.96% BenchmarkEncodeTwainSpeed1e5-4 9600 15 -99.84% BenchmarkEncodeTwainSpeed1e6-4 77633 153 -99.80% BenchmarkEncodeTwainDefault1e4-4 4352 3 -99.93% BenchmarkEncodeTwainDefault1e5-4 9408 76 -99.19% BenchmarkEncodeTwainDefault1e6-4 65984 768 -98.84% BenchmarkEncodeTwainCompress1e4-4 4352 3 -99.93% BenchmarkEncodeTwainCompress1e5-4 9216 76 -99.18% BenchmarkEncodeTwainCompress1e6-4 65792 768 -98.83% Updates #14258 Change-Id: Ibaa97b9619743ad623094727228eb2ada1ec7f1f Reviewed-on: https://go-review.googlesource.com/19336 Reviewed-by: Nigel Tao <nigeltao@golang.org> Reviewed-by: Joe Tsai <joetsai@digital-static.net> Run-TryBot: Joe Tsai <joetsai@digital-static.net> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
afdb8cff3e |
compress/flate: test if results are deterministic
This will test if deflate output is deterministic between two runs of the deflater, when write sizes differ. The deflater makes no official promises that results are deterministic between runs, but this is a good test to determine unintentional randomness. Note that this does not guarantee that results are deterministic across platforms nor that results will be deterministic between Go versions. This is also not guarantees we should imply. Change-Id: Id7dd89fe276060fd83a43d0b34ac35d50fcd32d9 Reviewed-on: https://go-review.googlesource.com/20573 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
53900cea1b |
compress/flate: forward upstream Writer errors
If the upstream writer has returned an error, it may not be returned by subsequent calls. This makes sure that if an error has been returned, the Writer will keep returning an error on all subsequent calls, and not silently "swallow" them. Change-Id: I2c9f614df72e1f4786705bf94e119b66c62abe5e Reviewed-on: https://go-review.googlesource.com/20515 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> |
|
|
|
c52cb1fe9e |
compress/flate: take NewWriter out of the benchmark loop.
This helps follow-up CLs ensure that the encoding's core computation does not allocate. It is a separate CL because it has a non-trivial effect on the benchmark numbers, even if it's purely an accounting change and not a change to the underlying performance: BenchmarkEncodeDigitsSpeed1e4-4 5.65 19.31 3.42x BenchmarkEncodeDigitsSpeed1e5-4 17.23 26.79 1.55x BenchmarkEncodeDigitsSpeed1e6-4 26.85 27.51 1.02x BenchmarkEncodeDigitsDefault1e4-4 4.41 13.21 3.00x BenchmarkEncodeDigitsDefault1e5-4 5.64 6.28 1.11x BenchmarkEncodeDigitsDefault1e6-4 5.54 5.65 1.02x BenchmarkEncodeDigitsCompress1e4-4 4.31 13.15 3.05x BenchmarkEncodeDigitsCompress1e5-4 5.52 5.91 1.07x BenchmarkEncodeDigitsCompress1e6-4 5.38 5.63 1.05x BenchmarkEncodeTwainSpeed1e4-4 5.45 19.06 3.50x BenchmarkEncodeTwainSpeed1e5-4 17.30 29.25 1.69x BenchmarkEncodeTwainSpeed1e6-4 28.06 30.86 1.10x BenchmarkEncodeTwainDefault1e4-4 4.06 12.36 3.04x BenchmarkEncodeTwainDefault1e5-4 6.15 7.62 1.24x BenchmarkEncodeTwainDefault1e6-4 6.84 6.99 1.02x BenchmarkEncodeTwainCompress1e4-4 4.06 12.27 3.02x BenchmarkEncodeTwainCompress1e5-4 5.29 5.92 1.12x BenchmarkEncodeTwainCompress1e6-4 5.24 5.29 1.01x Change-Id: I7d32866b7e2d478b0154332c1edeefe339af9a28 Reviewed-on: https://go-review.googlesource.com/20467 Reviewed-by: David Symonds <dsymonds@golang.org> |
|
|
|
ed8116989d |
compress/flate: remove unused woffset field.
Change-Id: Id0a12c76b0a6925f2926d38a1931157f9ef5f650 Reviewed-on: https://go-review.googlesource.com/20280 Reviewed-by: Joe Tsai <joetsai@digital-static.net> Run-TryBot: Joe Tsai <joetsai@digital-static.net> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
5fea2ccc77 |
all: single space after period.
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
5fc4decd10 |
compress/flate: extract LZ77 dictionary logic into seperate struct
The LZ77 portion of DEFLATE is relatively self-contained. For the decompression side of things, we extract this logic out for the following reasons: * It is easier to test just the LZ77 portion of the logic. * It reduces the noise in the inflate.go Also, we adjust the way that callbacks are handled in the inflate. Instead of using functions to abstract the logical componets of huffmanBlock(), use goto statements to jump between the necessary sections. This is faster since it avoids a function call and is arguably more readable. benchmark old MB/s new MB/s speedup BenchmarkDecodeDigitsSpeed1e4-4 53.62 60.11 1.12x BenchmarkDecodeDigitsSpeed1e5-4 61.90 69.07 1.12x BenchmarkDecodeDigitsSpeed1e6-4 63.24 70.58 1.12x BenchmarkDecodeDigitsDefault1e4-4 54.10 59.00 1.09x BenchmarkDecodeDigitsDefault1e5-4 69.50 74.07 1.07x BenchmarkDecodeDigitsDefault1e6-4 71.54 75.85 1.06x BenchmarkDecodeDigitsCompress1e4-4 54.39 58.94 1.08x BenchmarkDecodeDigitsCompress1e5-4 69.21 73.96 1.07x BenchmarkDecodeDigitsCompress1e6-4 71.14 75.75 1.06x BenchmarkDecodeTwainSpeed1e4-4 53.15 58.13 1.09x BenchmarkDecodeTwainSpeed1e5-4 66.56 72.29 1.09x BenchmarkDecodeTwainSpeed1e6-4 69.13 75.11 1.09x BenchmarkDecodeTwainDefault1e4-4 56.00 60.23 1.08x BenchmarkDecodeTwainDefault1e5-4 77.84 82.27 1.06x BenchmarkDecodeTwainDefault1e6-4 82.07 86.85 1.06x BenchmarkDecodeTwainCompress1e4-4 56.13 60.38 1.08x BenchmarkDecodeTwainCompress1e5-4 78.23 82.62 1.06x BenchmarkDecodeTwainCompress1e6-4 82.38 86.73 1.05x Change-Id: I8c6ae0e6bed652dd0570fc113c999977f5e71636 Reviewed-on: https://go-review.googlesource.com/16528 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
04d732b4c2 |
build: shorten a few packages with long tests
Takes 3% off my all.bash run time. For #10571. Change-Id: I8f00f523d6919e87182d35722a669b0b96b8218b Reviewed-on: https://go-review.googlesource.com/18087 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
b717090e01 |
compress/flate: tweak offsetCode so that it can be inlined
Functions with switches (#13071) cannot be inlined. Functions with consts (#7655) cannot be inlined. benchmark old MB/s new MB/s speedup BenchmarkEncodeDigitsSpeed1e4-4 10.25 10.20 1.00x BenchmarkEncodeDigitsSpeed1e5-4 26.44 27.22 1.03x BenchmarkEncodeDigitsSpeed1e6-4 32.28 33.51 1.04x BenchmarkEncodeDigitsDefault1e4-4 8.61 8.74 1.02x BenchmarkEncodeDigitsDefault1e5-4 7.03 6.98 0.99x BenchmarkEncodeDigitsDefault1e6-4 6.47 6.46 1.00x BenchmarkEncodeDigitsCompress1e4-4 8.62 8.73 1.01x BenchmarkEncodeDigitsCompress1e5-4 7.01 6.98 1.00x BenchmarkEncodeDigitsCompress1e6-4 6.43 6.53 1.02x BenchmarkEncodeTwainSpeed1e4-4 9.67 10.16 1.05x BenchmarkEncodeTwainSpeed1e5-4 26.46 26.94 1.02x BenchmarkEncodeTwainSpeed1e6-4 33.19 34.02 1.03x BenchmarkEncodeTwainDefault1e4-4 8.12 8.37 1.03x BenchmarkEncodeTwainDefault1e5-4 8.22 8.21 1.00x BenchmarkEncodeTwainDefault1e6-4 8.10 8.13 1.00x BenchmarkEncodeTwainCompress1e4-4 8.24 8.39 1.02x BenchmarkEncodeTwainCompress1e5-4 6.51 6.58 1.01x BenchmarkEncodeTwainCompress1e6-4 6.16 6.13 1.00x Change-Id: Ibafa5e3e2de0529853b5b3180e6fd6cb7090b76f Reviewed-on: https://go-review.googlesource.com/17171 Reviewed-by: Russ Cox <rsc@golang.org> |