Commit Graph

14 Commits

Author SHA1 Message Date
Katie Hockman fdceb2a11b compress: reduce copies of new text for compression testing
The previous book was 387 KiB decompressed and 119 KiB compressed, the
new book is 567 KiB decompressed and 132 KiB compressed. Overall, this
change will reduce the release binary size by 196 KiB. The new book will
allow for slightly more extensive compression testing with a larger
text.

Command to run the benchmark tests used with benchstat:
`../bin/go test -run='^$' -count=4 -bench=. compress/bzip2 compress/flate`

When running the benchmarks locally, changed "Newton" to "Twain" and
filtered the tests with the -bench flag to include only those which were
relevant to these changes.

benchstat results below:

name                            old time/op    new time/op     delta
DecodeTwain-8                     19.6ms ± 2%     24.1ms ± 1%  +23.04%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e4-8         140µs ± 3%      139µs ± 5%     ~     (p=0.886 n=4+4)
Decode/Twain/Huffman/1e5-8        1.27ms ± 3%     1.26ms ± 1%     ~     (p=1.000 n=4+4)
Decode/Twain/Huffman/1e6-8        12.4ms ± 0%     13.2ms ± 1%   +6.42%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e4-8           133µs ± 1%      123µs ± 1%   -7.35%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e5-8          1.20ms ± 0%     1.02ms ± 3%  -15.32%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e6-8          12.0ms ± 2%     10.1ms ± 3%  -15.89%  (p=0.029 n=4+4)
Decode/Twain/Default/1e4-8         131µs ± 6%      108µs ± 5%  -17.84%  (p=0.029 n=4+4)
Decode/Twain/Default/1e5-8        1.06ms ± 2%     0.80ms ± 1%  -24.97%  (p=0.029 n=4+4)
Decode/Twain/Default/1e6-8        10.0ms ± 3%      8.0ms ± 3%  -20.06%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e4-8     128µs ± 4%      115µs ± 4%   -9.70%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e5-8    1.04ms ± 2%     0.83ms ± 4%  -20.37%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e6-8    10.4ms ± 4%      8.1ms ± 5%  -22.25%  (p=0.029 n=4+4)
Encode/Twain/Huffman/1e4-8        55.7µs ± 2%     55.6µs ± 1%     ~     (p=1.000 n=4+4)
Encode/Twain/Huffman/1e5-8         441µs ± 0%      435µs ± 2%     ~     (p=0.343 n=4+4)
Encode/Twain/Huffman/1e6-8        4.31ms ± 4%     4.30ms ± 4%     ~     (p=0.886 n=4+4)
Encode/Twain/Speed/1e4-8           193µs ± 1%      166µs ± 2%  -14.09%  (p=0.029 n=4+4)
Encode/Twain/Speed/1e5-8          1.54ms ± 1%     1.22ms ± 1%  -20.53%  (p=0.029 n=4+4)
Encode/Twain/Speed/1e6-8          15.3ms ± 1%     12.2ms ± 3%  -20.62%  (p=0.029 n=4+4)
Encode/Twain/Default/1e4-8         393µs ± 1%      390µs ± 1%     ~     (p=0.114 n=4+4)
Encode/Twain/Default/1e5-8        6.12ms ± 4%     6.02ms ± 5%     ~     (p=0.486 n=4+4)
Encode/Twain/Default/1e6-8        69.4ms ± 5%     59.0ms ± 4%  -15.07%  (p=0.029 n=4+4)
Encode/Twain/Compression/1e4-8     423µs ± 2%      379µs ± 2%  -10.34%  (p=0.029 n=4+4)
Encode/Twain/Compression/1e5-8    7.00ms ± 1%     7.88ms ± 3%  +12.49%  (p=0.029 n=4+4)
Encode/Twain/Compression/1e6-8    76.6ms ± 5%     80.9ms ± 3%     ~     (p=0.114 n=4+4)

name                            old speed      new speed       delta
DecodeTwain-8                   19.8MB/s ± 2%   23.6MB/s ± 1%  +18.84%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e4-8      71.7MB/s ± 3%   72.1MB/s ± 6%     ~     (p=0.943 n=4+4)
Decode/Twain/Huffman/1e5-8      78.8MB/s ± 3%   79.5MB/s ± 1%     ~     (p=1.000 n=4+4)
Decode/Twain/Huffman/1e6-8      80.5MB/s ± 0%   75.6MB/s ± 1%   -6.03%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e4-8        75.2MB/s ± 1%   81.2MB/s ± 1%   +7.93%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e5-8        83.4MB/s ± 0%   98.6MB/s ± 3%  +18.16%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e6-8        83.6MB/s ± 2%   99.5MB/s ± 3%  +18.91%  (p=0.029 n=4+4)
Decode/Twain/Default/1e4-8      76.3MB/s ± 6%   92.8MB/s ± 4%  +21.62%  (p=0.029 n=4+4)
Decode/Twain/Default/1e5-8      94.4MB/s ± 3%  125.7MB/s ± 1%  +33.24%  (p=0.029 n=4+4)
Decode/Twain/Default/1e6-8       100MB/s ± 3%    125MB/s ± 3%  +25.12%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e4-8  78.4MB/s ± 4%   86.8MB/s ± 4%  +10.73%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e5-8  95.7MB/s ± 2%  120.3MB/s ± 4%  +25.65%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e6-8  96.4MB/s ± 4%  124.0MB/s ± 5%  +28.64%  (p=0.029 n=4+4)
Encode/Twain/Huffman/1e4-8       179MB/s ± 2%    180MB/s ± 1%     ~     (p=1.000 n=4+4)
Encode/Twain/Huffman/1e5-8       227MB/s ± 0%    230MB/s ± 2%     ~     (p=0.343 n=4+4)
Encode/Twain/Huffman/1e6-8       232MB/s ± 4%    233MB/s ± 4%     ~     (p=0.886 n=4+4)
Encode/Twain/Speed/1e4-8        51.8MB/s ± 1%   60.4MB/s ± 2%  +16.43%  (p=0.029 n=4+4)
Encode/Twain/Speed/1e5-8        65.1MB/s ± 1%   81.9MB/s ± 1%  +25.83%  (p=0.029 n=4+4)
Encode/Twain/Speed/1e6-8        65.2MB/s ± 1%   82.2MB/s ± 3%  +26.00%  (p=0.029 n=4+4)
Encode/Twain/Default/1e4-8      25.4MB/s ± 1%   25.6MB/s ± 1%     ~     (p=0.114 n=4+4)
Encode/Twain/Default/1e5-8      16.4MB/s ± 4%   16.6MB/s ± 5%     ~     (p=0.486 n=4+4)
Encode/Twain/Default/1e6-8      14.4MB/s ± 6%   17.0MB/s ± 4%  +17.67%  (p=0.029 n=4+4)
Encode/Twain/Compression/1e4-8  23.6MB/s ± 2%   26.4MB/s ± 2%  +11.54%  (p=0.029 n=4+4)
Encode/Twain/Compression/1e5-8  14.3MB/s ± 1%   12.7MB/s ± 3%  -11.08%  (p=0.029 n=4+4)
Encode/Twain/Compression/1e6-8  13.1MB/s ± 4%   12.4MB/s ± 3%     ~     (p=0.114 n=4+4)

name                            old alloc/op   new alloc/op    delta
DecodeTwain-8                     3.63MB ± 0%     3.63MB ± 0%   +0.15%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e4-8        42.0kB ± 0%     41.3kB ± 0%   -1.62%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e5-8        43.5kB ± 0%     45.1kB ± 0%   +3.74%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e6-8        71.7kB ± 0%     80.0kB ± 0%  +11.55%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e4-8          41.2kB ± 0%     41.3kB ± 0%     ~     (p=0.286 n=4+4)
Decode/Twain/Speed/1e5-8          45.1kB ± 0%     43.9kB ± 0%   -2.80%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e6-8          72.8kB ± 0%     81.3kB ± 0%  +11.72%  (p=0.029 n=4+4)
Decode/Twain/Default/1e4-8        41.2kB ± 0%     41.2kB ± 0%   -0.22%  (p=0.029 n=4+4)
Decode/Twain/Default/1e5-8        44.4kB ± 0%     43.0kB ± 0%   -3.02%  (p=0.029 n=4+4)
Decode/Twain/Default/1e6-8        71.0kB ± 0%     61.8kB ± 0%  -13.00%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e4-8    41.3kB ± 0%     41.2kB ± 0%   -0.29%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e5-8    43.3kB ± 0%     43.0kB ± 0%   -0.72%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e6-8    69.1kB ± 0%     63.7kB ± 0%   -7.90%  (p=0.029 n=4+4)

name                            old allocs/op  new allocs/op   delta
DecodeTwain-8                       51.0 ± 0%       51.2 ± 1%     ~     (p=1.000 n=4+4)
Decode/Twain/Huffman/1e4-8          15.0 ± 0%       14.0 ± 0%   -6.67%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e5-8          20.0 ± 0%       23.0 ± 0%  +15.00%  (p=0.029 n=4+4)
Decode/Twain/Huffman/1e6-8           134 ± 0%        161 ± 0%  +20.15%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e4-8            17.0 ± 0%       18.0 ± 0%   +5.88%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e5-8            30.0 ± 0%       31.0 ± 0%   +3.33%  (p=0.029 n=4+4)
Decode/Twain/Speed/1e6-8             193 ± 0%        228 ± 0%  +18.13%  (p=0.029 n=4+4)
Decode/Twain/Default/1e4-8          17.0 ± 0%       15.0 ± 0%  -11.76%  (p=0.029 n=4+4)
Decode/Twain/Default/1e5-8          28.0 ± 0%       32.0 ± 0%  +14.29%  (p=0.029 n=4+4)
Decode/Twain/Default/1e6-8           199 ± 0%        158 ± 0%  -20.60%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e4-8      17.0 ± 0%       15.0 ± 0%  -11.76%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e5-8      28.0 ± 0%       32.0 ± 0%  +14.29%  (p=0.029 n=4+4)
Decode/Twain/Compression/1e6-8       196 ± 0%        150 ± 0%  -23.47%  (p=0.029 n=4+4)

Updates #27151

Change-Id: I6c439694ed16a33bb4c63fbfb8570c7de46b4f2d
Reviewed-on: https://go-review.googlesource.com/135495
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2018-09-24 18:26:02 +00:00
Joe Tsai 9c3630f578 compress/flate: avoid large stack growth in fillDeflate
Ranging over an array causes the array to be copied over to the
stack, which cause large re-growths. Instead, we should iterate
over slices of the array.

Also, assigning a large struct literal uses the stack even
though the actual fields being populated are small in comparison
to the entirety of the struct (see #18636).

Fixing the stack growth does not alter CPU-time performance much
since the stack-growth and copying was such a tiny portion of the
compression work:

name                         old time/op    new time/op    delta
Encode/Digits/Default/1e4-8     332µs ± 1%     332µs ± 1%   ~     (p=0.796 n=10+10)
Encode/Digits/Default/1e5-8    5.07ms ± 2%    5.05ms ± 1%   ~       (p=0.815 n=9+8)
Encode/Digits/Default/1e6-8    53.7ms ± 1%    53.9ms ± 1%   ~     (p=0.075 n=10+10)
Encode/Twain/Default/1e4-8      380µs ± 1%     380µs ± 1%   ~     (p=0.684 n=10+10)
Encode/Twain/Default/1e5-8     5.79ms ± 2%    5.79ms ± 1%   ~      (p=0.497 n=9+10)
Encode/Twain/Default/1e6-8     61.5ms ± 1%    61.8ms ± 1%   ~     (p=0.247 n=10+10)

name                         old speed      new speed      delta
Encode/Digits/Default/1e4-8  30.1MB/s ± 1%  30.1MB/s ± 1%   ~     (p=0.753 n=10+10)
Encode/Digits/Default/1e5-8  19.7MB/s ± 2%  19.8MB/s ± 1%   ~       (p=0.795 n=9+8)
Encode/Digits/Default/1e6-8  18.6MB/s ± 1%  18.5MB/s ± 1%   ~     (p=0.072 n=10+10)
Encode/Twain/Default/1e4-8   26.3MB/s ± 1%  26.3MB/s ± 1%   ~     (p=0.616 n=10+10)
Encode/Twain/Default/1e5-8   17.3MB/s ± 2%  17.3MB/s ± 1%   ~      (p=0.484 n=9+10)
Encode/Twain/Default/1e6-8   16.3MB/s ± 1%  16.2MB/s ± 1%   ~     (p=0.238 n=10+10)

Updates #18636
Fixes #18625

Change-Id: I471b20339bf675f63dc56d38b3acdd824fe23328
Reviewed-on: https://go-review.googlesource.com/35122
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-12 19:15:57 +00:00
Brad Fitzpatrick 2341631506 all: sprinkle t.Parallel on some slow tests
I used the slowtests.go tool as described in
https://golang.org/cl/32684 on packages that stood out.

go test -short std drops from ~56 to ~52 seconds.

This isn't a huge win, but it was mostly an exercise.

Updates #17751

Change-Id: I9f3402e36a038d71e662d06ce2c1d52f6c4b674d
Reviewed-on: https://go-review.googlesource.com/32751
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-11-04 16:56:57 +00:00
Nigel Tao 590fce4884 compress/flate: tighten the BestSpeed max match offset bound.
Previously, we were off by one.

Also fix a comment typo.

Change-Id: Ib94d23acc56d5fccd44144f71655481f98803ac8
Reviewed-on: https://go-review.googlesource.com/32149
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-30 22:52:14 +00:00
Klaus Post 2e196b15b9 compress/flate: level 1 (best speed) match across blocks
This change makes deflate level 1 (best speed) match across
block boundaries. This comes at a small speed penalty,
but improves compression on almost all output.

Sample numbers on various content types:

enwik9:            391052014 ->  382578469 bytes, 77.59 -> 74.28 MB/s
adresser.001:       57269799 ->   47756095 bytes, 287.84 -> 357.86 MB/s
10gb:             5233055166 -> 5198328382 bytes, 105.85 -> 96.99 MB/s
rawstudio-mint14: 3972329211 -> 3927423364 bytes, 100.07 -> 94.22 MB/s
sites:             165556800 ->  163178702 bytes, 72.31 -> 70.15 MB/s
objectfiles:       115962472 ->  111649524 bytes, 132.60 -> 128.05 MB/s
sharnd.out:        200015283 ->  200015283 bytes, 221.50 -> 218.83 MB/s

Change-Id: I62a139e5c06976e803439a4268acede5139b8cfc
Reviewed-on: https://go-review.googlesource.com/31640
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2016-10-27 00:50:06 +00:00
Joe Tsai 8cd04da762 compress/flate: make huffmanBitWriter errors persistent
For persistent error handling, the methods of huffmanBitWriter have to be
consistent about how they check errors. It must either consistently
check error *before* every operation OR immediately *after* every
operation. Since most of the current logic uses the previous approach,
we apply the same style of error checking to writeBits and all calls
to Write such that they only operate if w.err is already nil going
into them.

The error handling approach is brittle and easily broken by future commits to
the code. In the near future, we should switch the logic to use panic at the
lowest levels and a recover at the edge of the public API to ensure
that errors are always persistent.

Fixes #16749

Change-Id: Ie1d83e4ed8842f6911a31e23311cd3cbf38abe8c
Reviewed-on: https://go-review.googlesource.com/27200
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-08-20 00:36:40 +00:00
Nigel Tao d8b7bd6a1f compress/flate: replace "Best Speed" with specialized version
This encoding algorithm, which prioritizes speed over output size, is
based on Snappy's LZ77-style encoder: github.com/golang/snappy

This commit keeps the diff between this package's encodeBestSpeed
function and and Snappy's encodeBlock function as small as possible (see
the diff below). Follow-up commits will improve this package's
performance and output size.

This package's speed benchmarks:

name                    old speed      new speed      delta
EncodeDigitsSpeed1e4-8  40.7MB/s ± 0%  73.0MB/s ± 0%   +79.18%  (p=0.008 n=5+5)
EncodeDigitsSpeed1e5-8  33.0MB/s ± 0%  77.3MB/s ± 1%  +134.04%  (p=0.008 n=5+5)
EncodeDigitsSpeed1e6-8  32.1MB/s ± 0%  82.1MB/s ± 0%  +156.18%  (p=0.008 n=5+5)
EncodeTwainSpeed1e4-8   42.1MB/s ± 0%  65.0MB/s ± 0%   +54.61%  (p=0.008 n=5+5)
EncodeTwainSpeed1e5-8   46.3MB/s ± 0%  80.0MB/s ± 0%   +72.81%  (p=0.008 n=5+5)
EncodeTwainSpeed1e6-8   47.3MB/s ± 0%  81.7MB/s ± 0%   +72.86%  (p=0.008 n=5+5)

Here's the milliseconds taken, before and after this commit, to compress
a number of test files:

Go's src/compress/testdata files:

     4          1 e.txt
     8          4 Mark.Twain-Tom.Sawyer.txt

github.com/golang/snappy's benchmark files:

     3          1 alice29.txt
    12          3 asyoulik.txt
     6          1 fireworks.jpeg
     1          1 geo.protodata
     1          0 html
     2          2 html_x_4
     6          3 kppkn.gtb
    11          4 lcet10.txt
     5          1 paper-100k.pdf
    14          6 plrabn12.txt
    17          6 urls.10K

Larger files linked to from
https://docs.google.com/spreadsheets/d/1VLxi-ac0BAtf735HyH3c1xRulbkYYUkFecKdLPH7NIQ/edit#gid=166102500

  2409       3182 adresser.001
 16757      11027 enwik9
 13764      12946 gob-stream
153978      74317 rawstudio-mint14.tar
  4371        770 sharnd.out

Output size is larger. In the table below, the first column is the input
size, the second column is the output size prior to this commit, the
third column is the output size after this commit.

    100003      47707      50006 e.txt
    387851     172707     182930 Mark.Twain-Tom.Sawyer.txt
    152089      62457      66705 alice29.txt
    125179      54503      57274 asyoulik.txt
    123093     122827     123108 fireworks.jpeg
    118588      18574      20558 geo.protodata
    102400      16601      17305 html
    409600      65506      70313 html_x_4
    184320      49007      50944 kppkn.gtb
    426754     166957     179355 lcet10.txt
    102400      82126      84937 paper-100k.pdf
    481861     218617     231988 plrabn12.txt
    702087     241774     258020 urls.10K
1073741824   43074110   57269781 adresser.001
1000000000  365772256  391052000 enwik9
1911399616  340364558  378679516 gob-stream
8558382592 3807229562 3972329193 rawstudio-mint14.tar
 200000000  200061040  200015265 sharnd.out

The diff between github.com/golang/snappy's encodeBlock function and
this commit's encodeBestSpeed function:

1c1,7
< func encodeBlock(dst, src []byte) (d int) {
---
> func encodeBestSpeed(dst []token, src []byte) []token {
> 	// This check isn't in the Snappy implementation, but there, the caller
> 	// instead of the callee handles this case.
> 	if len(src) < minNonLiteralBlockSize {
> 		return emitLiteral(dst, src)
> 	}
>
4c10
< 	// and len(src) <= maxBlockSize and maxBlockSize == 65536.
---
> 	// and len(src) <= maxStoreBlockSize and maxStoreBlockSize == 65535.
65c71
< 			if load32(src, s) == load32(src, candidate) {
---
> 			if s-candidate < maxOffset && load32(src, s) == load32(src, candidate) {
73c79
< 		d += emitLiteral(dst[d:], src[nextEmit:s])
---
> 		dst = emitLiteral(dst, src[nextEmit:s])
90c96
< 			// This is an inlined version of:
---
> 			// This is an inlined version of Snappy's:
93c99,103
< 			for i := candidate + 4; s < len(src) && src[i] == src[s]; i, s = i+1, s+1 {
---
> 			s1 := base + maxMatchLength
> 			if s1 > len(src) {
> 				s1 = len(src)
> 			}
> 			for i := candidate + 4; s < s1 && src[i] == src[s]; i, s = i+1, s+1 {
96c106,107
< 			d += emitCopy(dst[d:], base-candidate, s-base)
---
> 			// matchToken is flate's equivalent of Snappy's emitCopy.
> 			dst = append(dst, matchToken(uint32(s-base-3), uint32(base-candidate-minOffsetSize)))
114c125
< 			if uint32(x>>8) != load32(src, candidate) {
---
> 			if s-candidate >= maxOffset || uint32(x>>8) != load32(src, candidate) {
124c135
< 		d += emitLiteral(dst[d:], src[nextEmit:])
---
> 		dst = emitLiteral(dst, src[nextEmit:])
126c137
< 	return d
---
> 	return dst

This change is based on https://go-review.googlesource.com/#/c/21021/ by
Klaus Post, but it is a separate changelist as cl/21021 seems to have
stalled in code review, and the Go 1.7 feature freeze approaches.

Golang-dev discussion:
https://groups.google.com/d/topic/golang-dev/XYgHX9p8IOk/discussion and
of course cl/21021.

Change-Id: Ib662439417b3bd0b61c2977c12c658db3e44d164
Reviewed-on: https://go-review.googlesource.com/22370
Reviewed-by: Russ Cox <rsc@golang.org>
2016-04-29 06:58:42 +00:00
Klaus Post 42ad1dc01e compress/flate: add pure huffman deflater
Add a "HuffmanOnly" compression level, where the input is
only entropy encoded.

The output is fully inflate compatible. Typical compression
is reduction is about 50% of typical level 1 compression, however
the compression time is very stable, and does not vary as much as
nearly as much level 1 compression (or Snappy).

This mode is useful for:
 * HTTP compression in a CPU limited environment.
 * Entropy encoding Snappy compressed data, for archiving, etc.
 * Compression where compression time needs to be predictable.
 * Fast network transfer.

Snappy "usually" performs inbetween this and level 1 compression-wise,
but at the same speed as "Huffman", so this is not a replacement,
but a good supplement for Snappy, since it usually can compress
Snappy output further.

This is implemented as level -2, since this would be too much of a
compression reduction to replace level 1.

>go test -bench=Encode -cpu=1
BenchmarkEncodeDigitsHuffman1e4            30000             52334 ns/op         191.08 MB/s
BenchmarkEncodeDigitsHuffman1e5             3000            518343 ns/op         192.92 MB/s
BenchmarkEncodeDigitsHuffman1e6              300           5356884 ns/op         186.68 MB/s
BenchmarkEncodeDigitsSpeed1e4               5000            324214 ns/op          30.84 MB/s
BenchmarkEncodeDigitsSpeed1e5                500           3952614 ns/op          25.30 MB/s
BenchmarkEncodeDigitsSpeed1e6                 30          40760350 ns/op          24.53 MB/s
BenchmarkEncodeDigitsDefault1e4             5000            387056 ns/op          25.84 MB/s
BenchmarkEncodeDigitsDefault1e5              300           5950614 ns/op          16.80 MB/s
BenchmarkEncodeDigitsDefault1e6               20          63842195 ns/op          15.66 MB/s
BenchmarkEncodeDigitsCompress1e4            5000            391859 ns/op          25.52 MB/s
BenchmarkEncodeDigitsCompress1e5             300           5707112 ns/op          17.52 MB/s
BenchmarkEncodeDigitsCompress1e6              20          59839465 ns/op          16.71 MB/s
BenchmarkEncodeTwainHuffman1e4             20000             73498 ns/op         136.06 MB/s
BenchmarkEncodeTwainHuffman1e5              2000            595892 ns/op         167.82 MB/s
BenchmarkEncodeTwainHuffman1e6               200           6059016 ns/op         165.04 MB/s
BenchmarkEncodeTwainSpeed1e4                5000            321212 ns/op          31.13 MB/s
BenchmarkEncodeTwainSpeed1e5                 500           2823873 ns/op          35.41 MB/s
BenchmarkEncodeTwainSpeed1e6                  50          27237864 ns/op          36.71 MB/s
BenchmarkEncodeTwainDefault1e4              3000            454634 ns/op          22.00 MB/s
BenchmarkEncodeTwainDefault1e5               200           6859537 ns/op          14.58 MB/s
BenchmarkEncodeTwainDefault1e6                20          71547405 ns/op          13.98 MB/s
BenchmarkEncodeTwainCompress1e4             3000            462307 ns/op          21.63 MB/s
BenchmarkEncodeTwainCompress1e5              200           7534992 ns/op          13.27 MB/s
BenchmarkEncodeTwainCompress1e6               20          80353365 ns/op          12.45 MB/s
PASS
ok      compress/flate  55.333s

Change-Id: I8e12ad13220e50d4cf7ddba6f292333efad61b0c
Reviewed-on: https://go-review.googlesource.com/20982
Reviewed-by: Joe Tsai <joetsai@digital-static.net>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2016-03-29 09:34:52 +00:00
Klaus Post 53efe1e121 compress/flate: rework matching algorithm
This changes how matching is done in deflate algorithm.

The major change is that we do not look for matches that are only
3 bytes in length, matches must be 4 bytes at least.
Contrary to what you would expect this actually improves the
compresion ratio, since 3 literal bytes will often be shorter
than a match after huffman encoding.
This varies a bit by source, but is most often the case when the
source is "easy" to compress.

Second of all, a "stronger" hash is used. The hash is similar to
the hashing function used by Snappy.

Overall, the speed impact is biggest on higher compression levels.
I intend to replace the "speed" compression level, which can be
seen in CL 21021.

The built-in benchmark using "digits" is slower at level 1.
I see this as an exception, since "digits" is a special type
of data, where you have low entropy (numbers 0->9), but no
significant matches. Again, CL 20021 fixes that case.

NewWriterDict is also made considerably faster, by not running data
through the entire encoder. This is not reflected by the benchmark.

Overall, the speed impact is biggest on higher compression levels.
I intend to replace the "speed" compression level.

COMPARED to tip/master:
name                       old time/op    new time/op     delta
EncodeDigitsSpeed1e4-4        401µs ± 1%      345µs ± 2%   -13.95%
EncodeDigitsSpeed1e5-4       3.19ms ± 1%     4.27ms ± 3%   +33.96%
EncodeDigitsSpeed1e6-4       27.7ms ± 4%     43.8ms ± 3%   +58.00%
EncodeDigitsDefault1e4-4      641µs ± 0%      403µs ± 1%   -37.15%
EncodeDigitsDefault1e5-4     13.8ms ± 1%      6.4ms ± 3%   -53.73%
EncodeDigitsDefault1e6-4      162ms ± 1%       64ms ± 2%   -60.51%
EncodeDigitsCompress1e4-4     627µs ± 1%      405µs ± 2%   -35.45%
EncodeDigitsCompress1e5-4    13.9ms ± 0%      6.3ms ± 2%   -54.46%
EncodeDigitsCompress1e6-4     159ms ± 1%       64ms ± 0%   -59.91%
EncodeTwainSpeed1e4-4         433µs ± 4%      331µs ± 1%   -23.53%
EncodeTwainSpeed1e5-4        2.82ms ± 1%     3.08ms ± 0%    +9.10%
EncodeTwainSpeed1e6-4        28.1ms ± 2%     28.8ms ± 0%    +2.82%
EncodeTwainDefault1e4-4       695µs ± 4%      474µs ± 1%   -31.78%
EncodeTwainDefault1e5-4      11.8ms ± 0%      7.4ms ± 0%   -37.31%
EncodeTwainDefault1e6-4       128ms ± 0%       75ms ± 0%   -40.93%
EncodeTwainCompress1e4-4      719µs ± 3%      480µs ± 0%   -33.27%
EncodeTwainCompress1e5-4     15.0ms ± 3%      8.2ms ± 2%   -45.55%
EncodeTwainCompress1e6-4      170ms ± 0%       85ms ± 1%   -49.99%

name                       old speed      new speed       delta
EncodeDigitsSpeed1e4-4     25.0MB/s ± 1%   29.0MB/s ± 2%   +16.24%
EncodeDigitsSpeed1e5-4     31.4MB/s ± 1%   23.4MB/s ± 3%   -25.34%
EncodeDigitsSpeed1e6-4     36.1MB/s ± 4%   22.8MB/s ± 3%   -36.74%
EncodeDigitsDefault1e4-4   15.6MB/s ± 0%   24.8MB/s ± 1%   +59.11%
EncodeDigitsDefault1e5-4   7.27MB/s ± 1%  15.72MB/s ± 3%  +116.23%
EncodeDigitsDefault1e6-4   6.16MB/s ± 0%  15.60MB/s ± 2%  +153.25%
EncodeDigitsCompress1e4-4  15.9MB/s ± 1%   24.7MB/s ± 2%   +54.97%
EncodeDigitsCompress1e5-4  7.19MB/s ± 0%  15.78MB/s ± 2%  +119.62%
EncodeDigitsCompress1e6-4  6.27MB/s ± 1%  15.65MB/s ± 0%  +149.52%
EncodeTwainSpeed1e4-4      23.1MB/s ± 4%   30.2MB/s ± 1%   +30.68%
EncodeTwainSpeed1e5-4      35.4MB/s ± 1%   32.5MB/s ± 0%    -8.34%
EncodeTwainSpeed1e6-4      35.6MB/s ± 2%   34.7MB/s ± 0%    -2.77%
EncodeTwainDefault1e4-4    14.4MB/s ± 4%   21.1MB/s ± 1%   +46.48%
EncodeTwainDefault1e5-4    8.49MB/s ± 0%  13.55MB/s ± 0%   +59.50%
EncodeTwainDefault1e6-4    7.83MB/s ± 0%  13.25MB/s ± 0%   +69.19%
EncodeTwainCompress1e4-4   13.9MB/s ± 3%   20.8MB/s ± 0%   +49.83%
EncodeTwainCompress1e5-4   6.65MB/s ± 3%  12.20MB/s ± 2%   +83.51%
EncodeTwainCompress1e6-4   5.88MB/s ± 0%  11.76MB/s ± 1%  +100.06%

Change-Id: I724e33c1dd3e3a6a1b0a68e094baa959352baf32
Reviewed-on: https://go-review.googlesource.com/20929
Run-TryBot: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2016-03-23 11:33:29 +00:00
Brad Fitzpatrick 5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Russ Cox 04d732b4c2 build: shorten a few packages with long tests
Takes 3% off my all.bash run time.

For #10571.

Change-Id: I8f00f523d6919e87182d35722a669b0b96b8218b
Reviewed-on: https://go-review.googlesource.com/18087
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-29 15:46:44 +00:00
Brad Fitzpatrick 2ae77376f7 all: link to https instead of http
The one in misc/makerelease/makerelease.go is particularly bad and
probably warrants rotating our keys.

I didn't update old weekly notes, and reverted some changes involving
test code for now, since we're late in the Go 1.5 freeze. Otherwise,
the rest are all auto-generated changes, and all manually reviewed.

Change-Id: Ia2753576ab5d64826a167d259f48a2f50508792d
Reviewed-on: https://go-review.googlesource.com/12048
Reviewed-by: Rob Pike <r@golang.org>
2015-07-11 14:36:33 +00:00
Péter Surányi 9b6ccb1323 all: don't refer to code.google.com/p/go{,-wiki}/
Only documentation / comment changes. Update references to
point to golang.org permalinks or go.googlesource.com/go.
References in historical release notes under doc are left as is.

Change-Id: Icfc14e4998723e2c2d48f9877a91c5abef6794ea
Reviewed-on: https://go-review.googlesource.com/4060
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-02-06 14:41:47 +00:00
Russ Cox c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00