Commit Graph

19 Commits

Author SHA1 Message Date
Ilya Tocar 9eb219480e compress/bzip2: remove bit-tricks
Since compiler is now able to generate conditional moves, we can replace
bit-tricks with simple if/else. This even results in slightly better performance:

name            old time/op    new time/op    delta
DecodeDigits-6    13.4ms ± 4%    13.0ms ± 2%  -2.63%  (p=0.003 n=10+10)
DecodeTwain-6     37.5ms ± 1%    36.3ms ± 1%  -3.03%  (p=0.000 n=10+9)
DecodeRand-6      4.23ms ± 1%    4.07ms ± 1%  -3.67%  (p=0.000 n=10+9)

name            old speed      new speed      delta
DecodeDigits-6  7.47MB/s ± 4%  7.67MB/s ± 2%  +2.69%  (p=0.002 n=10+10)
DecodeTwain-6   10.4MB/s ± 1%  10.7MB/s ± 1%  +3.25%  (p=0.000 n=10+8)
DecodeRand-6    3.87MB/s ± 1%  4.03MB/s ± 2%  +4.08%  (p=0.000 n=10+10)
diff --git a/src/compress/bzip2/huffman.go b/src/compress/bzip2/huffman.go

Change-Id: Ie96ef1a9e07013b07e78f22cdccd531f3341caca
Reviewed-on: https://go-review.googlesource.com/102015
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Joe Tsai <joetsai@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-21 21:57:15 +00:00
Joe Kyo b26a4a1401 compress/bzip2: use sort.Slice in huffman.go
Change-Id: Ie4d23cdb81473a4c989a977a127479cf825084dc
Reviewed-on: https://go-review.googlesource.com/77850
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-17 00:34:26 +00:00
Joe Tsai 57f7bc3a05 compress/bzip2: fix checksum mismatch on empty reads
Previously, the read method checked whether the current block
was fully consumed or not based on whether the buffer could be filled
with a non-zero number of bytes. This check is problematic because
zero bytes could be read if the provided buffer is empty.

We fix this case by simply checking for whether the input buffer
provided by the user was empty or not. If empty, we assume that
we could not read any bytes because the buffer was too small,
rather than indicating that the current block was fully exhausted.

This check causes bzip2.Reader to be unable to make progress
on the next block unless a non-empty buffer is provided.
However, that is an entirely reasonable expectation since a
non-empty buffer needs to be provided eventually anyways to
read the actual contents of subsequent blocks.

Fixes #22028

Change-Id: I2bb1b2d54e78567baf2bf7b490a272c0853d7bfe
Reviewed-on: https://go-review.googlesource.com/66110
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-25 23:05:58 +00:00
Joe Tsai e0e4891827 compress/bzip2: remove dead code in huffman.go
The logic performs a series of shifts, which are useless given
that they are followed by an assignment that overrides the
value of the previous computation.

I suspect (but cannot prove) that this is leftover logic from an
original approach that attempted to store both the Huffman code
and the length within the same variable instead of using two
different variables as it currently does now.

Fixes #17949

Change-Id: Ibf6c807c6cef3b28bfdaf2b68d9bc13503ac21b2
Reviewed-on: https://go-review.googlesource.com/44091
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-05-24 21:56:48 +00:00
Elias Naur 2b780af08e Revert "all: test adjustments for the iOS builder"
This reverts commit 467109bf56.

Replaced by a improved strategy later in the CL relation chain.

Change-Id: Ib90813b5a6c4716b563c8496013d2d57f9c022b8
Reviewed-on: https://go-review.googlesource.com/36066
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-04 00:01:14 +00:00
David Crawshaw 467109bf56 all: test adjustments for the iOS builder
The working directory is now adjusted to match the typical Go test
working directory in main, as the old trick for adjusting earlier
stopped working with the latest version of LLDB bugs.

That means the small number of places where testdata files are
read before main is called no longer work. This CL adjusts those
reads to happen after main is called. (This has the bonus effect of
not reading some benchmark testdata files in all.bash.)

Fixes compress/bzip2, go/doc, go/parser, os, and time package
tests on the iOS builder.

Change-Id: If60f026aa7848b37511c36ac5e3985469ec25209
Reviewed-on: https://go-review.googlesource.com/35255
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-14 03:27:53 +00:00
Matthew Dempsky 0da4dbe232 all: remove unnecessary type conversions
cmd and runtime were handled separately, and I'm intentionally skipped
syscall. This is the rest of the standard library.

CL generated mechanically with github.com/mdempsky/unconvert.

Change-Id: I9e0eff886974dedc37adb93f602064b83e469122
Reviewed-on: https://go-review.googlesource.com/22104
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-15 07:31:45 +00:00
Dominik Honnef fdba5a7544 all: delete dead non-test code
This change removes a lot of dead code. Some of the code has never been
used, not even when it was first commited. The rest shouldn't have
survived refactors.

This change doesn't remove unused routines helpful for debugging, nor
does it remove code that's used in commented out blocks of code that are
only unused temporarily. Furthermore, unused constants weren't removed
when they were part of a set of constants from specifications.

One noteworthy omission from this CL are about 1000 lines of unused code
in cmd/fix, 700 lines of which are the typechecker, which hasn't been
used ever since the pre-Go 1 fixes have been removed. I wasn't sure if
this code should stick around for future uses of cmd/fix or be culled as
well.

Change-Id: Ib714bc7e487edc11ad23ba1c3222d1fd02e4a549
Reviewed-on: https://go-review.googlesource.com/20926
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-25 06:28:13 +00:00
Joe Tsai 8a83d36b8e compress/bzip2: refactor unit tests
Over the years as more bugs were discovered with the bzip2 library,
new Tests were appended the unit tests and the tests became gnarly.

Clean up the tests to be more consistent with modern Go style in
addition to coalescing common tests into a general version that
iterates over a list of input/output pairs. This has the advantage that
the input, output, and test code are all in the same area, rather than
being sprawled around the test file.

There is no loss of test coverage.

Change-Id: I377ed89378f0b89763d4a56ffc37b22d9c2a369e
Reviewed-on: https://go-review.googlesource.com/20133
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-03-03 02:53:34 +00:00
Joe Tsai 8b360d5fda compress/bzip2: prevent zero-length Huffman codes
Unlike RFC 1951 (DEFLATE), bzip2 does not use zero-length Huffman codes
to indicate that the symbol is missing. Instead, bzip2 uses a sparse
bitmap to indicate which symbols are present. Thus, it is undefined what
happens when a length of zero is used. Thus, fix the parsing logic so that
the length cannot ever go below 1-bit similar to how the C logic does things.

To confirm that the C bzip2 utility chokes on this data:
	$ echo "425a6836314159265359b1f7404b000000400040002000217d184682ee48
	a70a12163ee80960" | xxd -r -p | bzip2 -d

	bzip2: Data integrity error when decompressing

For reference see:
	bzip2-1.0.6/decompress.c:320

Change-Id: Ic1568f8e7f80cdea51d887b4d712cc239c2fe85e
Reviewed-on: https://go-review.googlesource.com/20119
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-03-02 16:32:46 +00:00
Joe Tsai ff27421067 compress/bzip2: fix benchmark to actually measure decompression rate
Motivation:
* Previously, the size of the compressed data was used for metrics,
rather than the uncompressed size. This causes the library to appear
to perform poorly relative to C or other implementation. Switch it
to use the uncompressed size so that it matches how decompression
benchmarks are usually done (like in compress/flate). This also makes
it easier to compare bzip2 rates to other algorithms since they measure
performance in this way.
* Also, reset the timer after doing initialization work.

Change-Id: I32112c2ee8e7391e658c9cf31039f70a689d9b9d
Reviewed-on: https://go-review.googlesource.com/17611
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-28 11:06:40 +00:00
Joe Tsai 47379929f1 compress/bzip2: use correct block size
The bzip2 block size is a multiple of 100*1000 not 100*1024.
Thus, the bzip2 decoder would incorrectly decode files with larger
block sizes when it should have otherwise failed.
Fortunately, we can correct this in a backwards compatible way since
Go has no implementation of a bzip2 encoder to produce bad blocks :)

To confirm that the C bzip2 utlity chokes on this data:
	$ echo "425a683131415926535936dc55330063ffc0006000200020a40830008b00
	08b8bb9229c28481b6e2a998" | xxd -r -p | bzip2 -d

	bzip2: Data integrity error when decompressing.

Fixes #13941

Change-Id: I2402e8829a8027ef94dd4fac050b200440a3d4e4
Reviewed-on: https://go-review.googlesource.com/20011
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-28 10:54:03 +00:00
Joe Tsai 0ea1c1f671 compress/bzip2/testdata: make Mark.Twain-Tom.Sawyer.txt free
Commit 7a1fb95d50 strips non-free license
from Mark.Twain-Tom.Sawyer.txt, but forgot to remove it from the compressed
version of the file.

Update #13216

Change-Id: I60f53275d56ba5baa6898db47b1d41f85e985c00
Reviewed-on: https://go-review.googlesource.com/17264
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-02 02:40:38 +00:00
Jakub Čajka 7a1fb95d50 compress: make Mark.Twain-Tom.Sawyer.txt licensed under non-free license free again
This change strips non-free license from Mark.Twain-Tom.Sawyer.txt along with all reference to Project Gutenberg in the file and the whole source tree. Making the file public domain again.

Fixes #13216

Change-Id: I2f41b0de225f627dde152efe93c006a4c24be668
Reviewed-on: https://go-review.googlesource.com/17196
Reviewed-by: Andrew Gerrand <adg@golang.org>
2015-11-24 21:50:01 +00:00
Alberto Donizetti 6403c957e0 compress/bzip2: make decoding faster
Issue 6754 reports that Go bzip2 Decode function is much slower
(about 2.5x in go1.5) than the Python equivalent (which is
actually just a wrapper around the usual C library) on random data.

Profiling the code shows that half a dozen of CMP instructions in a
tight loop are responsibile for most of the execution time.

This patch reduces the number of branches of the loop, greatly
improving performance on random data and speeding up decoding of
real data.

name            old time/op    new time/op    delta
DecodeDigits-4    9.28ms ± 1%    8.05ms ± 1%  -13.18%  (p=0.000 n=15+14)
DecodeTwain-4     28.9ms ± 2%    26.4ms ± 1%   -8.57%  (p=0.000 n=15+14)
DecodeRand-4      3.94ms ± 1%    3.06ms ± 1%  -22.45%  (p=0.000 n=15+14)

name            old speed      new speed      delta
DecodeDigits-4  4.65MB/s ± 1%  5.36MB/s ± 1%  +15.21%  (p=0.000 n=13+14)
DecodeTwain-4   4.32MB/s ± 2%  4.72MB/s ± 1%   +9.36%  (p=0.000 n=15+14)
DecodeRand-4    4.27MB/s ± 1%  5.51MB/s ± 1%  +28.86%  (p=0.000 n=15+14)

I've run some benchmark comparing Go bzip2 implementation with the
usual Linux bzip2 command (which is written in C). On my machine
this patch brings go1.5
  from ~2.26x to ~1.50x of bzip2 time (on 64MB  random data)
  from ~1.70x to ~1.50x of bzip2 time (on 100MB english text)
  from ~2.00x to ~1.88x of bzip2 time (on 64MB  /dev/zero data)

Fixes #6754

Change-Id: I3cb12d2c0c2243c1617edef1edc88f05f91d26d1
Reviewed-on: https://go-review.googlesource.com/13853
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2015-08-28 04:20:56 +00:00
Péter Surányi 9b6ccb1323 all: don't refer to code.google.com/p/go{,-wiki}/
Only documentation / comment changes. Update references to
point to golang.org permalinks or go.googlesource.com/go.
References in historical release notes under doc are left as is.

Change-Id: Icfc14e4998723e2c2d48f9877a91c5abef6794ea
Reviewed-on: https://go-review.googlesource.com/4060
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-02-06 14:41:47 +00:00
David Crawshaw 895e4b8550 compress/bzip2: s/repeat_power/repeatPower/
Change-Id: I64c8c247acd5d134b2f17ed7aab0a035d7710679
Reviewed-on: https://go-review.googlesource.com/1804
Reviewed-by: Minux Ma <minux@golang.org>
2014-12-19 01:29:00 +00:00
Russ Cox a6abe22eb6 compress/*: note that NewReader may introduce buffering
Fixes #8309.

LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant
https://golang.org/cl/147380043
2014-09-30 12:31:18 -04:00
Russ Cox c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00