Commit Graph

46 Commits

Author SHA1 Message Date
yangwenmai d553c0144d bits: use same expression with system bit size
Change-Id: Ibce07f8f36f7c64f7022ce656f8efbec5dff3f82
Reviewed-on: https://go-review.googlesource.com/c/go/+/313829
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-27 16:25:40 +00:00
Meng Zhuo 119d76d98e math/bits: folded reverse tables by using const string
linux/mips64le

name            old time/op  new time/op  delta
Reverse         2.31ns ± 1%  2.27ns ± 1%  -1.53%  (p=0.001 n=10+10)
Reverse8        0.65ns ± 1%  0.65ns ± 1%  -1.19%  (p=0.000 n=9+10)
Reverse16       1.15ns ± 2%  1.14ns ± 2%    ~     (p=0.062 n=9+10)
Reverse32       1.96ns ± 1%  1.94ns ± 1%  -1.16%  (p=0.000 n=10+9)
Reverse64       2.29ns ± 1%  2.26ns ± 0%  -0.94%  (p=0.000 n=9+9)
ReverseBytes    0.66ns ± 3%  0.65ns ± 1%  -1.58%  (p=0.006 n=9+10)
ReverseBytes16  0.66ns ± 2%  0.65ns ± 1%  -2.05%  (p=0.000 n=10+9)
ReverseBytes32  0.41ns ± 1%  0.40ns ± 0%  -1.68%  (p=0.000 n=10+10)
ReverseBytes64  0.66ns ± 1%  0.65ns ± 1%  -1.50%  (p=0.000 n=10+9)

cpu=1 benchtime=100ms count=100

name            old time/op  new time/op  delta
Reverse         28.0ns ± 3%  27.7ns ± 3%   -0.80%  (p=0.000 n=100+98)
Reverse8        2.24ns ± 1%  2.24ns ± 1%     ~     (p=0.142 n=98+100)
Reverse16       4.07ns ± 3%  4.05ns ± 3%   -0.66%  (p=0.000 n=99+99)
Reverse32       11.3ns ± 0%  11.3ns ± 0%     ~     (p=0.283 n=94+97)
Reverse64       12.6ns ± 0%  12.6ns ± 0%   +0.60%  (p=0.000 n=100+98)
ReverseBytes    5.25ns ± 1%  5.24ns ± 1%   -0.18%  (p=0.000 n=100+100)
ReverseBytes16  2.00ns ± 0%  2.21ns ± 3%  +10.07%  (p=0.000 n=88+100)
ReverseBytes32  4.08ns ± 2%  4.13ns ± 2%   +1.39%  (p=0.000 n=99+99)
ReverseBytes64  5.48ns ± 1%  5.45ns ± 1%   -0.50%  (p=0.000 n=98+99)

Update #43403

Change-Id: I7e7e00bb17608739d9f6b927c6dfef2580493a0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/280645
Trust: Meng Zhuo <mzh@golangcn.org>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-03-17 03:18:12 +00:00
Russ Cox d4b2638234 all: go fmt std cmd (but revert vendor)
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).

Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild

Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-02-20 03:54:50 +00:00
Russ Cox 4f1b0a44cb all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTemp
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)

Update the Go tree to use the preferred names.

As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.

ReadDir changes are in a separate CL, because they are not a
simple search and replace.

For #42026.

Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-12-09 19:12:23 +00:00
Michael Munday ab7a65f283 cmd/compile: clean up codegen for branch-on-carry on s390x
This CL optimizes code that uses a carry from a function such as
bits.Add64 as the condition in an if statement. For example:

    x, c := bits.Add64(a, b, 0)
    if c != 0 {
        panic("overflow")
    }

Rather than converting the carry into a 0 or a 1 value and using
that as an input to a comparison instruction the carry flag is now
used as the input to a conditional branch directly. This typically
removes an ADD LOGICAL WITH CARRY instruction when user code is
doing overflow detection and is closer to the code that a user
would expect to generate.

Change-Id: I950431270955ab72f1b5c6db873b6abe769be0da
Reviewed-on: https://go-review.googlesource.com/c/go/+/219757
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-22 20:11:06 +00:00
Alberto Donizetti 57c63e0fb2 math/bits: add Rem, Rem32, Rem64
The Div functions in math/bits (Div, Div32, and Div64) compute both
quotients and remainders, but they panic if the quotients do not not
fit a 32/64 uint.

Since, on the other hand, the remainder will always fit the size of
the divisor, it is useful to have Div variants that only compute the
remainder, and don't panic on a quotient overflow.

This change adds to the math/bits package three new functions:

  Rem(hi, lo, y uint) uint
  Rem32(hi, lo, y uint32) uint32
  Rem64(hi, lo, y uint64) uint64

which can be used to compute (hi,lo)%y even when the quotient
overflows the uint size.

Fixes #28970

Change-Id: I119948429f737670c5e5ceb8756121e6a738dbdc
Reviewed-on: https://go-review.googlesource.com/c/go/+/197838
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-10-18 13:47:36 +00:00
Filippo Valsorda 41329c07f9 math/bits: document that Add, Sub, Mul, RotateLeft, ReverseBytes are constant time
Fixes #31267

Change-Id: I91e4aa8cf9d797689cb9612d0fe3bf1bb3ad15a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/178177
Reviewed-by: Keith Randall <khr@golang.org>
2019-05-21 20:15:52 +00:00
adarsh ravichandran 776e1709e5 math/bits: add example for OnesCount function
Change-Id: Id87db9bed5e8715d554c1bf95c063d7d0a03c3e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/178117
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-05-20 18:16:09 +00:00
smasher164 5ca44dc403 math/bits: make Add and Sub fallbacks constant time
Make the extended precision add-with-carry and sub-with-carry operations
take a constant amount of time to execute, regardless of input.

name             old time/op  new time/op  delta
Add-4            1.16ns ±11%  1.51ns ± 5%  +30.52%  (p=0.008 n=5+5)
Add32-4          1.08ns ± 0%  1.03ns ± 1%   -4.86%  (p=0.029 n=4+4)
Add64-4          1.09ns ± 1%  1.95ns ± 3%  +79.23%  (p=0.008 n=5+5)
Add64multiple-4  4.03ns ± 1%  4.55ns ±11%  +13.07%  (p=0.008 n=5+5)
Sub-4            1.08ns ± 1%  1.50ns ± 0%  +38.17%  (p=0.016 n=5+4)
Sub32-4          1.09ns ± 2%  1.53ns ±10%  +40.26%  (p=0.008 n=5+5)
Sub64-4          1.10ns ± 1%  1.47ns ± 1%  +33.39%  (p=0.008 n=5+5)
Sub64multiple-4  4.30ns ± 2%  4.08ns ± 4%   -5.07%  (p=0.032 n=5+5)

Fixes #31267

Change-Id: I1824b1b3ab8f09902ce8b5fef84ce2fdb8847ed9
Reviewed-on: https://go-review.googlesource.com/c/go/+/170758
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-05-20 01:13:27 +00:00
Than McIntosh bead358611 math/bits: add gccgo-friendly code for compiler bootstrap
When building as part of the bootstrap process, avoid
use of "go:linkname" applied to variables, since this
feature is ill-defined/unsupported for gccgo.

Updates #30771.

Change-Id: Id44d01b5c98d292702e5075674117518cb59e2d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/170737
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-04-04 18:17:30 +00:00
erifan01 d0cbf9bf53 cmd/compile: follow up intrinsifying math/bits.Add64 for arm64
This CL deals with the additional comments of CL 159017.

Change-Id: I4ad3c60c834646d58dc0c544c741b92bfe83fb8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/168857
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-22 15:09:47 +00:00
erifan01 5714c91b53 cmd/compile: intrinsify math/bits.Add64 for arm64
This CL instrinsifies Add64 with arm64 instruction sequence ADDS, ADCS
and ADC, and optimzes the case of carry chains.The CL also changes the
test code so that the intrinsic implementation can be tested.

Benchmarks:
name               old time/op       new time/op       delta
Add-224            2.500000ns +- 0%  2.090000ns +- 4%  -16.40%  (p=0.000 n=9+10)
Add32-224          2.500000ns +- 0%  2.500000ns +- 0%     ~     (all equal)
Add64-224          2.500000ns +- 0%  1.577778ns +- 2%  -36.89%  (p=0.000 n=10+9)
Add64multiple-224  6.000000ns +- 0%  2.000000ns +- 0%  -66.67%  (p=0.000 n=10+10)

Change-Id: I6ee91c9a85c16cc72ade5fd94868c579f16c7615
Reviewed-on: https://go-review.googlesource.com/c/go/+/159017
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-20 05:39:49 +00:00
Michael Munday 42a82ce1a7 math/bits: optimize Reverse32 and Reverse64
Use ReverseBytes32 and ReverseBytes64 to speed up these functions.
The byte reversal functions are intrinsics on most platforms and
generally compile to a single instruction.

name       old time/op  new time/op  delta
Reverse32  2.41ns ± 1%  1.94ns ± 3%  -19.60%  (p=0.000 n=20+19)
Reverse64  3.85ns ± 1%  2.56ns ± 1%  -33.32%  (p=0.000 n=17+19)

Change-Id: I160bf59a0c7bd5db94114803ec5a59fae448f096
Reviewed-on: https://go-review.googlesource.com/c/159358
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-26 17:52:08 +00:00
Alberto Donizetti 11ce6eabd6 math/bits: remove named return in TrailingZeros16
TrailingZeros16 is the only one of the TrailingZeros functions with a
named return value in the signature. This creates a sligthly
unpleasant effect in the godoc listing:

  func TrailingZeros(x uint) int
  func TrailingZeros16(x uint16) (n int)
  func TrailingZeros32(x uint32) int
  func TrailingZeros64(x uint64) int
  func TrailingZeros8(x uint8) int

Since the named return value is not even used, remove it.

Change-Id: I15c5aedb6157003911b6e0685c357ce56e466c0e
Reviewed-on: https://go-review.googlesource.com/c/153340
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-09 14:27:56 +00:00
Brian Kessler 979d9027ae math/bits: define Div to panic when y<=hi
Div panics when y<=hi because either the quotient overflows
the size of the output or division by zero occurs when y==0.
This provides a uniform behavior for all implementations.

Fixes #28316

Change-Id: If23aeb10e0709ee1a60b7d614afc9103d674a980
Reviewed-on: https://go-review.googlesource.com/c/149517
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-11-27 05:04:33 +00:00
Brian Kessler ead5d1e316 math/bits: panic when y<=hi in Div
Explicitly check for divide-by-zero/overflow and panic with the appropriate
runtime error.  The additional checks have basically no effect on performance
since the branch is easily predicted.

name     old time/op  new time/op  delta
Div-4    53.9ns ± 1%  53.0ns ± 1%  -1.59%  (p=0.016 n=4+5)
Div32-4  17.9ns ± 0%  18.4ns ± 0%  +2.56%  (p=0.008 n=5+5)
Div64-4  53.5ns ± 0%  53.3ns ± 0%    ~     (p=0.095 n=5+5)

Updates #28316

Change-Id: I36297ee9946cbbc57fefb44d1730283b049ecf57
Reviewed-on: https://go-review.googlesource.com/c/144377
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-27 05:04:14 +00:00
Keith Randall 899f3a2892 cmd/compile: intrinsify math/bits.Add on amd64
name             old time/op  new time/op  delta
Add-8            1.11ns ± 0%  1.18ns ± 0%   +6.31%  (p=0.029 n=4+4)
Add32-8          1.02ns ± 0%  1.02ns ± 1%     ~     (p=0.333 n=4+5)
Add64-8          1.11ns ± 1%  1.17ns ± 0%   +5.79%  (p=0.008 n=5+5)
Add64multiple-8  4.35ns ± 1%  0.86ns ± 0%  -80.22%  (p=0.000 n=5+4)

The individual ops are a bit slower (but still very fast).
Using the ops in carry chains is very fast.

Update #28273

Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5
Reviewed-on: https://go-review.googlesource.com/c/144257
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:00 +00:00
Brian Kessler 127c51e48c math/bits: correct BenchmarkSub64
Previously, the benchmark was measuring Add64 instead of Sub64.

Change-Id: I0cf30935c8a4728bead9868834377aae0b34f008
Reviewed-on: https://go-review.googlesource.com/c/144380
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-24 14:53:19 +00:00
Brian Kessler 13de5e7f7f math/bits: add extended precision Add, Sub, Mul, Div
Port math/big pure go versions of add-with-carry, subtract-with-borrow,
full-width multiply, and full-width divide.

Updates #24813

Change-Id: Ifae5d2f6ee4237137c9dcba931f69c91b80a4b1c
Reviewed-on: https://go-review.googlesource.com/123157
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-11 17:54:20 +00:00
Martin Möhrmann 8c4170b2c9 math/bits: move tests into their own package
This makes math/bits not have any explicit imports even
when compiling tests and thereby avoids import cycles when
dependencies of testing want to import math/bits.

Change-Id: I95eccae2f5c4310e9b18124abfa85212dfbd9daa
Reviewed-on: https://go-review.googlesource.com/110479
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-01 15:33:01 +00:00
Tobias Klauser 89bcbf40b8 math/bits: add examples for right rotation
Right rotation is achieved using negative k in RotateLeft*(x, k). Add
examples demonstrating that functionality.

Change-Id: I15dab159accd2937cb18d3fa8ca32da8501567d3
Reviewed-on: https://go-review.googlesource.com/75371
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-03 20:12:07 +00:00
griesemer 2ddd07138d math/bits: complete examples
Change-Id: Icbe6885ffd3aa4e77441ab03a2b9a04a9276d5eb
Reviewed-on: https://go-review.googlesource.com/68311
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2017-10-06 16:58:03 +00:00
romanyx 92cfd07a6c math/bits: examples generator
Change-Id: Icdd0566d3b7dbc034256e16f8a6b6f1af07069b3
Reviewed-on: https://go-review.googlesource.com/54350
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-11 11:05:01 +00:00
Wembley G. Leach, Jr 762a0bae06 math/bits: Add examples for Reverse functions
Change-Id: I30563d31f6acea594cc853cc6b672ec664f90d48
Reviewed-on: https://go-review.googlesource.com/53636
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-09 18:02:36 +00:00
romanyx fa155066c4 math/bits: some regular examples for functions
Change-Id: Iee1b3e116b4dcc4071d6512abc5241eabedaeb5c
Reviewed-on: https://go-review.googlesource.com/53850
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-09 13:25:29 +00:00
Francesc Campoy Flores 3e3da54633 math/bits: fix example for OnesCount64
Erroneously called OnesCount instead of OnesCount64

Change-Id: Ie877e43f213253e45d31f64931c4a15915849586
Reviewed-on: https://go-review.googlesource.com/53410
Reviewed-by: Chris Broadfoot <cbro@golang.org>
2017-08-05 00:20:37 +00:00
Francesc Campoy 9b1e7cf2ac math/bits: add examples for OnesCount functions
Change-Id: Ie673f9665825a40281c2584d478ba1260f725856
Reviewed-on: https://go-review.googlesource.com/53357
Run-TryBot: Chris Broadfoot <cbro@golang.org>
Reviewed-by: Chris Broadfoot <cbro@golang.org>
2017-08-04 23:24:07 +00:00
Dylan Waits 5f7b3fabe1 math/bits: add examples for leading zero methods
Change-Id: Ib491d144387a7675af370f7b925fe6e62440d153
Reviewed-on: https://go-review.googlesource.com/48966
Run-TryBot: Kevin Burke <kev@inburke.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Kevin Burke <kev@inburke.com>
2017-07-15 21:55:58 +00:00
Lynn Boger 8304d10763 cmd/compile: ppc64x intrinsics for math/bits
This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on
ppc64x.

benchmark                       old ns/op     new ns/op     delta
BenchmarkLeadingZeros-16        4.26          1.71          -59.86%
BenchmarkLeadingZeros16-16      3.04          1.83          -39.80%
BenchmarkLeadingZeros32-16      3.31          1.82          -45.02%
BenchmarkLeadingZeros64-16      3.69          1.71          -53.66%
BenchmarkTrailingZeros-16       2.55          1.62          -36.47%
BenchmarkTrailingZeros32-16     2.55          1.77          -30.59%
BenchmarkTrailingZeros64-16     2.78          1.62          -41.73%
BenchmarkOnesCount-16           3.19          0.93          -70.85%
BenchmarkOnesCount32-16         2.55          1.18          -53.73%
BenchmarkOnesCount64-16         3.22          0.93          -71.12%

Update #18616

I also made a change to bits_test.go because when debugging some failures
the output was not quite providing the right argument information.

Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e
Reviewed-on: https://go-review.googlesource.com/41630
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-10 12:10:56 +00:00
Robert Griesemer 9d01def597 math/bits: support negative rotation count and remove RotateRight
For details see the discussion on the issue below.

RotateLeft functions can now be inlined because the don't panic
anymore for negative rotation counts.

name            old time/op  new time/op  delta
RotateLeft-8    6.72ns ± 2%  1.86ns ± 0%  -72.33%  (p=0.016 n=5+4)
RotateLeft8-8   4.41ns ± 2%  1.67ns ± 1%  -62.15%  (p=0.008 n=5+5)
RotateLeft16-8  4.46ns ± 6%  1.65ns ± 0%  -63.06%  (p=0.008 n=5+5)
RotateLeft32-8  4.50ns ± 5%  1.67ns ± 1%  -62.86%  (p=0.008 n=5+5)
RotateLeft64-8  4.54ns ± 1%  1.85ns ± 1%  -59.32%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20170411.4

(Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.)

For #18616.

Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686
Reviewed-on: https://go-review.googlesource.com/40394
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-11 23:57:24 +00:00
Robert Griesemer 32b41c8dc7 math/bits: move left-over functionality from bits_impl.go to bits.go
Removes an extra function call for TrailingZeroes and thus may
increase chances for inlining.

Change-Id: Iefd8d4402dc89b64baf4e5c865eb3dadade623af
Reviewed-on: https://go-review.googlesource.com/37613
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-28 23:50:47 +00:00
Robert Griesemer 83bc4a2fee math/bits: faster LeadingZeros and Len functions
benchmark                     old ns/op     new ns/op     delta
BenchmarkLeadingZeros-8       8.43          3.10          -63.23%
BenchmarkLeadingZeros8-8      8.13          1.33          -83.64%
BenchmarkLeadingZeros16-8     7.34          2.07          -71.80%
BenchmarkLeadingZeros32-8     7.99          2.87          -64.08%
BenchmarkLeadingZeros64-8     8.13          2.96          -63.59%

Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.

Change-Id: Id343531b408d42ac45f10c76f60e85bdb977f91e
Reviewed-on: https://go-review.googlesource.com/37582
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-28 20:55:13 +00:00
Robert Griesemer 9515cb511a math/bits: faster TrailingZeroes8
For sizes > 8, the existing code is faster.

benchmark                     old ns/op     new ns/op     delta
BenchmarkTrailingZeros8-8     1.95          1.29          -33.85%

Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.

Change-Id: I6f3a33ec633a2c544ec29693c141f2f99335c745
Reviewed-on: https://go-review.googlesource.com/37581
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-28 20:55:01 +00:00
Robert Griesemer d7a659b11b math/bits: faster OnesCount using table lookups for sizes 8,16,32
For uint64, the existing algorithm is faster.

benchmark                  old ns/op     new ns/op     delta
BenchmarkOnesCount8-8      1.95          0.97          -50.26%
BenchmarkOnesCount16-8     2.54          1.39          -45.28%
BenchmarkOnesCount32-8     2.61          1.96          -24.90%

Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.

Change-Id: I6cc42882fef3d24694720464039161e339a9ae99
Reviewed-on: https://go-review.googlesource.com/37580
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-28 20:54:49 +00:00
Robert Griesemer e18adbf88d math/bits: faster Reverse8/16 functions using table lookups
Measured on 2.3 GHz Intel Core i7, running macOS 10.12.3:

benchmark                old ns/op     new ns/op     delta
BenchmarkReverse8-8      1.70          0.99          -41.76%
BenchmarkReverse16-8     2.24          1.32          -41.07%

Fixes #19279.

Change-Id: I398cf8a3513b7fa63c130efc7846a7c5353999d4
Reviewed-on: https://go-review.googlesource.com/37459
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-25 22:18:58 +00:00
Robert Griesemer ac91a514ff math/bits: fix incorrect doc strings for TrailingZeros functions
Change-Id: I3e40018ab1903d3b9ada7ad7812ba71ea2a428e7
Reviewed-on: https://go-review.googlesource.com/37456
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-25 00:58:25 +00:00
Robert Griesemer 177dfba112 math/bits: faster OnesCount
Using some additional suggestions per "Hacker's Delight".
Added documentation and extra tests.

Measured on 1.7 GHz Intel Core i7, running macOS 10.12.3.

benchmark                  old ns/op     new ns/op     delta
BenchmarkOnesCount-4       7.34          5.38          -26.70%
BenchmarkOnesCount8-4      2.03          1.98          -2.46%
BenchmarkOnesCount16-4     2.56          2.50          -2.34%
BenchmarkOnesCount32-4     2.98          2.39          -19.80%
BenchmarkOnesCount64-4     4.22          2.96          -29.86%

Change-Id: I566b0ef766e55cf5776b1662b6016024ebe5d878
Reviewed-on: https://go-review.googlesource.com/37223
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-19 18:50:48 +00:00
Robert Griesemer a4a3d63dbe math/bits: added benchmarks for Leading/TrailingZeros
BenchmarkLeadingZeros-8      	200000000	         8.80 ns/op
BenchmarkLeadingZeros8-8     	200000000	         8.21 ns/op
BenchmarkLeadingZeros16-8    	200000000	         7.49 ns/op
BenchmarkLeadingZeros32-8    	200000000	         7.80 ns/op
BenchmarkLeadingZeros64-8    	200000000	         8.67 ns/op

BenchmarkTrailingZeros-8     	1000000000	         2.05 ns/op
BenchmarkTrailingZeros8-8    	2000000000	         1.94 ns/op
BenchmarkTrailingZeros16-8   	2000000000	         1.94 ns/op
BenchmarkTrailingZeros32-8   	2000000000	         1.92 ns/op
BenchmarkTrailingZeros64-8   	2000000000	         2.03 ns/op

Change-Id: I45497bf2d6369ba6cfc88ded05aa735908af8908
Reviewed-on: https://go-review.googlesource.com/37220
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 23:41:16 +00:00
Robert Griesemer 19028bdd18 math/bits: faster Rotate functions, added respective benchmarks
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

benchmark                    old ns/op     new ns/op     delta
BenchmarkRotateLeft-8        7.87          7.00          -11.05%
BenchmarkRotateLeft8-8       8.41          4.52          -46.25%
BenchmarkRotateLeft16-8      8.07          4.55          -43.62%
BenchmarkRotateLeft32-8      8.36          4.73          -43.42%
BenchmarkRotateLeft64-8      7.93          4.78          -39.72%

BenchmarkRotateRight-8       8.23          6.72          -18.35%
BenchmarkRotateRight8-8      8.76          4.39          -49.89%
BenchmarkRotateRight16-8     9.07          4.44          -51.05%
BenchmarkRotateRight32-8     8.85          4.46          -49.60%
BenchmarkRotateRight64-8     8.11          4.43          -45.38%

Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1
Reviewed-on: https://go-review.googlesource.com/37219
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 23:40:45 +00:00
Robert Griesemer a12edb8db6 math/bits: faster OnesCount, added respective benchmarks
Also: Changed Reverse/ReverseBytes implementations to use
the same (smaller) masks as OnesCount.

BenchmarkOnesCount-8          37.0          6.26          -83.08%
BenchmarkOnesCount8-8         7.24          1.99          -72.51%
BenchmarkOnesCount16-8        11.3          2.47          -78.14%
BenchmarkOnesCount32-8        18.4          3.02          -83.59%
BenchmarkOnesCount64-8        40.0          3.78          -90.55%
BenchmarkReverse-8            6.69          6.22          -7.03%
BenchmarkReverse8-8           1.64          1.64          +0.00%
BenchmarkReverse16-8          2.26          2.18          -3.54%
BenchmarkReverse32-8          2.88          2.87          -0.35%
BenchmarkReverse64-8          5.64          4.34          -23.05%
BenchmarkReverseBytes-8       2.48          2.17          -12.50%
BenchmarkReverseBytes16-8     0.63          0.95          +50.79%
BenchmarkReverseBytes32-8     1.13          1.24          +9.73%
BenchmarkReverseBytes64-8     2.50          2.16          -13.60%

OnesCount-8       37.0ns ± 0%   6.3ns ± 0%   ~             (p=1.000 n=1+1)
OnesCount8-8      7.24ns ± 0%  1.99ns ± 0%   ~             (p=1.000 n=1+1)
OnesCount16-8     11.3ns ± 0%   2.5ns ± 0%   ~             (p=1.000 n=1+1)
OnesCount32-8     18.4ns ± 0%   3.0ns ± 0%   ~             (p=1.000 n=1+1)
OnesCount64-8     40.0ns ± 0%   3.8ns ± 0%   ~             (p=1.000 n=1+1)
Reverse-8         6.69ns ± 0%  6.22ns ± 0%   ~             (p=1.000 n=1+1)
Reverse8-8        1.64ns ± 0%  1.64ns ± 0%   ~     (all samples are equal)
Reverse16-8       2.26ns ± 0%  2.18ns ± 0%   ~             (p=1.000 n=1+1)
Reverse32-8       2.88ns ± 0%  2.87ns ± 0%   ~             (p=1.000 n=1+1)
Reverse64-8       5.64ns ± 0%  4.34ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes-8    2.48ns ± 0%  2.17ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes16-8  0.63ns ± 0%  0.95ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes32-8  1.13ns ± 0%  1.24ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes64-8  2.50ns ± 0%  2.16ns ± 0%   ~             (p=1.000 n=1+1)

Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497
Reviewed-on: https://go-review.googlesource.com/37218
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 23:40:10 +00:00
Robert Griesemer 4498b68390 math/bits: faster Reverse, ReverseBytes
- moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
  This permits use of the same constant m twice (*) which may be
  better for machines that can't use large immediate constants
  directly with an AND instruction and have to load them explicitly.
  *) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)

- simplified returns
  This improves the generated code because the compiler recognizes
  x>>k | x<<k as ROT when k is the bitsize of x.

The 8-bit versions of these instructions can be significantly faster
still if they are replaced with table lookups, as long as the table
is in cache. If the table is not in cache, table-lookup is probably
slower, hence the choice of an explicit register-only implementation
for now.

BenchmarkReverse-8            8.50          6.86          -19.29%
BenchmarkReverse8-8           2.17          1.74          -19.82%
BenchmarkReverse16-8          2.89          2.34          -19.03%
BenchmarkReverse32-8          3.55          2.95          -16.90%
BenchmarkReverse64-8          6.81          5.57          -18.21%
BenchmarkReverseBytes-8       3.49          2.48          -28.94%
BenchmarkReverseBytes16-8     0.93          0.62          -33.33%
BenchmarkReverseBytes32-8     1.55          1.13          -27.10%
BenchmarkReverseBytes64-8     2.47          2.47          +0.00%

Reverse-8         8.50ns ± 0%  6.86ns ± 0%   ~             (p=1.000 n=1+1)
Reverse8-8        2.17ns ± 0%  1.74ns ± 0%   ~             (p=1.000 n=1+1)
Reverse16-8       2.89ns ± 0%  2.34ns ± 0%   ~             (p=1.000 n=1+1)
Reverse32-8       3.55ns ± 0%  2.95ns ± 0%   ~             (p=1.000 n=1+1)
Reverse64-8       6.81ns ± 0%  5.57ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes-8    3.49ns ± 0%  2.48ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes16-8  0.93ns ± 0%  0.62ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes32-8  1.55ns ± 0%  1.13ns ± 0%   ~             (p=1.000 n=1+1)
ReverseBytes64-8  2.47ns ± 0%  2.47ns ± 0%   ~     (all samples are equal)

Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d
Reviewed-on: https://go-review.googlesource.com/37215
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-17 22:20:28 +00:00
Robert Griesemer 3a239a6ae4 math/bits: fix benchmarks (make sure calls don't get optimized away)
Sum up function results and store them in an exported (global)
variable. This prevents the compiler from optimizing away the
otherwise side-effect free function calls.

We now have more realistic set of benchmark numbers...

Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

Note: These measurements are based on the same "old"
implementation as the prior measurements (commit 7d5c003).

benchmark                     old ns/op     new ns/op     delta
BenchmarkReverse-8            72.9          8.50          -88.34%
BenchmarkReverse8-8           13.2          2.17          -83.56%
BenchmarkReverse16-8          21.2          2.89          -86.37%
BenchmarkReverse32-8          36.3          3.55          -90.22%
BenchmarkReverse64-8          71.3          6.81          -90.45%
BenchmarkReverseBytes-8       11.2          3.49          -68.84%
BenchmarkReverseBytes16-8     6.24          0.93          -85.10%
BenchmarkReverseBytes32-8     7.40          1.55          -79.05%
BenchmarkReverseBytes64-8     10.5          2.47          -76.48%

Reverse-8         72.9ns ± 0%   8.5ns ± 0%   ~     (p=1.000 n=1+1)
Reverse8-8        13.2ns ± 0%   2.2ns ± 0%   ~     (p=1.000 n=1+1)
Reverse16-8       21.2ns ± 0%   2.9ns ± 0%   ~     (p=1.000 n=1+1)
Reverse32-8       36.3ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
Reverse64-8       71.3ns ± 0%   6.8ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes-8    11.2ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes16-8  6.24ns ± 0%  0.93ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes32-8  7.40ns ± 0%  1.55ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes64-8  10.5ns ± 0%   2.5ns ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I8aef1334b84f6cafd25edccad7e6868b37969efb
Reviewed-on: https://go-review.googlesource.com/37213
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 20:58:12 +00:00
Robert Griesemer ddb15cea4a math/bits: much faster ReverseBytes, added respective benchmarks
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

benchmark                     old ns/op     new ns/op     delta
BenchmarkReverseBytes-8       11.4          3.51          -69.21%
BenchmarkReverseBytes16-8     6.87          0.64          -90.68%
BenchmarkReverseBytes32-8     7.79          0.65          -91.66%
BenchmarkReverseBytes64-8     11.6          0.64          -94.48%

name              old time/op  new time/op  delta
ReverseBytes-8    11.4ns ± 0%   3.5ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes16-8  6.87ns ± 0%  0.64ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes32-8  7.79ns ± 0%  0.65ns ± 0%   ~     (p=1.000 n=1+1)
ReverseBytes64-8  11.6ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I67b529652b3b613c61687e9e185e8d4ee40c51a2
Reviewed-on: https://go-review.googlesource.com/37211
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 19:38:26 +00:00
Robert Griesemer 7d5c003a3a math/bits: much faster Reverse, added respective benchmarks
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.

name         old time/op  new time/op  delta
Reverse-8    76.6ns ± 0%   8.1ns ± 0%   ~     (p=1.000 n=1+1)
Reverse8-8   12.6ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
Reverse16-8  20.8ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
Reverse32-8  36.5ns ± 0%   0.6ns ± 0%   ~     (p=1.000 n=1+1)
Reverse64-8  74.0ns ± 0%   6.4ns ± 0%   ~     (p=1.000 n=1+1)

benchmark                old ns/op     new ns/op     delta
BenchmarkReverse-8       76.6          8.07          -89.46%
BenchmarkReverse8-8      12.6          0.64          -94.92%
BenchmarkReverse16-8     20.8          0.64          -96.92%
BenchmarkReverse32-8     36.5          0.64          -98.25%
BenchmarkReverse64-8     74.0          6.38          -91.38%

Change-Id: I6b99b10cee2f2babfe79342b50ee36a45a34da30
Reviewed-on: https://go-review.googlesource.com/37149
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 19:38:13 +00:00
Robert Griesemer 81acd308a4 math/bits: expand doc strings for all functions
Follow-up on https://go-review.googlesource.com/36315.
No functionality change.

For #18616.

Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604
Reviewed-on: https://go-review.googlesource.com/37140
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-17 19:02:56 +00:00
Robert Griesemer 661e2179e5 math/bits: added package for bit-level counting and manipulation
Initial platform-independent implementation.

For #18616.

Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c
Reviewed-on: https://go-review.googlesource.com/36315
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-16 21:54:59 +00:00