go/src
Josh Bleecher Snyder 5353cde080 runtime, cmd/internal/obj/arm: improve arm function prologue
When stack growth is not needed, as it usually is not,
execute only a single conditional branch
rather than three conditional instructions.
This adds 4 bytes to every function,
but might speed up execution in the common case.

Sample disassembly for

func f() {
	_ = [128]byte{}
}

Before:

TEXT main.f(SB) x.go
	x.go:3	0x2000	e59a1008	MOVW 0x8(R10), R1
	x.go:3	0x2004	e59fb028	MOVW 0x28(R15), R11
	x.go:3	0x2008	e08d200b	ADD R11, R13, R2
	x.go:3	0x200c	e1520001	CMP R1, R2
	x.go:3	0x2010	91a0300e	MOVW.LS R14, R3
	x.go:3	0x2014	9b0118a9	BL.LS runtime.morestack_noctxt(SB)
	x.go:3	0x2018	9afffff8	B.LS main.f(SB)
	x.go:3	0x201c	e52de084	MOVW.W R14, -0x84(R13)
	x.go:4	0x2020	e28d1004	ADD $4, R13, R1
	x.go:4	0x2024	e3a00000	MOVW $0, R0
	x.go:4	0x2028	eb012255	BL 0x4a984
	x.go:5	0x202c	e49df084	RET #132
	x.go:5	0x2030	eafffffe	B 0x2030
	x.go:5	0x2034	ffffff7c	?

After:

TEXT main.f(SB) x.go
	x.go:3	0x2000	e59a1008	MOVW 0x8(R10), R1
	x.go:3	0x2004	e59fb02c	MOVW 0x2c(R15), R11
	x.go:3	0x2008	e08d200b	ADD R11, R13, R2
	x.go:3	0x200c	e1520001	CMP R1, R2
	x.go:3	0x2010	9a000004	B.LS 0x2028
	x.go:3	0x2014	e52de084	MOVW.W R14, -0x84(R13)
	x.go:4	0x2018	e28d1004	ADD $4, R13, R1
	x.go:4	0x201c	e3a00000	MOVW $0, R0
	x.go:4	0x2020	eb0124dc	BL 0x4b398
	x.go:5	0x2024	e49df084	RET #132
	x.go:5	0x2028	e1a0300e	MOVW R14, R3
	x.go:5	0x202c	eb011b0d	BL runtime.morestack_noctxt(SB)
	x.go:5	0x2030	eafffff2	B main.f(SB)
	x.go:5	0x2034	eafffffe	B 0x2034
	x.go:5	0x2038	ffffff7c	?

Updates #10587.

package sort benchmarks on an iPhone 6:

name            old time/op  new time/op  delta
SortString1K     569µs ± 0%   565µs ± 1%  -0.75%  (p=0.000 n=23+24)
StableString1K   872µs ± 1%   870µs ± 1%  -0.16%  (p=0.009 n=23+24)
SortInt1K        317µs ± 2%   316µs ± 2%    ~     (p=0.410 n=26+26)
StableInt1K      343µs ± 1%   339µs ± 1%  -1.07%  (p=0.000 n=22+23)
SortInt64K      30.0ms ± 1%  30.0ms ± 1%    ~     (p=0.091 n=25+24)
StableInt64K    30.2ms ± 0%  30.0ms ± 0%  -0.69%  (p=0.000 n=22+22)
Sort1e2          147µs ± 1%   146µs ± 0%  -0.48%  (p=0.000 n=25+24)
Stable1e2        290µs ± 1%   286µs ± 1%  -1.30%  (p=0.000 n=23+24)
Sort1e4         29.5ms ± 2%  29.7ms ± 1%  +0.71%  (p=0.000 n=23+23)
Stable1e4       88.7ms ± 4%  88.6ms ± 8%  -0.07%  (p=0.022 n=26+26)
Sort1e6          4.81s ± 7%   4.83s ± 7%    ~     (p=0.192 n=26+26)
Stable1e6        18.3s ± 1%   18.1s ± 1%  -0.76%  (p=0.000 n=25+23)
SearchWrappers   318ns ± 1%   344ns ± 1%  +8.14%  (p=0.000 n=23+26)

package sort benchmarks on a first generation rpi:

name            old time/op  new time/op  delta
SearchWrappers  4.13µs ± 0%  3.95µs ± 0%   -4.42%  (p=0.000 n=15+13)
SortString1K    5.81ms ± 1%  5.82ms ± 2%     ~     (p=0.400 n=14+15)
StableString1K  9.69ms ± 1%  9.73ms ± 0%     ~     (p=0.121 n=15+11)
SortInt1K       3.30ms ± 2%  3.66ms ±19%  +10.82%  (p=0.000 n=15+14)
StableInt1K     5.97ms ±15%  4.17ms ± 8%  -30.05%  (p=0.000 n=15+15)
SortInt64K       319ms ± 1%   295ms ± 1%   -7.65%  (p=0.000 n=15+15)
StableInt64K     343ms ± 0%   332ms ± 0%   -3.26%  (p=0.000 n=12+13)
Sort1e2         3.36ms ± 2%  3.22ms ± 4%   -4.10%  (p=0.000 n=15+15)
Stable1e2       6.74ms ± 1%  6.43ms ± 2%   -4.67%  (p=0.000 n=15+15)
Sort1e4          247ms ± 1%   247ms ± 1%     ~     (p=0.331 n=15+14)
Stable1e4        864ms ± 0%   820ms ± 0%   -5.15%  (p=0.000 n=14+15)
Sort1e6          41.2s ± 0%   41.2s ± 0%   +0.15%  (p=0.000 n=13+14)
Stable1e6         192s ± 0%    182s ± 0%   -5.07%  (p=0.000 n=14+14)

Change-Id: I8a9db77e1d4ea1956575895893bc9d04bd81204b
Reviewed-on: https://go-review.googlesource.com/10497
Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-04 16:35:12 +00:00
..
archive archive/tar: terminate when reading malformed sparse files 2015-05-28 23:54:54 +00:00
bufio
builtin
bytes bytes, strings: add LastIndexByte 2015-04-30 07:13:18 +00:00
cmd runtime, cmd/internal/obj/arm: improve arm function prologue 2015-06-04 16:35:12 +00:00
compress
container
crypto crypto/x509: be strict about trailing data. 2015-04-30 03:49:36 +00:00
database/sql
debug all: build and use go tool compile, go tool link 2015-05-21 17:32:03 +00:00
encoding encoding/xml: Reset the parent stack before printing a chardata or comment field in a struct 2015-06-04 07:16:25 +00:00
errors
expvar
flag flag: Fix up a package comment a bit. 2015-05-19 02:18:40 +00:00
fmt fmt: fix buffer underflow for negative integers 2015-06-02 13:55:40 +00:00
go cmd/go: make test.bash pass again 2015-06-03 20:33:30 +00:00
hash hash/crc32: move reverse representation docs to an example 2015-05-04 00:19:22 +00:00
html html/template: prevent panic when escaping actions involving chain nodes 2015-06-01 20:52:04 +00:00
image image/gif: allow encoding a single-frame image whose top-left corner 2015-05-06 01:00:58 +00:00
index/suffixarray
internal internal/syscall/windows/registry: fix read overrun in GetStringsValue 2015-05-15 03:25:41 +00:00
io io: minor improvements to doc comment on WriteString. 2015-05-29 04:33:15 +00:00
log
math math/big: turn off debug mode 2015-06-03 22:08:17 +00:00
mime mime: fix names of examples 2015-06-01 22:20:58 +00:00
net net/http: set nosniff header when serving Error 2015-06-02 18:29:45 +00:00
os os: eradicate smallpox after test 2015-05-06 17:38:57 +00:00
path path: fix a typo in documentation of Split 2015-05-31 22:08:38 +00:00
reflect reflect: make PtrTo(FuncOf(...)) not crash 2015-05-16 00:51:05 +00:00
regexp regexp: suggest go doc, not godoc 2015-06-01 20:16:31 +00:00
runtime runtime, cmd/internal/obj/arm: improve arm function prologue 2015-06-04 16:35:12 +00:00
sort
strconv strconv: minor internal comment fix 2015-05-27 22:02:02 +00:00
strings strings: mention UTF-8 in the package comment. 2015-06-03 19:28:41 +00:00
sync
syscall syscall: don't run fcntl child process test on iOS 2015-05-15 16:41:12 +00:00
testing testing: fix typo 2015-05-12 23:39:00 +00:00
text text/template: refactor code to accomodate bi-state requirement for templates 2015-06-03 20:10:54 +00:00
time time: document that not all Unix time can be represented 2015-05-19 06:19:33 +00:00
unicode
unsafe
Make.dist
all.bash
all.bat
all.rc
androidtest.bash androidtest.bash: clean up stale GOROOT 2015-05-25 20:53:26 +00:00
bootstrap.bash
buildall.bash buildall.bash: exit 1 when make.bash fails 2015-05-17 01:40:33 +00:00
clean.bash
clean.bat
clean.rc
iostest.bash
make.bash
make.bat
make.rc
nacltest.bash nacltest.bash: remove syscall/fstest_nacl.go after test 2015-05-02 02:48:32 +00:00
race.bash
race.bat
run.bash build: correct quoting of args in run.bash 2015-05-09 04:23:47 +00:00
run.bat
run.rc