mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Cherry Zhang	f96b62be2e	cmd/internal/objabi, runtime: compact FUNCDATA indices As we deleted register maps, move FUNCDATA indices of stack objects, inline trees, and open-coded defers earlier. Change-Id: If73797b8c11fd207655c9498802fca9f6f9ac338 Reviewed-on: https://go-review.googlesource.com/c/go/+/265761 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-10-30 21:14:09 +00:00
Cherry Zhang	8414b1a5a4	runtime: remove go115ReduceLiveness and go115RestartSeq Make them always true. Delete code that are only executed when they are false. Change-Id: I6194fa00de23486c2b0a0c9075fe3a09d9c52762 Reviewed-on: https://go-review.googlesource.com/c/go/+/264339 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-10-30 21:13:24 +00:00
Austin Clements	afba990169	runtime/internal/atomic: drop package prefixes This drops package prefixes from the assembly code on 386 and arm. In addition to just being nicer, this allows the assembler to automatically pick up the argument stack map from the Go signatures of these functions. This doesn't matter right now because these functions never call back out to Go, but prepares us for the next CL. Change-Id: I90fed7d4dd63ad49274529c62804211b6390e2e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/262777 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-10-16 17:31:16 +00:00
Dan Scales	be64a19d99	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>	2019-10-24 13:54:11 +00:00
Bryan C. Mills	b76e6f8825	Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>	2019-10-16 20:59:53 +00:00
Dan Scales	dad616375f	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>	2019-10-16 18:27:16 +00:00
Josh Bleecher Snyder	4aeac68c92	runtime, cmd/compile: re-order PCDATA and FUNCDATA indices The pclntab encoding supports writing only some PCDATA and FUNCDATA values. However, the encoding is dense: The max index in use determines the space used. We should thus choose a numbering in which frequently used indices are smaller. This change re-orders the PCDATA and FUNCDATA indices using that principle, using a quick and dirty instrumentation to measure index frequency. It shrinks binaries by about 0.5%. Updates #6853 file before after Δ % go 14745044 14671316 -73728 -0.500% addr2line 4305128 4280552 -24576 -0.571% api 6095800 6058936 -36864 -0.605% asm 4930928 4906352 -24576 -0.498% buildid 2881520 2861040 -20480 -0.711% cgo 4896584 4867912 -28672 -0.586% compile 25868408 25770104 -98304 -0.380% cover 5319656 5286888 -32768 -0.616% dist 3654528 3634048 -20480 -0.560% doc 4719672 4691000 -28672 -0.607% fix 3418312 3393736 -24576 -0.719% link 6137952 6109280 -28672 -0.467% nm 4250536 4225960 -24576 -0.578% objdump 4665192 4636520 -28672 -0.615% pack 2297488 2285200 -12288 -0.535% pprof 14735332 14657508 -77824 -0.528% test2json 2834952 2818568 -16384 -0.578% trace 11679964 11618524 -61440 -0.526% vet 8452696 8403544 -49152 -0.581% Change-Id: I30665dce57ec7a52e7d3c6718560b3aa5b83dd0b Reviewed-on: https://go-review.googlesource.com/c/go/+/171760 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-04-19 15:40:42 +00:00
Keith Randall	cbafcc55e8	cmd/compile,runtime: implement stack objects Rework how the compiler+runtime handles stack-allocated variables whose address is taken. Direct references to such variables work as before. References through pointers, however, use a new mechanism. The new mechanism is more precise than the old "ambiguously live" mechanism. It computes liveness at runtime based on the actual references among objects on the stack. Each function records all of its address-taken objects in a FUNCDATA. These are called "stack objects". The runtime then uses that information while scanning a stack to find all of the stack objects on a stack. It then does a mark phase on the stack objects, using all the pointers found on the stack (and ancillary structures, like defer records) as the root set. Only stack objects which are found to be live during this mark phase will be scanned and thus retain any heap objects they point to. A subsequent CL will remove all the "ambiguously live" logic from the compiler, so that the stack object tracing will be required. For this CL, the stack tracing is all redundant with the current ambiguously live logic. Update #22350 Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f Reviewed-on: https://go-review.googlesource.com/c/134155 Reviewed-by: Austin Clements <austin@google.com>	2018-10-03 19:52:49 +00:00
Xia Bin	c19f86fbfb	runtime: fix reference to funcdata.go in comment Change-Id: I6c8699cd71b41cf8d178a0af3a745a19dcf60905 Reviewed-on: https://go-review.googlesource.com/123536 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-07-12 15:13:10 +00:00
Austin Clements	9f95c9db23	cmd/compile, cmd/internal/obj: record register maps in binary This adds FUNCDATA and PCDATA that records the register maps much like the existing live arguments maps and live locals maps. The register map is indexed independently from the argument and locals maps since changes in register liveness tend not to correlate with changes to argument and local liveness. This is the final CL toward adding safe-points everywhere. The following CLs will optimize liveness analysis to bring down the cost. The effect of this CL is: name old time/op new time/op delta Template 195ms ± 2% 197ms ± 1% ~ (p=0.136 n=9+9) Unicode 98.4ms ± 2% 99.7ms ± 1% +1.39% (p=0.004 n=10+10) GoTypes 685ms ± 1% 700ms ± 1% +2.06% (p=0.000 n=9+9) Compiler 3.28s ± 1% 3.34s ± 0% +1.71% (p=0.000 n=9+8) SSA 7.79s ± 1% 7.91s ± 1% +1.55% (p=0.000 n=10+9) Flate 133ms ± 2% 133ms ± 2% ~ (p=0.190 n=10+10) GoParser 161ms ± 2% 164ms ± 3% +1.83% (p=0.015 n=10+10) Reflect 450ms ± 1% 457ms ± 1% +1.62% (p=0.000 n=10+10) Tar 183ms ± 2% 185ms ± 1% +0.91% (p=0.008 n=9+10) XML 234ms ± 1% 238ms ± 1% +1.60% (p=0.000 n=9+9) [Geo mean] 411ms 417ms +1.40% name old exe-bytes new exe-bytes delta HelloSize 1.47M ± 0% 1.51M ± 0% +2.79% (p=0.000 n=10+10) Compared to just before "cmd/internal/obj: consolidate emitting entry stack map", the cumulative effect of adding stack maps everywhere and register maps is: name old time/op new time/op delta Template 185ms ± 2% 197ms ± 1% +6.42% (p=0.000 n=10+9) Unicode 96.3ms ± 3% 99.7ms ± 1% +3.60% (p=0.000 n=10+10) GoTypes 658ms ± 0% 700ms ± 1% +6.37% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.34s ± 0% +6.53% (p=0.000 n=9+8) SSA 7.41s ± 2% 7.91s ± 1% +6.71% (p=0.000 n=9+9) Flate 126ms ± 1% 133ms ± 2% +6.15% (p=0.000 n=10+10) GoParser 153ms ± 1% 164ms ± 3% +6.89% (p=0.000 n=10+10) Reflect 437ms ± 1% 457ms ± 1% +4.59% (p=0.000 n=10+10) Tar 178ms ± 1% 185ms ± 1% +4.18% (p=0.000 n=10+10) XML 223ms ± 1% 238ms ± 1% +6.39% (p=0.000 n=10+9) [Geo mean] 394ms 417ms +5.78% name old alloc/op new alloc/op delta Template 34.5MB ± 0% 38.0MB ± 0% +10.19% (p=0.000 n=10+10) Unicode 29.3MB ± 0% 30.3MB ± 0% +3.56% (p=0.000 n=8+9) GoTypes 113MB ± 0% 125MB ± 0% +10.89% (p=0.000 n=10+10) Compiler 510MB ± 0% 575MB ± 0% +12.79% (p=0.000 n=10+10) SSA 1.46GB ± 0% 1.64GB ± 0% +12.40% (p=0.000 n=10+10) Flate 23.9MB ± 0% 25.9MB ± 0% +8.56% (p=0.000 n=10+10) GoParser 28.0MB ± 0% 30.8MB ± 0% +10.08% (p=0.000 n=10+10) Reflect 77.6MB ± 0% 84.3MB ± 0% +8.63% (p=0.000 n=10+10) Tar 34.1MB ± 0% 37.0MB ± 0% +8.44% (p=0.000 n=10+10) XML 42.7MB ± 0% 47.2MB ± 0% +10.75% (p=0.000 n=10+10) [Geo mean] 76.0MB 83.3MB +9.60% name old allocs/op new allocs/op delta Template 321k ± 0% 337k ± 0% +4.98% (p=0.000 n=10+10) Unicode 337k ± 0% 340k ± 0% +1.04% (p=0.000 n=10+9) GoTypes 1.13M ± 0% 1.18M ± 0% +4.85% (p=0.000 n=10+10) Compiler 4.67M ± 0% 4.96M ± 0% +6.25% (p=0.000 n=10+10) SSA 11.7M ± 0% 12.3M ± 0% +5.69% (p=0.000 n=10+10) Flate 216k ± 0% 226k ± 0% +4.52% (p=0.000 n=10+9) GoParser 271k ± 0% 283k ± 0% +4.52% (p=0.000 n=10+10) Reflect 927k ± 0% 972k ± 0% +4.78% (p=0.000 n=10+10) Tar 318k ± 0% 333k ± 0% +4.56% (p=0.000 n=10+10) XML 376k ± 0% 395k ± 0% +5.04% (p=0.000 n=10+10) [Geo mean] 730k 764k +4.61% name old exe-bytes new exe-bytes delta HelloSize 1.46M ± 0% 1.51M ± 0% +3.66% (p=0.000 n=10+10) For #24543. Change-Id: I91e003dc64151916b384274884bf02a2d6862547 Reviewed-on: https://go-review.googlesource.com/109353 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-05-22 15:55:03 +00:00
David Lazar	699175a11a	cmd/compile,link: generate PC-value tables with inlining information In order to generate accurate tracebacks, the runtime needs to know the inlined call stack for a given PC. This creates two tables per function for this purpose. The first table is the inlining tree (stored in the function's funcdata), which has a node containing the file, line, and function name for every inlined call. The second table is a PC-value table that maps each PC to a node in the inlining tree (or -1 if the PC is not the result of inlining). To give the appearance that inlining hasn't happened, the runtime also needs the original source position information of inlined AST nodes. Previously the compiler plastered over the line numbers of inlined AST nodes with the line number of the call. This meant that the PC-line table mapped each PC to line number of the outermost call in its inlined call stack, with no way to access the innermost line number. Now the compiler retains line numbers of inlined AST nodes and writes the innermost source position information to the PC-line and PC-file tables. Some tools and tests expect to see outermost line numbers, so we provide the OutermostLine function for displaying line info. To keep track of the inlined call stack for an AST node, we extend the src.PosBase type with an index into a global inlining tree. Every time the compiler inlines a call, it creates a node in the global inlining tree for the call, and writes its index to the PosBase of every inlined AST node. The parent of this node is the inlining tree index of the call. -1 signifies no parent. For each function, the compiler creates a local inlining tree and a PC-value table mapping each PC to an index in the local tree. These are written to an object file, which is read by the linker. The linker re-encodes these tables compactly by deduplicating function names and file names. This change increases the size of binaries by 4-5%. For example, this is how the go1 benchmark binary is impacted by this change: section old bytes new bytes delta .text 3.49M ± 0% 3.49M ± 0% +0.06% .rodata 1.12M ± 0% 1.21M ± 0% +8.21% .gopclntab 1.50M ± 0% 1.68M ± 0% +11.89% .debug_line 338k ± 0% 435k ± 0% +28.78% Total 9.21M ± 0% 9.58M ± 0% +4.01% Updates #19348. Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3 Reviewed-on: https://go-review.googlesource.com/37231 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-03 21:29:30 +00:00
Austin Clements	bab191042b	cmd/internal/obj, runtime: update funcdata comments The comments in cmd/internal/obj/funcdata.go are identical to the comments in runtime/funcdata.h, but the majority of the definitions they refer to don't apply to Go sources and have been stripped out of funcdata.go. Remove these stale comments from funcdata.go and clean up the references to other copies of the PCDATA and FUNCDATA indexes. Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884 Reviewed-on: https://go-review.googlesource.com/37330 Reviewed-by: Keith Randall <khr@golang.org>	2017-02-27 22:29:28 +00:00
Richard Miller	8a2d6e9f6f	runtime: fix a typo in asssembly macro GO_RESULTS_INITIALIZED Fixes #14772 Change-Id: I32f2b6b74de28be406b1306364bc07620a453962 Reviewed-on: https://go-review.googlesource.com/20680 Reviewed-by: David du Colombier <0intro@gmail.com> Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-14 14:53:29 +00:00
Brad Fitzpatrick	519474451a	all: make copyright headers consistent with one space after period This is a subset of https://golang.org/cl/20022 with only the copyright header lines, so the next CL will be smaller and more reviewable. Go policy has been single space after periods in comments for some time. The copyright header template at: https://golang.org/doc/contribute.html#copyright also uses a single space. Make them all consistent. Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0 Reviewed-on: https://go-review.googlesource.com/20111 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-01 23:34:33 +00:00
Michael Hudson-Doyle	0388d4303f	runtime: remove unused FUNCDATA_DeadValueMaps Change-Id: Iccb0221bd9aef062d20798b952eaa09d9e60b902 Reviewed-on: https://go-review.googlesource.com/14345 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-09-07 21:02:11 +00:00
Russ Cox	fee9e47559	[dev.cc] runtime: convert header files to Go The conversion was done with an automated tool and then modified only as necessary to make it compile and run. [This CL is part of the removal of C code from package runtime. See golang.org/s/dev.cc for an overview.] LGTM=r R=r, austin CC=dvyukov, golang-codereviews, iant, khr https://golang.org/cl/167550043	2014-11-11 17:05:19 -05:00
Russ Cox	202bf8d94d	doc/asm: explain coordination with garbage collector Also a few other minor changes. Fixes #8712. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/164150043	2014-10-28 15:51:06 -04:00
Russ Cox	68c1c6afa0	cmd/cc, cmd/gc: stop generating 'argsize' PCDATA The argsize PCDATA was specifying the number of bytes passed to a function call, so that if the function did not specify its argument count, the garbage collector could use the call site information to scan those bytes conservatively. We don't do that anymore, so stop generating the information. LGTM=khr R=khr CC=golang-codereviews https://golang.org/cl/139530043	2014-09-12 07:51:00 -04:00
Russ Cox	99f7df0598	cmd/gc: turn Go prototypes into ptr liveness maps for assembly functions The goal here is to allow assembly functions to appear in the middle of a Go stack (having called other code) and still record enough information about their pointers so that stack copying and garbage collection can handle them precisely. Today, these frames are handled only conservatively. If you write func myfunc(x float64) (y int) (with no body, an 'extern' declaration), then the Go compiler now emits a liveness bitmap for use from the assembly definition of myfunc. The bitmap symbol is myfunc.args_stackmap and it contains two bitmaps. The first bitmap, in effect at function entry, marks all inputs as live. The second bitmap, not in effect at function entry, marks the outputs live as well. In funcdata.h, define new assembly macros: GO_ARGS opts in to using the Go compiler-generated liveness bitmap for the current function. GO_RESULTS_INITIALIZED indicates that the results have been initialized and need to be kept live for the remainder of the function; it causes a switch to the second generated bitmap for the assembly code that follows. NO_LOCAL_POINTERS indicates that there are no pointers in the local variables being stored in the function's stack frame. LGTM=khr R=khr CC=golang-codereviews https://golang.org/cl/137520043	2014-09-12 00:18:20 -04:00
Russ Cox	c007ce824d	build: move package sources from src/pkg to src Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.	2014-09-08 00:08:51 -04:00

20 Commits