Commit Graph

96 Commits

Author SHA1 Message Date
Keith Randall 1665b006a5 runtime: grow stack by copying
On stack overflow, if all frames on the stack are
copyable, we copy the frames to a new stack twice
as large as the old one.  During GC, if a G is using
less than 1/4 of its stack, copy the stack to a stack
half its size.

TODO
- Do something about C frames.  When a C frame is in the
  stack segment, it isn't copyable.  We allocate a new segment
  in this case.
  - For idempotent C code, we can abort it, copy the stack,
    then retry.  I'm working on a separate CL for this.
  - For other C code, we can raise the stackguard
    to the lowest Go frame so the next call that Go frame
    makes triggers a copy, which will then succeed.
- Pick a starting stack size?

The plan is that eventually we reach a point where the
stack contains only copyable frames.

LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/54650044
2014-02-26 23:28:44 -08:00
Keith Randall 3b5278fca6 runtime: get rid of the settype buffer and lock.
MCaches	now hold a MSpan for each sizeclass which they have
exclusive access to allocate from, so no lock is needed.

Modifying the heap bitmaps also no longer requires a cas.

runtime.free gets more expensive.  But we don't use it
much any more.

It's not much faster on 1 processor, but it's a lot
faster on multiple processors.

benchmark                 old ns/op    new ns/op    delta
BenchmarkSetTypeNoPtr1           24           23   -0.42%
BenchmarkSetTypeNoPtr2           33           34   +0.89%
BenchmarkSetTypePtr1             51           49   -3.72%
BenchmarkSetTypePtr2             55           54   -1.98%

benchmark                old ns/op    new ns/op    delta
BenchmarkAllocation          52739        50770   -3.73%
BenchmarkAllocation-2        33957        34141   +0.54%
BenchmarkAllocation-3        33326        29015  -12.94%
BenchmarkAllocation-4        38105        25795  -32.31%
BenchmarkAllocation-5        68055        24409  -64.13%
BenchmarkAllocation-6        71544        23488  -67.17%
BenchmarkAllocation-7        68374        23041  -66.30%
BenchmarkAllocation-8        70117        20758  -70.40%

LGTM=rsc, dvyukov
R=dvyukov, bradfitz, khr, rsc
CC=golang-codereviews
https://golang.org/cl/46810043
2014-02-26 15:52:58 -08:00
Russ Cox 67c83db60d runtime: use goc2c as much as possible
Package runtime's C functions written to be called from Go
started out written in C using carefully constructed argument
lists and the FLUSH macro to write a result back to memory.

For some functions, the appropriate parameter list ended up
being architecture-dependent due to differences in alignment,
so we added 'goc2c', which takes a .goc file containing Go func
declarations but C bodies, rewrites the Go func declaration to
equivalent C declarations for the target architecture, adds the
needed FLUSH statements, and writes out an equivalent C file.
That C file is compiled as part of package runtime.

Native Client's x86-64 support introduces the most complex
alignment rules yet, breaking many functions that could until
now be portably written in C. Using goc2c for those avoids the
breakage.

Separately, Keith's work on emitting stack information from
the C compiler would require the hand-written functions
to add #pragmas specifying how many arguments are result
parameters. Using goc2c for those avoids maintaining #pragmas.

For both reasons, use goc2c for as many Go-called C functions
as possible.

This CL is a replay of the bulk of CL 15400047 and CL 15790043,
both of which were reviewed as part of the NaCl port and are
checked in to the NaCl branch. This CL is part of bringing the
NaCl code into the main tree.

No new code here, just reformatting and occasional movement
into .h files.

LGTM=r
R=dave, alex.brainman, r
CC=golang-codereviews
https://golang.org/cl/65220044
2014-02-20 15:58:47 -05:00
Russ Cox e8ecd9f67a runtime: update malloc comment for MSpan.needzero
Missed this suggestion in CL 57680046.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/63390043
2014-02-13 14:31:48 -05:00
Russ Cox 86e3cb8da5 runtime: introduce MSpan.needzero instead of writing to span data
This cleans up the code significantly, and it avoids any
possible problems with madvise zeroing out some but
not all of the data.

Fixes #6400.

LGTM=dave
R=dvyukov, dave
CC=golang-codereviews
https://golang.org/cl/57680046
2014-02-13 11:10:31 -05:00
Dmitriy Vyukov 3c3be62201 runtime: concurrent GC sweep
Moves sweep phase out of stoptheworld by adding
background sweeper goroutine and lazy on-demand sweeping.

It turned out to be somewhat trickier than I expected,
because there is no point in time when we know size of live heap
nor consistent number of mallocs and frees.
So everything related to next_gc, mprof, memstats, etc becomes trickier.

At the end of GC next_gc is conservatively set to heap_alloc*GOGC,
which is much larger than real value. But after every sweep
next_gc is decremented by freed*GOGC. So when everything is swept
next_gc becomes what it should be.

For mprof I had to introduce 3-generation scheme (allocs, revent_allocs, prev_allocs),
because by the end of GC we know number of frees for the *previous* GC.

Significant caution is required to not cross yet-unknown real value of next_gc.
This is achieved by 2 means:
1. Whenever I allocate a span from MCentral, I sweep a span in that MCentral.
2. Whenever I allocate N pages from MHeap, I sweep until at least N pages are
returned to heap.
This provides quite strong guarantees that heap does not grow when it should now.

http-1
allocated                    7036         7033      -0.04%
allocs                         60           60      +0.00%
cputime                     51050        46700      -8.52%
gc-pause-one             34060569      1777993     -94.78%
gc-pause-total               2554          133     -94.79%
latency-50                 178448       170926      -4.22%
latency-95                 284350       198294     -30.26%
latency-99                 345191       220652     -36.08%
rss                     101564416    101007360      -0.55%
sys-gc                    6606832      6541296      -0.99%
sys-heap                 88801280     87752704      -1.18%
sys-other                 7334208      7405928      +0.98%
sys-stack                  524288       524288      +0.00%
sys-total               103266608    102224216      -1.01%
time                        50339        46533      -7.56%
virtual-mem             292990976    293728256      +0.25%

garbage-1
allocated                 2983818      2990889      +0.24%
allocs                      62880        62902      +0.03%
cputime                  16480000     16190000      -1.76%
gc-pause-one            828462467    487875135     -41.11%
gc-pause-total            4142312      2439375     -41.11%
rss                    1151709184   1153712128      +0.17%
sys-gc                   66068352     66068352      +0.00%
sys-heap               1039728640   1039728640      +0.00%
sys-other                37776064     40770176      +7.93%
sys-stack                 8781824      8781824      +0.00%
sys-total              1152354880   1155348992      +0.26%
time                     16496998     16199876      -1.80%
virtual-mem            1409564672   1402281984      -0.52%

LGTM=rsc
R=golang-codereviews, sameer, rsc, iant, jeremyjackins, gobot
CC=golang-codereviews, khr
https://golang.org/cl/46430043
2014-02-12 22:16:42 +04:00
Dmitriy Vyukov e48751e217 runtime: increase page size to 8K
Tcmalloc uses 8K, 32K and 64K pages, and in custom setups 256K pages.
Only Chromium uses 4K pages today (in "slow but small" configuration).
The general tendency is to increase page size, because it reduces
metadata size and DTLB pressure.
This change reduces GC pause by ~10% and slightly improves other metrics.

json-1
allocated                 8037492      8038689      +0.01%
allocs                     105762       105573      -0.18%
cputime                 158400000    155800000      -1.64%
gc-pause-one              4412234      4135702      -6.27%
gc-pause-total            2647340      2398707      -9.39%
rss                      54923264     54525952      -0.72%
sys-gc                    3952624      3928048      -0.62%
sys-heap                 46399488     46006272      -0.85%
sys-other                 5597504      5290304      -5.49%
sys-stack                  393216       393216      +0.00%
sys-total                56342832     55617840      -1.29%
time                    158478890    156046916      -1.53%
virtual-mem             256548864    256593920      +0.02%

garbage-1
allocated                 2991113      2986259      -0.16%
allocs                      62844        62652      -0.31%
cputime                  16330000     15860000      -2.88%
gc-pause-one            789108229    725555211      -8.05%
gc-pause-total            3945541      3627776      -8.05%
rss                    1143660544   1132253184      -1.00%
sys-gc                   65609600     65806208      +0.30%
sys-heap               1032388608   1035599872      +0.31%
sys-other                37501632     22777664     -39.26%
sys-stack                 8650752      8781824      +1.52%
sys-total              1144150592   1132965568      -0.98%
time                     16364602     15891994      -2.89%
virtual-mem            1327296512   1313746944      -1.02%

This is the exact reincarnation of already LGTMed:
https://golang.org/cl/45770044
which must not break darwin/freebsd after:
https://golang.org/cl/56630043
TBR=iant

LGTM=khr, iant
R=iant, khr
CC=golang-codereviews
https://golang.org/cl/58230043
2014-01-30 13:28:19 +04:00
Dmitriy Vyukov 86a3a54284 runtime: fix windows build
Currently windows crashes because early allocs in schedinit
try to allocate tiny memory blocks, but m->p is not yet setup.
I've considered calling procresize(1) earlier in schedinit,
but this refactoring is better and must fix the issue as well.
Fixes #7218.

R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/54570045
2014-01-28 00:26:56 +04:00
Dmitriy Vyukov bace9523ee runtime: smarter slice grow
When growing slice take into account size of the allocated memory block.
Also apply the same optimization to string->[]byte conversion.
Fixes #6307.

benchmark                    old ns/op    new ns/op    delta
BenchmarkAppendGrowByte        4541036      4434108   -2.35%
BenchmarkAppendGrowString     59885673     44813604  -25.17%

LGTM=khr
R=khr
CC=golang-codereviews, iant, rsc
https://golang.org/cl/53340044
2014-01-27 15:11:12 +04:00
Dmitriy Vyukov 1fa7029425 runtime: combine small NoScan allocations
Combine NoScan allocations < 16 bytes into a single memory block.
Reduces number of allocations on json/garbage benchmarks by 10+%.

json-1
allocated                 8039872      7949194      -1.13%
allocs                     105774        93776     -11.34%
cputime                 156200000    100700000     -35.53%
gc-pause-one              4908873      3814853     -22.29%
gc-pause-total            2748969      2899288      +5.47%
rss                      52674560     43560960     -17.30%
sys-gc                    3796976      3256304     -14.24%
sys-heap                 43843584     35192832     -19.73%
sys-other                 5589312      5310784      -4.98%
sys-stack                  393216       393216      +0.00%
sys-total                53623088     44153136     -17.66%
time                    156193436    100886714     -35.41%
virtual-mem             256548864    256540672      -0.00%

garbage-1
allocated                 2996885      2932982      -2.13%
allocs                      62904        55200     -12.25%
cputime                  17470000     17400000      -0.40%
gc-pause-one            932757485    925806143      -0.75%
gc-pause-total            4663787      4629030      -0.75%
rss                    1151074304   1133670400      -1.51%
sys-gc                   66068352     65085312      -1.49%
sys-heap               1039728640   1024065536      -1.51%
sys-other                38038208     37485248      -1.45%
sys-stack                 8650752      8781824      +1.52%
sys-total              1152485952   1135417920      -1.48%
time                     17478088     17418005      -0.34%
virtual-mem            1343709184   1324204032      -1.45%

LGTM=iant, bradfitz
R=golang-codereviews, dave, iant, rsc, bradfitz
CC=golang-codereviews, khr
https://golang.org/cl/38750047
2014-01-24 22:35:11 +04:00
Dmitriy Vyukov 8371b0142e undo CL 45770044 / d795425bfa18
Breaks darwin and freebsd.

««« original CL description
runtime: increase page size to 8K
Tcmalloc uses 8K, 32K and 64K pages, and in custom setups 256K pages.
Only Chromium uses 4K pages today (in "slow but small" configuration).
The general tendency is to increase page size, because it reduces
metadata size and DTLB pressure.
This change reduces GC pause by ~10% and slightly improves other metrics.

json-1
allocated                 8037492      8038689      +0.01%
allocs                     105762       105573      -0.18%
cputime                 158400000    155800000      -1.64%
gc-pause-one              4412234      4135702      -6.27%
gc-pause-total            2647340      2398707      -9.39%
rss                      54923264     54525952      -0.72%
sys-gc                    3952624      3928048      -0.62%
sys-heap                 46399488     46006272      -0.85%
sys-other                 5597504      5290304      -5.49%
sys-stack                  393216       393216      +0.00%
sys-total                56342832     55617840      -1.29%
time                    158478890    156046916      -1.53%
virtual-mem             256548864    256593920      +0.02%

garbage-1
allocated                 2991113      2986259      -0.16%
allocs                      62844        62652      -0.31%
cputime                  16330000     15860000      -2.88%
gc-pause-one            789108229    725555211      -8.05%
gc-pause-total            3945541      3627776      -8.05%
rss                    1143660544   1132253184      -1.00%
sys-gc                   65609600     65806208      +0.30%
sys-heap               1032388608   1035599872      +0.31%
sys-other                37501632     22777664     -39.26%
sys-stack                 8650752      8781824      +1.52%
sys-total              1144150592   1132965568      -0.98%
time                     16364602     15891994      -2.89%
virtual-mem            1327296512   1313746944      -1.02%

R=golang-codereviews, dave, khr, rsc, khr
CC=golang-codereviews
https://golang.org/cl/45770044
»»»

R=golang-codereviews
CC=golang-codereviews
https://golang.org/cl/56060043
2014-01-23 19:56:59 +04:00
Dmitriy Vyukov 6d603af6dc runtime: increase page size to 8K
Tcmalloc uses 8K, 32K and 64K pages, and in custom setups 256K pages.
Only Chromium uses 4K pages today (in "slow but small" configuration).
The general tendency is to increase page size, because it reduces
metadata size and DTLB pressure.
This change reduces GC pause by ~10% and slightly improves other metrics.

json-1
allocated                 8037492      8038689      +0.01%
allocs                     105762       105573      -0.18%
cputime                 158400000    155800000      -1.64%
gc-pause-one              4412234      4135702      -6.27%
gc-pause-total            2647340      2398707      -9.39%
rss                      54923264     54525952      -0.72%
sys-gc                    3952624      3928048      -0.62%
sys-heap                 46399488     46006272      -0.85%
sys-other                 5597504      5290304      -5.49%
sys-stack                  393216       393216      +0.00%
sys-total                56342832     55617840      -1.29%
time                    158478890    156046916      -1.53%
virtual-mem             256548864    256593920      +0.02%

garbage-1
allocated                 2991113      2986259      -0.16%
allocs                      62844        62652      -0.31%
cputime                  16330000     15860000      -2.88%
gc-pause-one            789108229    725555211      -8.05%
gc-pause-total            3945541      3627776      -8.05%
rss                    1143660544   1132253184      -1.00%
sys-gc                   65609600     65806208      +0.30%
sys-heap               1032388608   1035599872      +0.31%
sys-other                37501632     22777664     -39.26%
sys-stack                 8650752      8781824      +1.52%
sys-total              1144150592   1132965568      -0.98%
time                     16364602     15891994      -2.89%
virtual-mem            1327296512   1313746944      -1.02%

R=golang-codereviews, dave, khr, rsc, khr
CC=golang-codereviews
https://golang.org/cl/45770044
2014-01-23 18:59:43 +04:00
Russ Cox 3ec60c253d runtime: emit collection stacks in GODEBUG=allocfreetrace mode
R=khr, dvyukov
CC=golang-codereviews
https://golang.org/cl/51830043
2014-01-14 10:39:50 -05:00
Keith Randall 020b39c3f3 runtime: use special records hung off the MSpan to
record finalizers and heap profile info.  Enables
removing the special bit from the heap bitmap.  Also
provides a generic mechanism for annotating occasional
heap objects.

finalizers
        overhead      per obj
old	680 B	      80 B avg
new	16 B/span     48 B

profile
        overhead      per obj
old	32KB	      24 B + hash tables
new	16 B/span     24 B

R=cshapiro, khr, dvyukov, gobot
CC=golang-codereviews
https://golang.org/cl/13314053
2014-01-07 13:45:50 -08:00
Keith Randall 2d20c0d625 runtime: mark objects in free lists as allocated and unscannable.
On the plus side, we don't need to change the bits when mallocing
pointerless objects.  On the other hand, we need to mark objects in the
free lists during GC.  But the free lists are small at GC time, so it
should be a net win.

benchmark                    old ns/op    new ns/op    delta
BenchmarkMalloc8                    40           33  -17.65%
BenchmarkMalloc16                   45           38  -15.72%
BenchmarkMallocTypeInfo8            58           59   +0.85%
BenchmarkMallocTypeInfo16           63           64   +1.10%

R=golang-dev, rsc, dvyukov
CC=cshapiro, golang-dev
https://golang.org/cl/41040043
2013-12-18 17:13:59 -08:00
Carl Shapiro 48279bd567 runtime: add an allocation and free tracing for gc debugging
Output for an allocation and free (sweep) follows

MProf_Malloc(p=0xc2100210a0, size=0x50, type=0x0 <single object>)
        #0 0x46ee15 runtime.mallocgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:141
        #1 0x47004f runtime.settype_flush /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:612
        #2 0x45f92c gc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2071
        #3 0x45f89e mgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2050
        #4 0x45258b runtime.mcall /usr/local/google/home/cshapiro/go/src/pkg/runtime/asm_amd64.s:179

MProf_Free(p=0xc2100210a0, size=0x50)
        #0 0x46ee15 runtime.mallocgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:141
        #1 0x47004f runtime.settype_flush /usr/local/google/home/cshapiro/go/src/pkg/runtime/malloc.goc:612
        #2 0x45f92c gc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2071
        #3 0x45f89e mgc /usr/local/google/home/cshapiro/go/src/pkg/runtime/mgc0.c:2050
        #4 0x45258b runtime.mcall /usr/local/google/home/cshapiro/go/src/pkg/runtime/asm_amd64.s:179

R=golang-dev, dvyukov, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/21990045
2013-12-03 14:42:38 -08:00
Dmitriy Vyukov a33ef8d11b runtime: account for all sys memory in MemStats
Currently lots of sys allocations are not accounted in any of XxxSys,
including GC bitmap, spans table, GC roots blocks, GC finalizer blocks,
iface table, netpoll descriptors and more. Up to ~20% can unaccounted.
This change introduces 2 new stats: GCSys and OtherSys for GC metadata
and all other misc allocations, respectively.
Also ensures that all XxxSys indeed sum up to Sys. All sys memory allocation
functions require the stat for accounting, so that it's impossible to miss something.
Also fix updating of mcache_sys/inuse, they were not updated after deallocation.

test/bench/garbage/parser before:
Sys		670064344
HeapSys		610271232
StackSys	65536
MSpanSys	14204928
MCacheSys	16384
BuckHashSys	1439992

after:
Sys		670064344
HeapSys		610271232
StackSys	65536
MSpanSys	14188544
MCacheSys	16384
BuckHashSys	3194304
GCSys		39198688
OtherSys	3129656

Fixes #5799.

R=rsc, dave, alex.brainman
CC=golang-dev
https://golang.org/cl/12946043
2013-09-06 16:55:40 -04:00
Keith Randall fb376021be runtime: record type information for hashtable internal structures.
Remove all hashtable-specific GC code.

Fixes bug 6119.

R=cshapiro, dvyukov, khr
CC=golang-dev
https://golang.org/cl/13078044
2013-08-31 09:09:50 -07:00
Keith Randall d0dd420a24 runtime: rename FlagNoPointers to FlagNoScan. Better represents
the use of the flag, especially for objects which actually do have
pointers but we don't want the GC to scan them.

R=golang-dev, cshapiro
CC=golang-dev
https://golang.org/cl/13181045
2013-08-23 17:28:47 -07:00
Russ Cox 5822e7848a runtime: make SetFinalizer(x, f) accept any f for which f(x) is valid
Originally the requirement was f(x) where f's argument is
exactly x's type.

CL 11858043 relaxed the requirement in a non-standard
way: f's argument must be exactly x's type or interface{}.

If we're going to relax the requirement, it should be done
in a way consistent with the rest of Go. This CL allows f's
argument to have any type for which x is assignable;
that's the same requirement the compiler would impose
if compiling f(x) directly.

Fixes #5368.

R=dvyukov, bradfitz, pieter
CC=golang-dev
https://golang.org/cl/12895043
2013-08-14 14:54:31 -04:00
Dmitriy Vyukov 4e76abbc60 runtime: implement SysUnused on windows
Fixes #5584.

R=golang-dev, chaishushan, alex.brainman
CC=golang-dev
https://golang.org/cl/12720043
2013-08-14 21:54:07 +04:00
Dmitriy Vyukov 0a904a3f2e runtime: remove dead code
Remove dead code related to allocation of type metadata with SysAlloc.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12311045
2013-08-04 23:32:06 +04:00
Pieter Droogendijk 6350e45892 runtime: allow SetFinalizer with a func(interface{})
Fixes #5368.

R=golang-dev, dvyukov
CC=golang-dev, rsc
https://golang.org/cl/11858043
2013-07-29 19:43:08 +04:00
Dmitriy Vyukov f8a850b250 runtime: refactor mallocgc
Make it accept type, combine flags.
Several reasons for the change:
1. mallocgc and settype must be atomic wrt GC
2. settype is called from only one place now
3. it will help performance (eventually settype
functionality must be combined with markallocated)
4. flags are easier to read now (no mallocgc(sz, 0, 1, 0) anymore)

R=golang-dev, iant, nightlyone, rsc, dave, khr, bradfitz, r
CC=golang-dev
https://golang.org/cl/10136043
2013-07-26 21:17:24 +04:00
Dmitriy Vyukov 4f514e8691 runtime: use persistentalloc instead of SysAlloc in FixAlloc
Also reduce FixAlloc allocation granulatiry from 128k to 16k,
small programs do not need that much memory for MCache's and MSpan's.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/10140044
2013-06-10 09:20:27 +04:00
Dmitriy Vyukov 5d637b83a9 runtime: speedup malloc stats collection
Count only number of frees, everything else is derivable
and does not need to be counted on every malloc.
benchmark                    old ns/op    new ns/op    delta
BenchmarkMalloc8                    68           66   -3.07%
BenchmarkMalloc16                   75           70   -6.48%
BenchmarkMallocTypeInfo8           102           97   -4.80%
BenchmarkMallocTypeInfo16          108          105   -2.78%

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/9776043
2013-06-06 14:56:50 +04:00
Anthony Martin cdfbe00d91 runtime: fix description of SysAlloc
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/10010046
2013-06-04 17:12:29 -07:00
Dmitriy Vyukov 86da989ee5 runtime: introduce helper persistentalloc() function
It is a caching wrapper around SysAlloc() that can allocate small chunks.
Use it for symtab allocations. Reduces number of symtab walks from 4 to 3
(reduces buildfuncs time from 10ms to 7.5ms on a large binary,
reduces initial heap size by 680K on the same binary).
Also can be used for type info allocation, itab allocation.
There are also several places in GC where we do the same thing,
they can be changed to use persistentalloc().
Also can be used in FixAlloc, because each instance of FixAlloc allocates
in 128K regions, which is too eager.
Reincarnation of committed and rolled back https://golang.org/cl/9805043
The latent bugs that it revealed are fixed:
https://golang.org/cl/9837049
https://golang.org/cl/9778048

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9778049
2013-05-31 10:42:30 +04:00
Dmitriy Vyukov e17281b397 runtime: rename mheap.maps to mheap.spans
as was dicussed in cl/9791044

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/9853046
2013-05-30 17:09:58 +04:00
Dmitriy Vyukov 8bbb08533d runtime: make mheap statically allocated again
This depends on: 9791044: runtime: allocate page table lazily
Once page table is moved out of heap, the heap becomes small.
This removes unnecessary dereferences during heap access.
No logical changes.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9802043
2013-05-28 22:14:47 +04:00
Dmitriy Vyukov 671814b904 runtime: allocate page table lazily
This removes the 256MB memory allocation at startup,
which conflicts with ulimit.
Also will allow to eliminate an unnecessary memory dereference in GC,
because the page table is usually mapped at known address.
Update #5049.
Update #5236.

R=golang-dev, khr, r, khr, rsc
CC=golang-dev
https://golang.org/cl/9791044
2013-05-28 22:04:34 +04:00
Dmitriy Vyukov 828c68f8d8 undo CL 9805043 / 776aba85ece8
multiple failures on amd64

««« original CL description
runtime: introduce helper persistentalloc() function
It is a caching wrapper around SysAlloc() that can allocate small chunks.
Use it for symtab allocations. Reduces number of symtab walks from 4 to 3
(reduces buildfuncs time from 10ms to 7.5ms on a large binary,
reduces initial heap size by 680K on the same binary).
Also can be used for type info allocation, itab allocation.
There are also several places in GC where we do the same thing,
they can be changed to use persistentalloc().
Also can be used in FixAlloc, because each instance of FixAlloc allocates
in 128K regions, which is too eager.

R=golang-dev, daniel.morsing, khr
CC=golang-dev
https://golang.org/cl/9805043
»»»

R=golang-dev
CC=golang-dev
https://golang.org/cl/9822043
2013-05-28 11:14:39 +04:00
Dmitriy Vyukov 5166013f75 runtime: inline MCache_Alloc() into mallocgc()
benchmark                    old ns/op    new ns/op    delta
BenchmarkMalloc8                    68           62   -8.63%
BenchmarkMalloc16                   75           69   -7.94%
BenchmarkMallocTypeInfo8           102           98   -3.73%
BenchmarkMallocTypeInfo16          108          103   -4.63%

R=golang-dev, dave, khr
CC=golang-dev
https://golang.org/cl/9790043
2013-05-28 11:05:55 +04:00
Dmitriy Vyukov 47e0a3d7b1 runtime: introduce helper persistentalloc() function
It is a caching wrapper around SysAlloc() that can allocate small chunks.
Use it for symtab allocations. Reduces number of symtab walks from 4 to 3
(reduces buildfuncs time from 10ms to 7.5ms on a large binary,
reduces initial heap size by 680K on the same binary).
Also can be used for type info allocation, itab allocation.
There are also several places in GC where we do the same thing,
they can be changed to use persistentalloc().
Also can be used in FixAlloc, because each instance of FixAlloc allocates
in 128K regions, which is too eager.

R=golang-dev, daniel.morsing, khr
CC=golang-dev
https://golang.org/cl/9805043
2013-05-28 10:47:35 +04:00
Dmitriy Vyukov 5782f4117d runtime: introduce cnewarray() to simplify allocation of typed arrays
R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/9648044
2013-05-27 11:29:11 +04:00
Dmitriy Vyukov c075d82cca runtime: fix and speedup malloc stats
Currently per-sizeclass stats are lost for destroyed MCache's. This patch fixes this.
Also, only update mstats.heap_alloc on heap operations, because that's the only
stat that needs to be promptly updated. Everything else needs to be up-to-date only in ReadMemStats().

R=golang-dev, remyoudompheng, dave, iant
CC=golang-dev
https://golang.org/cl/9207047
2013-05-22 22:22:57 +04:00
Dmitriy Vyukov c4cfef075e runtime: simplify MCache
The nlistmin/size thresholds are copied from tcmalloc,
but are unnecesary for Go malloc. We do not do explicit
frees into MCache. For sparse cases when we do (mainly hashmap),
simpler logic will do.

R=rsc, dave, iant
CC=gobot, golang-dev, r, remyoudompheng
https://golang.org/cl/9373043
2013-05-22 13:29:17 +04:00
Dmitriy Vyukov 23ad563119 runtime: transfer whole span from MCentral to MCache
Finer-grained transfers were relevant with per-M caches,
with per-P caches they are not relevant and harmful for performance.
For few small size classes where it makes difference,
it's fine to grab the whole span (4K).

benchmark          old ns/op    new ns/op    delta
BenchmarkMalloc           42           40   -4.45%

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/9374043
2013-05-15 18:35:05 +04:00
Dmitriy Vyukov 5a89b35bca runtime: inline size to class conversion in malloc()
Also change table type from int32[] to int8[] to save space in L1$.

benchmark          old ns/op    new ns/op    delta
BenchmarkMalloc           42           40   -4.68%

R=golang-dev, bradfitz, r
CC=golang-dev
https://golang.org/cl/9199044
2013-05-15 11:02:33 +04:00
Shenghou Ma b3b1efd882 runtime: reduce max arena size on windows/amd64 to 32 GiB
Update #5236
Update #5402
This CL reduces gofmt's committed memory from 545864 KiB to 139568 KiB.
Note: Go 1.0.3 uses about 70MiB.

R=golang-dev, r, iant, nightlyone
CC=golang-dev
https://golang.org/cl/9245043
2013-05-07 06:53:02 +08:00
Dmitriy Vyukov cfe336770b runtime: replace union in MHeap with a struct
Unions break precise GC.
Update #5193.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/8368044
2013-04-06 20:02:03 -07:00
Jan Ziak a656f82071 runtime: precise garbage collection of channels
This changeset adds a mostly-precise garbage collection of channels.
The garbage collection support code in the linker isn't recognizing
channel types yet.

Fixes issue http://stackoverflow.com/questions/14712586/memory-consumption-skyrocket

R=dvyukov, rsc, bradfitz
CC=dave, golang-dev, minux.ma, remyoudompheng
https://golang.org/cl/7307086
2013-02-25 15:58:23 -05:00
Russ Cox 1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Russ Cox 8a6ff3ab34 runtime: allocate heap metadata at run time
Before, the mheap structure was in the bss,
but it's quite large (today, 256 MB, much of
which is never actually paged in), and it makes
Go binaries run afoul of exec-time bss size
limits on some BSD systems.

Fixes #4447.

R=golang-dev, dave, minux.ma, remyoudompheng, iant
CC=golang-dev
https://golang.org/cl/7307122
2013-02-15 14:27:03 -05:00
Russ Cox 472354f81e runtime/debug: add controls for garbage collector
Fixes #4090.

R=golang-dev, iant, bradfitz, dsymonds
CC=golang-dev
https://golang.org/cl/7229070
2013-02-04 00:00:55 -05:00
Shenghou Ma 4019d0e424 runtime: avoid defining the same variable in more than one translation unit
For gccgo runtime and Darwin where -fno-common is the default.

R=iant, dave
CC=golang-dev
https://golang.org/cl/7094061
2013-01-26 09:57:06 +08:00
Jan Ziak 9204eb4d3c runtime: interpret type information during garbage collection
R=rsc, dvyukov, remyoudompheng, dave, minux.ma, bradfitz
CC=golang-dev
https://golang.org/cl/6945069
2013-01-10 15:45:46 -05:00
Russ Cox 9799a5a4fd runtime: allow up to 128 GB of allocated memory
Incorporates code from CL 6828055.

Fixes #2142.

R=golang-dev, iant, devon.odell
CC=golang-dev
https://golang.org/cl/6826088
2012-11-13 12:45:08 -05:00
Jan Ziak e0c9d04aec runtime: add memorydump() debugging function
R=golang-dev
CC=golang-dev, remyoudompheng, rsc
https://golang.org/cl/6780059
2012-11-01 12:56:25 -04:00
Jingcheng Zhang 5d05c7800e runtime: sizeclass in MSpan should be int32.
R=golang-dev, minux.ma, dave, rsc
CC=golang-dev
https://golang.org/cl/6643046
2012-10-21 20:32:43 -04:00