runtime: add safe arena support to the runtime

This change adds an API to the runtime for arenas. A later CL can
potentially export it as an experimental API, but for now, just the
runtime implementation will suffice.

The purpose of arenas is to improve efficiency, primarily by allowing
for an application to manually free memory, thereby delaying garbage
collection. It comes with other potential performance benefits, such as
better locality, a better allocation strategy, and better handling of
interior pointers by the GC.

This implementation is based on one by danscales@google.com with a few
significant differences:
* The implementation lives entirely in the runtime (all layers).
* Arena chunks are the minimum of 8 MiB or the heap arena size. This
  choice is made because in practice 64 MiB appears to be way too large
  of an area for most real-world use-cases.
* Arena chunks are not unmapped, instead they're placed on an evacuation
  list and when there are no pointers left pointing into them, they're
  allowed to be reused.
* Reusing partially-used arena chunks no longer tries to find one used
  by the same P first; it just takes the first one available.
* In order to ensure worst-case fragmentation is never worse than 25%,
  only types and slice backing stores whose sizes are 1/4th the size of
  a chunk or less may be used. Previously larger sizes, up to the size
  of the chunk, were allowed.
* ASAN, MSAN, and the race detector are fully supported.
* Sets arena chunks to fault that were deferred at the end of mark
  termination (a non-public patch once did this; I don't see a reason
  not to continue that).

For #51317.

Change-Id: I83b1693a17302554cb36b6daa4e9249a81b1644f
Reviewed-on: https://go-review.googlesource.com/c/go/+/423359
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
This commit is contained in:
Michael Anthony Knyszek 2022-08-12 21:40:46 +00:00 committed by Michael Knyszek
parent 4c383951b9
commit 7866538d25
16 changed files with 1595 additions and 104 deletions

905
src/runtime/arena.go Normal file
View File

@ -0,0 +1,905 @@
// Copyright 2022 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Implementation of (safe) user arenas.
//
// This file contains the implementation of user arenas wherein Go values can
// be manually allocated and freed in bulk. The act of manually freeing memory,
// potentially before a GC cycle, means that a garbage collection cycle can be
// delayed, improving efficiency by reducing GC cycle frequency. There are other
// potential efficiency benefits, such as improved locality and access to a more
// efficient allocation strategy.
//
// What makes the arenas here safe is that once they are freed, accessing the
// arena's memory will cause an explicit program fault, and the arena's address
// space will not be reused until no more pointers into it are found. There's one
// exception to this: if an arena allocated memory that isn't exhausted, it's placed
// back into a pool for reuse. This means that a crash is not always guaranteed.
//
// While this may seem unsafe, it still prevents memory corruption, and is in fact
// necessary in order to make new(T) a valid implementation of arenas. Such a property
// is desirable to allow for a trivial implementation. (It also avoids complexities
// that arise from synchronization with the GC when trying to set the arena chunks to
// fault while the GC is active.)
//
// The implementation works in layers. At the bottom, arenas are managed in chunks.
// Each chunk must be a multiple of the heap arena size, or the heap arena size must
// be divisible by the arena chunks. The address space for each chunk, and each
// corresponding heapArena for that addres space, are eternelly reserved for use as
// arena chunks. That is, they can never be used for the general heap. Each chunk
// is also represented by a single mspan, and is modeled as a single large heap
// allocation. It must be, because each chunk contains ordinary Go values that may
// point into the heap, so it must be scanned just like any other object. Any
// pointer into a chunk will therefore always cause the whole chunk to be scanned
// while its corresponding arena is still live.
//
// Chunks may be allocated either from new memory mapped by the OS on our behalf,
// or by reusing old freed chunks. When chunks are freed, their underlying memory
// is returned to the OS, set to fault on access, and may not be reused until the
// program doesn't point into the chunk anymore (the code refers to this state as
// "quarantined"), a property checked by the GC.
//
// The sweeper handles moving chunks out of this quarantine state to be ready for
// reuse. When the chunk is placed into the quarantine state, its corresponding
// span is marked as noscan so that the GC doesn't try to scan memory that would
// cause a fault.
//
// At the next layer are the user arenas themselves. They consist of a single
// active chunk which new Go values are bump-allocated into and a list of chunks
// that were exhausted when allocating into the arena. Once the arena is freed,
// it frees all full chunks it references, and places the active one onto a reuse
// list for a future arena to use. Each arena keeps its list of referenced chunks
// explicitly live until it is freed. Each user arena also maps to an object which
// has a finalizer attached that ensures the arena's chunks are all freed even if
// the arena itself is never explicitly freed.
//
// Pointer-ful memory is bump-allocated from low addresses to high addresses in each
// chunk, while pointer-free memory is bump-allocated from high address to low
// addresses. The reason for this is to take advantage of a GC optimization wherein
// the GC will stop scanning an object when there are no more pointers in it, which
// also allows us to elide clearing the heap bitmap for pointer-free Go values
// allocated into arenas.
//
// Note that arenas are not safe to use concurrently.
//
// In summary, there are 2 resources: arenas, and arena chunks. They exist in the
// following lifecycle:
//
// (1) A new arena is created via newArena.
// (2) Chunks are allocated to hold memory allocated into the arena with new or slice.
// (a) Chunks are first allocated from the reuse list of partially-used chunks.
// (b) If there are no such chunks, then chunks on the ready list are taken.
// (c) Failing all the above, memory for a new chunk is mapped.
// (3) The arena is freed, or all references to it are dropped, triggering its finalizer.
// (a) If the GC is not active, exhausted chunks are set to fault and placed on a
// quarantine list.
// (b) If the GC is active, exhausted chunks are placed on a fault list and will
// go through step (a) at a later point in time.
// (c) Any remaining partially-used chunk is placed on a reuse list.
// (4) Once no more pointers are found into quarantined arena chunks, the sweeper
// takes these chunks out of quarantine and places them on the ready list.
package runtime
import (
"internal/goarch"
"runtime/internal/atomic"
"runtime/internal/math"
"unsafe"
)
const (
// userArenaChunkBytes is the size of a user arena chunk.
userArenaChunkBytesMax = 8 << 20
userArenaChunkBytes = uintptr(int64(userArenaChunkBytesMax-heapArenaBytes)&(int64(userArenaChunkBytesMax-heapArenaBytes)>>63) + heapArenaBytes) // min(userArenaChunkBytesMax, heapArenaBytes)
// userArenaChunkPages is the number of pages a user arena chunk uses.
userArenaChunkPages = userArenaChunkBytes / pageSize
// userArenaChunkMaxAllocBytes is the maximum size of an object that can
// be allocated from an arena. This number is chosen to cap worst-case
// fragmentation of user arenas to 25%. Larger allocations are redirected
// to the heap.
userArenaChunkMaxAllocBytes = userArenaChunkBytes / 4
)
func init() {
if userArenaChunkPages*pageSize != userArenaChunkBytes {
throw("user arena chunk size is not a mutliple of the page size")
}
if userArenaChunkBytes%physPageSize != 0 {
throw("user arena chunk size is not a mutliple of the physical page size")
}
if userArenaChunkBytes < heapArenaBytes {
if heapArenaBytes%userArenaChunkBytes != 0 {
throw("user arena chunk size is smaller than a heap arena, but doesn't divide it")
}
} else {
if userArenaChunkBytes%heapArenaBytes != 0 {
throw("user arena chunks size is larger than a heap arena, but not a multiple")
}
}
lockInit(&userArenaState.lock, lockRankUserArenaState)
}
type userArena struct {
// full is a list of full chunks that have not enough free memory left, and
// that we'll free once this user arena is freed.
//
// Can't use mSpanList here because it's not-in-heap.
fullList *mspan
// active is the user arena chunk we're currently allocating into.
active *mspan
// refs is a set of references to the arena chunks so that they're kept alive.
//
// The last reference in the list always refers to active, while the rest of
// them correspond to fullList. Specifically, the head of fullList is the
// second-to-last one, fullList.next is the third-to-last, and so on.
//
// In other words, every time a new chunk becomes active, its appended to this
// list.
refs []unsafe.Pointer
// defunct is true if free has been called on this arena.
//
// This is just a best-effort way to discover a concurrent allocation
// and free. Also used to detect a double-free.
defunct atomic.Bool
}
// newUserArena creates a new userArena ready to be used.
func newUserArena() *userArena {
a := new(userArena)
SetFinalizer(a, func(a *userArena) {
// If arena handle is dropped without being freed, then call
// free on the arena, so the arena chunks are never reclaimed
// by the garbage collector.
a.free()
})
a.refill()
return a
}
// new allocates a new object of the provided type into the arena, and returns
// its pointer.
//
// This operation is not safe to call concurrently with other operations on the
// same arena.
func (a *userArena) new(typ *_type) unsafe.Pointer {
return a.alloc(typ, -1)
}
// slice allocates a new slice backing store. slice must be a pointer to a slice
// (i.e. *[]T), because userArenaSlice will update the slice directly.
//
// cap determines the capacity of the slice backing store and must be non-negative.
//
// This operation is not safe to call concurrently with other operations on the
// same arena.
func (a *userArena) slice(sl any, cap int) {
if cap < 0 {
panic("userArena.slice: negative cap")
}
i := efaceOf(&sl)
typ := i._type
if typ.kind&kindMask != kindPtr {
panic("slice result of non-ptr type")
}
typ = (*ptrtype)(unsafe.Pointer(typ)).elem
if typ.kind&kindMask != kindSlice {
panic("slice of non-ptr-to-slice type")
}
typ = (*slicetype)(unsafe.Pointer(typ)).elem
// t is now the element type of the slice we want to allocate.
*((*slice)(i.data)) = slice{a.alloc(typ, cap), cap, cap}
}
// free returns the userArena's chunks back to mheap and marks it as defunct.
//
// Must be called at most once for any given arena.
//
// This operation is not safe to call concurrently with other operations on the
// same arena.
func (a *userArena) free() {
// Check for a double-free.
if a.defunct.Load() {
panic("arena double free")
}
// Mark ourselves as defunct.
a.defunct.Store(true)
SetFinalizer(a, nil)
// Free all the full arenas.
//
// The refs on this list are in reverse order from the second-to-last.
s := a.fullList
i := len(a.refs) - 2
for s != nil {
a.fullList = s.next
s.next = nil
freeUserArenaChunk(s, a.refs[i])
s = a.fullList
i--
}
if a.fullList != nil || i >= 0 {
// There's still something left on the full list, or we
// failed to actually iterate over the entire refs list.
throw("full list doesn't match refs list in length")
}
// Put the active chunk onto the reuse list.
//
// Note that active's reference is always the last reference in refs.
s = a.active
if s != nil {
if raceenabled || msanenabled || asanenabled {
// Don't reuse arenas with sanitizers enabled. We want to catch
// any use-after-free errors aggressively.
freeUserArenaChunk(s, a.refs[len(a.refs)-1])
} else {
lock(&userArenaState.lock)
userArenaState.reuse = append(userArenaState.reuse, liveUserArenaChunk{s, a.refs[len(a.refs)-1]})
unlock(&userArenaState.lock)
}
}
// nil out a.active so that a race with freeing will more likely cause a crash.
a.active = nil
a.refs = nil
}
// alloc reserves space in the current chunk or calls refill and reserves space
// in a new chunk. If cap is negative, the type will be taken literally, otherwise
// it will be considered as an element type for a slice backing store with capacity
// cap.
func (a *userArena) alloc(typ *_type, cap int) unsafe.Pointer {
s := a.active
var x unsafe.Pointer
for {
x = s.userArenaNextFree(typ, cap)
if x != nil {
break
}
s = a.refill()
}
return x
}
// refill inserts the current arena chunk onto the full list and obtains a new
// one, either from the partial list or allocating a new one, both from mheap.
func (a *userArena) refill() *mspan {
// If there's an active chunk, assume it's full.
s := a.active
if s != nil {
if s.userArenaChunkFree.size() > userArenaChunkMaxAllocBytes {
// It's difficult to tell when we're actually out of memory
// in a chunk because the allocation that failed may still leave
// some free space available. However, that amount of free space
// should never exceed the maximum allocation size.
throw("wasted too much memory in an arena chunk")
}
s.next = a.fullList
a.fullList = s
a.active = nil
s = nil
}
var x unsafe.Pointer
// Check the partially-used list.
lock(&userArenaState.lock)
if len(userArenaState.reuse) > 0 {
// Pick off the last arena chunk from the list.
n := len(userArenaState.reuse) - 1
x = userArenaState.reuse[n].x
s = userArenaState.reuse[n].mspan
userArenaState.reuse[n].x = nil
userArenaState.reuse[n].mspan = nil
userArenaState.reuse = userArenaState.reuse[:n]
}
unlock(&userArenaState.lock)
if s == nil {
// Allocate a new one.
x, s = newUserArenaChunk()
if s == nil {
throw("out of memory")
}
}
a.refs = append(a.refs, x)
a.active = s
return s
}
type liveUserArenaChunk struct {
*mspan // Must represent a user arena chunk.
// Reference to mspan.base() to keep the chunk alive.
x unsafe.Pointer
}
var userArenaState struct {
lock mutex
// reuse contains a list of partially-used and already-live
// user arena chunks that can be quickly reused for another
// arena.
//
// Protected by lock.
reuse []liveUserArenaChunk
// fault contains full user arena chunks that need to be faulted.
//
// Protected by lock.
fault []liveUserArenaChunk
}
// userArenaNextFree reserves space in the user arena for an item of the specified
// type. If cap is not -1, this is for an array of cap elements of type t.
func (s *mspan) userArenaNextFree(typ *_type, cap int) unsafe.Pointer {
size := typ.size
if cap > 0 {
if size > ^uintptr(0)/uintptr(cap) {
// Overflow.
throw("out of memory")
}
size *= uintptr(cap)
}
if size == 0 || cap == 0 {
return unsafe.Pointer(&zerobase)
}
if size > userArenaChunkMaxAllocBytes {
// Redirect allocations that don't fit into a chunk well directly
// from the heap.
if cap >= 0 {
return newarray(typ, cap)
}
return newobject(typ)
}
// Prevent preemption as we set up the space for a new object.
//
// Act like we're allocating.
mp := acquirem()
if mp.mallocing != 0 {
throw("malloc deadlock")
}
if mp.gsignal == getg() {
throw("malloc during signal")
}
mp.mallocing = 1
var ptr unsafe.Pointer
if typ.ptrdata == 0 {
// Allocate pointer-less objects from the tail end of the chunk.
v, ok := s.userArenaChunkFree.takeFromBack(size, typ.align)
if ok {
ptr = unsafe.Pointer(v)
}
} else {
v, ok := s.userArenaChunkFree.takeFromFront(size, typ.align)
if ok {
ptr = unsafe.Pointer(v)
}
}
if ptr == nil {
// Failed to allocate.
mp.mallocing = 0
releasem(mp)
return nil
}
if s.needzero != 0 {
throw("arena chunk needs zeroing, but should already be zeroed")
}
// Set up heap bitmap and do extra accounting.
if typ.ptrdata != 0 {
if cap >= 0 {
userArenaHeapBitsSetSliceType(typ, cap, ptr, s.base())
} else {
userArenaHeapBitsSetType(typ, ptr, s.base())
}
c := getMCache(mp)
if c == nil {
throw("mallocgc called without a P or outside bootstrapping")
}
if cap > 0 {
c.scanAlloc += size - (typ.size - typ.ptrdata)
} else {
c.scanAlloc += typ.ptrdata
}
}
// Ensure that the stores above that initialize x to
// type-safe memory and set the heap bits occur before
// the caller can make ptr observable to the garbage
// collector. Otherwise, on weakly ordered machines,
// the garbage collector could follow a pointer to x,
// but see uninitialized memory or stale heap bits.
publicationBarrier()
mp.mallocing = 0
releasem(mp)
return ptr
}
// userArenaHeapBitsSetType is the equivalent of heapBitsSetType but for
// non-slice-backing-store Go values allocated in a user arena chunk. It
// sets up the heap bitmap for the value with type typ allocated at address ptr.
// base is the base address of the arena chunk.
func userArenaHeapBitsSetType(typ *_type, ptr unsafe.Pointer, base uintptr) {
h := writeHeapBitsForAddr(uintptr(ptr))
// Our last allocation might have ended right at a noMorePtrs mark,
// which we would not have erased. We need to erase that mark here,
// because we're going to start adding new heap bitmap bits.
// We only need to clear one mark, because below we make sure to
// pad out the bits with zeroes and only write one noMorePtrs bit
// for each new object.
// (This is only necessary at noMorePtrs boundaries, as noMorePtrs
// marks within an object allocated with newAt will be erased by
// the normal writeHeapBitsForAddr mechanism.)
//
// Note that we skip this if this is the first allocation in the
// arena because there's definitely no previous noMorePtrs mark
// (in fact, we *must* do this, because we're going to try to back
// up a pointer to fix this up).
if uintptr(ptr)%(8*goarch.PtrSize*goarch.PtrSize) == 0 && uintptr(ptr) != base {
// Back up one pointer and rewrite that pointer. That will
// cause the writeHeapBits implementation to clear the
// noMorePtrs bit we need to clear.
r := heapBitsForAddr(uintptr(ptr)-goarch.PtrSize, goarch.PtrSize)
_, p := r.next()
b := uintptr(0)
if p == uintptr(ptr)-goarch.PtrSize {
b = 1
}
h = writeHeapBitsForAddr(uintptr(ptr) - goarch.PtrSize)
h = h.write(b, 1)
}
p := typ.gcdata // start of 1-bit pointer mask (or GC program)
var gcProgBits uintptr
if typ.kind&kindGCProg != 0 {
// Expand gc program, using the object itself for storage.
gcProgBits = runGCProg(addb(p, 4), (*byte)(ptr))
p = (*byte)(ptr)
}
nb := typ.ptrdata / goarch.PtrSize
for i := uintptr(0); i < nb; i += ptrBits {
k := nb - i
if k > ptrBits {
k = ptrBits
}
h = h.write(readUintptr(addb(p, i/8)), k)
}
// Note: we call pad here to ensure we emit explicit 0 bits
// for the pointerless tail of the object. This ensures that
// there's only a single noMorePtrs mark for the next object
// to clear. We don't need to do this to clear stale noMorePtrs
// markers from previous uses because arena chunk pointer bitmaps
// are always fully cleared when reused.
h = h.pad(typ.size - typ.ptrdata)
h.flush(uintptr(ptr), typ.size)
if typ.kind&kindGCProg != 0 {
// Zero out temporary ptrmask buffer inside object.
memclrNoHeapPointers(ptr, (gcProgBits+7)/8)
}
// Double-check that the bitmap was written out correctly.
//
// Derived from heapBitsSetType.
const doubleCheck = false
if doubleCheck {
size := typ.size
x := uintptr(ptr)
h := heapBitsForAddr(x, size)
for i := uintptr(0); i < size; i += goarch.PtrSize {
// Compute the pointer bit we want at offset i.
want := false
off := i % typ.size
if off < typ.ptrdata {
j := off / goarch.PtrSize
want = *addb(typ.gcdata, j/8)>>(j%8)&1 != 0
}
if want {
var addr uintptr
h, addr = h.next()
if addr != x+i {
throw("userArenaHeapBitsSetType: pointer entry not correct")
}
}
}
if _, addr := h.next(); addr != 0 {
throw("userArenaHeapBitsSetType: extra pointer")
}
}
}
// userArenaHeapBitsSetSliceType is the equivalent of heapBitsSetType but for
// Go slice backing store values allocated in a user arena chunk. It sets up the
// heap bitmap for n consecutive values with type typ allocated at address ptr.
func userArenaHeapBitsSetSliceType(typ *_type, n int, ptr unsafe.Pointer, base uintptr) {
mem, overflow := math.MulUintptr(typ.size, uintptr(n))
if overflow || n < 0 || mem > maxAlloc {
panic(plainError("runtime: allocation size out of range"))
}
for i := 0; i < n; i++ {
userArenaHeapBitsSetType(typ, add(ptr, uintptr(i)*typ.size), base)
}
}
// newUserArenaChunk allocates a user arena chunk, which maps to a single
// heap arena and single span. Returns a pointer to the base of the chunk
// (this is really important: we need to keep the chunk alive) and the span.
func newUserArenaChunk() (unsafe.Pointer, *mspan) {
if gcphase == _GCmarktermination {
throw("newUserArenaChunk called with gcphase == _GCmarktermination")
}
// Deduct assist credit. Because user arena chunks are modeled as one
// giant heap object which counts toward heapLive, we're obligated to
// assist the GC proportionally (and it's worth noting that the arena
// does represent additional work for the GC, but we also have no idea
// what that looks like until we actually allocate things into the
// arena).
deductAssistCredit(userArenaChunkBytes)
// Set mp.mallocing to keep from being preempted by GC.
mp := acquirem()
if mp.mallocing != 0 {
throw("malloc deadlock")
}
if mp.gsignal == getg() {
throw("malloc during signal")
}
mp.mallocing = 1
// Allocate a new user arena.
var span *mspan
systemstack(func() {
span = mheap_.allocUserArenaChunk()
})
if span == nil {
throw("out of memory")
}
x := unsafe.Pointer(span.base())
// Allocate black during GC.
// All slots hold nil so no scanning is needed.
// This may be racing with GC so do it atomically if there can be
// a race marking the bit.
if gcphase != _GCoff {
gcmarknewobject(span, span.base(), span.elemsize)
}
if raceenabled {
// TODO(mknyszek): Track individual objects.
racemalloc(unsafe.Pointer(span.base()), span.elemsize)
}
if msanenabled {
// TODO(mknyszek): Track individual objects.
msanmalloc(unsafe.Pointer(span.base()), span.elemsize)
}
if asanenabled {
// TODO(mknyszek): Track individual objects.
rzSize := computeRZlog(span.elemsize)
span.elemsize -= rzSize
span.limit -= rzSize
span.userArenaChunkFree = makeAddrRange(span.base(), span.limit)
asanpoison(unsafe.Pointer(span.limit), span.npages*pageSize-span.elemsize)
asanunpoison(unsafe.Pointer(span.base()), span.elemsize)
}
if rate := MemProfileRate; rate > 0 {
c := getMCache(mp)
if c == nil {
throw("newUserArenaChunk called without a P or outside bootstrapping")
}
// Note cache c only valid while m acquired; see #47302
if rate != 1 && userArenaChunkBytes < c.nextSample {
c.nextSample -= userArenaChunkBytes
} else {
profilealloc(mp, unsafe.Pointer(span.base()), userArenaChunkBytes)
}
}
mp.mallocing = 0
releasem(mp)
// Again, because this chunk counts toward heapLive, potentially trigger a GC.
if t := (gcTrigger{kind: gcTriggerHeap}); t.test() {
gcStart(t)
}
if debug.malloc {
if debug.allocfreetrace != 0 {
tracealloc(unsafe.Pointer(span.base()), userArenaChunkBytes, nil)
}
if inittrace.active && inittrace.id == getg().goid {
// Init functions are executed sequentially in a single goroutine.
inittrace.bytes += uint64(userArenaChunkBytes)
}
}
// Double-check it's aligned to the physical page size. Based on the current
// implementation this is trivially true, but it need not be in the future.
// However, if it's not aligned to the physical page size then we can't properly
// set it to fault later.
if uintptr(x)%physPageSize != 0 {
throw("user arena chunk is not aligned to the physical page size")
}
return x, span
}
// isUnusedUserArenaChunk indicates that the arena chunk has been set to fault
// and doesn't contain any scannable memory anymore. However, it might still be
// mSpanInUse as it sits on the quarantine list, since it needs to be swept.
//
// This is not safe to execute unless the caller has ownership of the mspan or
// the world is stopped (preemption is prevented while the relevant state changes).
//
// This is really only meant to be used by accounting tests in the runtime to
// distinguish when a span shouldn't be counted (since mSpanInUse might not be
// enough).
func (s *mspan) isUnusedUserArenaChunk() bool {
return s.isUserArenaChunk && s.spanclass == makeSpanClass(0, true)
}
// setUserArenaChunkToFault sets the address space for the user arena chunk to fault
// and releases any underlying memory resources.
//
// Must be in a non-preemptible state to ensure the consistency of statistics
// exported to MemStats.
func (s *mspan) setUserArenaChunkToFault() {
if !s.isUserArenaChunk {
throw("invalid span in heapArena for user arena")
}
if s.npages*pageSize != userArenaChunkBytes {
throw("span on userArena.faultList has invalid size")
}
// Update the span class to be noscan. What we want to happen is that
// any pointer into the span keeps it from getting recycled, so we want
// the mark bit to get set, but we're about to set the address space to fault,
// so we have to prevent the GC from scanning this memory.
//
// It's OK to set it here because (1) a GC isn't in progress, so the scanning code
// won't make a bad decision, (2) we're currently non-preemptible and in the runtime,
// so a GC is blocked from starting. We might race with sweeping, which could
// put it on the "wrong" sweep list, but really don't care because the chunk is
// treated as a large object span and there's no meaningful difference between scan
// and noscan large objects in the sweeper. The STW at the start of the GC acts as a
// barrier for this update.
s.spanclass = makeSpanClass(0, true)
// Actually set the arena chunk to fault, so we'll get dangling pointer errors.
// sysFault currently uses a method on each OS that forces it to evacuate all
// memory backing the chunk.
sysFault(unsafe.Pointer(s.base()), s.npages*pageSize)
// Everything on the list is counted as in-use, however sysFault transitions to
// Reserved, not Prepared, so we skip updating heapFree or heapReleased and just
// remove the memory from the total altogether; it's just address space now.
gcController.heapInUse.add(-int64(s.npages * pageSize))
// Count this as a free of an object right now as opposed to when
// the span gets off the quarantine list. The main reason is so that the
// amount of bytes allocated doesn't exceed how much is counted as
// "mapped ready," which could cause a deadlock in the pacer.
gcController.totalFree.Add(int64(s.npages * pageSize))
// Update consistent stats to match.
//
// We're non-preemptible, so it's safe to update consistent stats (our P
// won't change out from under us).
stats := memstats.heapStats.acquire()
atomic.Xaddint64(&stats.committed, -int64(s.npages*pageSize))
atomic.Xaddint64(&stats.inHeap, -int64(s.npages*pageSize))
atomic.Xadd64(&stats.largeFreeCount, 1)
atomic.Xadd64(&stats.largeFree, int64(s.npages*pageSize))
memstats.heapStats.release()
// This counts as a free, so update heapLive.
gcController.update(-int64(s.npages*pageSize), 0)
// Mark it as free for the race detector.
if raceenabled {
racefree(unsafe.Pointer(s.base()), s.elemsize)
}
systemstack(func() {
// Add the user arena to the quarantine list.
lock(&mheap_.lock)
mheap_.userArena.quarantineList.insert(s)
unlock(&mheap_.lock)
})
}
// inUserArenaChunk returns true if p points to a user arena chunk.
func inUserArenaChunk(p uintptr) bool {
s := spanOf(p)
if s == nil {
return false
}
return s.isUserArenaChunk
}
// freeUserArenaChunk releases the user arena represented by s back to the runtime.
//
// x must be a live pointer within s.
//
// The runtime will set the user arena to fault once it's safe (the GC is no longer running)
// and then once the user arena is no longer referenced by the application, will allow it to
// be reused.
func freeUserArenaChunk(s *mspan, x unsafe.Pointer) {
if !s.isUserArenaChunk {
throw("span is not for a user arena")
}
if s.npages*pageSize != userArenaChunkBytes {
throw("invalid user arena span size")
}
// Mark the region as free to various santizers immediately instead
// of handling them at sweep time.
if raceenabled {
racefree(unsafe.Pointer(s.base()), s.elemsize)
}
if msanenabled {
msanfree(unsafe.Pointer(s.base()), s.elemsize)
}
if asanenabled {
asanpoison(unsafe.Pointer(s.base()), s.elemsize)
}
// Make ourselves non-preemptible as we manipulate state and statistics.
//
// Also required by setUserArenaChunksToFault.
mp := acquirem()
// We can only set user arenas to fault if we're in the _GCoff phase.
if gcphase == _GCoff {
lock(&userArenaState.lock)
faultList := userArenaState.fault
userArenaState.fault = nil
unlock(&userArenaState.lock)
s.setUserArenaChunkToFault()
for _, lc := range faultList {
lc.mspan.setUserArenaChunkToFault()
}
// Until the chunks are set to fault, keep them alive via the fault list.
KeepAlive(x)
KeepAlive(faultList)
} else {
// Put the user arena on the fault list.
lock(&userArenaState.lock)
userArenaState.fault = append(userArenaState.fault, liveUserArenaChunk{s, x})
unlock(&userArenaState.lock)
}
releasem(mp)
}
// allocUserArenaChunk attempts to reuse a free user arena chunk represented
// as a span.
//
// Must be in a non-preemptible state to ensure the consistency of statistics
// exported to MemStats.
//
// Acquires the heap lock. Must run on the system stack for that reason.
//
//go:systemstack
func (h *mheap) allocUserArenaChunk() *mspan {
var s *mspan
var base uintptr
// First check the free list.
lock(&h.lock)
if !h.userArena.readyList.isEmpty() {
s = h.userArena.readyList.first
h.userArena.readyList.remove(s)
base = s.base()
} else {
// Free list was empty, so allocate a new arena.
hintList := &h.userArena.arenaHints
if raceenabled {
// In race mode just use the regular heap hints. We might fragment
// the address space, but the race detector requires that the heap
// is mapped contiguously.
hintList = &h.arenaHints
}
v, size := h.sysAlloc(userArenaChunkBytes, hintList, false)
if size%userArenaChunkBytes != 0 {
throw("sysAlloc size is not divisible by userArenaChunkBytes")
}
if size > userArenaChunkBytes {
// We got more than we asked for. This can happen if
// heapArenaSize > userArenaChunkSize, or if sysAlloc just returns
// some extra as a result of trying to find an aligned region.
//
// Divide it up and put it on the ready list.
for i := uintptr(userArenaChunkBytes); i < size; i += userArenaChunkBytes {
s := h.allocMSpanLocked()
s.init(uintptr(v)+i, userArenaChunkPages)
h.userArena.readyList.insertBack(s)
}
size = userArenaChunkBytes
}
base = uintptr(v)
if base == 0 {
// Out of memory.
unlock(&h.lock)
return nil
}
s = h.allocMSpanLocked()
}
unlock(&h.lock)
// sysAlloc returns Reserved address space, and any span we're
// reusing is set to fault (so, also Reserved), so transition
// it to Prepared and then Ready.
//
// Unlike (*mheap).grow, just map in everything that we
// asked for. We're likely going to use it all.
sysMap(unsafe.Pointer(base), userArenaChunkBytes, &gcController.heapReleased)
sysUsed(unsafe.Pointer(base), userArenaChunkBytes, userArenaChunkBytes)
// Model the user arena as a heap span for a large object.
spc := makeSpanClass(0, false)
h.initSpan(s, spanAllocHeap, spc, base, userArenaChunkPages)
s.isUserArenaChunk = true
// Account for this new arena chunk memory.
gcController.heapInUse.add(int64(userArenaChunkBytes))
gcController.heapReleased.add(-int64(userArenaChunkBytes))
stats := memstats.heapStats.acquire()
atomic.Xaddint64(&stats.inHeap, int64(userArenaChunkBytes))
atomic.Xaddint64(&stats.committed, int64(userArenaChunkBytes))
// Model the arena as a single large malloc.
atomic.Xadd64(&stats.largeAlloc, int64(userArenaChunkBytes))
atomic.Xadd64(&stats.largeAllocCount, 1)
memstats.heapStats.release()
// Count the alloc in inconsistent, internal stats.
gcController.totalAlloc.Add(int64(userArenaChunkBytes))
// Update heapLive.
gcController.update(int64(userArenaChunkBytes), 0)
// Put the large span in the mcentral swept list so that it's
// visible to the background sweeper.
h.central[spc].mcentral.fullSwept(h.sweepgen).push(s)
s.limit = s.base() + userArenaChunkBytes
s.freeindex = 1
s.allocCount = 1
// This must clear the entire heap bitmap so that it's safe
// to allocate noscan data without writing anything out.
s.initHeapBits(true)
// Clear the span preemptively. It's an arena chunk, so let's assume
// everything is going to be used.
//
// This also seems to make a massive difference as to whether or
// not Linux decides to back this memory with transparent huge
// pages. There's latency involved in this zeroing, but the hugepage
// gains are almost always worth it. Note: it's important that we
// clear even if it's freshly mapped and we know there's no point
// to zeroing as *that* is the critical signal to use huge pages.
memclrNoHeapPointers(unsafe.Pointer(s.base()), s.elemsize)
s.needzero = 0
// Set up the range for allocation.
s.userArenaChunkFree = makeAddrRange(base, s.limit)
return s
}

377
src/runtime/arena_test.go Normal file
View File

@ -0,0 +1,377 @@
// Copyright 2022 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package runtime_test
import (
"internal/goarch"
"reflect"
. "runtime"
"runtime/debug"
"runtime/internal/atomic"
"testing"
"time"
"unsafe"
)
type smallScalar struct {
X uintptr
}
type smallPointer struct {
X *smallPointer
}
type smallPointerMix struct {
A *smallPointer
B byte
C *smallPointer
D [11]byte
}
type mediumScalarEven [8192]byte
type mediumScalarOdd [3321]byte
type mediumPointerEven [1024]*smallPointer
type mediumPointerOdd [1023]*smallPointer
type largeScalar [UserArenaChunkBytes + 1]byte
type largePointer [UserArenaChunkBytes/unsafe.Sizeof(&smallPointer{}) + 1]*smallPointer
func TestUserArena(t *testing.T) {
// Set GOMAXPROCS to 2 so we don't run too many of these
// tests in parallel.
defer GOMAXPROCS(GOMAXPROCS(2))
// Start a subtest so that we can clean up after any parallel tests within.
t.Run("Alloc", func(t *testing.T) {
ss := &smallScalar{5}
runSubTestUserArenaNew(t, ss, true)
sp := &smallPointer{new(smallPointer)}
runSubTestUserArenaNew(t, sp, true)
spm := &smallPointerMix{sp, 5, nil, [11]byte{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}}
runSubTestUserArenaNew(t, spm, true)
mse := new(mediumScalarEven)
for i := range mse {
mse[i] = 121
}
runSubTestUserArenaNew(t, mse, true)
mso := new(mediumScalarOdd)
for i := range mso {
mso[i] = 122
}
runSubTestUserArenaNew(t, mso, true)
mpe := new(mediumPointerEven)
for i := range mpe {
mpe[i] = sp
}
runSubTestUserArenaNew(t, mpe, true)
mpo := new(mediumPointerOdd)
for i := range mpo {
mpo[i] = sp
}
runSubTestUserArenaNew(t, mpo, true)
ls := new(largeScalar)
for i := range ls {
ls[i] = 123
}
// Not in parallel because we don't want to hold this large allocation live.
runSubTestUserArenaNew(t, ls, false)
lp := new(largePointer)
for i := range lp {
lp[i] = sp
}
// Not in parallel because we don't want to hold this large allocation live.
runSubTestUserArenaNew(t, lp, false)
sss := make([]smallScalar, 25)
for i := range sss {
sss[i] = smallScalar{12}
}
runSubTestUserArenaSlice(t, sss, true)
mpos := make([]mediumPointerOdd, 5)
for i := range mpos {
mpos[i] = *mpo
}
runSubTestUserArenaSlice(t, mpos, true)
sps := make([]smallPointer, UserArenaChunkBytes/unsafe.Sizeof(smallPointer{})+1)
for i := range sps {
sps[i] = *sp
}
// Not in parallel because we don't want to hold this large allocation live.
runSubTestUserArenaSlice(t, sps, false)
// Test zero-sized types.
t.Run("struct{}", func(t *testing.T) {
arena := NewUserArena()
var x any
x = (*struct{})(nil)
arena.New(&x)
if v := unsafe.Pointer(x.(*struct{})); v != ZeroBase {
t.Errorf("expected zero-sized type to be allocated as zerobase: got %x, want %x", v, ZeroBase)
}
arena.Free()
})
t.Run("[]struct{}", func(t *testing.T) {
arena := NewUserArena()
var sl []struct{}
arena.Slice(&sl, 10)
if v := unsafe.Pointer(&sl[0]); v != ZeroBase {
t.Errorf("expected zero-sized type to be allocated as zerobase: got %x, want %x", v, ZeroBase)
}
arena.Free()
})
t.Run("[]int (cap 0)", func(t *testing.T) {
arena := NewUserArena()
var sl []int
arena.Slice(&sl, 0)
if len(sl) != 0 {
t.Errorf("expected requested zero-sized slice to still have zero length: got %x, want 0", len(sl))
}
arena.Free()
})
})
// Run a GC cycle to get any arenas off the quarantine list.
GC()
if n := GlobalWaitingArenaChunks(); n != 0 {
t.Errorf("expected zero waiting arena chunks, found %d", n)
}
}
func runSubTestUserArenaNew[S comparable](t *testing.T, value *S, parallel bool) {
t.Run(reflect.TypeOf(value).Elem().Name(), func(t *testing.T) {
if parallel {
t.Parallel()
}
// Allocate and write data, enough to exhaust the arena.
//
// This is an underestimate, likely leaving some space in the arena. That's a good thing,
// because it gives us coverage of boundary cases.
n := int(UserArenaChunkBytes / unsafe.Sizeof(*value))
if n == 0 {
n = 1
}
// Create a new arena and do a bunch of operations on it.
arena := NewUserArena()
arenaValues := make([]*S, 0, n)
for j := 0; j < n; j++ {
var x any
x = (*S)(nil)
arena.New(&x)
s := x.(*S)
*s = *value
arenaValues = append(arenaValues, s)
}
// Check integrity of allocated data.
for _, s := range arenaValues {
if *s != *value {
t.Errorf("failed integrity check: got %#v, want %#v", *s, *value)
}
}
// Release the arena.
arena.Free()
})
}
func runSubTestUserArenaSlice[S comparable](t *testing.T, value []S, parallel bool) {
t.Run("[]"+reflect.TypeOf(value).Elem().Name(), func(t *testing.T) {
if parallel {
t.Parallel()
}
// Allocate and write data, enough to exhaust the arena.
//
// This is an underestimate, likely leaving some space in the arena. That's a good thing,
// because it gives us coverage of boundary cases.
n := int(UserArenaChunkBytes / (unsafe.Sizeof(*new(S)) * uintptr(cap(value))))
if n == 0 {
n = 1
}
// Create a new arena and do a bunch of operations on it.
arena := NewUserArena()
arenaValues := make([][]S, 0, n)
for j := 0; j < n; j++ {
var sl []S
arena.Slice(&sl, cap(value))
copy(sl, value)
arenaValues = append(arenaValues, sl)
}
// Check integrity of allocated data.
for _, sl := range arenaValues {
for i := range sl {
got := sl[i]
want := value[i]
if got != want {
t.Errorf("failed integrity check: got %#v, want %#v at index %d", got, want, i)
}
}
}
// Release the arena.
arena.Free()
})
}
func TestUserArenaLiveness(t *testing.T) {
t.Run("Free", func(t *testing.T) {
testUserArenaLiveness(t, false)
})
t.Run("Finalizer", func(t *testing.T) {
testUserArenaLiveness(t, true)
})
}
func testUserArenaLiveness(t *testing.T, useArenaFinalizer bool) {
// Disable the GC so that there's zero chance we try doing anything arena related *during*
// a mark phase, since otherwise a bunch of arenas could end up on the fault list.
defer debug.SetGCPercent(debug.SetGCPercent(-1))
// Defensively ensure that any full arena chunks leftover from previous tests have been cleared.
GC()
GC()
arena := NewUserArena()
// Allocate a few pointer-ful but un-initialized objects so that later we can
// place a reference to heap object at a more interesting location.
for i := 0; i < 3; i++ {
var x any
x = (*mediumPointerOdd)(nil)
arena.New(&x)
}
var x any
x = (*smallPointerMix)(nil)
arena.New(&x)
v := x.(*smallPointerMix)
var safeToFinalize atomic.Bool
var finalized atomic.Bool
v.C = new(smallPointer)
SetFinalizer(v.C, func(_ *smallPointer) {
if !safeToFinalize.Load() {
t.Error("finalized arena-referenced object unexpectedly")
}
finalized.Store(true)
})
// Make sure it stays alive.
GC()
GC()
// In order to ensure the object can be freed, we now need to make sure to use
// the entire arena. Exhaust the rest of the arena.
for i := 0; i < int(UserArenaChunkBytes/unsafe.Sizeof(mediumScalarEven{})); i++ {
var x any
x = (*mediumScalarEven)(nil)
arena.New(&x)
}
// Make sure it stays alive again.
GC()
GC()
v = nil
safeToFinalize.Store(true)
if useArenaFinalizer {
arena = nil
// Try to queue the arena finalizer.
GC()
GC()
// In order for the finalizer we actually want to run to execute,
// we need to make sure this one runs first.
if !BlockUntilEmptyFinalizerQueue(int64(2 * time.Second)) {
t.Fatal("finalizer queue was never emptied")
}
} else {
// Free the arena explicitly.
arena.Free()
}
// Try to queue the object's finalizer that we set earlier.
GC()
GC()
if !BlockUntilEmptyFinalizerQueue(int64(2 * time.Second)) {
t.Fatal("finalizer queue was never emptied")
}
if !finalized.Load() {
t.Error("expected arena-referenced object to be finalized")
}
}
func TestUserArenaClearsPointerBits(t *testing.T) {
// This is a regression test for a serious issue wherein if pointer bits
// aren't properly cleared, it's possible to allocate scalar data down
// into a previously pointer-ful area, causing misinterpretation by the GC.
// Create a large object, grab a pointer into it, and free it.
x := new([8 << 20]byte)
xp := uintptr(unsafe.Pointer(&x[124]))
var finalized atomic.Bool
SetFinalizer(x, func(_ *[8 << 20]byte) {
finalized.Store(true)
})
// Write three chunks worth of pointer data. Three gives us a
// high likelihood that when we write 2 later, we'll get the behavior
// we want.
a := NewUserArena()
for i := 0; i < int(UserArenaChunkBytes/goarch.PtrSize*3); i++ {
var x any
x = (*smallPointer)(nil)
a.New(&x)
}
a.Free()
// Recycle the arena chunks.
GC()
GC()
a = NewUserArena()
for i := 0; i < int(UserArenaChunkBytes/goarch.PtrSize*2); i++ {
var x any
x = (*smallScalar)(nil)
a.New(&x)
v := x.(*smallScalar)
// Write a pointer that should not keep x alive.
*v = smallScalar{xp}
}
KeepAlive(x)
x = nil
// Try to free x.
GC()
GC()
if !BlockUntilEmptyFinalizerQueue(int64(2 * time.Second)) {
t.Fatal("finalizer queue was never emptied")
}
if !finalized.Load() {
t.Fatal("heap allocation kept alive through non-pointer reference")
}
// Clean up the arena.
a.Free()
GC()
GC()
}

View File

@ -362,6 +362,9 @@ func ReadMemStatsSlow() (base, slow MemStats) {
if s.state.get() != mSpanInUse {
continue
}
if s.isUnusedUserArenaChunk() {
continue
}
if sizeclass := s.spanclass.sizeclass(); sizeclass == 0 {
slow.Mallocs++
slow.Alloc += uint64(s.elemsize)
@ -1625,3 +1628,67 @@ func (s *ScavengeIndex) Clear(ci ChunkIdx) {
}
const GTrackingPeriod = gTrackingPeriod
var ZeroBase = unsafe.Pointer(&zerobase)
const UserArenaChunkBytes = userArenaChunkBytes
type UserArena struct {
arena *userArena
}
func NewUserArena() *UserArena {
return &UserArena{newUserArena()}
}
func (a *UserArena) New(out *any) {
i := efaceOf(out)
typ := i._type
if typ.kind&kindMask != kindPtr {
panic("new result of non-ptr type")
}
typ = (*ptrtype)(unsafe.Pointer(typ)).elem
i.data = a.arena.new(typ)
}
func (a *UserArena) Slice(sl any, cap int) {
a.arena.slice(sl, cap)
}
func (a *UserArena) Free() {
a.arena.free()
}
func GlobalWaitingArenaChunks() int {
n := 0
systemstack(func() {
lock(&mheap_.lock)
for s := mheap_.userArena.quarantineList.first; s != nil; s = s.next {
n++
}
unlock(&mheap_.lock)
})
return n
}
var AlignUp = alignUp
// BlockUntilEmptyFinalizerQueue blocks until either the finalizer
// queue is emptied (and the finalizers have executed) or the timeout
// is reached. Returns true if the finalizer queue was emptied.
func BlockUntilEmptyFinalizerQueue(timeout int64) bool {
start := nanotime()
for nanotime()-start < timeout {
lock(&finlock)
// We know the queue has been drained when both finq is nil
// and the finalizer g has stopped executing.
empty := finq == nil
empty = empty && readgstatus(fing) == _Gwaiting && fing.waitreason == waitReasonFinalizerWait
unlock(&finlock)
if empty {
return true
}
Gosched()
}
return false
}

View File

@ -33,6 +33,7 @@ const (
lockRankRoot
lockRankItab
lockRankReflectOffs
lockRankUserArenaState
// TRACEGLOBAL
lockRankTraceBuf
lockRankTraceStrings
@ -69,50 +70,51 @@ const lockRankLeafRank lockRank = 1000
// lockNames gives the names associated with each of the above ranks.
var lockNames = []string{
lockRankSysmon: "sysmon",
lockRankScavenge: "scavenge",
lockRankForcegc: "forcegc",
lockRankDefer: "defer",
lockRankSweepWaiters: "sweepWaiters",
lockRankAssistQueue: "assistQueue",
lockRankSweep: "sweep",
lockRankPollDesc: "pollDesc",
lockRankCpuprof: "cpuprof",
lockRankSched: "sched",
lockRankAllg: "allg",
lockRankAllp: "allp",
lockRankTimers: "timers",
lockRankNetpollInit: "netpollInit",
lockRankHchan: "hchan",
lockRankNotifyList: "notifyList",
lockRankSudog: "sudog",
lockRankRwmutexW: "rwmutexW",
lockRankRwmutexR: "rwmutexR",
lockRankRoot: "root",
lockRankItab: "itab",
lockRankReflectOffs: "reflectOffs",
lockRankTraceBuf: "traceBuf",
lockRankTraceStrings: "traceStrings",
lockRankFin: "fin",
lockRankGcBitsArenas: "gcBitsArenas",
lockRankMheapSpecial: "mheapSpecial",
lockRankMspanSpecial: "mspanSpecial",
lockRankSpanSetSpine: "spanSetSpine",
lockRankProfInsert: "profInsert",
lockRankProfBlock: "profBlock",
lockRankProfMemActive: "profMemActive",
lockRankProfMemFuture: "profMemFuture",
lockRankGscan: "gscan",
lockRankStackpool: "stackpool",
lockRankStackLarge: "stackLarge",
lockRankHchanLeaf: "hchanLeaf",
lockRankWbufSpans: "wbufSpans",
lockRankMheap: "mheap",
lockRankGlobalAlloc: "globalAlloc",
lockRankTrace: "trace",
lockRankTraceStackTab: "traceStackTab",
lockRankPanic: "panic",
lockRankDeadlock: "deadlock",
lockRankSysmon: "sysmon",
lockRankScavenge: "scavenge",
lockRankForcegc: "forcegc",
lockRankDefer: "defer",
lockRankSweepWaiters: "sweepWaiters",
lockRankAssistQueue: "assistQueue",
lockRankSweep: "sweep",
lockRankPollDesc: "pollDesc",
lockRankCpuprof: "cpuprof",
lockRankSched: "sched",
lockRankAllg: "allg",
lockRankAllp: "allp",
lockRankTimers: "timers",
lockRankNetpollInit: "netpollInit",
lockRankHchan: "hchan",
lockRankNotifyList: "notifyList",
lockRankSudog: "sudog",
lockRankRwmutexW: "rwmutexW",
lockRankRwmutexR: "rwmutexR",
lockRankRoot: "root",
lockRankItab: "itab",
lockRankReflectOffs: "reflectOffs",
lockRankUserArenaState: "userArenaState",
lockRankTraceBuf: "traceBuf",
lockRankTraceStrings: "traceStrings",
lockRankFin: "fin",
lockRankGcBitsArenas: "gcBitsArenas",
lockRankMheapSpecial: "mheapSpecial",
lockRankMspanSpecial: "mspanSpecial",
lockRankSpanSetSpine: "spanSetSpine",
lockRankProfInsert: "profInsert",
lockRankProfBlock: "profBlock",
lockRankProfMemActive: "profMemActive",
lockRankProfMemFuture: "profMemFuture",
lockRankGscan: "gscan",
lockRankStackpool: "stackpool",
lockRankStackLarge: "stackLarge",
lockRankHchanLeaf: "hchanLeaf",
lockRankWbufSpans: "wbufSpans",
lockRankMheap: "mheap",
lockRankGlobalAlloc: "globalAlloc",
lockRankTrace: "trace",
lockRankTraceStackTab: "traceStackTab",
lockRankPanic: "panic",
lockRankDeadlock: "deadlock",
}
func (rank lockRank) String() string {
@ -134,48 +136,49 @@ func (rank lockRank) String() string {
//
// Lock ranks that allow self-cycles list themselves.
var lockPartialOrder [][]lockRank = [][]lockRank{
lockRankSysmon: {},
lockRankScavenge: {lockRankSysmon},
lockRankForcegc: {lockRankSysmon},
lockRankDefer: {},
lockRankSweepWaiters: {},
lockRankAssistQueue: {},
lockRankSweep: {},
lockRankPollDesc: {},
lockRankCpuprof: {},
lockRankSched: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof},
lockRankAllg: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched},
lockRankAllp: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched},
lockRankTimers: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllp, lockRankTimers},
lockRankNetpollInit: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllp, lockRankTimers},
lockRankHchan: {lockRankSysmon, lockRankScavenge, lockRankSweep, lockRankHchan},
lockRankNotifyList: {},
lockRankSudog: {lockRankSysmon, lockRankScavenge, lockRankSweep, lockRankHchan, lockRankNotifyList},
lockRankRwmutexW: {},
lockRankRwmutexR: {lockRankSysmon, lockRankRwmutexW},
lockRankRoot: {},
lockRankItab: {},
lockRankReflectOffs: {lockRankItab},
lockRankTraceBuf: {lockRankSysmon, lockRankScavenge},
lockRankTraceStrings: {lockRankSysmon, lockRankScavenge, lockRankTraceBuf},
lockRankFin: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankGcBitsArenas: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankMheapSpecial: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankMspanSpecial: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankSpanSetSpine: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfInsert: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfBlock: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfMemActive: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfMemFuture: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankProfMemActive},
lockRankGscan: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture},
lockRankStackpool: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan},
lockRankStackLarge: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan},
lockRankHchanLeaf: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankHchanLeaf},
lockRankWbufSpans: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan},
lockRankMheap: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans},
lockRankGlobalAlloc: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMheapSpecial, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap},
lockRankTrace: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap},
lockRankTraceStackTab: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap, lockRankTrace},
lockRankPanic: {},
lockRankDeadlock: {lockRankPanic, lockRankDeadlock},
lockRankSysmon: {},
lockRankScavenge: {lockRankSysmon},
lockRankForcegc: {lockRankSysmon},
lockRankDefer: {},
lockRankSweepWaiters: {},
lockRankAssistQueue: {},
lockRankSweep: {},
lockRankPollDesc: {},
lockRankCpuprof: {},
lockRankSched: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof},
lockRankAllg: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched},
lockRankAllp: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched},
lockRankTimers: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllp, lockRankTimers},
lockRankNetpollInit: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllp, lockRankTimers},
lockRankHchan: {lockRankSysmon, lockRankScavenge, lockRankSweep, lockRankHchan},
lockRankNotifyList: {},
lockRankSudog: {lockRankSysmon, lockRankScavenge, lockRankSweep, lockRankHchan, lockRankNotifyList},
lockRankRwmutexW: {},
lockRankRwmutexR: {lockRankSysmon, lockRankRwmutexW},
lockRankRoot: {},
lockRankItab: {},
lockRankReflectOffs: {lockRankItab},
lockRankUserArenaState: {},
lockRankTraceBuf: {lockRankSysmon, lockRankScavenge},
lockRankTraceStrings: {lockRankSysmon, lockRankScavenge, lockRankTraceBuf},
lockRankFin: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankGcBitsArenas: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankMheapSpecial: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankMspanSpecial: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankSpanSetSpine: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfInsert: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfBlock: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfMemActive: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings},
lockRankProfMemFuture: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankProfMemActive},
lockRankGscan: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture},
lockRankStackpool: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan},
lockRankStackLarge: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan},
lockRankHchanLeaf: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankHchanLeaf},
lockRankWbufSpans: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan},
lockRankMheap: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans},
lockRankGlobalAlloc: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMheapSpecial, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap},
lockRankTrace: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap},
lockRankTraceStackTab: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankPollDesc, lockRankCpuprof, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRwmutexW, lockRankRwmutexR, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap, lockRankTrace},
lockRankPanic: {},
lockRankDeadlock: {lockRankPanic, lockRankDeadlock},
}

View File

@ -452,6 +452,14 @@ func mallocinit() {
//
// On AIX, mmaps starts at 0x0A00000000000000 for 64-bit.
// processes.
//
// Space mapped for user arenas comes immediately after the range
// originally reserved for the regular heap when race mode is not
// enabled because user arena chunks can never be used for regular heap
// allocations and we want to avoid fragmenting the address space.
//
// In race mode we have no choice but to just use the same hints because
// the race detector requires that the heap be mapped contiguously.
for i := 0x7f; i >= 0; i-- {
var p uintptr
switch {
@ -477,9 +485,16 @@ func mallocinit() {
default:
p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32)
}
// Switch to generating hints for user arenas if we've gone
// through about half the hints. In race mode, take only about
// a quarter; we don't have very much space to work with.
hintList := &mheap_.arenaHints
if (!raceenabled && i > 0x3f) || (raceenabled && i > 0x5f) {
hintList = &mheap_.userArena.arenaHints
}
hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
hint.addr = p
hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
hint.next, *hintList = *hintList, hint
}
} else {
// On a 32-bit machine, we're much more concerned
@ -547,6 +562,14 @@ func mallocinit() {
hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
hint.addr = p
hint.next, mheap_.arenaHints = mheap_.arenaHints, hint
// Place the hint for user arenas just after the large reservation.
//
// While this potentially competes with the hint above, in practice we probably
// aren't going to be getting this far anyway on 32-bit platforms.
userArenaHint := (*arenaHint)(mheap_.arenaHintAlloc.alloc())
userArenaHint.addr = p
userArenaHint.next, mheap_.userArena.arenaHints = mheap_.userArena.arenaHints, userArenaHint
}
}
@ -755,8 +778,6 @@ retry:
case p == 0:
return nil, 0
case p&(align-1) == 0:
// We got lucky and got an aligned region, so we can
// use the whole thing.
return unsafe.Pointer(p), size + align
case GOOS == "windows":
// On Windows we can't release pieces of a

View File

@ -718,9 +718,9 @@ func typeBitsBulkBarrier(typ *_type, dst, src, size uintptr) {
// initHeapBits initializes the heap bitmap for a span.
// If this is a span of single pointer allocations, it initializes all
// words to pointer.
func (s *mspan) initHeapBits() {
if s.spanclass.noscan() {
// words to pointer. If force is true, clears all bits.
func (s *mspan) initHeapBits(forceClear bool) {
if forceClear || s.spanclass.noscan() {
// Set all the pointer bits to zero. We do this once
// when the span is allocated so we don't have to do it
// for each object allocation.

View File

@ -252,7 +252,7 @@ func (c *mcache) allocLarge(size uintptr, noscan bool) *mspan {
// visible to the background sweeper.
mheap_.central[spc].mcentral.fullSwept(mheap_.sweepgen).push(s)
s.limit = s.base() + size
s.initHeapBits()
s.initHeapBits(false)
return s
}

View File

@ -252,6 +252,6 @@ func (c *mcentral) grow() *mspan {
// n := (npages << _PageShift) / size
n := s.divideByElemSize(npages << _PageShift)
s.limit = s.base() + size*n
s.initHeapBits()
s.initHeapBits(false)
return s
}

View File

@ -1162,6 +1162,15 @@ func gcMarkTermination() {
printunlock()
}
// Set any arena chunks that were deferred to fault.
lock(&userArenaState.lock)
faultList := userArenaState.fault
userArenaState.fault = nil
unlock(&userArenaState.lock)
for _, lc := range faultList {
lc.mspan.setUserArenaChunkToFault()
}
semrelease(&worldsema)
semrelease(&gcsema)
// Careful: another GC cycle may start now.

View File

@ -602,13 +602,14 @@ func (sl *sweepLocked) sweep(preserve bool) bool {
if debug.clobberfree != 0 {
clobberfree(unsafe.Pointer(x), size)
}
if raceenabled {
// User arenas are handled on explicit free.
if raceenabled && !s.isUserArenaChunk {
racefree(unsafe.Pointer(x), size)
}
if msanenabled {
if msanenabled && !s.isUserArenaChunk {
msanfree(unsafe.Pointer(x), size)
}
if asanenabled {
if asanenabled && !s.isUserArenaChunk {
asanpoison(unsafe.Pointer(x), size)
}
}
@ -682,6 +683,41 @@ func (sl *sweepLocked) sweep(preserve bool) bool {
// to go so release the span.
atomic.Store(&s.sweepgen, sweepgen)
if s.isUserArenaChunk {
if preserve {
// This is a case that should never be handled by a sweeper that
// preserves the span for reuse.
throw("sweep: tried to preserve a user arena span")
}
if nalloc > 0 {
// There still exist pointers into the span or the span hasn't been
// freed yet. It's not ready to be reused. Put it back on the
// full swept list for the next cycle.
mheap_.central[spc].mcentral.fullSwept(sweepgen).push(s)
return false
}
// It's only at this point that the sweeper doesn't actually need to look
// at this arena anymore, so subtract from pagesInUse now.
mheap_.pagesInUse.Add(-s.npages)
s.state.set(mSpanDead)
// The arena is ready to be recycled. Remove it from the quarantine list
// and place it on the ready list. Don't add it back to any sweep lists.
systemstack(func() {
// It's the arena code's responsibility to get the chunk on the quarantine
// list by the time all references to the chunk are gone.
if s.list != &mheap_.userArena.quarantineList {
throw("user arena span is on the wrong list")
}
lock(&mheap_.lock)
mheap_.userArena.quarantineList.remove(s)
mheap_.userArena.readyList.insert(s)
unlock(&mheap_.lock)
})
return false
}
if spc.sizeclass() != 0 {
// Handle spans for small objects.
if nfreed > 0 {

View File

@ -203,6 +203,25 @@ type mheap struct {
speciallock mutex // lock for special record allocators.
arenaHintAlloc fixalloc // allocator for arenaHints
// User arena state.
//
// Protected by mheap_.lock.
userArena struct {
// arenaHints is a list of addresses at which to attempt to
// add more heap arenas for user arena chunks. This is initially
// populated with a set of general hint addresses, and grown with
// the bounds of actual heap arena ranges.
arenaHints *arenaHint
// quarantineList is a list of user arena spans that have been set to fault, but
// are waiting for all pointers into them to go away. Sweeping handles
// identifying when this is true, and moves the span to the ready list.
quarantineList mSpanList
// readyList is a list of empty user arena spans that are ready for reuse.
readyList mSpanList
}
unused *specialfinalizer // never set, just here to force the specialfinalizer type into DWARF
}
@ -352,7 +371,6 @@ var mSpanStateNames = []string{
"mSpanDead",
"mSpanInUse",
"mSpanManual",
"mSpanFree",
}
// mSpanStateBox holds an atomic.Uint8 to provide atomic operations on
@ -462,11 +480,13 @@ type mspan struct {
spanclass spanClass // size class and noscan (uint8)
state mSpanStateBox // mSpanInUse etc; accessed atomically (get/set methods)
needzero uint8 // needs to be zeroed before allocation
isUserArenaChunk bool // whether or not this span represents a user arena
allocCountBeforeCache uint16 // a copy of allocCount that is stored just before this span is cached
elemsize uintptr // computed from sizeclass or from npages
limit uintptr // end of data in span
speciallock mutex // guards specials list
specials *special // linked list of special records sorted by offset.
userArenaChunkFree addrRange // interval for managing chunk allocation
}
func (s *mspan) base() uintptr {
@ -1206,6 +1226,7 @@ func (h *mheap) allocSpan(npages uintptr, typ spanAllocType, spanclass spanClass
base = alignUp(base, physPageSize)
scav = h.pages.allocRange(base, npages)
}
if base == 0 {
// Try to acquire a base address.
base, scav = h.pages.alloc(npages)
@ -1550,6 +1571,9 @@ func (h *mheap) freeSpanLocked(s *mspan, typ spanAllocType) {
throw("mheap.freeSpanLocked - invalid stack free")
}
case mSpanInUse:
if s.isUserArenaChunk {
throw("mheap.freeSpanLocked - invalid free of user arena chunk")
}
if s.allocCount != 0 || s.sweepgen != h.sweepgen {
print("mheap.freeSpanLocked - span ", s, " ptr ", hex(s.base()), " allocCount ", s.allocCount, " sweepgen ", s.sweepgen, "/", h.sweepgen, "\n")
throw("mheap.freeSpanLocked - invalid free")

View File

@ -83,6 +83,9 @@ NONE
< itab
< reflectOffs;
# User arena state
NONE < userArenaState;
# Tracing without a P uses a global trace buffer.
scavenge
# Above TRACEGLOBAL can emit a trace event without a P.
@ -100,7 +103,8 @@ allg,
notifyList,
reflectOffs,
timers,
traceStrings
traceStrings,
userArenaState
# Above MALLOC are things that can allocate memory.
< MALLOC
# Below MALLOC is the malloc implementation.

View File

@ -70,6 +70,30 @@ func (a addrRange) subtract(b addrRange) addrRange {
return a
}
// takeFromFront takes len bytes from the front of the address range, aligning
// the base to align first. On success, returns the aligned start of the region
// taken and true.
func (a *addrRange) takeFromFront(len uintptr, align uint8) (uintptr, bool) {
base := alignUp(a.base.addr(), uintptr(align)) + len
if base > a.limit.addr() {
return 0, false
}
a.base = offAddr{base}
return base - len, true
}
// takeFromBack takes len bytes from the end of the address range, aligning
// the limit to align after subtracting len. On success, returns the aligned
// start of the region taken and true.
func (a *addrRange) takeFromBack(len uintptr, align uint8) (uintptr, bool) {
limit := alignDown(a.limit.addr()-len, uintptr(align))
if a.base.addr() > limit {
return 0, false
}
a.limit = offAddr{limit}
return limit, true
}
// removeGreaterEqual removes all addresses in a greater than or equal
// to addr and returns the new range.
func (a addrRange) removeGreaterEqual(addr uintptr) addrRange {

View File

@ -99,7 +99,14 @@ func sigpanic() {
if gp.paniconfault {
panicmemAddr(gp.sigcode1)
}
print("unexpected fault address ", hex(gp.sigcode1), "\n")
if inUserArenaChunk(gp.sigcode1) {
// We could check that the arena chunk is explicitly set to fault,
// but the fact that we faulted on accessing it is enough to prove
// that it is.
print("accessed data from freed user arena ", hex(gp.sigcode1), "\n")
} else {
print("unexpected fault address ", hex(gp.sigcode1), "\n")
}
throw("fault")
case _SIGTRAP:
if gp.paniconfault {

View File

@ -840,7 +840,14 @@ func sigpanic() {
if gp.paniconfault {
panicmemAddr(gp.sigcode1)
}
print("unexpected fault address ", hex(gp.sigcode1), "\n")
if inUserArenaChunk(gp.sigcode1) {
// We could check that the arena chunk is explicitly set to fault,
// but the fact that we faulted on accessing it is enough to prove
// that it is.
print("accessed data from freed user arena ", hex(gp.sigcode1), "\n")
} else {
print("unexpected fault address ", hex(gp.sigcode1), "\n")
}
throw("fault")
case _SIGFPE:
switch gp.sigcode0 {

View File

@ -259,7 +259,14 @@ func sigpanic() {
if gp.paniconfault {
panicmemAddr(gp.sigcode1)
}
print("unexpected fault address ", hex(gp.sigcode1), "\n")
if inUserArenaChunk(gp.sigcode1) {
// We could check that the arena chunk is explicitly set to fault,
// but the fact that we faulted on accessing it is enough to prove
// that it is.
print("accessed data from freed user arena ", hex(gp.sigcode1), "\n")
} else {
print("unexpected fault address ", hex(gp.sigcode1), "\n")
}
throw("fault")
case _EXCEPTION_INT_DIVIDE_BY_ZERO:
panicdivide()