runtime: simplify and optimize typedslicecopy

Currently, typedslicecopy meticulously performs a typedmemmove on
every element of the slice. This probably used to be necessary because
we only had an individual element's type, but now we use the heap
bitmap, so we only need to know whether the type has any pointers and
how big it is. Hence, this CL rewrites typedslicecopy to simply
perform one bulk barrier and one memmove.

This also has a side-effect of eliminating two unnecessary write
barriers per slice element that were coming from updates to dstp and
srcp, which were stored in the parent stack frame. However, most of
the win comes from eliminating the loops.

name                 old time/op  new time/op  delta
BulkWriteBarrier-12  7.83ns ±10%  7.33ns ± 6%  -6.45%  (p=0.000 n=20+20)

Updates #22460.

Change-Id: Id3450e9f36cc8e0892f268319b136f0d8f5464b8
Reviewed-on: https://go-review.googlesource.com/73831
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
This commit is contained in:
Austin Clements 2017-10-27 15:30:19 -04:00
parent f96b95bcd1
commit 6a5f1e58ed
1 changed files with 6 additions and 34 deletions

View File

@ -341,41 +341,13 @@ func typedslicecopy(typ *_type, dst, src slice) int {
// compiler only emits calls to typedslicecopy for types with pointers,
// and growslice and reflect_typedslicecopy check for pointers
// before calling typedslicecopy.
if !writeBarrier.needed {
memmove(dstp, srcp, uintptr(n)*typ.size)
return n
size := uintptr(n) * typ.size
if writeBarrier.needed {
bulkBarrierPreWrite(uintptr(dstp), uintptr(srcp), size)
}
systemstack(func() {
if uintptr(srcp) < uintptr(dstp) && uintptr(srcp)+uintptr(n)*typ.size > uintptr(dstp) {
// Overlap with src before dst.
// Copy backward, being careful not to move dstp/srcp
// out of the array they point into.
dstp = add(dstp, uintptr(n-1)*typ.size)
srcp = add(srcp, uintptr(n-1)*typ.size)
i := 0
for {
typedmemmove(typ, dstp, srcp)
if i++; i >= n {
break
}
dstp = add(dstp, -typ.size)
srcp = add(srcp, -typ.size)
}
} else {
// Copy forward, being careful not to move dstp/srcp
// out of the array they point into.
i := 0
for {
typedmemmove(typ, dstp, srcp)
if i++; i >= n {
break
}
dstp = add(dstp, typ.size)
srcp = add(srcp, typ.size)
}
}
})
// See typedmemmove for a discussion of the race between the
// barrier and memmove.
memmove(dstp, srcp, size)
return n
}