runtime: skip TestCPUMetricsSleep as flaky

This test is fundamentally flaky because of a mismatch between how
internal idle time is calculated and how the test expects it to be
calculated. It's unclear how to resolve this mismatch, given that it's
perfectly valid for a goroutine to remain asleep while background
goroutines (e.g. the scavenger) run. In practice, we might be able to
set some generous lower-bound, but until we can confirm that on the
affected platforms, skip the test as flaky unconditionally.

For #60276.
For #60376.

Change-Id: Iffd5c4be10cf8ae8a6c285b61fcc9173235fbb2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/497876
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This commit is contained in:
Michael Anthony Knyszek 2023-05-24 13:51:33 +00:00 committed by Michael Knyszek
parent 227e712685
commit 220c396ea8
1 changed files with 18 additions and 2 deletions

View File

@ -5,6 +5,7 @@
package runtime_test
import (
"internal/testenv"
"reflect"
"runtime"
"runtime/debug"
@ -645,6 +646,18 @@ func TestCPUMetricsSleep(t *testing.T) {
// test is basically meaningless on this platform.
t.Skip("wasip1 currently busy-waits in idle time; test not applicable")
}
// Unconditionally skip this test as flaky.
//
// There's a fundamental issue with this test, which is that there's no
// guarantee the application will go idle; background goroutines and
// time spent in the scheduler going to sleep can always erode idle time
// sufficiently such that minimum idle time (or maximum user time) stays
// within some threshold.
//
// Leave this as skipped while we figure out a better way to check this.
testenv.SkipFlaky(t, 60376)
names := []string{
"/cpu/classes/idle:cpu-seconds",
@ -681,8 +694,11 @@ func TestCPUMetricsSleep(t *testing.T) {
metrics.Read(m2)
// If the bug we expect is happening, then the Sleep CPU time will be accounted for
// as user time rather than idle time. Because we're doing this on one core, the
// maximum amount of time that can be attributed to user time is the time spent asleep.
// as user time rather than idle time.
//
// TODO(mknyszek): This number here is wrong. Background goroutines and just slow
// platforms spending a non-trivial amount of time in the scheduler doing things
// could easily erode idle time beyond this minimum.
minIdleCPUSeconds := dur.Seconds() * float64(runtime.GOMAXPROCS(-1))
if dt := m2[0].Value.Float64() - m1[0].Value.Float64(); dt < minIdleCPUSeconds {