diff --git a/doc/diagnostics.html b/doc/diagnostics.html new file mode 100644 index 0000000000..74768ce750 --- /dev/null +++ b/doc/diagnostics.html @@ -0,0 +1,439 @@ + + +
+The Go ecosystem provides a large suite of APIs and tools to +diagnose logic and performance problems in Go programs. This page +summarizes the available tools and helps Go users pick the right one +for their specific problem. +
+ ++Diagnostics solutions can be categorized into the following groups: +
+ ++Note: Some diagnostics tools may interfere with each other. For example, precise +memory profiling skews CPU profiles and goroutine blocking profiling affects scheduler +trace. Use tools in isolation to get more precise info. +
+ +
+Profiling is useful for identifying expensive or frequently called sections
+of code. The Go runtime provides
+profiling data in the format expected by the
+pprof visualization tool.
+The profiling data can be collected during testing
+via go test or endpoints made available from the
+net/http/pprof package. Users need to collect the profiling data and use pprof tools to filter
+and visualize the top code paths.
+
Predefined profiles provided by the runtime/pprof package:
+ +runtime.SetBlockProfileRate to enable it.
+runtime.SetMutexProfileFraction to enable it.
+What other profilers can I use to profile Go programs?
+ ++On Linux, perf tools +can be used for profiling Go programs. Perf can profile +and unwind cgo/SWIG code and kernel, so it can be useful to get insights into +native/kernel performance bottlenecks. On macOS, +Instruments +suite can be used profile Go programs. +
+ +Can I profile my production services?
+ +Yes. It is safe to profile programs in production, but enabling +some profiles (e.g. the CPU profile) adds cost. You should expect to +see performance downgrade. The performance penalty can be estimated +by measuring the overhead of the profiler before turning it on in +production. +
+ ++You may want to periodically profile your production services. +Escpeically in system with many replicas of a single process, selecting +a random replica periodically is safe option. +Select a production process, profile it for +X seconds for every Y seconds and save the results for visualization and +analysis; then repeat periodically. Results may be manually and/or automatically +reviewed to find problems. +Collection of profiles can interfere with each other, +so it is recommended to collect only a single profile at a time. +
+ ++What are the best ways to visualize the profiling data? +
+ +
+The Go tools provide text, graph, and callgrind
+visualization of the profile data via
+go tool pprof.
+Read Profiling Go programs
+to see them in action.
+
+
+
+Listing of the most expensive calls as text.
+
+
+
+Visualization of the most expensive calls as a graph.
+
Weblist view displays the expensive parts of the source line by line in
+an HTML page. In the following example, 530ms is spent in the
+runtime.concatstrings and cost of each line is presented
+in the listing.
+
+
+Visualization of the most expensive calls as weblist.
+
+Another way to visualize profile data is a flame graph. +Flame graphs allow you to move in a specific ancestry path, so you can zoom +in/out specific sections of code more easily. +
+ +
+
+
+Flame graphs offers visualization to spot the most expensive code-paths.
+
Am I restricted to the built-in profiles?
+ ++Additionally to what is provided by the runtime, Go users can create +their custom profiles via pprof.Profile +and use the existing tools to examine them. +
+ +Can I serve the profiler handlers (/debug/pprof/...) on a different path and port?
+ +
+Yes. The net/http/pprof package registers its handlers to the default
+mux by default, but you can also register them yourself by using the handlers
+exported from the package.
+
+For example, the following example will serve the pprof.Profile +handler on :7777 at /pprof/profile: +
+ ++
+mux := http.NewServeMux()
+mux.HandleFunc("/custom_debug_path/profile", pprof.Profile)
+http.ListenAndServe(":7777", mux)
+
+
+
++Tracing is a way to instrument code to analyze latency throughout the +lifecycle of a chain of calls. Go provides +golang.org/x/net/trace +package as a minimal tracing backend per Go node and provides a minimal +instrumentation library with a simple dashboard. Go also provides +an execution tracer to trace the runtime events within an interval. +
+ +Tracing enables us to:
+ ++In monolithic systems, it's relatively easy to collect diagnostic data +from the building blocks of a program. All modules live within one +process and share common resources to report logs, errors, and other +diagnostic information. Once your system grows beyond a single process and +starts to become distributed, it becomes harder to follow a call starting +from the front-end web server to all of its back-ends until a response is +returned back to the user. This is where distributed tracing plays a big +role to instrument and analyze your production systems. +
+ ++Distributed tracing is a way to instrument code to analyze latency throughout +the lifecycle of a user request. When a system is distributed and when +conventional profiling and debugging tools don’t scale, you might want +to use distributed tracing tools to analyze the performance of your user +requests and RPCs. +
+ +Distributed tracing enables us to:
+ +The Go ecosystem provides various distributed tracing libraries per tracing system +and backend-agnostic ones.
+ + +Is there a way to automatically intercept each function call and create traces?
+ ++Go doesn’t provide a way to automatically intercept every function call and create +trace spans. You need to manually instrument your code to create, end, and annotate spans. +
+ +How should I propagate trace headers in Go libraries?
+ +
+You can propagate trace identifiers and tags in the context.Context.
+There is no canonical trace key or common representation of trace headers
+in the industry yet. Each tracing provider is responsible for providing propagation
+utilities in their Go libraries.
+
+What other low-level events from the standard library or +runtime can be included in a trace? +
+ ++The standard library and runtime are trying to expose several additional APIs +to notify on low level internal events. For example, httptrace.ClientTrace +provides APIs to follow low-level events in the life cycle of an outgoing request. +There is an ongoing effort to retrieve low-level runtime events from +the runtime execution tracer and allow users to define and record their user events. +
+ ++Debugging is the process of identifying why a program misbehaves. +Debuggers allow us to understand a program’s execution flow and current state. +There are several styles of debugging; this section will only focus on attaching +a debugger to a program and core dump debugging. +
+ +Go users mostly use the following debuggers:
+ +How well do debuggers work with Go programs?
+ ++As of Go 1.9, the DWARF info generated by the gc compiler is not complete +and sometimes makes debugging harder. There is an ongoing effort to improve the +DWARF information to help the debuggers display more accurate information. +Until those improvements are in you may prefer to disable compiler +optimizations during development for more accuracy. To disable optimizations, +use the "-N -l" compiler flags. For example, the following command builds +a package with no compiler optimizations: + +
+
+$ go build -gcflags="-N -l" ++ + +
+As of Go 1.10, the Go binaries will have the required DWARF information +for accurate debugging. To enable the DWARF improvements, use the following +compiler flags and use GDB until Delve supports location lists: +
+ ++
+$ go build -gcflags="-dwarflocationlists=true" ++ + +
What’s the recommended debugger user interface?
+ ++Even though both delve and gdb provides CLIs, most editor integrations +and IDEs provides debugging-specific user interfaces. Please refer to +the editors guide to see the options +with debugger UI support. +
+ +Is it possible to do postmortem debugging with Go programs?
+ ++A core dump file is a file that contains the memory dump of a running +process and its process status. It is primarily used for post-mortem +debugging of a program and to understand its state +while it is still running. These two cases make debugging of core +dumps a good diagnostic aid to postmortem and analyze production +services. It is possible to obtain core files from Go programs and +use delve or gdb to debug, see the +core dump debugging +page for a step-by-step guide. +
+ ++The runtime provides stats and reporting of internal events for +users to diagnose performance and utilization problems at the +runtime level. +
+ ++Users can monitor these stats to better understand the overall +health and performance of Go programs. +Some frequently monitored stats and states: +
+ +runtime.ReadMemStats
+reports the metrics related to heap
+allocation and garbage collection. Memory stats are useful for
+monitoring how much memory resources a process is consuming,
+whether the process can utilize memory well, and to catch
+memory leaks.debug.ReadGCStats
+reads statistics about garbage collection.
+It is useful to see how much of the resources are spent on GC pauses.
+It also reports a timeline of garbage collector pauses and pause time percentiles.debug.Stack
+returns the current stack trace. Stack trace
+is useful to see how many goroutines are currently running,
+what they are doing, and whether they are blocked or not.debug.WriteHeapDump
+suspends the execution of all goroutines
+and allows you to dump the heap to a file. A heap dump is a
+snapshot of a Go process' memory at a given time. It contains all
+allocated objects as well as goroutines, finalizers, and more.runtime.NumGoroutine
+returns the number of current goroutines.
+The value can be monitored to see whether enough goroutines are
+utilized or to detect the goroutine leaks.Go comes with a runtime execution tracer to capture a wide range +of runtime events. Scheduling, syscall, garbage collections, +heap size, and other events are collected by runtime and available +for visualization by the go tool trace. Execution tracer is a tool +to detect latency and utilization problems. You can examine how well +the CPU is utilized, and when networking or syscalls are a cause of +preemption for the goroutines.
+ +Tracer is useful to:
+However, it is not great for identifying hot spots such as +analyzing the cause of excessive memory or CPU usage. +Use profiling tools instead first to address them.
+ +
+
+
Above, the go tool trace visualization shows the execution started +fine, and then it became serialized. It suggests that there might +be lock contention for a shared resource that creates a bottleneck.
+ +See go tool trace
+to collect and analyze runtime traces.
+
Runtime also emits events and information if +GODEBUG +environmental variable is set accordingly.
+ ++Summarizes tools and methodologies to diagnose problems in Go programs. +
+Answers to common questions about Go.