Method Tracing
#1497
Replies: 1 comment 1 reply
-
@apangin , thanks for this great post and the work behind it! A question about this note:
As an async-profiler user, how do I understand that this particular profiling session needs this option? Should I notice some signs of its applicability, or maybe just use it if the defaults don't work? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
Async-profiler is often described as a low-overhead sampling profiler for Java. Sampling means the profiler collects statistical information about program execution by periodically interrupting it to get stack traces. While this approach has minimal performance impact on the monitored application, the main trade-off is its inability to get an exact count of method invocations or duration of each invocation.
If a method is missing in a sampling profile, it does not mean this method has never been executed. In the cases when it is important to know the exact number of calls or to discover if a particular code path is ever taken, instrumenting profilers come to the rescue. In this post, I will showcase method tracing capabilities of async-profiler, including the recently added latency profiling feature.
Basic instrumentation
Async-profiler has had instrumentation capabilities since version 1.7 released back in 2020. You can specify a Java method to instrument with
-e
option, and all invocations of the given method will be recorded. This may be either a JDK internal method or any application method, no matter public or private.Let's say, we are chasing a cause of mysterious Full GC in logs:
Looks as if someone is calling
System.gc
, and async-profiler can tell where it is called from exactly:This command collects stack traces of all
System.gc
and summarizes them on a flame graph.Surprisingly,
System.gc
is called right in the OpenJDK core library under certain out-of-memory conditions. By the way, to trace the source of such out-of-memory errors, you can use the same async-profiler feature: just instrument theOutOfMemoryError
constructor:Note
--cstack vmx
option: it enables the most detailed stack walking mode, showing all Java and native frames interleaved. In this example,OutOfMemoryError
is thrown from the native implementation of the JDK internalmap0
method.You can use
*
wildcard to profile more than one method at a time.For instance,
-e java.io.FileInputStream.read*
will matchread
,readNBytes
andreadAllBytes
methods.More generally,
-e java.io.FileInputStream.*
will instrument all methods of theFileInputStream
class.What other types of instrumentation async-profiler supports, find in my Devoxx UK talk.
Latency tracing
Although Java method profiling feature has existed for a long time, it could only trace method invocations, but not duration of calls.
latency
option introduced in #1421, closes this gap. It is already available in Nightly builds.As before,
-e
argument (orevent
when starting profiler as an agent) selects a method to instrument. Now, with thelatency
option, async-profiler will record the duration of each call. When combined withtotal
option, flame graph accumulates total duration instead of the invocation count.Here is an example how to count total time spent inside
ByteBuffer.allocateDirect
:To get the most out of this feature, use latency profiling with JFR output format. In this case, async-profiler records every single invocation of the given method along with the event timestamp, duration and a stack trace. You can then analyze the recording in JDK Mission Control (JMC) on the Event Browser tab - look for Method Trace events:
But what if the method of interest is called millions of times? To simplify analysis and, importantly, to reduce performance overhead of method tracing, you can filter only those method calls that run longer than the specified amount of time - just pass the duration threshold as
latency
argument.For example, we want to find usages of
ArrayList.contains
, which is generally discouraged because of its linear complexity. However, there is no problem with this method at all when a list contains only a few elements. Our goal will be to findArrayList.contains
calls that take longer than 50 microseconds:Opening profile in JMC, we see a handful of Method Trace events and can quickly dive deep into the longest ones.
On this screenshot, the slowest
ArrayList.contains
appears to be called fromAbstractSet.removeAll
. This is not a unique problem, actually. Depending on the relative size of collections,removeAll
implementation may iterate over the original set or over the specified collection. If you are unlucky and a list is larger than the set, the algorithm complexity degrades toO(N*M)
. If you ever experienced the same issue, the fix would be to replaceset.removeAll(list)
withlist.forEach(set::remove)
.Performance implications
Now that you've seen the capabilities of Method Tracing, I anticipate the question: what about overhead? Is it safe to enable tracing in production?
In short, Method Tracing was designed with high performance in mind. The feature relies on the bytecode instrumentation: for every traced method, async-profiler inserts a timestamp call at the beginning of the method and another call at every exit point. Such instrumentation does not prevent the method from being JIT-compiled, it typically adds just a few nanoseconds. When the method duration exceeds the latency threshold, a JNI call is made to collect a stack trace and to record the event. This is the most expensive part of tracing. Hence the rule: choose
latency
parameter wisely to avoid excessive tracing of short-running methods. For an average application, recording about 100 calls per second does not cause any noticeable performance impact.Another approach to reduce the overhead is to set non-zero
interval
. E.g.,interval=20
means async-profiler will record only one of every 20 method calls. This can be useful to collect a latency histogram without paying penatly on every invocation.Method Tracing in JDK
Async-profiler supports method tracing with all JDK versions starting from 8.
In the meantime, JDK 25 comes with a new built-in feature that looks very similar at first glance.
JEP 520: JFR Method Timing & Tracing adds two JFR events:
jdk.MethodTrace
andjdk.MethodTiming
. In fact, async-profiler adopted the same event for method traces, aiming to become compatible with external tools once they support this new feature. There are several important differences, though.In JEP 520, stack trace of the
jdk.MethodTrace
event starts from the caller method. I deem it an unfortunate design flaw to omit the traced method itself, and that is why.Besides just method ID,
StackFrame
carries a bunch of additional details, such as frame type. At different times, the traced method can be interpreted, compiled or inlined. Discovering its compilation status is useful to understand a possible cause of latency outliers, which may be related, for instance, to deoptimization issues.If the target method has multiple exits, it is good to know which one is taken exactly. For example, this allows to compare latency of the "cache hit" path with the "cache miss" branch. Again,
method
alone is not enough to do so, as opposed toStackFrame
which haslineNumber
andbytecodeIndex
fields.With the traced method included in a stack trace, Flame Graph visualization works out of the box. Without it, tool developers need to provide a special case for
jdk.MethodTrace
event and append a synthetic frame with the traced method themselves.For the above reasons, we decided to include the target method in async-profiler's stack traces, so that additional information becomes immediately visible in JMC:
Like many other JFR events,
jdk.MethodTrace
has a few settings that can be tuned via the command line or in .jfc configuration file. JEP 520 mentionsthreshold
intended to trace only methods taking longer than the specified amount of time. In practice, however, changingthreshold
has no effect, indicating the setting has not been implemented yet. This means, JFR Method Tracing is not currently suitable for high volume events.There is one more interesting peculiarity of JDK's method traces. They treat all
throw
instructions as exit points, even if exception is caught inside the same method. Consequently, all invocations of such methods will be counted twice:I submitted OpenJDK issues for both problems: JDK-8367948, JDK-8367949.
Notwithstanding this, I'm excited to see OpenJDK development move in the same direction. This only emphasizes the relevance of Java observability tooling that is comprised of both sampling and instrumenting profilers.
Conclusion
Method Tracing in async-profiler complements traditional sampling by providing exact invocation counts, call paths, and latency insights, all with carefully designed performance safeguards. With some caveats, JDK Flight Recorder also offers Method Tracing and Timing feature since JDK 25.
Downloads
Beta Was this translation helpful? Give feedback.
All reactions