5 Advanced Java Debugging Techniques Every Developer Should Know About
Production debugging is hard, and it’s getting harder. With
architectures becoming more distributed and code more asynchronous, pinpointing
and resolving errors that happen in production is no child’s game. In this
article we’ll talk about some advanced techniques that can help you get to the
root cause of painful bugs in production more quickly, without adding material
overhead to your already busy production environment.
Better jstacks. jstack has been around for a long time
(about as long as the JVM’s been around) and even to this day remains a crucial
tool in every developer’s arsenal. Whenever you’re staring at a Java process
that’s hung or not responding, it’s your go-to tool to see the stack trace of
each thread. Even so, jstack has a couple of disadvantages that detract from
its ability to help you debug complex issues in production. The first is that
while it tells you what each thread is doing through its stack trace, it
doesn't tell what you why it’s doing it (the kind of information usually only
available through a debugger); and it doesn’t tell you when it started doing
it.
Fortunately enough, there’s a great way you can fix that,
and make a good tool even better, by injecting dynamic variable state into each
thread’s data. The key to this problem lies in one of the most unexpected
places. You see, at no point in it’s execution does jstack query the JVM or the
application code for variable state to present. However there is one important
exception, which we can leverage to turn lemons into lemonade, and that is -
the Thread.Name() property, whose value is injected into the stack dump. By
setting it correctly you can move away from uninformative jstack thread printouts
that looks like this
“pool-1-thread-1″ #17 prio=5 os_prio=31
tid=0x00007f9d620c9800 nid=0x6d03 in Object.wait() [0x000000013ebcc000]
Compare that with the following thread printout that
contains a description of the actual work being done by the thread, the input
parameters passed to it, and the time in which it started processing the
request:
”pool-1-thread- #17: Queue: ACTIVE_PROD, MessageID: AB5CAD,
type: Analyze, TransactionID: 56578956, Start Time: 10/8/2014 18:34″
Here’s an example for how we set a stateful thread name:
private void processMessage(Message message) { //an entry
point into your code
String name =
Thread.currentThread().getName();
try {
Thread.currentThread().setName(prettyFormat(name,
getCurrTranscationID(),
message.getMsgType(),
message.getMsgID(), getCurrentTime()));
doProcessMessage(message);
}
finally {
Thread.currentThread().setName(name); // return to original name
}
}
In this example, where the thread is processing messages out
of a queue, we see the target queue from which the thread is dequeuing
messages, as well as the ID of the message being processed, and the transaction
to which it is related (which is critical for reproducing locally), and last,
but far from least - the time in which the processing of this message began.
This last bit of information enables you to look at a server jstack with
upwards of a hundred worker threads, and see which ones started first and are
most likely causing an application server to hang.
(Click on the image to enlarge it)
An example of how an enhanced jstack shows dynamic variable
state for each thread in the dump. Thread start time is marked as TS.
The capability works just as well when you’re using a
profiler, a commercial monitoring tool, a JMX console, or even Java 8’s new
Mission Control. In all of these cases, your ability to look at the live thread
state, or a historic thread dump, and see exactly what each thread is doing and
when it started is materially enhanced by having stateful thread contexts.
This thread variable state will also be shown by any JDK or
commercial debugger or profiler.
But the value of Thread names doesn’t stop there. They play
an even bigger role in production debugging - even if you don’t use jstack at
all. One instance is the global Thread.uncaughtExceptionHandler callback which
serves as a last line of defense before an uncaught exception terminates a
thread (or is sent back to the thread-pool).
By the point an uncaught exception handler is reached, code
execution has stopped and both frames and variable state have already been
rolled back. The only state that remains for you to log the task that thread
was processing, it’s parameters and starting time is captured by you guessed it
- a stateful Thread name (and any additional TLS variables loaded onto it).
Its important to keep in mind that a threading framework
might implicitly catch exceptions without you knowing it. A good example is
ThreadPoolExecutorService, which catches all exceptions in your Runnable and
delegates them to its afterExecute method, which you can override to display
something meaningful. So whatever framework you use, be mindful that if a thread
fails you still have a chance to log the exception and thread state to avoid
tasks disappearing into the ether.
Throughput and deadlock jstacks. Another disadvantage of
tools like jstack or Mission Control is that they need to be activated manually
on a target JVM which is experiencing issues. This reduces their effectiveness
in production where 99% of the time when issues occur you’re not there to
debug.
Happily enough there’s a way by which you can activate
jstack programmatically when your application’s throughput falls under a
specific threshold or meets a specific set of conditions. You can even
automatically activate jstack when your JVM deadlocks to see exactly which
threads are deadlocking and what all the other threads are doing (coupled of
course with dynamic variable state for each one, courtesy of stateful thread
names). This can be invaluable as deadlocking and concurrency issue are
sporadic for the most part and notoriously hard to reproduce. In this case by
activating jstack automatically at the moment of deadlock, one which also
contains the stateful information for each thread can be a huge catalyst in
your ability to reproduce and solve these kinds of bugs.
Click here to learn more about how to automatically detect
deadlocks from within your application.
Capturing live variables. We’ve talked about ways of
capturing state from the JVM through thread contexts. This approach, however
effective, is restricted to variables that you had to format into the thread
name in advance. Ideally, we want to be able to go in and get the value of any
variable from any point in the code from a live JVM, without attaching a
debugger or redeploying code. A great tool that’s been around for a while, but
hasn’t got the recognition it deserves, lets you do just that; and that tool is
BTrace.
BTrace is a helpful tool that lets you run Java-like scripts
on top of a live JVM to capture or aggregate any form of variable state without
restarting the JVM or deploying new code. This enables you to do pretty
powerful things like printing the stack traces of threads, writing to a
specific file, or printing the number of items of any queue or connection pool
and many more.
This is done using BTrace scripting, a Java-like syntax in
which you write functions that are injected into the code in locations of your
choice through bytecode transformation (a process we’ll touch on below). The
best way to try out the tool is to attach its sample scripts into a live
application. Usage is very straightforward, from your command line simply enter
- btrace <JVM pid> <script name>.There’s no need to restart your
JVM.
BTrace is very well documented and comes with many sample
scripts (see below) to cover various common debugging scenarios around IO,
memory allocation and class loading. Here are a couple of powerful examples of
things you can do very easily with BTrace -
NewArray.java: Print
whenever a new char[] is allocated, and also add your own conditions based on
its value. Pretty handy for selective memory profiling.
FileTracker.java:
Print whenever the application writes to a specific file location. Great for
pinpointing the cause of excessive IO operations.
Classload.java: React
whenever a target class is loaded into the JVM. Very useful for debugging
“jar-hell” situations.
BTrace was designed as a non-intrusive tool, which means it
cannot alter application state or flow control. That’s a good thing, as it
reduces the chances of us negatively interfering with the execution of live
code and makes its use in production much more acceptable. But this capability
comes with some heavy restrictions - you can’t create objects (not even
Strings!), call into your or 3rd party code (to perform actions such as
logging), or even do simple things such as looping for fear of creating an
infinite loop. To be able to do those you’ll have to use the next set of
techniques - writing your own Java agent.
A Java agent is a jar file that provides access by the JVM
to an Instrumentation object to enables you to modify bytecode that has already
been loaded into the JVM to alter its behaviour. This essentially lets you
“rewrite” code that is already loaded and compiled by the JVM, without
restarting the application or changing the .class file on disk. Think about it
like BTrace on steroids - you can essentially inject new code anywhere in your
app into both your code and 3rd party code to capture any piece of information
you want.
The biggest downside to writing your own agent is that
unlike BTrace, which lets you write Java-like scripts to capture state, Java
agents operate at the bytecode level. This means that if you want to inject
code into an application you’ll have to create the right bytecode. This can be
tricky because bytecode can be hard to produce and read, following an operator stack-like
syntax which is similar in many ways to Assembly language. And to make things
harder, since bytecode is already compiled, any miscorrelation to the location
in which it is injected will be rejected without much fanfare by the JVM’s
verifier.
To assist with this, many bytecode generation libraries have
been written over the years such as JavaAssist and ASM (which is my personal
favorite). A great hack which I find myself using quite a lot uses the ASM
byteview IDE plugin, which lets you type any Java code in your editor and
automatically generate the right ASM code to then generate it’s equivalent
bytecode, which you can copy and paste into your agent.
Click here for a real-world example of a sample agent we
used to detect and sporadic memory leaks coming from 3rd party code on our
server - correlating it to application state.
Dynamically generating bytecode generation scripts from your
IDE using the ASM bytecode plugin.
The last technique I’d like to touch on briefly is building
Native JVM agents. This approach uses the JVM TI C++ API layer, which gives you
unprecedented control and access into the internals of the JVM. This includes
things like getting callbacks whenever GC starts and stops, whenever new
threads are spawn, monitors are acquired, and many more low-level capabilities.
This is by far the most powerful approach to acquire state from running code,
as you are essentially running at the JVM level.
But with great power comes great responsibility, and some
pretty complex challenges make this approach relatively harder to implement.
The first is that since you’re running at the JVM level you're no longer
writing in cross platform Java, but in low-level platform dependant C++. The
second disadvantage is that the APIs themselves, while extremely powerful, are
hard to use and can impact performance significantly, depending on the specific
set of capabilities you’re consuming.
On the plus side, if used correctly, this layer provides
terrific access to parts of the JVM which would otherwise be closed to us in
our search of the root cause of production bugs. When we began writing Takipi
for production debugging, I’m not sure we knew the extent to which TI would
play a crucial role in our ability to build the tool. The reason for that is
that through the use of this layer you’re able to detect exceptions, calls into
the OS or map application code without manual user instrumentation. If you have
the time to take a look at this API layer - I highly recommend it, as it opens
a unique window into the JVM not many of us
know.[Source]-https://www.infoq.com/articles/Advanced-Java-Debugging-Techniques/
We provide the best Advanced Java training in navi mumbai. We have industry experienced
trainers and provide hands on practice. Basic to advanced modules are covered
in training sessions.
Comments
Post a Comment