Saturday, November 09, 2013

JVM OutOfMemory troubleshooting - Eclipse MAT to rescue Talend jobs

I will illustrate with an example how to use Eclipse MAT to debug OutOfMemory issues in standalone Talend jobs.

Talend Open Studio for Data Integration generates a shell file that packages a java command. When that shell script returns OutOfmemory errors we need to proced exactly the same way we would troubleshoot OutOfmemory errors in any JVM running process. We need to generate a Heap Memory Dump file (*.hprof) and analyze it with a tool to find out if we are holding more objects in memory than actually needed.

The first thing we need to do is to narrow down the OutOfMemory to a command line:
job/job/job_run.sh --context_param ...
Then we need to ddd the necessary flags to the shell script to get the heap dump file:
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath="/tmp/dumps"
Now we run the script again and we notice the message:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /tmp/dumps/java_pid5394.hprof ...
Note that HeapDumpPath is actually not a directory but the generated file when running JDK 8 so you will see a different message indicating that /tmp/dumps is actually your hprof file.
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /tmp/dumps ...
You will need to append the extension in order to load it in Eclipse MAT or simply use the file name directly from the JVM flag:
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath="/tmp/dump.hprof"
Our file has been generated so it is time to load the Heap Memory Dump file (*.hprof) into Eclipse Memory Analyzer (MAT)

Just go to "Menu|File|Open File". Once loaded select "Leak Suspects Report". The pie chart should identify major problem suspect(s) and scrolling down that page you can drill down:



Click in "Details" for each problem suspect, for example:



Look how in the case of Talend the whole consumption is practically happening in the main() method. Talend just produces a huge Java class with a main method. Drill into "Accumulated Objects by Class" available towards the bottom of the page:



As you can see dom4j is used for parsing most likely big XML content (instead of SAX for example). Clicking on the link for objects you will be able to navigate through their herarchy. With Object Query Language (OQL) you can literally inspect anything in the hierarchy. Locate the OQL icon and click on it and get ready to type queries. The result will be a similar hierarchy than the one you get when inspecting individual objects if you just "select * from" the object but you can drill down getting hints about the actual loaded data with OQL. All you need is to look into fields of the object for which you can use the hirarchy inspection or even Javadocs directly:



The solution is most likely to use a different configuration for the faulty component or in the case of a deficiency of it look for an alternative. BTW if your heap is too high you can always limit memory consumption to minimize the size of the hprof file. Most likely with smaller memory footprint the memory leaking will be still revealed by an excesive usage of certain classes of objects.

Profile the application before getting surprising OutOfMemory

You can generate heap dumps from running programs at any time. You just need the pid of the running process, for example:
$ jps
2017 Bootstrap
13667 Jps
13650 talend_sample
$ jmap -dump:format=b,file=/tmp/talend_sample.hprof 13650
Dumping heap to /tmp/talend_sample.hprof  ...
Heap dump file created

No comments:

Followers