Follow Us on Twitter

Troubleshoot JVM crashes of WebLogic: CompilerThread problem

by Tony van Esch on June 5, 2012 · 8 comments

Somehow our WebLogic server running Enterprise Manager 11g sometimes dies and is restarted by NodeManager. The restart is of course very nice, but I’d rather see the WebLogic server keeps on running.

First thing to do is investigate the WebLogic server logfiles and the JVM logfiles

cd $ORACLE_HOME/../..//gc_inst/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/
 WLS logs: EMGC_OMS1.log*
 JVM logs: EMGC_OMS1.out*

The WLS serverlogs don’t show any clues to what has happened, but the JVM standard out (stdout) actually has logged a core dump.

 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x00007fea60a26a8a, pid=6442, tid=1100994880
 #
 # JRE version: 6.0_18-b07
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode linux-amd64 )
 # Problematic frame:
 # V  [libjvm.so+0x248a8a]
 #
 # An error report file with more information is saved as:
 # /u01/app/oracle/product/gc_inst/user_projects/domains/GCDomain/hs_err_pid6442.log
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 #
 /u01/app/oracle/product/gc_inst/user_projects/domains/GCDomain/bin/startWebLogic.sh: line 193:  6442 Aborted                 (core dumped) ${JAVA_HOME}/bin/java ${JAVA_VM} ${MEM_ARGS} -Dweblogic.Name=${SERVER_NAME} -Djava.security.policy=${WL_HOME}/server/lib/weblogic.policy ${JAVA_OPTIONS} ${PROXY_SETTINGS} ${SERVER_CLASS}
 <Jun 5, 2012 10:33:53 AM> <FINEST> <NodeManager> <Waiting for the process to die: 6392>
 <Jun 5, 2012 10:33:53 AM> <INFO> <NodeManager> <Server failed so attempting to restart (restart count = 1)>

Let’s look into logfile: /u01/app/oracle/product/gc_inst/user_projects/domains/GCDomain/hs_err_pid6442.log.

There are actually three more of these hs_err logfiles and the same number of linux coredumps. Examining the hs_err logfile shows us some relevant information to pin-point the root cause. I’ve trimmed the output to make it more readable for this specific case.  The clues to follow:

  1. Thread section: SIGSEGV on CompilerThread. The compiler is giving us throubles.
  2. Compile task: oracle.sysman.emo.perf.metric.rt.DbAshRollupMetric._getData. This is the class being compiled when compiler choked.
  3. Process section: All threads in _thread_blocked state and Compiler Thread in _thread_in_native state.
---------------  T H R E A D  ---------------
Current thread (0x00007fea16e59800):  JavaThread "CompilerThread0" daemon [_thread_in_native, id=6468, stack(0x00000000418fd000,0x00000000419fe000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), si_addr=0x0000000000000000
...sniped...
Current CompileTask:
C2:188% !   oracle.sysman.emo.perf.metric.rt.DbAshRollupMetric._getData(ZLjava/util/Date;Ljava/lang/Object;J)Loracle/sysman/emSDK/emd/dtd/MetricResult; @ 165 (1478 bytes)
...sniped...
---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )
  0x00007fea16e5b000 JavaThread "Low Memory Detector" daemon [_thread_blocked, id=6470, stack(0x00000000410d3000,0x00000000411d4000)]
  0x00007fea164ef800 JavaThread "CompilerThread1" daemon [_thread_blocked, id=6469, stack(0x00000000420d5000,0x00000000421d6000)]
=>0x00007fea16e59800 JavaThread "CompilerThread0" daemon [_thread_in_native, id=6468, stack(0x00000000418fd000,0x00000000419fe000)]
  0x00007fea16eb1800 JavaThread "MultiThreadedHttpConnectionManager cleanup" daemon [_thread_blocked, id=6467, stack(0x00000000417fc000,0x00000000418fd000)]
  0x00007fea16e51000 JavaThread "AD Thread Pool-Global1" daemon [_thread_blocked, id=6466, stack(0x0000000041ad6000,0x0000000041bd7000)]
  0x00007fea16e4f000 JavaThread "AD Thread Pool-Global0" daemon [_thread_blocked, id=6465, stack(0x0000000040616000,0x0000000040717000)]
  0x00007fea16df5000 JavaThread "AD Thread-Metric Reporter0" daemon [_thread_blocked, id=6464, stack(0x0000000040252000,0x0000000040353000)]
  0x00007fea16df4800 JavaThread "AD Thread-Config Poller" daemon [_thread_blocked, id=6463, stack(0x00000000416fb000,0x00000000417fc000)]
  0x00000000412a4000 JavaThread "Thread-0" daemon [_thread_blocked, id=6454, stack(0x000000004010a000,0x000000004020b000)]
  0x00000000412b6000 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=6451, stack(0x000000004050b000,0x000000004060c000)]
  0x0000000041297800 JavaThread "Finalizer" daemon [_thread_blocked, id=6450, stack(0x0000000040800000,0x0000000040901000)]
  0x0000000041290000 JavaThread "Reference Handler" daemon [_thread_blocked, id=6449, stack(0x0000000041ea9000,0x0000000041faa000)]
  0x0000000041208800 JavaThread "main" [_thread_blocked, id=6443, stack(0x0000000041c95000,0x0000000041d96000)]

Scanning through the other hs_err.log files I actually see that all crashes were caused when compiling the same class!

cd /u01/app/oracle/product/gc_inst/user_projects/domains/GCDomain
grep 'C2:' hc_err_*.log
hs_err_pid17546.log:C2:175% !   oracle.sysman.emo.perf.metric.rt.DbAshRollupMetric._getData(ZLjava/util/Date;Ljava/lang/Object;J)Loracle/sysman/emSDK/emd/dtd/MetricResult; @ 165 (1478 bytes)
hs_err_pid25557.log:C2:187% !   oracle.sysman.emo.perf.metric.rt.DbAshRollupMetric._getData(ZLjava/util/Date;Ljava/lang/Object;J)Loracle/sysman/emSDK/emd/dtd/MetricResult; @ 165 (1478 bytes)
hs_err_pid6013.log:C2:198% !   oracle.sysman.emo.perf.metric.rt.DbAshRollupMetric._getData(ZLjava/util/Date;Ljava/lang/Object;J)Loracle/sysman/emSDK/emd/dtd/MetricResult; @ 165 (1478 bytes)
hs_err_pid6442.log:C2:188% !   oracle.sysman.emo.perf.metric.rt.DbAshRollupMetric._getData(ZLjava/util/Date;Ljava/lang/Object;J)Loracle/sysman/emSDK/emd/dtd/MetricResult; @ 165 (1478 bytes)

MOS note 1009131.1 describes what is happening when a compiler crash occurs. The JVM is compiling classes into native code (instead of doing interpreted execution). We actually have three options to work around this problem.

  1. use -client instead of -server (which is default in jdk 1.6);
  2. use -Xint to disable native compilation completely (for the whole JVM);
  3. use .hotspot_compiler file to selectively prevent classes from being compiled to native code.

As we have determined which class is giving is problems, we can implement option 3. We have to place the .hotspot_compiler file in the working directory of the JVM. We can determine the working directory by executing the following java utility.

/u01/app/oracle/product/jdk/bin/jinfo <PID of JVM>|grep user.dir
Attaching to process ID 3069, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 16.0-b13
user.dir = /u01/app/oracle/product/gc_inst/user_projects/domains/GCDomain

Now we create the .hotspot_compiler file in the user.dir with the following contents:

exclude oracle/sysman/emo/perf/metric/rt/DbAshRollupMetric._getData

Now we can restart the JVM and check wether the configuration is succesful. The fastest way to restart the Weblogic JVM of Enterprise Manager is by issuing a SIGKILL. Nodemanager will kick in and restart the Weblogic server.
Checking standard out (EMGC_OMS1.out) of the Weblogic JVM will show that we succeeded.

<sniped>
starting weblogic with Java version:
CompilerOracle: exclude oracle/sysman/emo/perf/metric/rt/DbAshRollupMetric._getData
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
Starting WLS with line:
<etcetera>

Hope this helps…

References

Check out the following links for more info.

Troubleshoot JVM crashes of WebLogic: CompilerThread problem, 4.5 out of 5 based on 2 ratings
Ratings:
VN:F [1.9.22_1171]
Rating: 4.5/5 (2 votes cast)

{ 8 comments… read them below or add one }

Eric Darchis October 17, 2012 at 4:39 pm

The -Xint parameter allowed me to see an error message that was not visible when it was crashing:

I removed the -Xint and set the sysctl.conf with:
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

It is now working exactly as expected without -Xint or any other workaround. May not work for everyone but worth checking anyhow.

Reply

Tony van Esch October 23, 2012 at 10:26 am

Hi Eric,

thanks for the information. Will use it to investigate further issues.

regards,
Tony

Reply

Chinmay Anand December 12, 2012 at 7:44 am

Hi Eric,
How did you get into this number “16777216”?
Did the option “-Xint ” give you hint that this value is the one you need to set as max OS receive buffer size and max OS send buffer size?

This will help us to analyze the same issue occurring in our environment.
Thanks for the information.

Regards,
Chinmay

Reply

Chinmay Anand December 12, 2012 at 7:54 am

Hi,
Please ignore my ignorance, as I don’t have any experience in system administration, but trying to recover a random JVM crash from an application using JDK build 1.5.0_14-b03.

Any suggestion will be appreciated.
Regards,
Chinmay

Reply

Tony van Esch December 12, 2012 at 9:55 am

Hi Chinmay,
identify logfiles (on linux I use ls -l /proc//fd/) and get clues about your intermittent crashes. Furthermore you can use VisualVM to troubleshoot/profile your application: http://visualvm.java.net/

hope this helps,
Tony

Reply

kalyan March 31, 2015 at 11:52 am

Hello,
I have a dought dude. we are using Java EE application running on Web logic Server 11g .
So how to check which JVM we are using ?
I have seen some where that “webloigc server is a Java s/w process which will execute on JVM..
so tell me .pls reply me solution to “nlkalyan.434@gmail.com”
Thanks in advance!!

Reply

ramesh May 5, 2016 at 8:31 am

Hi All,
This P1 for My Team.
Please Reply with solution ASAP.
While starting the server itn’t getting started, its giving the error
There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 2147483648 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /opt/nokiacif/cwap/epm/instance/wl/epmadm/dom/dom-v1.0.1/epmadmdomain/hs_err_pid21426.log

And also its not creating log file also.

Thank in Advance,
ramesh.

Reply

ramesh May 5, 2016 at 8:34 am

Hi all,

Im getting below while starting the managed server.
It Is P1 for me could anyone help me ASAP.

There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 2147483648 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /opt/nokiacif/cwap/epm/instance/wl/epmadm/dom/dom-v1.0.1/epmadmdomain/hs_err_pid21426.log

Thanks in advance,
Ramesh

Reply

Leave a Comment

 

Previous post:

Next post:

About Whitehorses
Company profile
Services
Technology

Whitehorses website

Home page
Whitebooks
Jobs

Follow us
Blog post RSS
Comment RSS
Twitter