We just migrated a part of a 9.2.0.8 database to 11.2.0.3 and it is running really well. Ofcourse at some point errors will show up that need to be addressed. Some errors are easier to handle than others. There are certain kinds of errors that are often hard to diagnose. Think of memory errors in the ORA-0403x range. The case needed to be investigated concerns a ORA-04030. How do we approach such cases.
This is the complete error description:
ORA-04030: out of process memory when trying to allocate 104 bytes (callheap,kpokgtda)
Enterprise manager Grid Control has a special section called the support workbench, where incidents and problems can be managed. In the main screen of the database instance we have a list of open alerts.
Drilling into the specific incident, we will see some basic information related to this specific incident. Supposedly there was a problem with the callheap ‘kpokgtda’.
At this point I normally do a search on My Oracle Support (MOS) to see if I get some hits pertaining to this memory structure. In this case I get two hits:
- Bug 9127263 – Session spin compiling a form library using dblinks and SYNONYM [ID 9127263.8]
- Compiling a PLSQL Procedure Over a Database Link From 9i To 10g Hangs [ID 987507.1]
These hits relate to older versions of the database, although there are some database links present in the current setup. We can investigate this further on the commandline. All information that was presented in Enterprise Manager can be obtained through the ADRCI commandline utility aswell.
Let’s see what incidents we have:
adrci ADRCI: Release 11.2.0.3.0 - Production on Fri Apr 27 10:41:46 2012 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. ADR base = "/u01/app/oracle" adrci> show INCIDENT ADR Home = /u01/app/oracle/diag/rdbms/PROD/PROD: ************************************************************************* INCIDENT_ID PROBLEM_KEY CREATE_TIME -------------------- ----------------------------------------------------------- ---------------------------------------- 24185 ORA 4031 2012-04-19 17:08:56.945000 +02:00 91410 ORA 4030 2012-04-21 19:49:42.090000 +02:00 91411 ORA 4030 2012-04-21 19:50:28.889000 +02:00 91562 ORA 4030 2012-04-21 19:53:07.754000 +02:00 91498 ORA 4030 2012-04-21 19:53:12.735000 +02:00 91530 ORA 4030 2012-04-21 19:53:37.473000 +02:00 131799 ORA 4030 2012-04-26 00:20:38.729000 +02:00 131800 ORA 4030 2012-04-26 00:20:54.722000 +02:00 134407 ORA 603 2012-04-26 00:21:14.538000 +02:00 9 rows fetched
We want the details of incident 131799, as that’s the one we want to diagnose.
adrci> show incident -mode detail -p "incident_id=131799" ********************************************************** INCIDENT INFO RECORD 1 ********************************************************** INCIDENT_ID 131799 STATUS ready CREATE_TIME 2012-04-26 00:20:38.729000 +02:00 PROBLEM_ID 2 CLOSE_TIME <NULL> FLOOD_CONTROLLED none ERROR_FACILITY ORA ERROR_NUMBER 4030 ERROR_ARG1 104 ERROR_ARG2 callheap ERROR_ARG3 kpokgtda ERROR_ARG4 <NULL> <sniped> IMPACT4 0 KEY_NAME Client ProcId KEY_VALUE oracle@server1 (TNS V1-V3).21266_47614664629808 KEY_NAME SID KEY_VALUE 406.4823 KEY_NAME ProcId KEY_VALUE 274.47 OWNER_ID 1 INCIDENT_FILE /u01/app/oracle/diag/rdbms/PROD/PROD/incident/incdir_131799/PROD_ora_21266_i131799.trc OWNER_ID 1 INCIDENT_FILE /u01/app/oracle/diag/rdbms/PROD_1/PROD/trace/PROD_ora_21266.trc
To find out what was being done in the session at crash time, open up the incident file created through adrci. This tracefile is more readable than the original tracefile generated by the database server as it has been processed by the ADRCI framework.
adrci> show tracefile -i 131799
diag/rdbms/PROD/PROD/incident/incdir_131799/PROD_ora_21266_i131799.trc
adrci> view PROD_ora_21266_i131799.trc
The header information gives us the first clue. The session came from a remote server.
*** 2012-04-26 00:20:38.748 *** SESSION ID:(406.4823) 2012-04-26 00:20:38.748 *** CLIENT ID:() 2012-04-26 00:20:38.748 *** SERVICE NAME:(whrocks) 2012-04-26 00:20:38.748 *** MODULE NAME:(oracle@remoteServer1(TNS V1-V3)) 2012-04-26 00:20:38.748 *** ACTION NAME:() 2012-04-26 00:20:38.748
Let’s dive deeper into the tracefile (geek mode). Following is an overview of memory usage by the session. Do not forget that I’m just using common sense to get to this point. It seems this session is (trying to or) consuming way to much memory (see highlights).
========= Dump for incident 131799 (ORA 4030) ========
----- Beginning of Customized Incident Dump(s) -----
=======================================
TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
*** 2012-04-26 00:20:44.218
65% 2592 MB, 25369976 chunks: "kpokgtda "
callheap ds=0xb2ae8c0 dsprt=0xb2af7e0
35% 1368 MB, 357321 chunks: "permanent memory "
Alloc server h ds=0x2b4e2780cc10 dsprt=0x2b4e2780e090
<sniped>
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
*** 2012-04-26 00:20:45.097
******************************************************
PRIVATE HEAP SUMMARY DUMP
4087 MB total:
4087 MB commented, 166 KB permanent
114 KB free (0 KB in empty extents),
2691 MB, 2 heaps: "callheap "
1395 MB, 1 heap: "Alloc environm " 19 KB free held
*** 2012-04-26 00:20:51.550
------------------------------------------------------
Summary of subheaps at depth 1
3976 MB total:
3974 MB commented, 74 KB permanent
1706 KB free (0 KB in empty extents),
2594 MB, 25369976 chunks: "kpokgtda " 1514 KB free held
================================
PER-PROCESS PRIVATE MEMORY USAGE
--------------------------------
Private memory usage per Oracle process
-------------------------
Top 10 processes:
-------------------------
(percentage is of 4599 MB total allocated memory)
89% pid 274: 4087 MB used of 4089 MB allocated <= CURRENT PROC
1% pid 213: 41 MB used of 49 MB allocated (4672 KB freeable)
1% pid 24: 40 MB used of 46 MB allocated (4352 KB freeable)
1% pid 21: 40 MB used of 43 MB allocated (1088 KB freeable)
1% pid 23: 40 MB used of 43 MB allocated (1088 KB freeable)
0% pid 34: 19 MB used of 21 MB allocated
0% pid 22: 17 MB used of 18 MB allocated
0% pid 13: 11 MB used of 13 MB allocated (1216 KB freeable)
0% pid 17: 1988 KB used of 12 MB allocated (10 MB freeable)
0% pid 157: 8045 KB used of 11 MB allocated (960 KB freeable)
Now let’s find out which cursor or object was accessed by this session. Start at the beginning of the tracefile and search for ObjectName. The first one found gives me a clue to what was initially called. I know the package is being called from a 9.2.0.8 database through a database link (I changed the real schema and package name btw). This seems to indicate that we might be hitting a bug related to the findings in MOS (database links, synonyms and packages) I mentioned earlier.
SO: 0x4531866d8, type: 79, owner: 0x4484dd2b8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x4503c9fe8, name=LIBRARY OBJECT PIN, file=kgl.h LINE:8551, pg=0
LibraryObjectPin: Address=0x4531866d8 Handle=0x455cee788 Mode=S Lock=0x4531865d8 User=0x4484dd2b8 Session=0x4484dd2b8 Count=1 Mask=0001 Flags=[00] SavepointNum=0x766
LibraryHandle: Address=0x455cee788 Hash=917efa9a LockMode=S PinMode=S LoadLockMode=0 Status=VALD
ObjectName: Name=WH.ROCKS
FullHashValue=9b915f44c837ed0b4f5adc6e917efa9a Namespace=TABLE/PROCEDURE(01) Type=PACKAGE(09) Identifier=68426 OwnerIdn=729
Statistics: InvalidationCount=0 ExecutionCount=811523 LoadCount=6 ActiveLocks=14 TotalLockCount=9534 TotalPinCount=102697
Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=10879 HandleInUse=10879 HandleReferenceCount=0
So now we actually know where something went wrong. Next step is packaging the incident and open a Service Request with Oracle to help further diagnose this issue. Let’s do this through the Enterprise Manager interface.
Generate a package and exclude those incidents and files which are not relevant (or in my case way to big (think Gigabytes)). After packaging you can create a Service Request and upload the package with all relevant tracefile and logs to MOS.
As you can see the ADRCI framework helped in identifying the possible root cause of our ORA-04030 issue. Especially the processing/reformatting of the initial tracefile into an adrci incident tracefile gives us more information for diagnosing these kind of problems without being an Oracle internals expert. The adrci utility can ofcourse also be used to investigate and diagnose other tough (internal) errors.
Have fun troubleshooting with ADRCI
Troubleshooting ORA-04030 process memory errors with the ADR Command Interpreter (ADRCI),



Whitehorses is specialized in succesfully implementing Oracle SOA solutions: BPEL, OSB, WebLogic & BPM