Follow Us on Twitter

Troubleshooting ORA-04030 process memory errors with the ADR Command Interpreter (ADRCI)

by Tony van Esch on April 27, 2012 · 0 comments

We just migrated a part of a 9.2.0.8 database to 11.2.0.3 and it is running really well. Ofcourse at some point errors will show up that need to be addressed. Some errors are easier to handle than others. There are certain kinds of errors that are often hard to diagnose. Think of memory errors in the ORA-0403x range. The case needed to be investigated concerns a ORA-04030. How do we approach such cases.

This is the complete error description:

ORA-04030: out of process memory when trying to allocate 104 bytes (callheap,kpokgtda)

Enterprise manager Grid Control has a special section called the support workbench, where incidents and problems can be managed. In the main screen of the database instance we have a list of open alerts.

Oracle Enterprise Manager Grid Control Support Workbench

Oracle Enterprise Manager Grid Control Support Workbench

Drilling into the specific incident, we will see some basic information related to this specific incident. Supposedly there was a problem with the callheap ‘kpokgtda’.

Examining ORA-4030 Incident Details

Examining ORA-4030 Incident Details

At this point I normally do a search on My Oracle Support (MOS) to see if I get some hits pertaining to this memory structure. In this case I get two hits:

  1. Bug 9127263 – Session spin compiling a form library using dblinks and SYNONYM [ID 9127263.8]
  2. Compiling a PLSQL Procedure Over a Database Link From 9i To 10g Hangs [ID 987507.1]

These hits relate to older versions of the database, although there are some database links present in the current setup. We can investigate this further on the commandline. All information that was presented in Enterprise Manager can be obtained through the ADRCI commandline utility aswell.

Let’s see what incidents we have:

adrci
ADRCI: Release 11.2.0.3.0 - Production on Fri Apr 27 10:41:46 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
ADR base = "/u01/app/oracle"
adrci> show INCIDENT
ADR Home = /u01/app/oracle/diag/rdbms/PROD/PROD:
*************************************************************************
INCIDENT_ID          PROBLEM_KEY                                                 CREATE_TIME
-------------------- ----------------------------------------------------------- ----------------------------------------
24185                ORA 4031                                                    2012-04-19 17:08:56.945000 +02:00
91410                ORA 4030                                                    2012-04-21 19:49:42.090000 +02:00
91411                ORA 4030                                                    2012-04-21 19:50:28.889000 +02:00
91562                ORA 4030                                                    2012-04-21 19:53:07.754000 +02:00
91498                ORA 4030                                                    2012-04-21 19:53:12.735000 +02:00
91530                ORA 4030                                                    2012-04-21 19:53:37.473000 +02:00
131799               ORA 4030                                                    2012-04-26 00:20:38.729000 +02:00
131800               ORA 4030                                                    2012-04-26 00:20:54.722000 +02:00
134407               ORA 603                                                     2012-04-26 00:21:14.538000 +02:00
9 rows fetched

We want the details of incident 131799, as that’s the one we want to diagnose.

adrci> show incident -mode detail -p "incident_id=131799"
**********************************************************
INCIDENT INFO RECORD 1
**********************************************************
   INCIDENT_ID                   131799
   STATUS                        ready
   CREATE_TIME                   2012-04-26 00:20:38.729000 +02:00
   PROBLEM_ID                    2
   CLOSE_TIME                    <NULL>
   FLOOD_CONTROLLED              none
   ERROR_FACILITY                ORA
   ERROR_NUMBER                  4030
   ERROR_ARG1                    104
   ERROR_ARG2                    callheap
   ERROR_ARG3                    kpokgtda
   ERROR_ARG4                    <NULL>
 <sniped>
   IMPACT4                       0
   KEY_NAME                      Client ProcId
   KEY_VALUE                     oracle@server1 (TNS V1-V3).21266_47614664629808
   KEY_NAME                      SID
   KEY_VALUE                     406.4823
   KEY_NAME                      ProcId
   KEY_VALUE                     274.47
   OWNER_ID                      1
   INCIDENT_FILE                 /u01/app/oracle/diag/rdbms/PROD/PROD/incident/incdir_131799/PROD_ora_21266_i131799.trc
   OWNER_ID                      1
   INCIDENT_FILE                 /u01/app/oracle/diag/rdbms/PROD_1/PROD/trace/PROD_ora_21266.trc

To find out what was being done in the session at crash time, open up the incident file created through adrci. This tracefile is more readable than the original tracefile generated by the database server as it has been processed by the ADRCI framework.

adrci> show tracefile -i 131799
     diag/rdbms/PROD/PROD/incident/incdir_131799/PROD_ora_21266_i131799.trc
adrci> view PROD_ora_21266_i131799.trc

The header information gives us the first clue. The session came from a remote server.

*** 2012-04-26 00:20:38.748
*** SESSION ID:(406.4823) 2012-04-26 00:20:38.748
*** CLIENT ID:() 2012-04-26 00:20:38.748
*** SERVICE NAME:(whrocks) 2012-04-26 00:20:38.748
*** MODULE NAME:(oracle@remoteServer1(TNS V1-V3)) 2012-04-26 00:20:38.748
*** ACTION NAME:() 2012-04-26 00:20:38.748

Let’s dive deeper into the tracefile (geek mode). Following is an overview of memory usage by the session. Do not forget that I’m just using common sense to get to this point. It seems this session is (trying to or) consuming way to much memory (see highlights).

========= Dump for incident 131799 (ORA 4030) ========
----- Beginning of Customized Incident Dump(s) -----
=======================================
TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
*** 2012-04-26 00:20:44.218
65% 2592 MB, 25369976 chunks: "kpokgtda                  "
         callheap        ds=0xb2ae8c0  dsprt=0xb2af7e0
35% 1368 MB, 357321 chunks: "permanent memory          "
         Alloc server h  ds=0x2b4e2780cc10  dsprt=0x2b4e2780e090
<sniped>
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
*** 2012-04-26 00:20:45.097
******************************************************
PRIVATE HEAP SUMMARY DUMP
4087 MB total:
  4087 MB commented, 166 KB permanent
   114 KB free (0 KB in empty extents),
    2691 MB,   2 heaps:   "callheap       "
    1395 MB,   1 heap:    "Alloc environm "            19 KB free held
*** 2012-04-26 00:20:51.550
------------------------------------------------------
Summary of subheaps at depth 1
3976 MB total:
  3974 MB commented, 74 KB permanent
  1706 KB free (0 KB in empty extents),
    2594 MB, 25369976 chunks:  "kpokgtda                  " 1514 KB free held
================================
PER-PROCESS PRIVATE MEMORY USAGE
--------------------------------
Private memory usage per Oracle process
-------------------------
Top 10 processes:
-------------------------
(percentage is of 4599 MB total allocated memory)
89% pid 274: 4087 MB used of 4089 MB allocated  <= CURRENT PROC
 1% pid 213: 41 MB used of 49 MB allocated (4672 KB freeable)
 1% pid 24: 40 MB used of 46 MB allocated (4352 KB freeable)
 1% pid 21: 40 MB used of 43 MB allocated (1088 KB freeable)
 1% pid 23: 40 MB used of 43 MB allocated (1088 KB freeable)
 0% pid 34: 19 MB used of 21 MB allocated
 0% pid 22: 17 MB used of 18 MB allocated
 0% pid 13: 11 MB used of 13 MB allocated (1216 KB freeable)
 0% pid 17: 1988 KB used of 12 MB allocated (10 MB freeable)
 0% pid 157: 8045 KB used of 11 MB allocated (960 KB freeable)

Now let’s find out which cursor or object was accessed by this session. Start at the beginning of the tracefile and search for ObjectName. The first one found gives me a clue to what was initially called. I know the package is being called from a 9.2.0.8 database through a database link (I changed the real schema and package name btw). This seems to indicate that we might be hitting a bug related to the findings in MOS (database links, synonyms and packages) I mentioned earlier.

      SO: 0x4531866d8, type: 79, owner: 0x4484dd2b8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
       proc=0x4503c9fe8, name=LIBRARY OBJECT PIN, file=kgl.h LINE:8551, pg=0
      LibraryObjectPin: Address=0x4531866d8 Handle=0x455cee788 Mode=S Lock=0x4531865d8 User=0x4484dd2b8 Session=0x4484dd2b8 Count=1 Mask=0001 Flags=[00] SavepointNum=0x766
      LibraryHandle:  Address=0x455cee788 Hash=917efa9a LockMode=S PinMode=S LoadLockMode=0 Status=VALD
        ObjectName:  Name=WH.ROCKS
          FullHashValue=9b915f44c837ed0b4f5adc6e917efa9a Namespace=TABLE/PROCEDURE(01) Type=PACKAGE(09) Identifier=68426 OwnerIdn=729
        Statistics:  InvalidationCount=0 ExecutionCount=811523 LoadCount=6 ActiveLocks=14 TotalLockCount=9534 TotalPinCount=102697
        Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=10879 HandleInUse=10879 HandleReferenceCount=0

So now we actually know where something went wrong. Next step is packaging the incident and open a Service Request with Oracle to help further diagnose this issue. Let’s do this through the Enterprise Manager interface.

Generate a package and exclude those incidents and files which are not relevant (or in my case way to big (think Gigabytes)). After packaging you can create a Service Request and upload the package with all relevant tracefile and logs to MOS.

Details of the Service Request package with relevant tracefile and logs

Details of the Service Request package with relevant tracefile and logs

As you can see the ADRCI framework helped in identifying the possible root cause of our ORA-04030 issue. Especially the processing/reformatting of the initial tracefile into an adrci incident tracefile gives us more information for diagnosing these kind of problems without being an Oracle internals expert. The adrci utility can ofcourse also be used to investigate and diagnose other tough (internal) errors.

Have fun troubleshooting with ADRCI

Troubleshooting ORA-04030 process memory errors with the ADR Command Interpreter (ADRCI), 4.4 out of 5 based on 5 ratings

Ratings:
VN:F [1.9.22_1171]
Rating: 4.4/5 (5 votes cast)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 

Previous post:

Next post:

About Whitehorses
Company profile
Services
Technology

Whitehorses website

Home page
Whitebooks
Jobs

Follow us
Blog post RSS
Comment RSS
Twitter