Pages

Wednesday, March 25, 2009

Journey from 9.0.1 to 9.2.0.8: Part-IV

The Action Plan for Oracle R1 (9.0.1) to Oracle R2 (9.2.0.8) migration was in place, and the scheduled full export dumps of the production instance were generating around 9 GB dump file. I coordinated with the IT Admins and arranged for a Test Machine, where I could carry out series of migration tests with both Manual Upgrade Process and Database Upgrade Assistant.

I had to set up an environment similar to that of the Production Box, only then could I simulate the migration. So, my Action Plan for the simulated migration/upgrade on the same server was as follows:

  1. Firstly, to install Oracle 9iR1 [9.0.1] base home on the test machine,
  2. Then, to create a Test Database similar to the Production Database, by using the latest export backup of the production instance,
  3. Next, to install Oracle 9iR2 [9.2.0] base home in a separate location on the same test machine,
  4. Then, to patch the Oracle 9iR2 [9.2.0] base home with 9.2.0.8 patchset,
  5. Before beginning the upgrade process, to ensure a cold backup of Oracle 9iR1 [9.0.1] Test Database is taken,
  6. Make sure the following environment variables point to the new Release directories:
    a. ORACLE_HOME
    b. PATH
    c. ORA_NLS33
    d. ORACLE_BASE
    e. LD_LIBRARY_PATH
    f. ORACLE_PATH
  7. Then, using either DBUA or Manual process, to upgrade the Oracle 9iR1 [9.0.1] test database to Oracle 9iR2 [9.2.0.8],
  8. If upgrade is successful, then to take a post upgrade cold backup,
  9. If any major issues are encountered during the upgrade process, then to restore the cold backup prior to upgrade and point the database to the old home Oracle 9iR1 [9.0.1]. And, to Restart the Migrate/Upgrade Process.
Depending on the test results, I would have to choose between Manual Upgrade and DBUA for carrying out the upgrade on the production box.

With DBUA, you had to do very little, as you were completely relying on the Upgrade Assistant to do the job. But, if the upgrade fails in between, you would have a hard time tracing and fixing the issues and then continuing from where the upgrade failed. Whereas, with the Manual Upgrade Process, though it was tediously cumbersome method, you had complete control over your upgrade. Even if at any point you get some error, you could eventually have full control over the upgrade to fix the issue and proceed further. In addition, by the time you complete the Manual Upgrade, you would have all the technical insights to the migration from one version to another. I also came across the Metalink Note: 406037.1 to check for any issues during applying patch without shutting down the Database or Listener.

To begin with, I initiated the migration testing using Manual Upgrade Process using the Metalink Note: 159657.1 [Complete Upgrade Checklist for Manual Upgrades from 8.X / 9.0.1 to Oracle9iR2 (9.2.0)]

Please find below output related the pre-requisite checks performed on the Test database:


SQL> select max(bytes) from dba_free_space
2 where tablespace_name='SYSTEM';



MAX(BYTES)

----------

242745344



SQL> SELECT COUNT(*) FROM dba_objects;



COUNT(*)

----------

67952



SQL> select substr(owner,1,12) owner,

2 substr(object_name,1,30) object,

3 substr(object_type,1,30) type,

4 status

5 from dba_objects

6 where status <>'VALID';



OWNER OBJECT TYPE STATUS

------------ ------------------------------ ------------------ -------

- 11 Non System Objects not shown here



11 rows selected.



SQL> @C:\ORACLE\PRODUCT\9.0.1\DB_1\RDBMS\ADMIN\UTLRP



PL/SQL procedure successfully completed.



SQL> select substr(owner,1,12) owner,

2 substr(object_name,1,30) object,

3 substr(object_type,1,30) type,

4 status

5 from dba_objects

6 where status <>'VALID';



no rows selected



SQL> SELECT * FROM V$RECOVER_FILE;



no rows selected



SQL> SELECT * FROM V$BACKUP WHERE STATUS <> 'NOT ACTIVE';



no rows selected



SQL> SELECT * FROM DBA_2PC_PENDING;



no rows selected



SQL> ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0 SCOPE=BOTH;



System altered.



SQL>

SQL> ED

Wrote file afiedt.buf



1 SELECT USERNAME,

2 DEFAULT_TABLESPACE

3 FROM DBA_USERS

4* WHERE USERNAME IN ('SYS', 'SYSTEM')

SQL> /



USERNAME DEFAULT_TABLESPACE

------------------------------------------------------------

SYSTEM SYSTEM

SYS SYSTEM



2 rows selected.



SQL> ED

Wrote file afiedt.buf



1 SELECT TABLESPACE_NAME

2 FROM DBA_TABLES

3* WHERE TABLE_NAME='AUD$'

SQL> /



TABLESPACE_NAME

------------------------------

SYSTEM



1 row selected.



SQL> SELECT * FROM V$CONTROLFILE;



STATUS

-------

NAME

----------------------------------------------------------------------

C:\ORACLE\ORADATA\DBDEV\CONTROL01.CTL

C:\ORACLE\ORADATA\DBDEV\CONTROL02.CTL

C:\ORACLE\ORADATA\DBDEV\CONTROL03.CTL

3 rows selected.

SQL> SELECT * FROM V$PWFILE_USERS;



USERNAME SYSDBSYSOP

----------------------------------------

INTERNAL TRUE TRUE

SYS TRUE TRUE



2 rows selected.



SQL> SHOW PARAMETER MAX_ENAB



NAME TYPE VALUE

----------------------------------------------------------------------

max_enabled_roles integer 148



SQL> SHOW PARAMETER POOL_SIZE



NAME TYPE VALUE

----------------------------------------------------------------------

global_context_pool_size string

java_pool_size string 25165824

large_pool_size string 41943040

shared_pool_size big integer100663296


SQL> ALTER SYSTEM SET SHARED_POOL_SIZE=150M SCOPE=SPFILE;



System altered.



SQL> ALTER SYSTEM SET LARGE_POOL_SIZE=157286400 SCOPE=SPFILE;



System altered.



SQL>

SQL> SHOW PARAMETER REMOTE



NAME TYPE VALUE

----------------------------------------------------------------------

remote_archive_enable boolean TRUE

remote_dependencies_mode string TIMESTAMP

remote_listener string

remote_login_passwordfile string EXCLUSIVE

remote_os_authent boolean FALSE

remote_os_roles boolean FALSE



SQL> SHUT IMMEDIATE

Database closed.

Database dismounted.

ORACLE instance shut down.



SQL> HOST ORADIM -DELETE -SID DBDEV



SQL> HOST ORADIM -NEW -SID DBDEV -INTPWD DBDEV -STARTMODE AUTO -PFILE=C:\oracle\admin\DBDEV\pfile\INIT.ORA



SQL> HOST ORADIM -NEW -SID DBDEV -INTPWD DBDEV -STARTMODE AUTO -PFILE C:\oracle\admin\DBDEV\pfile\INIT.ORA



Once the pre-requisite checks are satisfactory, startup the database in migrate mode.

SQL> startup migrate
ORACLE instance started.

Total System Global Area 472457268 bytes
Fixed Size 454708 bytes
Variable Size 369098752 bytes
Database Buffers 100663296 bytes
Redo Buffers 2240512 bytes
Database mounted.
Database opened.
To start the Migrate/Upgrade process, run the following script:

SQL> @C:\oracle\product\9.2.0\db_1\rdbms\admin\U0900010

SQL> SELECT COMP_NAME, VERSION, STATUS
2 FROM DBA_REGISTRY;

COMP_NAME VERSION STATUS
----------------------------------- ------------------------------ -----------
Oracle9i Catalog Views 9.2.0.8.0 VALID
Oracle9i Packages and Types 9.2.0.8.0 VALID
JServer JAVA Virtual Machine 9.0.1 LOADED
Java Packages 9.0.1 LOADED
Oracle XDK for Java 9.0.1 LOADED
Oracle Text 9.0.1 LOADED
Oracle Workspace Manager 9.0.1.0.0 LOADED
Oracle interMedia 9.0.1.0.1 LOADED

8 rows selected.
To Upgrade the Components, run the following script:

SQL> @C:\oracle\product\9.2.0\db_1\rdbms\admin\CMPDBMIG

SQL> SELECT COMP_NAME, VERSION, STATUS
2 FROM DBA_REGISTRY;

COMP_NAME VERSION STATUS
----------------------------------- ------------------------------ -----------
Oracle9i Catalog Views 9.2.0.8.0 VALID
Oracle9i Packages and Types 9.2.0.8.0 VALID
JServer JAVA Virtual Machine 9.2.0.8.0 VALID
Oracle9i Java Packages 9.2.0.8.0 VALID
Oracle XDK for Java 9.2.0.10.0 UPGRADED
Oracle Text 9.0.1 LOADED
Oracle Workspace Manager 9.2.0.1.0 VALID
Oracle interMedia 9.0.1.0.1 LOADED

8 rows selected.

SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
We need to compile the invalid objects from the Migrate/Upgrade process.

 SQL> startup restrict
ORACLE instance started.

Total System Global Area 472457268 bytes
Fixed Size 454708 bytes
Variable Size 369098752 bytes
Database Buffers 100663296 bytes
Redo Buffers 2240512 bytes
Database mounted.
Database opened.

SQL> @C:\oracle\product\9.2.0\db_1\rdbms\admin\UTLRP

PL/SQL procedure successfully completed.

Table created.

Table created.

Table created.

Index created.

Table created.

Table created.

View created.

View created.

Package created.

No errors.

Package body created.

No errors.

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.

SQL>
SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 472457268 bytes
Fixed Size 454708 bytes
Variable Size 369098752 bytes
Database Buffers 100663296 bytes
Redo Buffers 2240512 bytes
Database mounted.
Database opened.

SQL> SELECT COMP_NAME, VERSION, STATUS
2 FROM DBA_REGISTRY;

COMP_NAME VERSION STATUS
----------------------------------- ------------------------------ -----------
Oracle9i Catalog Views 9.2.0.8.0 VALID
Oracle9i Packages and Types 9.2.0.8.0 VALID
JServer JAVA Virtual Machine 9.2.0.8.0 VALID
Oracle9i Java Packages 9.2.0.8.0 VALID
Oracle XDK for Java 9.2.0.10.0 VALID
Oracle Text 9.0.1 LOADED
Oracle Workspace Manager 9.2.0.1.0 VALID
Oracle interMedia 9.0.1.0.1 LOADED

8 rows selected.
To upgrade “Oracle Text” and “Oracle interMedia”, I referred the following Metalink Notes :
  • 275689.1 Manual installation, deinstallation of Oracle Text 9.2, and
  • 337415.1 Where to Find the Information to Install, Upgrade, Downgrade and Deinstall interMedia/Oracle Multimedia?

Now, my test database was successfully upgraded to Oracle 9iR2 (9.2.0.8). Similarly, I set up another test database and this time carried out my simulated migration using DBUA. Using the Database Upgrade Assistant for migration of the test database was really smooth. Just, Sit back and watch! But, need to watch for error messages.

Parallel to the test run activities, we received the good news that the Production Server’s RAM can now be upgraded from 1.5 GB to 4 GB, as the RAM for our server model is available. This really made the production migration activity to run smoothly. Without any further delay, we planned for an appropriate maintenance window and upgraded the Server's RAM.

Subsequently, I carried out the Oracle SGA Optimization, by appropriately sizing the SGA components to increase the SGA from 930 MB to 1.4 GB. And remaining 500-600 MB from the 2 GB process limit was configured for the user sessions’ memory. As we were running Windows 2000 Server Standard Edition, we were not in a position to enable Address Windowing Extensions [Metalink Note: 225349.1] to use. So the remaining 2 GB RAM remains unutilized.

By now, I was quite comfortable with both the upgrade processes, and I decided to carry out the production migration from Oracle 9iR1 (9.0.1) to Oracle 9iR2 (9.2.0.8) using DBUA. I planned for an ample maintenance window and, with the required technical assistance from the IT Admins, upgraded the production database using Database Upgrade Assistant. While upgrade I received the following errors:


Step Execution Information:
Oracle Database

Error ignored: ORA-00904: "TYPE": invalid identifier
Error ignored: ORA-24032: object AQ$_AQ_EVENT_TABLE_T exists, index could not be created for queue table AQ_EVENT_TABLE ORA-06512: at line 6
Error ignored: ORA-24002: QUEUE_TABLE SYS.AQ_EVENT_TABLE does not exist ORA-06512: at line 6
Error ignored: ORA-24010: QUEUE SYS.AQ_EVENT_TABLE_Q does not exist ORA-06512: at "SYS.DBMS_AQADM_SYS", line 3676 ORA-06512: at "SYS.DBMS_AQADM", line 217 ORA-06512: at line 2
Error ignored: ORA-24032: object AQ$_AQ_SRVNTFN_TABLE_T exists, index could not be created for queue table AQ_SRVNTFN_TABLE ORA-06512: at line 6
Error ignored: ORA-24002: QUEUE_TABLE SYS.AQ_SRVNTFN_TABLE does not exist ORA-06512: at line 6
Error ignored: ORA-24010: QUEUE SYS.AQ_SRVNTFN_TABLE_Q does not exist ORA-06512: at "SYS.DBMS_AQADM_SYS", line 3676 ORA-06512: at "SYS.DBMS_AQADM", line 217 ORA-06512: at line 2
Error ignored: ORA-24006: cannot create QUEUE, SYSTEM.AQ$_DEF$_AQCALL_E already exists ORA-06512: at line 9
Error ignored: ORA-24002: QUEUE_TABLE SYSTEM.DEF$_AQCALL does not exist ORA-06512: at line 19
Error ignored: ORA-24006: cannot create QUEUE, SYSTEM.AQ$_DEF$_AQERROR_E already exists ORA-06512: at line 9
Error ignored: ORA-24002: QUEUE_TABLE SYSTEM.DEF$_AQERROR does not exist ORA-06512: at line 19
Error ignored: ORA-04063: view "SYS.DBA_LOGSTDBY_PROGRESS" has errors

Go to top


As most of the errors where related to Streams, and we were not using any Streams, I ignored and completed the upgrade using DBUA, and checked the health of the production instance.

SQL>
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Release 9.2.0.8.0 - Production
PL/SQL Release 9.2.0.8.0 - Production
CORE 9.2.0.8.0 Production
TNS for 32-bit Windows: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

SQL>
SQL> select substr(comp_id,1,12) comp_id, status, substr(version,1,10) version,
2 substr(comp_name,1,40) comp_name from dba_registry order by 1;

COMP_ID STATUS VERSION COMP_NAME
------------ ----------- ---------- ----------------------------------------
AMD OPTION OFF 9.0.1.0.0 OLAP Catalog
CATALOG INVALID 9.2.0.8.0 Oracle9i Catalog Views
CATJAVA VALID 9.2.0.8.0 Oracle9i Java Packages
CATPROC INVALID 9.2.0.8.0 Oracle9i Packages and Types
CONTEXT VALID 9.2.0.8.0 Oracle Text
JAVAVM VALID 9.2.0.8.0 JServer JAVA Virtual Machine
ORDIM VALID 9.2.0.8.0 Oracle interMedia
OWM VALID 9.2.0.1.0 Oracle Workspace Manager
RAC INVALID 9.2.0.8.0 Oracle9i Real Application Clusters
SDO VALID 9.2.0.8.0 Spatial
WK LOADED 9.0.1.0.0 Ultrasearch
XML VALID 9.2.0.10.0 Oracle XDK for Java

12 rows selected.

SQL> SELECT SUBSTR(OWNER,1,12) OWNER, SUBSTR(OBJECT_NAME,1,30) OBJECT,
2 SUBSTR(OBJECT_TYPE,1,30) TYPE, STATUS
3 FROM DBA_OBJECTS
4 WHERE STATUS <> 'VALID';

SYS AQ$_AQ_EVENT_TABLE_E QUEUE INVALID
SYS AQ_EVENT_TABLE_Q QUEUE INVALID
SYS CATALOG_TYP TYPE INVALID
SYS CATALOG_TYP TYPE BODY INVALID
SYS CATINDEXMETHODS TYPE INVALID
SYS DBA_LOGSTDBY_PROGRESS VIEW INVALID
SYS DBMS_INTERNAL_LOGSTDBY PACKAGE BODY INVALID
SYS DBMS_LOGSTDBY PACKAGE BODY INVALID
SYS RULEINDEXMETHODS TYPE INVALID
SYS TEXTOPTSTATS TYPE INVALID
QS_ADM QS_APPLICATIONS PACKAGE BODY INVALID
11 rows selected.

SQL> SELECT * FROM NLS_DATABASE_PARAMETERS WHERE PARAMETER='NLS_NCHAR_CHARACTERSET';

NLS_NCHAR_CHARACTERSET AL16UTF16

Despite all the errors, the ERP Application was running fine. I had to fix the invalid database components and the AQ invalid objects. I immediately raised Priority Service Request (7020306.994) with Oracle Support for each of the invalid components.

The invalid objects were of 2 types:

  1. INVALID AQ objects
    Support first suggested dropping the AQ objects, as we don’t use them. Later, I suggested the Support to fix the object rather than deleting them (Spin off SR: 19434118.6). Then, I was told that we have a metadata corruption. To resolve the invalid AQ objects, I referred Metalink Note: 203225.1 (To Manually Cleanup Advanced Queuing Tables)
  2. INVALID Log Miner related objects
    To resolve the invalid objects pertaining to Log Miner, I was re-directed to Metalink Note: 309814.1 (Invalid SYS.DBMS_INTERNAL_LOGSTDBY Package Before or After Migrate)

SQL> select substr(comp_id,1,12) comp_id, status, substr(version,1,10) version,
2 substr(comp_name,1,40) comp_name from dba_registry order by 1;

COMP_ID STATUS VERSION COMP_NAME
------------ ----------- ---------- ----------------------------------------
AMD OPTION OFF 9.0.1.0.0 OLAP Catalog
CATALOG VALID 9.2.0.8.0 Oracle9i Catalog Views
CATJAVA VALID 9.2.0.8.0 Oracle9i Java Packages
CATPROC VALID 9.2.0.8.0 Oracle9i Packages and Types
CONTEXT VALID 9.2.0.8.0 Oracle Text
JAVAVM VALID 9.2.0.8.0 JServer JAVA Virtual Machine
ORDIM VALID 9.2.0.8.0 Oracle interMedia
OWM VALID 9.2.0.1.0 Oracle Workspace Manager
RAC INVALID 9.2.0.8.0 Oracle9i Real Application Clusters
SDO VALID 9.2.0.8.0 Spatial
WK LOADED 9.0.1.0.0 Ultrasearch
XML VALID 9.2.0.10.0 Oracle XDK for Java

12 rows selected.

To upgrade the Ultrasearch manually, the following steps were suggested by Oracle Support:

The simplest method to make Ultra Search VALID is to manually reinstall it.

Step1: Deinstall Ultra Search.Note 177203.1 How to Manually Install and Uninstall Ultra Search 9.0.1 Server Components.

Follow, "MANUAL DEINSTALLATION OF ULTRA SEARCH DATABASE OBJECTS" from the above Doc,Step2:

Install Ultra Search.Note 222517.1 How to Manually Install and Uninstall Ultra Search 9.2.0.x Server Components

Follow, "MANUAL INSTALL STEPS" from the above Doc.

Now check for the status and WKSYS invalid objects.select comp_id, comp_name, status, version from dba_registry;select object_name, object_type from dba_objects where status='INVALID' and owner='WKSYS';


To resolve the Invalid RAC component issue, the Oracle support redirected me to Metalink Note 312071.1 (RAC Option Invalid after Migration). Here, we didn’t have to do anything but leave it as it is. Why? You can refer the below RCA and Solution from the Metalink Note 312071.1:

Previous versions (8.1.7/9.0.1) didn’t use a registry for components.

During migration to 9.2 or 10g the registry (DBA_REGISTRY) was created with information from database.

For RAC, the entry in the registry is created if view V$PING exists in the database at migration time.

Unfortunately this view (V$PING) was created by catparr.sql, a script which could be used although RAC hadn't been installed.

Therefore if V$PING exists during migration to 9.2, 10g or 10gR2, a entry for RAC is created in DBA_REGISTRY. Later, cmpdbmig script checks DBA_REGISTRY and tries to upgrade all components, including RAC.

As really RAC is not installed, the component becomes invalid.

Solution
If you want to eliminate the line from registry you can:- downgrade to 8.1.7- remove view V$PING- upgrade to 9.2, 10g or 10gR2 againBut the best solution is to ignore this line if RAC is not installed in the database.


After resolving all the issues, I was now in a position to pronounce the production database migration from Oracle 9iR1 (9.0.1) to 9iR2 (9.2.0.8) as “Successful”.

SQL> select comp_id, comp_name, status, version from dba_registry;

COMP_ID COMP_NAME STATUS VERSION
------------------------------ --------------------------------------- ----------- ------------
CATALOG Oracle9i Catalog Views VALID 9.2.0.8.0
CATPROC Oracle9i Packages and Types VALID 9.2.0.8.0
JAVAVM JServer JAVA Virtual Machine VALID 9.2.0.8.0
CATJAVA Oracle9i Java Packages VALID 9.2.0.8.0
XML Oracle XDK for Java VALID 9.2.0.10.0
CONTEXT Oracle Text VALID 9.2.0.8.0
RAC Oracle9i Real Application Clusters INVALID 9.2.0.8.0
OWM Oracle Workspace Manager VALID 9.2.0.1.0
ORDIM Oracle interMedia VALID 9.2.0.8.0
SDO Spatial VALID 9.2.0.8.0
AMD OLAP Catalog OPTION OFF 9.0.1.0.0
WK Oracle Ultra Search VALID 9.2.0.8.0

12 rows selected.

SQL> ed
Wrote file afiedt.buf

1* select object_name, object_type from dba_objects where status='INVALID'
SQL> /

no rows selected

Now, I was in a position to pronounce the production database migration from Oracle 9iR1 (9.0.1) to 9iR2 (9.2.0.8) as “Successful”.

It’s been almost more than 10 months now, and we haven’t encountered any ORA-600 [25012] or ORA-600 [12209]. The Systems are running smooth, there are neglibible user performance complaints, and I have more control over the whole ERP Database.

Post upgrade, I was able to successfully implement Oracle RMAN Backup & Recovery. Despite the delay in procurement of new Hardware being (due to the global economic recession), we are now relatively comfortable to migrate the production database from Oracle 9i R2 (9.2.0.8) to either Oracle 10g R2 or Oracle 11g R1.

This is my “Journey from 9.0.1 to 9.2.0.8”. What I’ve learnt from this journey is that for a successful implementation you definitely need technical skills and functional skills, but as in my case, having strong people skills helps you overcome obstacles, by finding workarounds you'll never think of in a routine migration. I sincerely hope that “Journey from 9.0.1 to 9.2.0.8” will be of some use to the Oracle wayfarers, and look forward to sharing more accomplishments in the near future.

Thursday, March 19, 2009

Journey from 9.0.1 to 9.2.0.8: Part-III

Within the first few weeks when I had joined the Company, it was brought to my attention that the System Crashes were quite frequent, and that I had to resolve it at the earliest to prevent downtimes having business impact. I had inquired and gathered enough non-technical information regarding the System Crash Issue from the perspective of the Developers, the IT Admins, and few of the users. I also had a glimpse of the Alert.log and found a peculiar error whose frequent occurrence was notable. Then it happened. The users started calling in that their sessions have hung and they were not able to login to open new sessions. Even the developers could not do anything. I had logged in using Toad with my sysdba account and even that session was frozen.

I remember rushing towards the Datacenter and coordinating with the IT Admins, as I did not have any remote access to control the database at my end. To my surprise, it was a pretty casual scenario there. We logged in to the server and I noticed that the CPU utilization was 99%-100%. I tried to connect using the sysdba privileges and was not able to login. Eventually, we decided to restart the server and then the System was back to normal.

I had to do something, where we do not need to restart the server every time and avoid the Crashes from happening. Two questions that were haunting me, and I was desperately seeking answer to them were:

  1. What could have caused the server CPU utilization to shoot up at 99-100%?
  2. Why wasn’t I able to login using my sysdba privileges, despite being part of the ORA_DBA group?
A detailed study of the Alert.log revealed that the instance raised a series of errors “ORA-600 [25012]”, and immediately after that the instance crashed, or the server was restarted. Further investigation revealed that the error was being raised since 2005. And the subsequent occurrences increased gradually. Then a few weeks later another similar incident occurred. This time we could see ORA-600 [12209] and ORA-600 [17281] in the Alert.log and the trace files. I immediately raised a Service Request [6434255.992] with the Oracle Support.

The Oracle Support pointed out that we were hitting a Bug [2622690] and there are many reasons why it could occur, in our cases this time it was because of Call stack and Circumstances Match. And, they could not do anything more as we were running the database on a Desupported version of Oracle. The last supported Release was Oracle 9iR2 [9.2.0.8.0]. Below is the update from Oracle Support on the issue:

You are hitting bug 2622690.

Details:

ORA-600 [12209] can occur when using shared servers (MTS) if the client connection aborts.

In this case the database crashed because PMON had problems cleaning up the Shared server process. There is a fix created on top of 9.0.1.4 for ms windows. The patch should be located under 3183731. However, you must first install patchset 9.0.1.4

The bug is fixed in 10g, but I could find no occurrences for the bug in 9.2.0 either.
Therefore, my recommendation is to install and upgrade to a supported release. Which is 9.2.0.8 or 10g.The bug is fixed in 10g and should not reoccur in 9.2.0.8.

However should it reoccur, we can request a backport.

A new note was created, which can be accessed via Metalink [452099.1].

My Analysis with using all information in hand to my first question was that when the Bug was being hit and the CPU utilization increased up to 99-100%, a dump file was being generated by Oracle until the server was restarted, the size of which would usually vary from 25 MB to 300MB (based on the last 3-4 known cases).

Later, I found out the max_dump_file_size parameter to be set to UNLIMITED. I changed it to 100MB and monitored the effect of the change for the ORA-600 errors encountered. Based on the findings, I further reduced the max_dump_file_size parameter to 10MB. After this change, I would rarely restart the server. I could restart the database via remote access. You can call this a temporary work-around to the ORA-600 being encountered.

With respect to the second question, I suppose the login failure could be because of the increased CPU utilization to 99-100%, and oracle prioritizing the dump file creation, such that oracle probably did not have sufficient resources to allow the establishment of the new connections (be it sys or any other account).

To fix this issue permanently, we had to migrate the database to the latest supported release, i.e. Oracle 10gR2. But due to resource constraints on the server, especially the available Memory (1.5 GB) and to some extent the storage, the only way we could migrate to 10gR2 would be if we could migrate to a new server with optimal configurations. And if we were to fix the issue, we had to at least apply the patch 3183731 over Oracle 9iR1 (9.0.1.4) as suggested by Oracle Support or migrate to Oracle 9iR2 (9.2.0.8) on the same server. Oracle Support recommended moving to Oracle 9iR2 [9.2.0.8].

I prepared a Root Cause Analysis Report on the Incident, and a Management Summary depicting the Average Loss (in terms of Cost) that the company is incurring due to the Downtimes resulting from these incidents. And, build up a case to migrate to Oracle 9iR2 [9.2.0.8].

What I had to do next was, to draft the action plan for the migration from Oracle 9iR1 [9.0.1] to Oracle 9iR2 [9.2.0.8]. And then, carry out a serious of tests to ensure that I tackle all the issues related to migration before hand, especially bearing in mind that we have an OLTP ERP Database, and to ensure that the ORA-600 nightmares don't occur post migration. Parallely, I had to, any how, upgrade the RAM on the server to have a smooth migration as well as to address the performance issues related to resource constraints.

In a nutshell, my Action Plan for the migration/upgrade on the same server was as follows:

  1. Firstly, to install Oracle 9iR2 [9.2.0] base home in a separate location on the database server,
  2. Then, to patch the Oracle 9iR2 [9.2.0] base home with 9.2.0.8 patchset,
  3. Before beginning the upgrade process, to ensure a cold backup of Oracle 9iR1 [9.0.1] Database is taken,
  4. Then, using either DBUA or Manual process, to upgrade the Oracle 9iR1 [9.0.1] database to Oracle 9iR2 [9.2.0.8],
  5. If everything is perfect, then to take a post upgrade cold backup,
  6. If any major issues are encountered during the upgrade process, then to restore the cold backup prior to upgrade and point the database to the old home Oracle 9iR1 [9.0.1].

In the final part to "Journey from 9.0.1 to 9.2.0.8" series, I will update my experience in Oracle 9iR2 [9.2.0.8] production upgrade, and how I tackled the post-upgrade issues.

Wednesday, March 18, 2009

SAVE 25% of all Oracle9i and 10g Exams

I received an email from Oracle University. So thought of sharing the discount information on variaous Oracle Exams with Oracle colleagues and aspirant in India.

Hope it helps.

SAVE 25% of all Oracle9i and 10g Exams

-HURRY BOOK YOUR SEATS NOW!( Limited seats per week per centre)

Exam Dates: 21st, 28th March 2009
Hurry, Register Now!

Please contact the below Oracle Representatives for exam details.
Location Contact us
Bangalore 080-41088235
Chennai 044-66346114
Delhi 011-46509015
Hyderabad 040-66397157
Kolkata 033-66162000
Mumbai 022-67711214
Ahmedabad 079-40024246
Pune 020-66321113 / 9326727519
Nagpur 08041088036
Chandigarh 080 41088036
Coimbatore 080 41084657
Jaipur 080 41088036

To Learn More on Oracle Certification, call 080- 41088012 or 91-9900521221
email Susheela

To plan for your Oracle Database 10g and 11g training:Email: Supriya or call: 080
41084656 for the latest course schedule

Tuesday, March 17, 2009

Oracle APEX: At Oracle 6i Developers' Mercy

I heard about Oracle APEX from a lot of people, and that it was really a gem in designing Oracle Web Applications. I was wondering like Oracle IAS, we would need to know a lot of Java to do the same. But, to my surprise, Oracle Application Express was a true "Oracle SQL, PL/SQL" Web based developement tool. Everything you design, develop and deploy comes from the database.

You need to have an ADMIN schema for managing Oracle APEX and then, you can have separate schemas for your developments, where you will "actually" store all you forms, reports, coding, SQL and PL/SQL objects, etc. When you run your application through a web server, all the elements of the page come from the database. That's Remarkable. Don't believe me!!! Then try it your self.

In the next post I will show you how I set up the APEX environment for our developers, for them to evaluate APEX for internal developments.

Tuesday, March 3, 2009

Investigating Hiccups in RMAN Implementation for Production Database

My RMAN Implementation is stuck at the point of near to "Implemented".

  1. I have configured the Production Database in Archivelog Mode.
  2. I have created a Recovery Catalog.
  3. I have registered the Production Database.
  4. I have set all the configuration required for the Disk Backup to a shared SAN location.
But, RMAN backup to the shared SAN location is "killing" slow. It takes around 8-13 hours to have a full RMAN backup (full backup size-18GB).

So, I raised a Service Request with Oracle Support and have been following up with them since last week. The IT Admins claim, network is not a bottleneck as they push 70 GB of backup in 4 hours. And with the series of findings that I submitted to Oracle Support, RMAN seems to be doing its job perfectly. On the shared location on Test Machine, the same backup completes in 35-40 minutes.

Today, I had a work with Oracle Support and ran a series of Test to see how much time it takes to backup the production instance in 3 different location.

I connected RMAN and logged into to the Production Instance using target control file on the Production Server and carried out the test.

For the test, I used the following elements:

Below is the spooled output of the test cases:

Spooling started in log file: \\Testdb\orcl\orcl_TEST_bkp.log
Recovery Manager: Release 9.2.0.8.0 - Production

RMAN>

List of Backups
===============
Key TY LV S Device Type Completion Time #Pieces #Copies Tag
------- -- -- - ----------- --------------- ------- ------- ---
71 B F A DISK 02-MAR-09 9 1 TAG20090302T114822
72 B A A DISK 02-MAR-09 1 1 TAG20090302T122345
73 B F A DISK 02-MAR-09 1 1
74 B F A DISK 03-MAR-09 9 1 TAG20090303T123712
75 B A A DISK 03-MAR-09 1 1 TAG20090303T131419
76 B F A DISK 03-MAR-09 1 1

RMAN>
RMAN>
Starting backup at 03-MAR-09
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=65 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00016 name=D:\ORACLE\ORADATA\ORCL\ORION_TEST.DBF
channel ORA_DISK_1: starting piece 1 at 03-MAR-09
channel ORA_DISK_1: finished piece 1 at 03-MAR-09
piece handle=D:\ORACLE\BACKUP\ORIONTEST_37K90DHC_1_1_20090303 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:04:55
Finished backup at 03-MAR-09

Starting Control File and SPFILE Autobackup at 03-MAR-09
piece handle=D:\ORACLE\ORA92\DATABASE\C-1032853409-20090303-02 comment=NONE
Finished Control File and SPFILE Autobackup at 03-MAR-09

RMAN>
RMAN>
Starting backup at 03-MAR-09
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00016 name=D:\ORACLE\ORADATA\ORCL\ORION_TEST.DBF
channel ORA_DISK_1: starting piece 1 at 03-MAR-09
channel ORA_DISK_1: finished piece 1 at 03-MAR-09
piece handle=\\TESTDB\ORCL\ORIONTEST_39K90DTF_1_1_20090303 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:02:25
Finished backup at 03-MAR-09

Starting Control File and SPFILE Autobackup at 03-MAR-09
piece handle=D:\ORACLE\ORA92\DATABASE\C-1032853409-20090303-03 comment=NONE
Finished Control File and SPFILE Autobackup at 03-MAR-09

RMAN>
RMAN>
Starting backup at 03-MAR-09
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00016 name=D:\ORACLE\ORADATA\ORCL\ORION_TEST.DBF
channel ORA_DISK_1: starting piece 1 at 03-MAR-09
channel ORA_DISK_1: finished piece 1 at 03-MAR-09
piece handle=\\BLADE5\MIS\BACKUP\RMAN\ORIONTEST_3BK90E8G_1_1_20090303 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:26:25
Finished backup at 03-MAR-09

Starting Control File and SPFILE Autobackup at 03-MAR-09
piece handle=D:\ORACLE\ORA92\DATABASE\C-1032853409-20090303-04 comment=NONE
Finished Control File and SPFILE Autobackup at 03-MAR-09

RMAN>
List of Backups
===============
Key TY LV S Device Type Completion Time #Pieces #Copies Tag
------- -- -- - ----------- --------------- ------- ------- ---
71 B F A DISK 02-MAR-09 9 1 TAG20090302T114822
72 B A A DISK 02-MAR-09 1 1 TAG20090302T122345
73 B F A DISK 02-MAR-09 1 1
74 B F A DISK 03-MAR-09 9 1 TAG20090303T123712
75 B A A DISK 03-MAR-09 1 1 TAG20090303T131419
76 B F A DISK 03-MAR-09 1 1
79 B F A DISK 03-MAR-09 1 1 TAG20090303T144812
80 B F A DISK 03-MAR-09 1 1
81 B F A DISK 03-MAR-09 1 1 TAG20090303T145439
82 B F A DISK 03-MAR-09 1 1
83 B F A DISK 03-MAR-09 1 1 TAG20090303T150031
84 B F A DISK 03-MAR-09 1 1
RMAN>


Here is the summary of the Test :

The backup tablespace size is 1.36 GB with one datafile.
The backup piece size in all 3 test cases is 1.1 GB each.
RMAN Backup Time at 3 location:
Local Backup Duration (D:\oracle\backup): 04:55 minutes
Shared Backup Duration (\\testdb\orcl): 02:25 minutes
Shared SAN Duration (\\blade5\mis\backup\RMAN): 26:25 minutes

With the results, it clearly indicates that backup of a 1.36 GB tablespace to SAN location is 5-6 times slower. This could be because of either Network Issue, High CPU Utilization, or Lots of I/Os. I have submitted the results to Oracle Support. Let's see what they have to say.

Until the issue is resolve, I am taking RMAN backups to shared location (\\testdb\orcl).

I will keep you posted on the further upcomings.