August 2007

Oracle 10g – What You Should Know Before You Upgrade

A New Version

Every so often the large software vendors come out with a new version of a flagship product. Companies like Microsoft, IBM, Oracle, Adobe, etc. each come out with new versions promising the newest bells and whistles, whether we need them or not. Some of these new versions are very easy to install (or upgrade themselves) and some of them require months of planning and testing. When you start talking about upgrading an operating system or a database environment, you are suddenly talking about a completely different kind of effort for installing the latest PDF reader or compression tool. The potential impact to an organization for an upgrade gone wrong could cost hard dollars and a disruption in an ongoing production environment.

There is a difference between a report not being delivered on time and a customer having to wait a couple of seconds longer for a response, especially in the E-commerce world. How often would you go back to a web store or web travel site if it didn’t deliver a quick and accurate response to a purchase or flight inquiry? A poorly performing system or a flawed migration will cost you not only hard dollars, but can also cause damage to your image within and outside the organization.

Upgrading to a new version of a tool should only be done if several requirements are met:

  • Does the new version provide required functionality not currently available?
  • Will support for the old version be discontinued?
  • Do I have the correct, well-trained staff to support the new version?
  • Can the current continuity of the production system be guaranteed during and after the migration?

A lot of organizations want to stay on top of technology by implementing and migrating to the newest version as soon as possible; other organizations prefer to wait for the first service pack so that the initial kinks are worked out. That’s not any different for an RDBMS upgrade:

  • Microsoft just released their SQL Server 2005 about 18 months ago and you see a slow migration going to this new version from SQL Server 6.5/7/2000.
  • IBM just released version 9 of their DB2 UDB RDBMS, but most customers still rely on version 8 and its 14 fix packs.
  • Oracle released version 10g Release 2 in July of 2005 and will drop support for version 8i in December of 2007. They will drop maintenance support for Oracle 9iR2 in July of 2007 (i.e. no more patches and fixes will be provided; special extended assistance support ends July 2010), and they just announced the launch of Oracle 11g Release 1 on July 11th, 2007. Based on their release and support schedule for the last couple of releases (8iR3, 9iR1, 9iR2, 10gR1 and 10g2) they will release a new version every 18-24 months.

The Decision

Knowing that Oracle will drop support on 9iR2, should you upgrade to Oracle 10g or wait for Oracle 11g? Again, this depends on whether you are an early adopter that takes the occasional bug for granted or if you want to wait to migrate to a more stable platform. If you review the list of requirements for upgrading, it seems evident that there is a need to start preparing for at least an upgrade to Oracle 10g. Whether you want to take it up another level to Oracle 11g depends on whether or not it brings you additional features that 10g currently lacks.

A common mistake is upgrading to improve performance. As Oracle guru Tom Kyte likes to say, any change to a system can have one of three outcomes: improved performance, degraded performance, or neither. If you currently have a well tuned system, you might see an improvement by upgrading to Oracle 10g -- “might,” if you don’t implement every new feature Oracle has implemented in its newest release. A lot of new features involve security, auditing, monitoring, etc. which causes a slight overhead over normal processing. On the other side, “normal” features see a definite performance improvement. For example, the PL/SQL compiler generates more efficient code, which can significantly reduce CPU usage in certain applications. Also, the updated performance tuning and monitoring tools provide new information and insight into your database.

A system that is not well tuned will most likely continue to perform poorly after an upgrade, and in some cases will perform worse. Features like Automatic Storage Management and Automatic Memory Management lower the involvement (and level of knowledge) required from a DBA to support it, but it also makes troubleshooting problems a lot more difficult. The database acts more like a black box since more tuning is performed internally by the database itself.

In order to support a new DBMS like Oracle 10g or 11g, it should be a requirement that the Database Administrators (DBAs) are well trained before they perform the upgrade and support the environment. A DBA should not just be responsible for creating objects, managing tablespaces, and backing up/restoring databases; they need to also know the ins and outs of the new tool as well as the architecture of the DBMS and how to troubleshoot performance issues. Each organization is different, but in many cases the system administrators and architects become more responsible for overall architecture instead of the DBAs who have the knowledge and expertise for efficient architectural design.

The Infrastructure

So, you’ve decided to upgrade? This opens up a new myriad of choices because often an upgrade of Oracle goes hand-in-hand with an upgrade of the supporting infrastructure. The number of changes you make to your infrastructure will, of course, increase the complexity of your migration since you will be introducing a new set of variables to the mix. There are seldom cut and dry answers in these areas that can be found without analyzing your specific situation but the following questions will be helpful:

Should we stay on the existing Operating System?

A lot of organizations are looking into Linux as the supporting platform right now and are moving away from Solaris, AIX, and other brands of UNIX for a variety of reasons. One of the main determining factors relates to the experience and knowledge of support personnel and the extent that specific OS features are being utilized in the existing architecture. What works fine on AIX might cause an unknown performance problem on Linux due to missing OS patches or specific code in the Oracle kernel to work on a certain OS/processor. Oracle patches are specific for Oracle versions, processor types and OS versions and patch levels. In some cases it might be required that you change out your infrastructure due to the incompatibility of certain Oracle versions and Oracle tools with certain server architectures. For example, Oracle 10gR2 is supported on HP-TRU64, but Oracle Warehouse Builder for HP-TRU64 is only supported up until Oracle 10gR1. Since the HP-TRU64 was being removed from the Oracle compatibility matrix as a supported platform, one customer had to switch out their whole infrastructure to HP-UX to get the required functionality. Additionally, if there are any other tools running on the same server then each of those must also be examined for compatibility.

Should we utilize the existing storage solution?

A storage solution is almost always another bottleneck. Too often we see an Oracle installation on a RAID-5 storage system, which is fine for a read-only system but detrimental for a high-update, high-performance environment. A general statement is that the more disks you have in your storage solution that can be assigned to your Oracle environment in RAID-1+0 configurations, the better it will be for your performance. The newest storage solutions provide really high throughput using the latest technologies and self-balancing controllers. These tools spread the load over all disks avoiding so-called hot spots at the operating or storage level, and relieving the Oracle DBA or Architect from having to manually configure a layout over all disks and mount points. If you are currently in a RAID 1+0 configuration and your system analysts have determined that you are experiencing I/O latency then you will most likely continue to have this issue after the Oracle upgrade. One potential and relatively inexpensive option is to increase the amount of memory in the server itself: One client increased the memory from 2 GB per CPU to 4 GB per CPU and realized a 20% reduction in load time.

Which edition should we use?

There are really 2 editions of Oracle for corporations (Standard and Enterprise) and each of these can be placed in a Real Application Cluster (RAC) configuration. (For a complete comparison between editions visit Oracle’s site at http://www.oracle.com/database/product_editions.html.)

If you are running a medium to large data warehouse (> 1 TB) then the Enterprise version is your best candidate as it contains many features directly related to warehousing (Compression, Parallelism, OLAP, and Partitioning). The Standard edition is appropriate for OLTP applications and even for smaller staging areas in a warehouse environment.

Many people ask the question about implementing Real Application Cluster (RAC). RAC is significantly better in 10g and is now a viable option. RAC provides several advantages such as high availability and many times increased performance. However, the configuration and maintenance is more complex making it critical to consider internal resource knowledge and training before taking this approach. Additionally, unless you are moving to smaller but more numerous machines (such as several Linux boxes vs. a mid/main frame machine) then the costs may also actually be higher. One thing to keep in mind is that typically a minimum of three machines is required for this configuration. While this does provide failover capability, it does not reduce the need for a good backup/recovery plan since a fatal failure on the SAN will affect this configuration as well.

The Planning

Once the decision has been made to upgrade, a project plan needs to be put in place. A well performing system requires the involvement of both the Infrastructure group and the Database group. One group cannot do a successful migration without the other. Therefore, during the planning phase it is a requirement to involve both parties and incorporate their input, insights, and the assignments of tasks.

Depending on the infrastructure changes, resource availability, and the type of migration (in-place, or new environment), a number of tasks will need to be assigned. Here is a short list of tasks that need to be defined and assigned:

• Identify all Oracle instances to be upgraded
• Identify all applications running on these Oracle instances
• Train DBA staff on new features
• Involve resources from those applications to set up test plans
• Capture statspack results from controlled tests
• Capture statspack results from production load run
• Implement new ordered/assigned hardware
• Implement Oracle up to the latest available officially released patch level
• Verify current Oracle support level
• Backup databases from 8/9i environment, restore on 10g
• Run controlled application tests
• Capture Automated Workload Repository (AWR) information from the controlled tests (AWR is the 10g version of statspack)
• Identify issues, resolve and tune environment
• Open Service Request with Oracle for scenarios not resolved internally
• Complete testing
• Create fall-back scenario
• Migrate environment(s)
• Run production (in parallel if possible)
• Capture AWR information from production load run
• Identify issues, resolve and tune environment
• Open Service Requests for Level 1 issues.
• Decommission old environment/servers

This list is far from complete, but shows some of the activities and tasks required before upgrading. An Oracle migration is not like installing a new desktop application. The number of resources involved in an Oracle 10g upgrade should never be limited to just the DBA group. In order to guarantee a successful migration, end users need to be involved, as well as network administrators, system/OS administrators, SAN/storage administrators, vendors and Oracle support.

Resolving a perceived performance problem by end users also requires the interaction between all of these groups since it could be caused by a number of things: The application could have some non-optimized code; the database connectivity software could be not optimized for 10g (ODBC, OCI, .Net adapters, JDBC, etc); the network could have problems at the time; or the problem could be in the storage switch, SAN router, faulty disk, non- optimized RAID configuration, bad SQL query, Operating System patch missing, 32/64 bit mismatch between hardware, OS and RDBMS, etc. There are too many causes for a poorly performing system to rely on a magic solution from Oracle that will improve speed. Sometimes it will, often it won’t, and at times it can become worse.

In our experience, it is best to create a separate environment for testing, and if resources allow for it, production. We have seen cases where just switching over a database from Oracle 8/9 to an exact copy of the database on Oracle 10g using optimized 10g parameters will make an application stop working or give wrong results. In one case we noted that during an upgrade from 9i to 10g, certain queries gave incorrect results just by setting the optimizer mode from 8.1.7 to 10.2.0 or v.v. The reason for the incorrect data was the new optimizer that Oracle implemented in 9i that focuses on minimizing I/O, instead of minimizing CPU, as in older versions. The way the optimizer creates an access path translating a SQL Statement to the internal Oracle code works differently between Oracle 8 and Oracle 9/10, which in extreme rare cases causes different result sets for the same query.

In order to support an Oracle 10g environment, it should be a requirement for any size organization to train its staff on the new features of Oracle 10g, performing a migration, identifying, troubleshooting and tuning an environment. The size of your organization determines if this needs to be done by one person or a team of people. Wherever outside resources are required to participate in a migration, they should provide the technical advice and expertise to the internal resources for them to execute a migration. Depending exclusively on outside resources to execute a migration will leave an organization vulnerable when it comes time to support the new environment.

In our next article we will discuss in detail some successful migrations and which parameters and settings can cause problems or work miracles when upgrading to 10g.