Oracle 10g Optimizer Behavior for JDE EnterpriseOne

by Roger Carter, DB Connect Solutions

Background

As of Oracle 10g the Rule-Based Optimizer (RBO) has moved to an unsupported state. Although the RBO is not completely deprecated as of version 10g, its days are clearly numbered and there is one and only one path forward – the Cost-Based Optimizer (CBO).

As many battle-weary Oracle DBAs can attest, the move toward the CBO has been a long and painful process for many applications and application vendors. On the one hand the traditional RBO was extremely naïve in its approach to SQL query optimization, yet it yielded very stable and predictable results. Conversely the CBO had intricate algorithms written and rewritten over the years with the end goal being an implied silver bullet toward SQL query optimization. The implicit promise from the propeller brains at Oracle was an internal engine that could and would make intelligent decisions based on data distribution, data volume, and even the speed of the underlying hardware subsystems.

Nice goal indeed.

What many application vendors struggled with over these intervening years was that despite the advertisements to the contrary, Oracle’s CBO was inherently unstable. For many good reasons that went largely uncommunicated to us mere mortals who actually supported production systems, Oracle struggled internally with exactly how to make the damned thing work as advertised.

Think about it -- How could any algorithm yield perfection (or near perfection) when it wasn’t based on a set of explicitly-defined rules such as those present in the traditional RBO? True, it might yield fabulous results 98% of the time, but those other 2% -- wow could they be painful, if not excruciatingly so! It didn’t matter that an ERP system using the CBO ran perfectly most of the time. What mattered is that ERP systems relying on the CBO didn’t perform well some times!

Many ERP vendors resisted the push toward the CBO for this very reason. It was better to adjust to “acceptable” performance using the RBO approach when results were predictable; it was not sufficient to accept even minimal performance problems using the CBO approach. As all DBAs know quite well, end users don’t particularly like to wait 36-48 hours for month-end reports, yet they can adjust to it if the results are predictable. The worst possible scenario, one which leads normally slack-jawed quiet users to mutate into homicidal DBA killers, is a set of standard reports that run in 5 minutes one month and 5 hours the next month. When the DBA says “But the database is fine!” that’s um, sort of a lie. And it makes excuses for instability introduced by the CBO.

Yet the move toward the CBO wasn’t and still isn’t a matter of debate. It’s sort of like a Microsoft edict with a database twist from Sir Larry Ellison himself:

Thou shalt embrace the Cost-Based Optimizer or die trying!

This author is fairly certain a number of DBAs have gone ahead and died, perhaps by their own Oracle OCP hands, when they couldn’t get the cost-based optimizer to well, optimize. The purpose of this whitepaper is to help prevent the same fate for a few of my JDE DBA comrades. For the non-JDE DBAs, well, you are what you eat I guess.

Cost-Based Optimizer: Oracle 9i per chance to 10g?

In anticipation of the move to a universal CBO paradigm, most major packages, including JDE EnterpriseOne, have transitioned to the Cost-Based Optimizer (CBO) as of Oracle 9i. This required that Oracle statistics be kept up-to-date so the CBO could produce optimal (and consistent) execution paths. If the CBO did not have accurate statistics, performance would suffer and the JDE DBA would have to go into hiding or do a little work.

This meant the DBA had to run scripts to periodically refresh statistics. The problem with this process is that the statistics gathering process was typically an “all-or-nothing” approach, meaning that statistics were refreshed for tables whether they needed to be refreshed or not. Some DBAs took advantage of a new feature in 9i called table monitoring that allowed the Oracle database to track which table statistics needed to be refreshed, but this required even more scripting to accomplish. More work (job security) for the Oracle DBA.

As of Oracle 10g, however, the process of identifying which statistics to refresh as well as the statistics refresh process itself, is completely automated (as long as the init.ora parameter STATISTICS_LEVEL is set to either TYPICAL or ALL). The internal Oracle Scheduler comes predefined with a job named GATHER_STATS_JOB which automatically refreshes statistics on tables whose statistics have gone stale. Less work (job insecurity) for the Oracle DBA.

The CBO in Oracle 9i was a small step forward in the new optimization paradigm because the default behavior of statistics collection was not to create histograms (a mechanism for profiling the data distribution within a table). This meant that the CBO in Oracle 9i was a little like the RBO with a twist of CBO thrown in.

But Oracle 10g became a CBO leap of faith. As of Oracle 10g, the default behavior is to collect histograms when table statistics are generated. (This is controlled by the method_opt parameter of the statistics collection mechanism.) Histograms provide the CBO with more detailed information about the data distribution within the columns of data in each table, and therefore allow the CBO to make an even more thorough analysis regarding the proper execution path for a particular query.

However, therein lies the CBO rub. When the CBO has detailed data distribution statistics available, it can do really stupid things to JDE Oracle queries.

This is because the presence of histograms can lead to performance instability for certain queries that use bind variables against columns containing skewed data. To understand this anomaly and to formulate a strategy to address it, we must delve deeper into the CBO and understand the concept of bind variable peeking.

Peekaboo! Your Bind Variables Have Been Peeked

In Oracle 9i the CBO introduced the concept of bind peeking. This is essentially a step performed during the hard parse of a SQL statement. (A hard parse occurs when the SQL statement does not exist in the shared pool.)

When parsing the SQL statement, the CBO “peeks” at the actual values of the bind variables as the SQL statement is loaded into the shared pool. The CBO uses these actual bind variable values to make a decision on the execution path to utilize for the statement. If the values within the column(s) corresponding to the bind variables are uniformly distributed, the result of the bind peeking doesn’t particularly matter regardless of the actual bind variable value. (Think of a statement such as SELECT * FROM EMPLOYEE WHERE EMPID = :1; the CBO will always use the primary key index on the EMPID column regardless of which employee id is used in the first execution of the statement.)

However, if the values within the column(s) corresponding to the bind variables are not uniformly distributed, the result of the bind peeking can lead to interesting and sometimes unexpected results. (Think of a statement such as SELECT * FROM EMPLOYEE WHERE STATE_CD = :1 where nearly all employees live in New Jersey. If there’s an index on STATE_CD and the very first execution of the statement is specified as SELECT * FROM EMPLOYEE WHERE STATE_CD = ‘AK’ the bind peeking would lead the CBO to choose the index, and the result would be an efficient execution for the first execution. After the first execution, the SQL statement and the execution path are now cached in the shared pool. If a subsequent execution of the statement is specified as SELECT * FROM EMPLOYEE WHERE STATE_CD = ‘NJ’ the index would be used again, even though a full table scan would be much more efficient.)

Within JDE, this bind peeking performance anomaly is seen quite frequently. It is typically seen on large tables, especially the F0911, F42119, and sometimes the F4211. It can impact UBEs, especially jobs that process the entire table a set of data at a time (for instance, processing the entire F0911 by a column that has a relatively small set of values). It can also be seen in interactive queries depending on the search criteria and the specific values chosen by the end user. At times it can seem to “come and go” depending on how long the offending SQL statement is cached in the shared pool. (This depends on the size of the shared pool, how frequently the SQL statement is run, and how much dynamic/un-shareable SQL is issued to the database.)

This issue was not as prevalent in Oracle 9i because the creation of histograms was not the default behavior. Because the CBO did not have detailed knowledge of data distribution in 9i, the risk of bind peeking anomalies was much lower. At the same time, the CBO was not as intelligent in Oracle 9i.

Recommendations for running JDE with the CBO

  • Do not turn off the Cost-Based Optimizer because you just can’t fight City Hall.
  • Although it is possible to disable bind peeking with a hidden init.ora parameter, this is not yet recommended. There may be some future guidance on this issue from Oracle or other know-it-alls like this author.
  • Do not turn off histograms collection across the board. Overall, the presence of histograms helps more than harms performance. Individual performance anomalies must be addressed instead.
  • It may be beneficial, however, to test the removal of histograms on individual tables, for example the F4211 and F42119. (If histograms are removed manually they will be subsequently recreated via the GATHER_STATS_JOB unless this is overridden via exec dbms_stats.set_param(‘method_opt’,’FOR ALL COLUMNS SIZE REPEAT’
  • Use the default GATHER_STATS_JOB provided in Oracle 10g. However, do not use the default sample size of DBMS_STATS.AUTO_SAMPLE_SIZE which can lead to poor histograms on skewed columns. It is strongly recommended that at least 30% be used for the default sample size. This can be achieved by setting the default via the command exec dbms_stats.set_param('estimate_percent',30)
  • In Oracle 10g the recalculation of statistics using DBMS_STATS will no longer automatically invalidate dependent cursors as happened in Oracle 9i. (This is because the no_invalidate parameter of DBMS_STATS defaults to DBMS_STATS.AUTO_INVALIDATE.) This means that Oracle will decide when to invalidate cursors so this behavior must be kept in mind when trying to force a new execution plan by recalculating statistics.
  • Because statistics collection is drastically different in Oracle 10g versus Oracle 9i it may be beneficial to transition to the new statistics mechanisms after upgrading to Oracle 10g. This is discussed in detail in Metalink Note 465787.1 “Managing CBO Stats during an upgrade from 9i to 10g”.
  • Individual queries that suffer from unstable execution plans can be stabilized via the creation of stored outlines or SQL profiles.
  • Despite all best efforts, occasionally new Oracle statistics will cause the CBO to choose an entirely new execution path, resulting in poor performance. Oracle has provided a mechanism to “rollback” changes made to statistics via the DBMS_STATS.RESTORE_TABLE_STATS package.

Summary

From an Oracle database perspective, the Cost-Based Optimizer is here to stay. Despite this author’s many rantings about its instability, it is possible to manage the CBO quite effectively in an Oracle JDE setting. This involves an understanding of the changes to the CBO between Oracle 9i and Oracle 10g, and it likewise involves the Oracle DBA understanding the performance profile of the underlying application. This is all the more important as the size of Oracle JDE databases scales over time. As Oracle has demonstrated time and again, nothing ever stays the same.

About the Author:

Roger Carter of DBConnect Solutions, Inc. has over 20 years of experience in the Information Technology arena working as an Oracle DBA for nearly all of his career. He brings a world-class approach to customer service that is lacking in the Oracle DBA arena in general and more specifically within the realm of ERP implementations. Roger is highly-regarded as a subject-matter expert in the area of Oracle tuning for JDE, with many years of experience managing and tuning multi-terabyte Oracle databases.
Roger is the founder and past president of the Iowa Oracle Users Group and has recruited, trained, and mentored dozens of Oracle DBAs over the course of his career, many of whom are now highly-regarded in their own right.

 del.icio.us  Stumbleupon  Technorati  Digg 

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this entry.
Comments
  • No comments exist for this entry.
Leave a comment

Submitted comments will be subject to moderation before being displayed.

 Enter the above security code (required)

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.