HomeOracle Page 4 - Developing and Implementing Applications
Go Atomic - Oracle
This article, the first of three parts, focuses on the design and creation of applications that use the database. It is excerpted from chapter five of the book Oracle Database 10g DBA Handbook, written by Kevin Loney and Bob Bryla (McGraw-Hill/Osborne, 2005; ISBN: 0072231459).
You can use SQL to combine many steps into one large query. In some cases, this may benefit your application—you can create stored procedures and reuse the code and thus reduce the number of database trips performed. However, you can take this too far, creating large queries that fail to complete quickly enough. These queries commonly include multiple sets of grouping operations, inline views, and complex multirow calculations against millions of rows.
If you are performing batch operations, you may be able to break such a query into its atomic components, creating temporary tables to store the data from each step. If you have an operation that takes hours to complete, you almost always can find a way to break it into smaller component parts. Divide and conquer the performance problem.
For example, a batch operation may combine data from multiple tables, perform joins and sorts, and then insert the result into a table. On a small scale, this may perform satisfactorily. On a large scale, you may have to divide this operation into multiple steps:
Create a work table. Insert rows into it from one of the source tables for the query, selecting only those rows and columns that you care about later in the process.
Create a second work table for the columns and rows from the second table.
Create any needed indexes on the work tables. Note that all the steps to this point can be parallelized—the inserts, the queries of the source tables, and the creation of the indexes.
Perform the join, again parallelized. The join output may go into another work table.
Perform any sorts needed. Sort as little data as possible.
Insert the data into the target table.
Why go through all these steps? Because you can tune them individually, you may be able to tune them to complete much faster individually than Oracle can complete them as a single command. For batch operations, you should consider making the steps as simple as possible. You will need to manage the space allocated for the work tables, but this approach can generate significant benefits to your batch-processing performance.
Eliminate Unnecessary Sorts
As part of the example in the preceding section, the sort operation was performed last. In general, sort operations are inappropriate for OLTP applications. Sort operations do not return any rows to the user until the entire set of rows is sorted. Row operations, on the other hand, return rows to the user as soon as those rows are available.
Consider the following simple test: Perform a full table scan of a large table. As soon as the query starts to execute, the first rows are displayed. Now, perform the same full table scan but add an order by clause on an unindexed column. No rows will be displayed until all the rows have been sorted. Why does this happen? Because for the second query Oracle performs a SORT ORDER BY operation on the results of the full table scan. Because it is a set operation, the set must be completed before the next operation is performed.
Now, imagine an application in which there are many queries executed within a procedure. Each of the queries has an order by clause. This turns into a series of nested sorts—no operation can start until the one before it completes.
Notethat union operations perform sorts. If it is appropriate for the business logic, use a union all operation in place of a union, because a union all does not perform a sort (because it does not eliminate duplicates).
A union all operation does not eliminate duplicate rows from the result set, so it may generate different results than a union.
Eliminate the Need to Query Undo Segments
When performing a query, Oracle will need to maintain a read-consistent image of the rows queried. If a row is modified by another user, the database will need to query the undo segment to see the row as it existed at the time your query began. Application designs that call for queries to frequently access data that others may be changing at the same time force the database to do more work—it has to look in multiple locations for one piece of data. Again, this is a design issue. DBAs may be able to configure the undo segment areas to reduce the possibility of queries encountering errors, but correcting the fundamental problem requires a change to the application design.
Tell the Database What It Needs to Know
Oracle’s optimizer relies on statistics when it evaluates the thousands of possible paths to take during the execution of a query. How you manage those statistics can significantly impact the performance of your queries.
Keep Your Statistics Updated
How often should you gather statistics? With each major change to the data in your tables, you should reanalyze the tables. If you have partitioned the tables, you can analyze them on a partition-by-partition basis. As of Oracle Database 10g, you can use the Automatic Statistics Gathering feature to automate the collection of statistics. By default, that process gathers statistics during a maintenance window from 10 P.M to 6 A.M. each night and all day on weekends.
Because the analysis job is usually a batch operation performed after hours, you can tune it by improving sort and full table scan performance at the session level. If you are performing the analysis manually, increase the settings for the DB_FILE_MULTIBLOCK_READ_COUNT and PGA_AGGREGATE_TARGET parameters within your session prior to gathering the statistics. If you are not using PGA_AGGREGATE_TARGET, increase SORT_AREA_SIZE instead. The result will be enhanced performance for the sorts and full table scans the analysis performs.
Hint Where Needed
In most cases, the cost-based optimizer (CBO) selects the most efficient execution path for queries. However, you may have information about a better path. You may give Oracle a hint to influence the join operations, the overall query goal, the specific indexes used, or the parallelism of the query.
Maximize the Throughput in the Environment
In an ideal environment, there is never a need to query information outside the buffer cache; all of the data stays in memory all of the time. Unless you are working with a very small database, however, this is not a realistic approach. In this section, you will see guidelines for maximizing the throughput of the environment.
Use Disk Caching
If Oracle cannot find the data it needs in the buffer cache or PGA, it performs a physical read. But how many of the physical reads actually reach the disk? If you use disk caching, you may be able to prevent 90 percent or more of the access requests for the most-needed blocks. If the database buffer cache hit ratio is 90 percent, you are accessing the disks 10 percent of the time—and if the disk cache prevents 90 percent of those requests from reaching the disk, your effective hit ratio is 99 percent. Oracle’s internal statistics do not reflect this improvement; you will need to work with your disk administrators to configure and monitor the disk cache.
Use a Larger Database Block Size
There is only one reason not to use the largest block size available in your environment for a new database: if you cannot support a greater number of users performing updates and inserts against a single block. Other than that, increasing the database block size should improve the performance of almost everything in your application. Larger database block sizes help keep indexes from splitting levels and help keep more data in memory longer. If you are experiencing buffer busy waits during inserts, increase the settings for the freelists parameter setting at the object level (if you are using Automatic Segment Space Management, the freelists parameter does not apply).