HomeOracle Page 5 - Developing and Implementing Applications
Store Data Efficiently at the Block Level - Oracle
This article, the first of three parts, focuses on the design and creation of applications that use the database. It is excerpted from chapter five of the book Oracle Database 10g DBA Handbook, written by Kevin Loney and Bob Bryla (McGraw-Hill/Osborne, 2005; ISBN: 0072231459).
Oracle stores blocks of data in memory. It is in your best interest to make sure those blocks are as densely packed with data as possible. If your data storage is inefficient at the block level, you will not gain as much benefit as you can from the caches available in the database.
By default, the pctused parameter is set to 40 for all database blocks, and pctfree is set to 10. If you use the defaults, then as rows are added to the table, they will also be added to a block until that block is 90 percent full; at that point the block will be removed from the “freelist” and all new inserts will use other blocks in the table. Updates of the rows in the block will use the space reserved by the pctfree setting. Rows may then be deleted from the block, but the block will not be added back to the freelist until the space usage within the block drops below the pctused setting. This means that in applications that feature many delete and insert operations of rows, it is common to find many blocks using just slightly above the pctused value of each block. In that case, each block is just over 40 percent used, so each block in the buffer cache is only that full—resulting in a significant increase in the number of blocks requested to complete each command. If your application performs many delete and insert operations, you should consider increasing pctused so the block will be re-added to the freelist as quickly as possible.
NOTE
If you use Automatic Segment Space Management, the pctused setting is managed by the database.
If the pctfree setting is too low, updates may force Oracle to move the row (called a migrated row). In some cases, row chaining is inevitable, such as when your row length is greater than your database block size or when you have more than 254 columns in a table. When row chaining and migration occur, each access of a row will require accessing multiple blocks, thus impacting the number of logical reads required for each command. You can detect row chaining by analyzing the table and then checking its statistics via USER_TABLES.
Design to Throughput, Not Disk Space
Take an application that is running on eight 9GB disks and move it to a single 72GB disk. Will the application run faster or slower? In general, it will run slower because the throughput of the single disk is unlikely to be equal to the combined throughput of the eight separate disks. Rather than designing your disk layout based on the space available (a common method), design it based on the throughput of the disks available. You may decide to use only part of each disk. The remaining space on the disk will not be used by the production application unless the throughput available for that disk improves.
Avoid the Use of the Temporary Segments
Whenever possible, perform all sorts in memory. Any operation that writes to the temporary segments is potentially wasting resources. Oracle uses temporary segments when the SORT_ AREA_SIZE parameter (or PGA_AGGREGATE_TARGET, if it is used) does not allocate enough memory to support the sorting requirements of operations. Sorting operations include index creations, order by clauses, statistics gathering, group by operations, and some joins. As noted earlier in this chapter, you should strive to sort as few rows as possible. When performing the sorts that remain, perform them in memory.
Favor Fewer, Faster Processors
Given the choice, use a small number of fast processors in place of a larger number of slower processors. The operating system will have fewer processing queues to manage and will generally perform better.
Divide and Conquer Your Data
If you cannot avoid performing expensive operations on your database, you can attempt to split the work into more manageable chunks. Often you can severely limit the number of rows acted on by your operations, substantially improving performance.
Use Partitions
Partitions can benefit end users, DBAs, and application support personnel. For end users, there are two potential benefits: improved query performance and improved availability for the database. Query performance may improve because of partitionelimination. The optimizer knows what partitions may contain the data requested by a query. As a result, the partitions that will not participate are eliminated from the query process. Because fewer logical and physical reads are needed, the query should complete faster.
NOTE
The Partitioning Option is an extra-cost option for the Enterprise Edition of the database software.
The availability improves because of the benefits partitions generate for DBAs and application support personnel. Many administrative functions can be performed on single partitions, allowing the rest of the table to be unaffected. For example, you can truncate a single partition of a table. You can split a partition, move it to a different tablespace, or switch it with an existing table (so that the previously independent table is then considered a partition). You can gather statistics on one partition at a time. All these capabilities narrow the scope of administrative functions, reducing their impact on the availability of the database as a whole.
Use Materialized Views
You can use materialized views to divide the types of operations users perform against your tables. When you create a materialized view, you can direct users to query the materialized view directly or you can rely on Oracle’s query rewrite capability to redirect queries to the materialized view. As a result, you will have two copies of the data—one that services the input of new transactional data, and a second (the materialized view) that services queries. As a result, you can take one of them offline for maintenance without affecting the availability of the other. Also, the materialized view can pre-join tables and pre-generate aggregations so user queries perform as little work as possible.
Use Parallelism
Almost every major operation can be parallelized—including queries, inserts, object creations, and data loads. The parallel options allow you to involve multiple processors in the execution of a single command, effectively dividing the command into multiple smaller coordinated commands. As a result, the command may perform better. You can specify a degree of parallelism at the object level and can override it via hints in your queries.