Tuning Your Database for Availability

When dealing with databases, uptime is not the only issue; if the database is up but slow, the end user will still be unhappy. Fortunately, there are ways to tune an Oracle database to deal with this issue, which is the subject of this article. It is excerpted from chapter three of Oracle Database 10g High Availability with RAC, Flashback and Data Guard, written by Matthew Hart and Scott Jesse (McGraw-Hill/Osborne, 2004; ISBN: 0072254289).

When we talk about high availability, we must take into account more than just the basic question of  “is the database up?” Much more than just that simple question is at stake. What really matters is the perceptions of the users. If the database is up but responsiveness of the database is poor, then from the end user perspective, uptime is affected. This is not a revelation to anyone, but it bears repeating, as this is the premise of this chapter, and the reason for its existence within the covers of a high-availability book. Oracle Database 10g has added numerous tuning enhancements, as well as manageability enhancements that will make it easier to detect bottlenecks in the database, and to alleviate those bottlenecks, sometimes automatically.

In this chapter, we will discuss some of these key features, including a better understanding of what they are, and how to set them up to enable the HA DBA to take full advantage of them. Since manageability is a key component of this as well, we will also base many of our examples on the Oracle Database 10g Enterprise Manager. This latest version of EM is web-based, and is greatly enhanced to make identifying and fixing bottlenecks easier and more intuitive than ever before.

Intelligent Infrastructure

Oracle Database 10g has made great strides in utilizing the information that the database knows about its own inner workings, and turning that knowledge into an infrastructure that will allow the HA DBA to easily and automatically take advantage of that information. This intelligent infrastructure, as termed by Oracle, begins with the Automatic Workload Repository. The Workload Repository gathers information on activity in the database at regular intervals. Each run of the Workload Repository is followed by an automatic run of The Automatic Database Diagnostic Monitor (ADDM), which interprets the information supplied by AWR and makes recommendations. One can dig deeper into these recommendations by then manually running one of several advisors that Oracle has provided. In addition, an active session history, or ASH, is maintained to track all events that sessions are waiting for, to allow real-time information gathering. Altogether this is referred to as the intelligent infrastructure. Combined, these tools give the HA DBA more immediate access than ever before into the inner workings of the Oracle RDBMS.

MMON Background Process

The MMON background process is a new background process introduced in Oracle Database 10g, responsible for the automatic monitoring that goes on within the database. MMON is responsible for kicking off the Automatic Database Diagnostic Monitor at the end of each AWR run, to check for bottlenecks or problems identified within the report. In addition, MMON monitors the ALERT_QUE, a queue owned by sys, once a minute, for any new information/alerts. MMON can be used to either automatically take a corrective action or to send an alert via the ALERT_QUE. These alerts are then displayed on the database home page in Enterprise Manager. Additional subscribers (such as third-party or home-grown tools) can be added to the ALERT_ QUE using the DBMS_AQADM.ADD_SUBSCRIBER procedure, allowing alerts to be sent to multiple locations. If the ALERT_QUE is unavailable for some reason, alerts will go to the alert.log for the instance.

AWR: Automatic Workload Repository

We begin by looking at one of the key components of these enhancements in Oracle Database 10g—the Automatic Workload Repository, or AWR. For those of you who are familiar with StatsPack from previous releases, AWR will have a familiar look and feel to it. For those who are new to Oracle, or perhaps did not take advantage of StatsPack in previous releases, we will begin with an overview of how it works. Later in this section, we will talk about what is new between AWR and the older versions of StatsPack.

What Is AWR?

AWR is essentially a job-based scheduled collection of statistics, gathered and stored in the Oracle database, containing information about the database itself. This metadata repository by default is kept in a new tablespace in Oracle Database 10g, called SYSAUX. The script catawr.sql creates the Workload Repository, with objects owned by sys. Workload Repository object names are prefaced with WRH$, WRI$, or WRM$. When creating a new database in Oracle Database 10g, the Workload Repository schema and the job to gather statistics are automatically created as part of the database creation. By default, statistics collection will run every 60 minutes, gathering information from within the database on stats such as I/O waits and other wait events, CPU used per session, sorts, I/O rates on various datafiles, and so forth. In order for this to occur, the parameter STATISTICS_LEVEL must be set to ALL or TYPICAL (TYPICAL being the default). At the end of a run, the repository is updated with current information for that period. Reports can then be generated, using begin and end values corresponding to previous Workload Repository runs, to determine what was happening within the database during a given period of time.

Since AWR runs are scheduled by default when the database is created, nothing special need be done to enable this functionality. As mentioned above, AWR runs at 60-minute intervals, collecting stats and storing that information. However, as you can imagine, this can lead to a rather large repository over time, so by default, the data will be purged after seven days. Should you wish to change these defaults—either the frequency of the runs, or how soon the repository is purged—it is easy enough to do so using Enterprise Manager. Should you wish to disable AWR altogether, you may do that as well; however, we strongly recommend that you do not do this. The overhead for AWR is minimal, and the gains achieved by allowing the statistic gathering can come in handy at the most unexpected times.

Command-Line API for AWR

As mentioned in the previous section, the functionality of the Workload Repository can best be utilized within Enterprise Manager (EM). Thus, the majority of examples in this chapter will be EM-centric. However, for those of you who are command-line junkies, the API for the Workload Repository is a new one, aptly named DBMS_ WORKLOAD_REPOSITORY. This can be used to create snapshots, drop snapshots, build baselines, and so forth. We will not be delving into the details of the DBMS_ WORKLOAD_REPOSITORY package, but for those of you who are so inclined (or are desperately in need of some heavy reading), please refer to the Oracle Database 10g Release 1 Guide titled  “PL/SQL Packages and Types Reference,” which is part of the Oracle documentation set.

Viewing an AWR Report

OK, so the Workload Repository is populated every 60 minutes with statistics, and this happens automatically. That’s nice. Now what? What do I do with this information? The answer is that you define times outlined by AWR runs, in which to turn this information into a report. When creating an AWR report, you define the beginning and ending interval for the reports using snapshot IDs from the Workload Repository. Your report will begin at one snapshot and end at a later snapshot.

An AWR report can be viewed in Enterprise Manager, using the administration page. From there, select Automatic Workload Repository, from under the Workload section, and then click on the underlined number next to Snapshots. Note that here you can also change the interval for AWR—we will cover these steps in more detail in an HA Workshop later in this section. As mentioned previously, you can determine the interval of the reports, with the beginning of the interval going back as far as the repository keeps the data. Reports can be viewed using a full report view in HTML format, or you can view the data with pertinent information summarized in the details view. If you prefer the old text look of the reports, as in the Oracle9i and Oracle8i StatsPack Reports, then a text report must be generated using the AWR_REPORT_ TEXT function of the DBMS_WORLOAD_REPOSITORY command-line API.

Interpreting the Workload Repository Report Output

The report, when viewed in its full form, will begin with a heading, which provides the DB information such as DB_Name, DBID, instance_name, and so on. The begin and end snap times will also be shown, with the total elapsed time for the report displayed in minutes. This interval should not be too long, or else the results will be diluted. The default interval of 60 minutes is usually a good starting point, but when troubleshooting a specific issue, you may want to lower that interval to 30 minutes to get more granularity in your data. After the heading information, the report is then broken out into several sections, covering various different types of statistics of value.

When reviewing the report, the Report Summary section at the top will give a quick overview of the database performance. The ending cache sizes will be shown, as well as the load profile, instance efficiency percentages, shared pool statistics, and the top five wait events. This report summary gives a quick look at the system, allowing an immediate review of the most common sources of bottlenecks. The Instance Efficiency section gives a quick look at the buffer and library cache hit ratios, sorts, parsing information, and so on, and allows you to immediately see if these areas of memory need to be increased. The load profile gives you an overall picture of how much of certain types of activity is going on in the database, such as physical reads and writes, sorts, parses, logons, and so forth. The top five timed events tell us where we are spending most of our time waiting, and are also categorized now into wait classes, which we will discuss later in this chapter. The wait classes give the HA DBA another opportunity for a quick look at the situation. For example, if the top five wait events are all I/O related, it can be made readily apparent by showing that all five events are part of the same wait class.

RAC Statistics

If you are running in a RAC environment, the next section will contain a quick look at RAC statistics to give you an idea of the overall performance of your cluster. This section provides information on the workload characteristics of your cluster, and gives insight into how much traffic is going across the interconnect, how frequently instances need to read from the remote cache vs. the local cache, and what the average times are for these operations.

Main Report

After the top sections, the main report body will follow. Here you will have links to various other sections of the report that are of interest, including sections on SQL statements ordered by various stats such as buffer gets, reads, executions, file and tablespace I/O stats, SGA stats, and so forth. These sections will help pinpoint specific areas that are common bottlenecks, by highlighting poorly tuned or frequently executed SQL, and/or by bringing to light I/O hot spots where disk response time is not acceptable. Each report contains a virtual goldmine of information on what is happening within the database, and what may be causing slowdowns at various stages of operation. Our purpose here is not to discuss each individual section, but rather to provide an overview of how to gather and review the reports. Automating the interpretation of these reports via ADDM will be discussed later in this chapter.

{mospagebreak title=Creating Baselines for Comparing the Workload}

Evaluating the reports is also made easier if there is a comparison point. We may suspect, as we evaluate sections of the AWR report, that something does not look right—the top SQL may show SQL statements with an excessive number of buffer gets, as an example. It would be invaluable to be able to compare the report to a similar point in time in the past, when similar work was being done and performance was better. For example, perhaps the same SQL was run at points in the past, but the buffer gets for that same statement were much lower. This might indicate that an index has been dropped, or the data distribution of the tables queried has somehow changed.

A baseline (also called a Preserved Snapshot Set) is used to have a comparison point to refer back to if and when performance suffers. The basic idea is that on a Monday morning (or any other time), when out of the blue the HA DBA is barraged with users advising that the database performance is less than exemplary, the first thing that he or she can do is compare the current performance (or, say, the performance over the past hour) to a baseline that was previously set up for that time frame. This will give the HA DBA an immediate idea of what is different from last week (or whatever point in the past you are comparing to) and a head start on where to look for the solution to the problem.

Periods for Baseline Creation

As you can probably tell already, it is necessary to anticipate periods when you might expect to have performance problems. For this, you must have an understanding of the nuances of your particular business. For example, you may experience different peaks at different times of the day—a busy period may occur between 8:30 A.M. and 1:00 P.M., and another peak from 2:30 P.M. to 4:30 P.M. Fortunately, it is possible to create multiple baselines, each covering different points in time. Therefore, you may decide to create a different, separate baseline for each of those periods.

On the other hand, it is difficult to create a baseline for every possible period. If performance problems occur with no appropriate baseline having been created, don’t fret. Depending on how long the retention time is for the Workload Repository, you can go back and create the baseline from a similar time frame in the past. For that reason, we recommend that you increase the workload retention time from the default of seven days. You may want to consider two weeks, or even a value of just over a month, depending on your storage capacity and the overall impact this has on your system. Keep in mind that this information will be stored in the SYSAUX tablespace, so size the SYSAUX tablespace accordingly.

Consider our scenario at the beginning of this section. Let’s say the DBA did not have a baseline for the early Monday morning period when performance was bad. Let’s assume that the problem began at 5:30 A.M. In this shop’s line of work, Monday mornings are rather unique, but it is unusual for the system to be loaded this early. The DBA would like to compare this morning’s workload to last week at the same time. However, with the default retention time of seven days, stats from last Monday at 5:30 A.M. would have already been purged from the database. If the retention time were set to eight days, or fourteen days, the DBA could go back in time in the Workload Repository, create a quick baseline covering last Monday morning, and then compare the current day’s snapshot to last week at the same time.

Comparing Current Stats to Baseline 

The advantage of creating baselines is twofold. First, the baseline data remains in the repository indefinitely, until it is explicitly deleted, so it is not subject to being aged out like the rest of the snapshots. This means that the data will still be there in a month to refer back to and compare. Hence the term “Preserved Snapshot Set.” The second advantage is the ability to quickly run comparisons of a snapshot to a baseline.

It is easiest to do the comparison using EM. From the Workload Repository section, you can select the option to Compare Timelines. This will allow you to first select the time frame that you wish to compare (presumably a period of time with poor performance), and then you can select a preserved snapshot set (or any other period of time within your retention range) to compare that to. The output will give you such fodder as comparison of block change rates for each period, physical reads, parse time, user commits, and so on. The display will show the timelines side by side, in graphical format, allowing you to quickly see the differentiators between the two timelines for the given periods.

———————————————————————-
HA Workshop: Exploring the Workload Repository

Workshop Notes

The Workload Repository is best understood by using Enterprise Manager, giving a graphical view of the snapshots available and masking the complexity of the APIs used. This example walks through using EM to view reports, edit the retention policies, and create baselines for comparison.

Step 1.  Log in to EM using the following URL, substituting in the hostname of the database server for <server_name>:

http://<server_name>:5500/em

If using Enterprise Manager Grid Control (discussed in Chapter 5), use the machine name for the management repository database, with port 7777, as such:

http://<repository_server>:7777/em

Navigate to the Workload Repository by clicking on the Administration link for your particular database, and then, under Workload, click the Automatic Workload Repository link.

Step 2.  Note the Snapshot Retention time (in days), the Snapshot Interval, and the Collection Level, under the General section. Click on Edit to change these. Set the Retention Period value to 30 days, and the System Snapshot Interval to 30 minutes, as shown in Figure 3-1, and then click OK.

Step 3.  Back on the main Workload Repository page, in the Snapshots section, click on the underlined number next to Snapshots. This link will propel you into the Snapshots screen, where you can view the times of all AWR snapshots that currently reside in the repository.

Step 4.  To view a report, navigate to the box titled Actions and change the selection to View Report. Click on the radio button next to an existing snap ID (not the last one) and then click Go. The next page will display all snapshots created


Figure 3-1.  Changing default snapshot retention time 

after the beginning snap ID that was previously selected. Select the one created directly after the beginning snap ID (this will give us a 30-minute interval), and then click OK. After a few seconds, you will see the Snapshot Details screen, with the ability to see in report format or details format. Save it by clicking on the Save to File button in the Report screen.

Step 5.  Next, to create a baseline, return to the Snapshots screen by clicking on the Snapshots link across the top. For this exercise, we will create a baseline for the time period of 9:00 A.M. to 12:00 P.M., on a weekday, so select the radio button next to an existing snapshot with a capture time as near as possible to 9:00 A.M. on a weekday. Ensure that Create Preserved Snapshot Set is selected in the Actions box, and choose Go.

Step 6.  On the Create Preserved Snapshot Set page, give the Snapshot Set a name of Weekday Morning and then select the Ending Snapshot to be the snapshot ID closest to 12 P.M. on the same day. Click OK to create the baseline (preserved snapshot set).

Step 7.  Now, to compare a report interval to the baseline you have just created, return to the Snapshots screen by clicking on the Workload Repository link and then the Snapshots link again (see Step 3). In the Actions box, select Compare Timelines and, for the beginning snap ID, choose a snapshot from 9:00 A.M. on a weekday other than the baseline you just created.

Step 8.  Choose the ending snapshot of 12:00 P.M. on that same day and choose Go.

Step 9.  On the Second Timeline Start screen, choose the Select a Preserved Snapshot Set option and then select the weekday morning baseline that was created in Step 6. Click Next, and then click Finish to get the Compare Timelines: Results screen. This screen will give you a graphical comparison between the baseline (second timeline) and the date that you had selected for comparison.


NOTE

You do not have to have a preserved snapshot set to do the comparison—you can compare intervals between any two periods for which you still have snapshots in the repository, based on your snapshot retention period.


ADDM (Automatic Database Diagnostic Monitor)

Well, after going through all of the information available in a Workload Repository report, it can seem a bit daunting to interpret. Further, you may ask yourself, geez—do I need to read through these reports every 60 minutes? Once a day? Once a week? The answer to that depends on your social life. However, the good news is that you do not have to read these reports on an hourly basis. The Automatic Database Diagnostic Monitor will do that for you.

ADDM is constantly running in the background, monitoring information that is collected in the Workload Repository. ADDM uses this data, as well as data from the active session history (ASH), and automatically analyzes this information to provide proactive recommendations on tuning and performance issues. The home screen in Enterprise Manager has an Advice section, with links to ADDM Findings, as well as an ADDM Analysis section where this information is displayed.

ADDM essentially can be used in two separate modes—what we would refer to as a proactive and a reactive mode. The proactive mode is defined by the automatic diagnosis mentioned above. Aside from these automatic ADDM checks, additional ADDM tasks can be created to look for specific problems, or to go back in time and look at a period in time when a problem was observed or reported. This is the reactive mode usage of ADDM.

{mospagebreak title=Viewing the ADDM Reports}

The ADDM runs are scheduled to run automatically at the end of each Workload Repository snapshot run—so each time a snapshot is gathered, ADDM runs automatically behind the scenes, gathering information on the last 60 minutes of activity (as defined by the system snapshot interval). From the command line, the DBMS_ADVISOR API can be run to get the stored advice from ADDM. For example, the last/most recent report can be viewed directly from SQL*Plus with the following query:

SQL> set long 1000000
SQL> set pagesize 50000
SQL> column get_clob format a80
SQL>select dbms_advisor.get_task_report(task_name) as ADDM_report
    from  dba_advisor_tasks
    where task_id = (
      select max(t.task_id)
      from dba_advisor_tasks t, dba_advisor_log l
      where t.task_id = l.task_id
       
and t.advisor_name = ‘ADDM’
        and l.status = ‘COMPLETED’);

The output of this query will look something like this:

ADDM_REPORT
——————————————–
          DETAILED ADDM REPORT FOR TASK ‘ADDM:1037082046_1_198′ WITH ID 310
          ———————————-
             Analysis Period: 10-DEC-2003 from 20:30:51 to 21:00:36
         Database ID/Instance: 1037082046/1
               Snapshot Range: from 197 to 198
                Database Time: 1119 seconds
        Average Database Load: .6 active sessions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FINDING 1: 100% impact (7136 seconds)
————————————-
SQL statements were found waiting for row lock waits.
RECOMMENDATION 1: Application Analysis, 100% benefit (7136 seconds)
  ACTION: Trace the cause of row contention in the application logic. Use
     given blocked SQL to identify the database objects involved.
     Investigate application logic involving DML on these objects.
  RATIONALE: The SQL statement with SQL_ID 18n2c85u5p8zf was blocked on
     row locks.
     RELEVANT OBJECT: SQL statement with SQL_ID 18n2c85u5p8zf
     update emp set ename = ename
  RATIONALE: The SQL statement with SQL_ID 275zcmg41cx02 was blocked on
     row locks.
     RELEVANT OBJECT: SQL statement with SQL_ID 275zcmg41cx02
  RATIONALE: The SQL statement with SQL_ID 5dhfmtds8m5qs was blocked on
     row locks.
     RELEVANT OBJECT: SQL statement with SQL_ID 5dhfmtds8m5qs
 
RATIONALE: The SQL statement with SQL_ID 872zuyfy72zs5 was blocked on
     row locks.
     RELEVANT OBJECT: SQL statement with SQL_ID 872zuyfy72zs5
SYMPTOMS THAT LED TO THE FINDING:
  Wait class “Application” was consuming significant database time. (100%
impact [7136 seconds])
FINDING 2: 100% impact (1790 seconds)
————————————-
PL/SQL execution consumed significant database time.
 
RECOMMENDATION 1: SQL Tuning, 100% benefit (1790 seconds)
   
ACTION: Tune the PL/SQL block with SQL_ID fjxa1vp3yhtmr. Refer to the
     
“Tuning PL/SQL Applications” chapter of Oracle’s “PL/SQL User’s Guide
      and Reference”
     
RELEVANT OBJECT: SQL statement with SQL_ID fjxa1vp3yhtmr
     
BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;

ADDM and Enterprise Manager

While you can do this from the command line, as we have noted previously, it is far easier to view the latest run (and any past runs of ADDM) by using Enterprise Manager. These findings can be viewed by clicking on the Advisor Central link from the database or instance home page. This link will take you directly to the Advisors page in EM, where you will see the task name associated with the last ADDM run (if you follow the link from the database home page of a cluster database, you will see the last ADDM run for each instance). To view the same output as the above query showed, ensure that the radio button next to the task is selected, and then click View Result. On the next screen, any findings will be summarized under the section Information Findings. To view the actual report, click on the View Report button.

Viewing Past Advisor Runs 

By default, the reports generated as we see above are stored in the database for a period of 30 days. Therefore, if you need (or want) to review this information reactively, you can simply go back in time to view a past ADDM run, and see the advice that was dispensed along with this report. From the command line, it is possible to get this information from the DBA_ADVISOR_TASKS and DBA_ADVISOR_LOGS views, but it is a bit more difficult than the above query. The above query simply used the max(task_id) to get the latest run. If you want to get a specific ADDM run from the past, you need to know the specific task_id. Therefore, we again recommend using Enterprise Manager for this.

Again, from the EM database home page, choose Advisor Central from the Related Links section at the bottom of the page. Under Advisory Type, choose All Types, and under Advisor Runs, choose All, and then click Go. This will pull up a summary page of all advisory runs still stored in the database, with a description, the task name, and the start and end time of each run. You can now go back to a specific point in time, and view the advice dispensed by ADDM for that time period by simply clicking on the task’s Name link, as shown in Figure 3-2.


Figure 3-2.  ADDM report as viewed in EM

What Drives ADDM?

So, what is the force behind the Automatic Database Diagnostic Monitor? What is it that makes ADDM generate a finding? Is it based on the mood that ADDM happens to be in when awakened after a 60-minute slumber? Is a “finding” precipitated by a congressional investigation, or triggered by an unfounded report on a fringe web site? Fortunately, ADDM is a bit more structured than that.

ADDM uses a combination of sources for its findings and recommendations: wait events for individual sessions; the DB time model to determine how much DB time is spent in database calls, with a focus on reducing the overall database time spent on operations; wait classes, which are high-level groupings of the more fine-grained wait events; and operating system and database metrics. Using this information stored in the Workload Repository and the active session history (ASH) lays the groundwork for ADDM to make the call to report a finding, and then come up with a subsequent recommendation.

Wait Events and Active Session History

Wait events are essentially the lifeblood of diagnosing a performance problem. In past releases, if the DBA could capture a problem while it was occurring, they could go into the V$SESSION_WAIT view and determine what it was that sessions were waiting on. This would give him or her the impetus to then move forward in resolving the issue. If we missed the event, however, we would have to either rerun it (somehow) or rely on information from StatsPack (or AWR) on what the top wait events were for that period, and make the leap of faith that these waits were corresponding to what our particular problem was. With Oracle Database 10g, there is a new view introduced called V$ACTIVE_SESSION_HISTORY, which maintains data from active sessions, capturing the wait events and their wait times, the SQL_ID, and session information for the waiting session. This allows us to then go back to the past and view this detailed information as it existed in the past. ADDM can use this for proactive and reactive analysis purposes. In addition, the view V$SESSION_WAIT_HISTORY will provide the last 10 wait events for an active session.

Wait Classes 

With Oracle Database 10g, there are now over 700 wait events, primarily due to the fact that many previous wait events have been broken down into more granular events to make diagnostics more precise. To make these wait events easier to interpret, they are categorized into wait classes, with each wait class pointing to a high-level category of problem for a particular wait event. Each wait event is assigned to a specific wait class, with the wait class pointing toward the problem and solution. The most common wait classes and their descriptions are listed here:

  1. Administrative  Commands issued by a privileged user (that is, DBA) that result in other users waiting—an index rebuild, for example.
     
  2. Application  Generally related to application design, this category includes lock waits caused by row-level locking and explicit or implicit lock commands (implicit lock commands such as those generated by a DDL statement).
  3. Cluster  Global cache, global enqueue, and global messaging–related wait events in a RAC environment.
  4. Commit  Currently only includes a wait event for redo log sync confirmation after a commit.
  5. Concurrency  Generally, waits involving concurrent parsing and buffer cache latch and lock contention; indicative of many sessions going after the same resources.
     
  6. Configuration  Wait events in this category can generally be resolved by tuning; includes undersized log buffer space, logfile sizes, buffer cache size, shared pool size, ITL allocation, HW enqueue contention, ST enqueue contention, and so on.
  7. Idle  Session is inactive.
  8. Other  Catchall for wait events not associated with one of the predefined classes.
  9. Network  Waits for data to be sent over the network—specifically waits in the networking layer or on hardware.
  10. Scheduler  Waits due to Resource Manager prioritization.
  11. System I/O  Waits on I/O done by background processes (except for MMON and SMON).
  12. User I/O  Wait for blocks to be read off disk by foreground process, SMON, or MMON.

{mospagebreak title=The DB Time Model}

The DB time model is a way of looking at performance based on the overall time spent in the database. Time-based statistics are kept in two new views in Oracle Database 10g—V$SYS_TIME_MODEL and V$SESS_TIME_MODEL. While these views contain various different time-related stats, the most significant of these is DB time. The goal of any tuning activity should be to reduce that overall time. If the overall time spent in the database, in database calls, is reduced, then the tuning has been effective. ADDM will use the DB time model as a basis for findings and recommendations. In addition, automatic tuning of the SGA (discussed later in this chapter, in “Memory Advisor and ASMM”) will take DB time into consideration when deciding where to allocate SGA resources, with the goal being to reduce overall DB time.

Advisor Central

Now that we are on the topic of the advisors, let’s go back and take a more detailed look at each of the advisors in the Advisor Central page. This page offers links to the following advisors, along with the ability to create a task for these advisors:


SQL Tuning Advisor

The SQL Tuning Advisor gives recommendations on changing SQL statements, to be rewritten so that they are more efficient. The SQL statements are referenced by a new column that is part of V$ACTIVE_SESSION_HISTORY, called SQL_ID. ADDM will detect statements that appear to be inefficient, and/or are leading to a disproportionate number of waits, and may suggest that you run the SQL Tuning Advisor against a given SQL statement. The SQL Tuning Advisor, when accessed from Advisor Central, will allow you to look at the top SQL for a specific period of time, create SQL tuning sets, or choose to view/create snapshots and/or baselines.

If a high-usage SQL statement shows up in an ADDM run, that statement will be displayed in the ADDM Analysis section of the EM database home page. Clicking on the statement will take you into a slightly different look, in the ADDM Finding Details view. From here, you will be able to run the SQL Tuning Advisor against the particular SQL statement.

Top SQL and Period SQL

The Top SQL link shows us by default the Spot SQL view, which shows five-minute intervals over the past hour. The highlighted section can be moved to display any five-minute interval over the past hour. The top SQL statements will be ordered by activity percentage, CPU percentage, and wait percentage, respectively, with the SQL_ID being the identifier for a given statement. The Period SQL link will allow a look at SQL statements over a longer period of time (a week), with the highlighted section giving you a look at a given 24-hour period within that week.

Clicking on the SQL_ID will give you an in-depth look at that particular statement, including the actual statement itself, the execution plan, current statistics, and execution history. The Current Statistics link has valuable information on the amount of space the statement takes in the shared_pool, as well as loads (parses) and invalidations, referring to the number of times this particular statement has been aged out of the shared_pool. A statement with a high number of loads and/or invalidations may be a candidate for pinning in the shared_pool.

Creating SQL Tuning Sets

SQL tuning sets can be created from within this page to allow a group of statements, either by type or by time period, to be analyzed. You may even want to create a tuning set based on just a single statement. Once the set is created, you can view the SQL statements by cost, buffer gets, disk reads, and so on. You can then choose to run either the SQL Tuning Advisor or the SQL Access Advisor against the tuning set. When choosing the criteria to create a tuning set, you might want to look at all of the statements in a given period of time, you may look at all SELECT statements within a period of time, or you may want to just grab the top one or two statements to create a tuning set.

SQL Access Advisor

The SQL Tuning Advisor will not give suggestions on adding or removing indexes, or making other physical changes. Changes like this would change the access method of a statement without changing the actual statement itself. For those types of recommendations, we refer you to the SQL Access Advisor. The SQL Access Advisor gives recommendations on either creating materialized views or creating indexes, or both, in order to give more efficient access for SQL statements. As input, you can use current and/or recent SQL activity that is still cached in memory. In addition, you can use a workload from the SQL Repository (requires that a SQL tuning set be created), or you can get a user-defined workload from a table. In addition, you can create a hypothetical workload, given certain inputs.

Memory Advisor and ASMM

In Oracle 9i, the concept of a dynamically resizable SGA was introduced, giving a DBA the ability to adjust one SGA parameter downward and then increase another value upward in its place. The Memory Advisor in Oracle Database 10g allows shared memory management to be enabled, which essentially automates this same process—hence the term automatic shared memory management, or ASMM. When shared memory management is enabled, a new parameter, SGA_TARGET, is set to a numeric value that becomes the ceiling on the total SGA size. Once shared memory management is enabled, you are essentially allowing Oracle to automatically adjust the values that define the total SGA size, on the fly, up to the total for SGA_TARGET.

ASMM Ceilings and Floors

As an oversimplified example, consider if your total SGA size is 1GB, divided equally between db_cache, large_pool, shared_pool, and java_pool. This would essentially give you values of around 250MB for each of these parameters. With shared memory management enabled, Oracle will monitor the stats associated with these SGA values, and may decide to reduce one parameter in favor of increasing another. For example, suppose Oracle detects that during a certain interval, the buffer cache hit ratio is lower than 80 percent, while at the same time the amount of free memory in the large_pool is relatively high. Oracle may decide to reduce the large_pool_size by 50 percent (to 125MB), and then dynamically increase the db_cache_size by 50 percent to 375MB. The overall memory used for the SGA is still the same, meaning that we have not exceeded our ceiling (as defined by SGA_ TARGET). However, the distribution of that memory has been altered, with the overall effect of increasing performance by using that memory where it is most needed.

When ASMM is enabled (by setting the SGA_TARGET), you should explicitly set the automatically managed pieces of the SGA to 0. In Oracle Database 10g Release 1, this includes the four parameters mentioned above: shared_pool_size, large_pool_size, java_pool_size, and db_cache_size (only the default db_cache_ size is automatically tuned). All other SGA-related parameters should still be manually set. If you set one of the above parameters, defined as auto-tunable, to an explicit value in the spfile or init file, that value will be used as the floor for that parameter—meaning that while Oracle may be allowed to increase the amount of memory allocated to that particular piece of the SGA, it will not be allowed to drop below the explicitly set value. So, continuing with our example, if we set SGA_TARGET to 1GB, and then set each of the above parameters to an explicit value of 250MB, no automatic tuning can be done because the floor for each parameter adds up to the total of SGA_TARGET, which is our ceiling. This leaves no room to increase or decrease anything. For this reason, when SGA_TARGET is set via the Memory Advisor in Enterprise Manager, Enterprise Manager will automatically unset all of the other SGA parameters (that is, set them to 0 values), thus giving Oracle the freedom to modify all parameters on the fly as it sees fit (whether to higher values or lower values). You can view the current settings that Oracle has arrived at for each of the SGA components by querying the view V$SGA_DYNAMIC_COMPONENTS, and these values can also be viewed in the Memory Advisor on EM.

Non-ASMM Tunable SGA Components 

The above is an oversimplified example, as the SGA is, of course, comprised of more than just those four parameters. SGA_ TARGET must actually account for the entire SGA, including a new parameter in Oracle Database 10g: STREAMS_POOL_SIZE. Aside from that, SGA_TARGET must account for values for the LOG_BUFFER, any KEEP pools, the RECYCLE pool, and any nondefault buffer caches. However, the KEEP pools, the RECYCLE pool, nondefault buffer caches and Streams_Pool values are not currently auto-managed, so these parameters must be set explicitly, even while the other main components of the SGA should be set to 0.


NOTE

Oracle cannot increase the total SGA beyond the value for SGA_MAX_SIZE, which by default is set exactly equal to the total SGA size. If you attempt to increase the total SGA size (using the Memory Advisor Wizard) to a value greater than SGA_MAX_SIZE, you will receive an error.


Memory Advice Parameters

If you choose not to enable shared memory management, the Memory Advisor will still be able to give advice on sizes for parameters within the SGA, such as DB_ CACHE_SIZE and SHARED_POOL_SIZE, as well as provide advice on the setting of PGA_AGGREGATE_TARGET. This advice is enabled automatically if STATISTICS_ LEVEL is set to either TYPICAL or ALL. For advice on the buffer cache, you must also have the parameter DB_CACHE_ADVICE=ON (this will default to ON if the STATISTICS_LEVEL parameter is set to TYPICAL or ALL).

The following views are populated when the above parameters are set to their default values:

v$db_cache_advice
v$shared_pool_advice
v$java_pool_advice
v$mttr_target_advice
v$pga_target_advice

This information is formatted and available in graphical format via the Memory Advisor on Enterprise Manager. By clicking on the Advice button next to the Buffer Cache values in the Memory Advisor, you will see calculations of what the difference would be in number of block reads for various values of DB_CACHE_SIZE. Similarly, the Advice button next to the Shared Pool value (see Figure 3-3) will give you estimates of parse times under different values of SHARED_POOL_SIZE.

By the same token, within the Memory Advisor, you can view estimates on the PGA workload by going to the PGA link (see Figure 3-3 again) and clicking on the Advice button there. This value, of course, depends on PGA_AGGREGATE_TARGET being set to a nonzero value in the init.ora (or spfile).


NOTE

Many of these parameters can be changed dynamically, both in memory and for future startups, if using an spfile. For example, SGA_TARGET, DB_CACHE_ADVICE, and STATISTICS_LEVEL can all be changed on the fly, and SGA parameters such as DB_CACHE_SIZE and SHARED_POOL_SIZE can all be lowered (set to 0) dynamically via an ALTER SYSTEM command.



Figure 3-3.  Memory Advisor

{mospagebreak title=Additional Advisors}

The previous advisors merited sections on their own due to the impact that they potentially have and the new features that they bring with them to Oracle Database 10g. In addition to the advisors mentioned, Advisor Central provides links to other new advisors that can offer useful insights into the performance and layout of your database.

MTTR Advisor

The MTTR Advisor allows making changes to fast_start_mttr_target, which essentially controls how long we want instance recovery to take, should we have a crash. Setting a target value here will cause Oracle to make internal changes to how frequently data is flushed to disk, or checkpointed, to ensure that if a crash recovery occurs, the time to roll forward will be close to the targeted value. This means the crash recovery will complete within x number of seconds, and the database will then be opened and made available. The view V$MTTR_TARGET_ ADVICE is used to provide an idea of what additional I/O might be incurred if the MTTR is reduced. This advisor also allows for the setting up of a flash recovery area, and enabling database archivelog mode. This is not tuning related, but of course is intrinsic to high availability, so these topics are covered more appropriately in other chapters.

Segment Advisor

The Segment Advisor essentially looks at space usage and advises on whether you can free up space. It also checks for fragmentation, and determines whether reorganizing individual segments and/or making changes to entire tablespaces can gain efficiencies. If it finds segments that can be reorganized, it will highlight them and provide a recommendation to do a shrink of the segment, and offer the ability to schedule a job to do so immediately or schedule the run for some future time. In Oracle Database 10g, a table or index segment can be compacted via the alter <object> <object_name> shrink … command. This command can be done while the table or index is online, without impacting user access to the segments and without the need to rebuild the indexes afterward. However, row movement must be enabled on the table in question first. Note in Figure 3-4 that the segment in question has generated advice, but the advice cannot be executed because row movement is not enabled. Thus, the selection is grayed out. In order to make any changes to this table, you will have to enable row movement (alter table PROD_ EMP enable Row Movement) and then rerun the Segment Advisor. Then you will be able to select the option under the Recommendations section to shrink the segment.


FIGURE 3-4.  Segment Advisor

Undo Management Advisor

The Undo Management Advisor first takes you into a review of the undo management parameters and allows you to make or change settings for enabling automatic undo management, which will automatically tune the UNDO_RETENTION time for you based on longest running queries and available space in the UNDO tablespace. In addition, the Undo Management Advisor allows for making changes to the UNDO tablespace, or actually changing the undo to a different tablespace altogether. From here, you can actually launch into the Undo Advisor itself, and Oracle will give advice on what the undo retention should be             (in seconds) to avoid Snapshot Too Old errors (ORA-1555). In addition, the Undo Advisor will estimate the size that the UNDO tablespace needs to be in order to retain that amount of undo. Since Oracle Database 10g provides you with the opportunity to guarantee the retention period, the Undo Advisor will take data from the past seven days (by default), analyze that data, and determine how much undo has been generated during that period. The Undo Advisor will then provide a graph, showing you how large the UNDO tablespace needs to be to achieve the desired UNDO_RETENTION time without running out of space in the UNDO_TABLESPACE.

Automatic Storage Management (ASM)

Automatic Storage Management, or ASM, is another new Oracle Database 10g feature that revolutionizes the way Oracle and the HA DBA manage database files. ASM combines volume management with the concept of Oracle managed files to allow the HA DBA to create a database comprised of datafiles that are not only self-managed, but also the I/O is automatically balanced among available disks. ASM combines the ability to simplify management of files with the ability to automatically self-tune, while at the same time providing a level of redundancy and availability that is absolutely imperative for the storage grid.

The implementation of ASM involves the creation of a normal Oracle instance with the parameter INSTANCE_TYPE=ASM set to a value of ASM on a node where a database or databases reside. This instance does not have an associated database, but rather is used to manage the disks that are accessed by your database(s). As such, an ASM instance is never opened—it is only mounted. Mounting an ASM instance involves mounting the disk groups associated with the ASM instance, so that the disk groups and files are then accessible from the other instances. We will discuss ASM in various sections throughout the remainder of the book, but we will take the time here to discuss the concepts behind ASM, how to implement ASM in your environment, and how to manage an ASM environment once you are up and running.

ASM Concepts

The underlying concept behind ASM is that it is a file system created specifically for Oracle datafiles, on top of RAW or block devices. This file system is kept and maintained by the Oracle kernel, so Oracle knows where file extents are and automatically manages the placement of these extents for maximum performance and availability of your database. You, as the HA DBA, will not know or care where Oracle is placing extents on disk. Oracle will do all of that management for you through ASM. No volume management software is needed, and no file system is needed.

ASM Disk Group

At its highest level, within ASM you will create ASM disk groups, comprised of one or more disks (usually RAW, but certified NFS storage will work as well). Oracle will take that disk group as the location for creating files, and will lay down files in 1MB extents across however many disks are available. The more disks that are used within a disk group, the more flexibility you will give Oracle to spread the I/O out among disks, resulting in better performance and improved redundancy. ASM disk groups can be used for all Oracle files, including the spfile, the controlfile, the online redo logs, and all datafiles. In addition, you can use an ASM disk group for your flashback recovery area (discussed in Chapter 8), as a location for all RMAN backups, flashback logs, and archived logs. Bear in mind, however, that ASM was created specifically for Oracle, so it cannot be used as a general purpose file system. As such, files in an ASM disk group are not visible at the OS, and files such as Oracle binaries and Oracle trace files must be kept on a regular file system (such as UFS or NTFS).


NOTE

We mentioned that extents are written out in 1MB sizes, and this is true for all files except controlfiles and logfiles. Redo logs, controlfiles, and flashback logs use fine-grained striping, by default, which results in extents of 128K, rather than 1MB. This allows large I/Os to be split into smaller chunks and processed by more disks, resulting in better performance for these types of files.


Stripe and Mirror Everything (SAME)

ASM adheres to the same philosophy, which recommends to stripe and mirror everything. This is handled in ASM by allowing the setting of redundancy levels during the creation of a disk group. Normal redundancy implies that you have at least two disks, because every allocation unit (or extent) will be written twice, to two different disks within the disk group. High redundancy implies three-way mirroring, meaning every allocation unit (or extent) will be written to three separate disks within the disk group. This mirroring is not the traditional type of mirroring that you may be used to, however—this is done at the extent level. For example, let’s assume that we are mirroring with normal redundancy (two-way mirroring), and that we have five disks in a disk group. If we then create a 10MB file on that disk group, the first 1MB extent may be mirrored across disks 3 and 5, the next 1MB extent may be mirrored across disks 2 and 4, the next extent across disks 1 and 3, and so on. When all is said and done, every extent has been mirrored, but no two disks will contain identical data. If you choose external redundancy when creating a disk group, this is perfectly acceptable, but it implies that all mirroring is handled at the hardware level.

By the same token, ASM achieves striping by spreading the extents, aka allocation units, for a given file across all available disks in a disk group. So, your TEMP tablespace may be 4GB in size, but if you have a disk group with 10 disks in it, you will not care how the tablespace is laid out—Oracle with ASM will automatically spread the extents for this file across the disks, seeking to balance out the I/O and avoid hot spots on disk. If Oracle detects that a particular disk is getting too much I/O, it will attempt to read the mirrored copy of an extent from a different disk, if it is available. The same is true for all files, including redo logs.


NOTE

Mirroring is actually performed to what are known as “partner” disks. Within an ASM disk group, any given disk can have a maximum of eight partners. This means that the extents written to a disk can be mirrored to any one of the eight partners defined for that disk. In our simple example, where we have only five disks, any disk can be the partner of another disk because we have not exceeded this limit. However, in a disk group with more than eight disks (say, hundreds or even thousands of disks), it is important to realize that each disk will be limited in the number of partners that can participate in the mirroring for that disk. This is done intentionally, as limiting the number of partners minimizes the possibility that a double disk failure could lead to data loss—this could only happen if the two disks that fail also happen to be partners. Utilizing high redundancy (triple-mirroring) reduces this likelihood even further. An ASM disk group will theoretically support up to 10,000 disks with a single ASM instance, spread across as many as 63 disk groups. ASM also supports up to 1 million files in a disk group. In Oracle Database 10g Release 1, only one ASM instance is allowed per node.


{mospagebreak title=Failure Groups} 

A failure group allows you to take the redundancy of disks to the next level, by creating a group containing disks from multiple controllers. As such, if a controller fails, and all of the disks associated with that controller are inaccessible, other disks within the disk group will still be accessible as long as they are connected to a different controller. By creating a failure group within the disk group, Oracle and ASM will mirror writes to different disks, but will also mirror writes to disks within different failure groups, so that the loss of a controller will not impact access to your data.

File Size Limits on ASM

As we have discussed, ASM disk groups support a variety of different file types, including online redo logs, controlfiles, datafiles, archived redo logs, RMAN backup sets, and Flashback Logs. In Oracle Database 10g Release 1, ASM imposes a maximum file size on any file in an ASM disk group, regardless of the file type. That maximum size depends on the redundancy level of the disk group itself, as shown here:

Max File Size             Redundancy Level

300GB                        External redundancy

150GB                        Normal redundancy

100GB                        High redundancy

These maximum values do not affect the maximum values imposed by other limitations, such as a maximum size for database files themselves. In Oracle Database 10g Release 1, for example, a database file is limited to approximately 4 million blocks. This limit applies irrespective of the underlying storage mechanism, file system, or platform. As you can see, for most block sizes this will not be an issue. However, some platforms (such as Tru64) support a db_block_size of up to 32k. As such, the normal maximum database file size with a 32K block would be 128GB. However, if you are on a high-redundancy ASM disk group (3-way mirroring), the maximum file size would actually be 100GB.

In addition, Oracle Database 10g includes a new feature: the ability to create a tablespace using the BIGFILE syntax. When creating a BIGFILE tablespace, you are only allowed a single datafile in that tablespace, but the limit on the number of blocks is increased to approximately 4 billion blocks (from 4 million). The theoretical maximum file size for a datafile in a BIGFILE tablespace would then be in the Terabytes—but as you can see, ASM will limit the datafile size to 300GB or lower, based on the Redundancy Level of the disk group. Expect this limitation to be removed in subsequent patches or releases. See Metalink Note 265659.1 for details.

Rebalancing Operations

Inherent to ASM is the ability to add and remove disks from a disk group on the fly without impacting the overall availability of the disk group itself, or of the database. This, again, is one of the precepts of grid computing. ASM handles this by initiating a rebalance operation any time a disk is added or removed. If a disk is removed from the disk group, either due to a failure or excess capacity in the group, the rebalance operation will remirror the extents that had been mirrored to that disk and redistribute the extents among the remaining disks in the group. If a new disk is added to the group, the rebalance will do the same, ensuring that each disk in the group has a relatively equal number of extents.

Because of the way the allocation units are striped, a rebalance operation only requires that a small percentage of extents be relocated, minimizing the impact of this operation. Nevertheless, you can control the rebalance operation by using the parameter ASM_POWER_LIMIT, which is a parameter specific to the ASM instance. By default, this is set to 1, meaning that any time a disk is added or removed, a rebalance operation will begin—using a single slave. By setting this value to 0 for a disk group, you can defer the operation until later (say overnight), at which time you can set the ASM_POWER_LIMIT to as high as 11. This will generate 11 slave processes to do the work of rebalancing. This can be accomplished via the alter system command:

alter system set asm_power_limit=0;
alter system set asm_power_limit=11;

Background Processes for ASM 

An ASM instance introduces two new types of background processes—the RBAL process and the ARBn processes. The RBAL process within the ASM instance actually determines when a rebalance needs to be done and estimates how long it will take. RBAL then invokes the ARB processes to do the actual work. The number of ARB processes invoked depends on the ASM_ POWER_LIMIT setting. If this is set to the max of 11, then an ASM instance would have 11 ARB background processes, starting with ARB0 and ending with ARBA. In addition, a regular database instance will have an RBAL and an ASMB process, but the RBAL process in a database instance is used for making global calls to open the disks in a disk group. The ASMB process communicates with the CSS daemon on the node and receives file extent map information from the ASM instance. ASMB is also responsible for providing I/O stats to the ASM instance.

ASM and RAC

Because it is managed by Oracle, ASM environments are particularly well-suited for a RAC installation. Using a shared disk array with ASM disk groups for file locations can greatly simplify the storage of your datafiles. ASM eliminates the need to configure RAW devices for each file, simplifying the file layout and configuration. ASM also eliminates the need to use a cluster file system, as ASM takes over all file management duties for the database files. However, you can still use a cluster file system if you want to install your ORACLE_HOME on a shared drive (on those platforms that support the ORACLE_HOME on a cluster file system). We discuss ASM instance configuration in a RAC environment in the next section, as well as in Chapter 4.

Implementing ASM

Conceptually, as we mentioned, ASM requires a separate instance be created on each node/server where any Oracle instances reside. On the surface, this instance is just like any other Oracle instance, with an init file, init parameters, and so forth, except that this instance never opens a database. The major difference between an ASM instance and a regular instance lies in a few parameters:

  1. INSTANCE_TYPE = ASM (mandatory for an ASM instance)
  2. ASM_DISKSTRING = /dev/raw/raw* (path to look for candidate disks)
  3. ASM_DISKGROUPS = ASM_DISK (defines disk groups to mount at startup)

Aside from these parameters, the ASM instance requires an SGA of around 100MB, leaving the total footprint for the ASM instance at around 130MB. Remember—no controlfiles, datafiles, or redo logs are needed for an ASM instance. The ASM instance is used strictly to mount disk groups. A single ASM instance can manage disk groups used by multiple Oracle databases on a single server. However, in a RAC environment, each separate node/server must have its own ASM instance (we will discuss this in more detail in Chapter 4).

Creating the ASM Instance

If you are going to create a database using ASM for the datafiles, you must first create the ASM instance and disk groups to be used. It is possible to do this via the command line by simply creating a separate init file, with INSTANCE_TYPE = ASM. Disk groups can be created or modified using SQL commands such as CREATE DISK GROUP, ALTER DISK GROUP, DROP DISK GROUP, and so on. The instance name for your ASM instance should be +ASM, with the + actually being part of the instance name. In a RAC environment, the instance_number will be appended, so ASM instances will be named +ASM1, +ASM2, +ASM3, and so forth.

However, as in the past, we recommend using the GUI tools such as DBCA. The simplest way to get an ASM instance up and running is to create your database by using the DBCA. If there is not currently an ASM instance on the machine, the DBCA will create an ASM instance for you, in addition to your database instance. If you are using the DBCA to create a RAC database, then an ASM instance will be created on each node selected as part of your cluster. After creation of the ASM instance, you will be able to create the disk groups directly from within the DBCA, which is created by providing disks in the form of RAW slices. The ASM instance will then mount the disk groups before proceeding with the rest of the database creation. Figure 3-5 shows the option you will have when creating the database, to choose the type of storage on the Storage Options screen. After selecting this option, you will be presented with the ASM Disk Groups screen, which will show you the available disk groups. On a new installation, this screen will be blank, as there will be no disk groups. So, at this point, on a new installation, you would choose the Create New option.


FIGURE 3-5.  Choosing ASM on the DBCA Storage Options screen

Creating ASM Disk Groups Using the DBCA

In the Create Disk Group screen (see Figure 3-6), ASM searches for disks based on the disk discovery string defined in the ASM_DISKSTRING parameter. Different platforms have different default values for the disk string. On Linux, the default will be /dev/raw/*, and on Solaris, the default will be /dev/rdsk/*. This value can be modified from within the DBCA, as shown in Figure 3-6, by clicking the Change Disk Discovery Path button. Once you have the correct path, you should end up with a list of possible candidate disks. Note in Figure 3-6 that the Show Candidates radio button is selected. This will only display disks in the discovery string that are not already part of another disk group. Note also that two of the disks have FORMER under the Header Status. This is because these two disks were at one time part of a different disk group, but that group was dropped. This status might also show up if a disk is removed from an existing disk group—nevertheless, these disks are available to be added to this group, and are considered valid candidates. In Figure 3-6, you can see that we have selected these two disks to make up our new disk group.


FIGURE 3-6.  Creating the ASM disk group


NOTE

On most Linux platforms, Oracle provides a special ASM library to simplify the ASM integration with the operating system, and to ease the disk discovery process. At this time, the library is not available for all platforms, but you can check OTN periodically for the availability of a library for your platform at http://otn.oracle.com/software/tech/linux/asmlib/ index.html

 


 

At this point, the ASM instance has already been created by the DBCA, and the disk group will be mounted by the ASM instance. If this is a RAC environment, the ASM instance will have been created on all nodes selected as participating in the database creation. (ASM mounts the disk groups in a manner similar to how it mounts the database, but rather than via an ALTER DATABASE MOUNT; command, the ALTER DISKGROUPS MOUNT; command is used instead.) Once the disk group is mounted, you can proceed to select the group for your database files and continue on with the creation of the actual database. You will be asked if you want to use Oracle managed files, and you will also be prompted to set up a flashback recovery area. If you choose to set up a flashback recovery area, we recommend that you set up a separate ASM disk group for the flashback recovery area. You will be given an opportunity by the DBCA to create that separate disk group. If you choose not to do so during creation time, you can create additional disk groups later on, either manually or using Enterprise Manager.


NOTE

ASM requires that the ocssd daemon be running (cluster synchronization service) even in a single instance. (On Windows, ocssd manifests itself as a service called OracleCSService.) For this reason, Oracle will install/create the ocssd daemon (or OracleCSService) automatically on a new Oracle Database 10g install, even in a single-instance environment.


{mospagebreak title=Managing ASM Environments with EM}

Using Enterprise Manager to help you manage your ASM environment is most likely the simplest way to keep on top of the situation. In order to do so, we recommend that you configure Enterprise Manager Grid Control (as described in Chapter 5). This is a particular necessity when running in a RAC environment, as Grid Control allows you to manage all instances in the cluster, as well as all ASM instances from one central location.

Navigating Through EM

Enterprise Manager provides a graphical interface for such ASM operations as adding or removing disks from a disk group, dropping and creating new disk groups, mounting and dismounting disk groups, and rebalancing within a disk group. While you cannot create the instance itself through EM, you can modify parameters such as ASM_DISKSTRING or ASM_POWER_LIMT. Additionally, while we mentioned that files in an ASM disk group are not visible at the OS, Enterprise Manager will allow you to look at the file and directory structure used by ASM to manage these files. To end this section, we will go through a HA Workshop that will walk you through the steps needed to navigate ASM through Enterprise Manager.

——-HA Workshop: Managing ASM Through Enterprise Manager

Workshop Notes

This workshop assumes that Enterprise Manager Grid Control has already been configured in your environment, and also assumes that an ASM instance is running. If EM Grid Control has not been configured, please refer to Chapter 5 for an overview of Grid Control configuration, or refer to the Oracle Database 10g Release 1 Oracle Enterprise Manager Advanced Configuration guide.

Step 1.  Log on as SYSMAN to the Enterprise Manager Grid Control screen using the host name of the management repository machine, and port 7777. In this example, rmsclnxclu1 is a node in our cluster, but it also happens to be the host for the EM management repository:

http://rmsclnxclu1:7777/em

Step 2.  Click on the Targets tab across the top, and then choose All Targets from the blue bar. This will list all instances, including ASM instances.

Step 3.  Find the ASM instance from the list of targets. The name will be something along the lines of +ASM1_rmsclnxclu1.us .oracle.com, where rmsclnxclu1.us .oracle.com is the host name where the ASM instance is running. Click on the link to the ASM instance. This will bring you to the home page for the ASM instance. Here, you can see a list of disk groups, the databases serviced by the disk groups, and a graphical depiction of the amount of disk space each database is using. In addition, you will see any alerts related to your ASM instance.

Step 4.  Click on the Performance link for the ASM instance. (You may be prompted to log in—if so, provide the sysdba password for the ASM instance.) Here you will see a graphical depiction of the throughput, disk I/O per second, and disk I/O response times for all disk groups managed by ASM. Clicking on the Expand All option at the bottom of the page will allow you to see the cumulative statistics for each disk in the disk group.

Step 5.  Next click on the Configuration link across the top. Here you can modify the values for the ASM_DISKSTRING, ASM_DISKGROUPS, and ASM_POWER_ LIMITS parameters.

Step 6.  Now click on the Administration link. This will provide you a link to each ASM disk group managed by the ASM instance, as shown in Figure 3-7. If there are no disk groups, you can create them from here, or create additional groups by clicking on the Create button. Disk groups can also be mounted or dismounted from here—if you are in a RAC environment, you will be prompted to select which instances you want to dismount the group from. By selecting the options from the drop-down list, you can initiate an immediate rebalance operation as well, as shown in Figure 3-7.

Step 7.  Click on the disk group itself now, to again see the disks in the group. This will take you into the General description page for that disk group.


FIGURE 3-7.  ASM disk groups in Enterprise Manager

Here you can check the disks’ integrity, add disks to the disk group, or delete disks from the group. You can also drill down further to see performance stats for the individual disks.

Step 8.  Next click on the Files link to view the files on the ASM disk group. Choose the Expand All option to view all the files within all of the ASM directories. In Figure 3-8, we have displayed a partial expansion of the directory list. As you can see, Oracle creates a directory structure within the ASM disk group, which it maintains internally. In this example, Oracle managed files are in use, meaning that the parameter DB_CREATE_FILE_DEST is set to +ASM_DISK. Oracle automatically created an ASM directory with the database name, and then within that directory


FIGURE 3-8.  Viewing ASM files from within Enterprise Manager

created additional subdirectories for controlfiles, logfiles, datafiles, and so forth. When using Oracle managed files, it is not necessary to specify either the name or the location of the file. Oracle will determine the location based on the ASM file type, and will then assign a filename based on the type of file, the file number, and the version number.

Step 9.  Lastly, click on a single file to get a description of how the file is mirrored, the block size and number of blocks in the file, the file type, the creation date, and the striping (either coarse or fine). The output should look something like this:

SYSAUX.270.3: Properties
Name SYSAUX.270.3
Type DATAFILE
Redundancy MIRROR
Block Size (bytes) 8192
Blocks 64001
Logical Size (KB) 512008 KB
Striped COARSE
Creation Date 15-FEB-2004 02:43:32

——————————————–

ASM Metadata

As we have discussed, an ASM instance has no physical storage component associated with it—the ASM instance is purely a logical incarnation, in memory. However, there are physical components to an ASM disk group, essentially stored on disk on each ASM disk. When a disk is made part of an ASM disk group, the header of each disk is updated to reflect information, including the disk group name, the physical size of all disks in the group, the allocation unit size, and so on. The header also contains information relating specifically to that disk, including the size, the failure group, the disk name, and so forth. In addition, metadata is stored in ASM files on the disks themselves, using file numbers below 256. For this reason, when creating a new database on an ASM disk group, the system datafile will generally be file 256, and the rest of the files in the database are numbered upward from there—because file 255 and below are reserved for ASM metadata. The ASM metadata is always mirrored across three disks (if available), even when the external redundancy option is chosen.

[gp-comments width="770" linklove="off" ]

chat