Oracle Database Backup and Recovery Practices

Database failures are often a matter of when, not if. This three-part article series walks you through the best practices for backing up and recovering Oracle Database XE. It is excerpted from chapter 30 of the book Beginning PHP and Oracle: From Novice to Professional, written by W. Jason Gilmore, Bob Bryla (Apress; ISBN: 1590597702).

Ensuring database availability is a critical skill you need even if your Oracle Database XE instance is used by a small group of developers in your department. Many types of database failures are beyond your control as a DBA, such as disk failures, network failures, and user errors. This emphasizes the need to prepare in advance for all of these potential failures after assessing the cost of database downtime versus the effort required to harden your database against failure. Many of these failures, as you might expect, require you to work closely with the server system administrators and network administrators to minimize the impact. You need to promptly receive notification when failures occur, or a warning when they are about to occur.

In this chapter, we start by presenting you with Oracle’s recommended best practices for ensuring the recoverability of your database when, not if, you have a database failure. If your database is a production database that must be available continuously, these requirements are mandatory. On the other hand, if your database is for development, an occasional backup may suffice. However, by using Oracle’s best practices, your downtime will be minimal in the event of a failure, giving you more time to focus on PHP application development instead of data recovery.

Next, we show you how to back up your database, using the Oracle Database XE scripts. Once you have backed up your database, you will need to know how to recover the database from a media failure such as a missing or corrupted datafile.

Backup and Recovery Best Practices

Oracle recommends several techniques you can use to ensure database availability and recoverability. Many of these techniques are automatically implemented when you install Oracle Database XE. However, there are a couple of places where you can tweak the default configuration to improve the recoverability further. We discuss these tweaks in the sections that follow.

Before we dig in to the recoverability and availability techniques, it is important to know the types of failures you may encounter in your database so that you may respond appropriately when they occur. Database failures fall into two broad categories: media failures and nonmedia failures.

Media failures occur when a server disk or a disk controller fails and makes one or more of your database’s datafiles unusable (see Chapter 28 for an overview of Oracle Database XE’s storage structures). After the hardware error is resolved (e.g., the server administrator replaces the disk drive), it is your responsibility to restore the corrupted or destroyed datafiles from a disk or tape backup. As the price of disk space falls, the added level of convenience and speed of disk makes tape backups less desirable except for archival purposes.

Nonmedia failures include all other types of failures. Here are the most common types of nonmedia failures and how you will deal with them:

  1. Statement failure: Your SQL statement fails because of a syntax problem, or your permissions do not allow you to execute the statement. The recovery process for fixing this error is relatively easy: use the correct syntax or obtain permissions on the objects in the SQL statement. 
     
  2. Instance failure: The entire database fails due to a power failure, server hardware failure, or a bug in the Oracle software. Recovery from this type of failure is automatic: once the server hardware failure is fixed or the power is restored, Oracle Database XE uses the online redo log files to ensure that all committed transactions are recorded in the database’s datafiles. In the case of a possible Oracle software bug, your next step after restarting the database is to investigate whether there is a patch file or a workaround for the software bug. 
     
  3. Process failure: A user may be disconnected from the database due to a network connection failure or an exceeded resource limit (such as too much CPU time). The Oracle Database XE background processes automatically clean up by freeing the memory used by the user connection and roll back any uncommitted transactions started during the user’s session. 
     
  4. User error: A user may drop a table or delete rows from a table unintentionally.

{mospagebreak title=Multiplexing Redo Log Files}

As you remember from Chapter 28, the online redo log files are a key component required to recover from both instance failure and media failure. By default, Oracle Database XE creates the minimum number of redo log files (two). When the first redo log file fills with committed transactions, subsequent transactions are written to the other redo log file. Whether you have two, three, or more redo log files, Oracle writes to the log files in a circular fashion. Thus, if you have ARCHIVELOG mode enabled, Oracle can write new transactions to the next redo log file while Oracle archives the previous online log file. (We show you how to enable ARCHIVELOG mode in a coming section.)


Note  The terms redo log file and online redo log file are often used interchangeably. However, the distinction is important when you are comparing online redo log files to archived (offline) redo log files.


To prevent loss of data if you lose one of the online redo log files, you can multiplex, or mirror, the redo log files. In other words, each redo log file, whether there are two, three, or more, has one or more identical copies. These copies are maintained automatically by Oracle processes. Writing to a specific log file occurs in parallel with all other log files in the group. While there is a very slight performance hit when the Oracle processes must write to two copies of the redo log file instead of just one, the slight overhead is easy to justify compared to the recovery time (including lost committed transactions) if a nonmultiplexed online log file is lost due to a hardware failure or other error. To see the current status of the online redo log files, start at the Oracle Database XE home page and navigate to Administration ➤ Storage. In the Tasks section on the right side of the page, click View Logging Status and you will see the names and status of the online redo log files. By default, Oracle Database XE creates two online redo log files, as you can see in Figure 40-1.

Notice the directory path for the redo log files:

/usr/lib/oracle/xe/app/oracle/flash_recovery_area

 
Figure 40-1.  Online redo log file status

This area, as you might surmise, is known as the Flash Recovery Area. The Flash Recovery Area automates the management for backups of all types of database objects such as multiplexed copies of the control file and online redo log files, archived redo log files, and datafiles. You specify the location of the Flash Recovery Area along with a maximum size, and Recovery Manager (RMAN) manages files within this area. You define the location and size of the Flash Recovery Area with two initialization parameters. From the SQL command-line prompt run this command:

show parameter db_recov

You will see all parameters in the database that begin with db_recov :  

NAME db_recovery_file_dest

TYPE

string

VALUE

/usr/lib/oracle/xe/app/oracle/

 

 

flash_recovery_area

db_recovery_file_dest_size

big integer 10G

You can also view these parameters from the Home ➤ Administration ➤ About page in the Oracle Database XE GUI. To ensure prompt and easy recovery of any database object, your Flash Recovery Area should be large enough to hold at least one copy of all datafiles, incremental backups, online redo log files, control files, and any archived redo log files required to restore a database from the last full or incremental backup to the point in time of a media failure. You can check the status of the Flash Recovery Area by querying the dynamic performance view V$RECOVERY_FILE_DEST :

select name, space_limit, space_used from v$recovery_file_dest;  

NAME SPACE_LIMIT SPACE_USED

 

/usr/lib/oracle/xe/app/oracle/flash_recovery_area 10737418240 851753472

Of the 10GB of space available in the Flash Recovery Area, less than 900MB is used.

Multiplexing these redo log files is easy; the only catch is that there is no GUI interface available for this operation—you must use a couple of SQL commands. You will put the multiplexed redo log files in the directory /u01/app/oracle/onlinelog . This file system is on a separate disk drive and a separate controller from the redo log files shown earlier in Figure 40-1. Connect as a user with SYSDBA privileges, and use these SQL statements:

alter database add logfile member
    
‘/u01/app/oracle/onlinelog/g1m2.log ‘
to group 1;
alter database add logfile member

     ‘/u01/app/oracle/onlinelog/g2m2.log’
to group 2;

Notice that you do not need to specify a size for the new redo log file group members; all files within the same redo log file group must have the same size, so Oracle automatically uses the file size of the files within the existing group. After you run these statements, you revisit the Database Logging page shown earlier in Figure 40-1, and you now see the same log file groups but with each having a multiplexed member, as shown in Figure 40-2.


Please check back for the next part of the series. Figure 40-2.  Multiplexed online redo log files

[gp-comments width="770" linklove="off" ]

chat