Home arrow Oracle arrow Page 7 - Oracle and Availability: Illustrated Downtime Scenarios

Waiting for the File to Restore from Tape - Oracle

Ever have to recover your servers from a flooded basement? Move a database to an new platform and need the weekend to test it but users need the database today? We will illustrate various database problems and which piece of technology could be employed to prevent the outage or to recover from it quickly. (From the book Oracle Database 10g High Availability with RAC, Flashback & Data Guard by Matthew Hart and Scott Jesse, ISBN: 0072254289, McGraw-Hill/Osborne, 2004.)

TABLE OF CONTENTS:
  1. Oracle and Availability: Illustrated Downtime Scenarios
  2. Horatio's Woodscrews
  3. User-Defined Availability
  4. Cyclical Database Resource Requirements
  5. Out of Space in the Woodscrew Tablespace
  6. Restarting Long-Running Transactions
  7. Waiting for the File to Restore from Tape
  8. The Dropped Table
  9. Complete and Total Disaster
By: McGraw-Hill/Osborne
Rating: starstarstarstarstar / 6
July 20, 2004

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

For our poor DBA at Horatio's Woodscrews, it only gets worse. He needs to restore the file WS_APP_DATA01.DBF from backup and then perform media recovery using archivelogs. He contacts the backup administrator and asks that the file be restored from tape.

After some shuffling around, the backup administrator finds the last backup and begins the tape restore. Because of the size of the file, this will take a few hours to get the file queued over the network from the tape jukebox to the database server. The DBA asks if there are any backups of the file on disk, knowing that there is plenty of unused space on the new SAN that could be used for disk backup of important files. The backup administrator just runs the tape servers, though, and knows nothing about the SAN. That's a different department.

Avoid Long Tape Restores with RMAN Flashback Recovery Area

Using new RMAN functionality, you can create backup jobs that always store at least the last full copy of the database on a disk location called the flashback recovery area. Then, when you take the next backup, RMAN will automatically age out the last backup, or you can set it up to move the old backups to tape from the disk area. When you go to initiate the restore from RMAN, it knows where the last best copy is. The restore can be instantaneous, as we can switch out the bad file with the new file and begin applying archivelogs immediately. We discuss this in Chapter 8.

 

Protect Against Node Failure with a Robust Archive Strategy

While RAC does provide a higher degree of protection against outages, you have to be careful how you configure the database so that you don't invent single points of failure. In this case, the archivelogs from Node2 are required for recovery, but Node2 is off the network temporarily. For help with archive strategies with RAC, see Chapter 5, in the section “Redo Logs and Media Recovery.” Also see Chapter 8, particularly Figure 8-8.

RAC and the Single Point of Failure

The DBA for Horatio's Woodscrews is no fool. He knows that he needs to transition the data to a RAC cluster. As part of the preparation, he has configured the Sales Department's smaller database on a RAC cluster with two nodes. He set it up quickly, with minimal configuration, and it's been running smoothly for a few weeks.

However, the network configuration issues that affected the production database had also affected the Sales system. One of the nodes in the cluster was not available on the network. The DBA wasn't worried, however, because the other node was still available.

But one of the files was asking for media recovery, so the DBA issued the recovery command and it asked for an archivelog that was required for recovery. The DBA pressed ENTER to accept the default archivelog. But the media recovery session failed. He looked a little closer, and noted that the requested archivelog was on the unavailable node. Suddenly, his RAC cluster was not operational as he waited for the other node to become available so he could perform recovery.

Rewinding the Database

Perhaps the most difficult outage situations are those tricky logical errors introduced by the users themselves—when a user updates the wrong table, or updates the wrong values. These types of errors are tough to overcome because they are not perceived by the database as errors, but just another transaction. Typically, user errors do not occur in a vacuum; an erroneous update can occur alongside hundreds of correct updates. Pretty soon, the bad data is buried by thousands of additional updates. How can you fish just one of those transactions out? Can you “rewind” the database back to a previous point in time?

Use Flashback Table to Restore a Table to a Previous State

In Oracle Database 10g, Oracle introduced the ability to rewind a table to a previous state without performing point-in-time recovery. This is called Flashback Table, and it's part of the Flashback Technologies discussed in Chapter 9.

You can also use LogMiner to review transactions in the archived redo logs in order to determine where exactly the bad data was entered, as well as to retrieve the good transactions entered after the bad transaction. LogMiner is discussed in Chapter 2.

At Horatio's Woodscrews, the problem was reported by the Accounting team. As they went through their month-end processing, they began to realize that the data seemed incorrect. All the roll-up values seemed to be impossibly high, even during a good month. They could not figure out what had happened, but the data had been incorrectly entered at some point in the morning. Now, in the afternoon, they came to the DBA and asked that he “start over” all of their tables as they had looked in the morning.

The options for restoring just a few tables are limited, without rolling the entire database back to a previous point-in-time. Oracle has provided for a tablespace point-in-time recovery (TSPITR), where just a single tablespace is restored to a previous point. But that is labor-intensive, and the smallest unit that can be restored is the entire tablespace. The Accounting group does not want to redo all their work for the day, just the work in a few tables.

This chapter is from Oracle Database 10g High Availability with RAC, Flashback & Data Guard, by Hart and Jesse. (McGraw-Hill/Osborne, 2004, ISBN: 0072254289). Check it out at your favorite bookstore today. Buy this book now.



 
 
>>> More Oracle Articles          >>> More By McGraw-Hill/Osborne
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

ORACLE ARTICLES

- Oracle Java Security Woes Continue
- Oracle's New IaaS Cloud Option: There's a Ca...
- Oracle Acquires Eloqua to Boost Cloud Presen...
- Choosing Innovation: Oracle Survey Insights
- Oracle Fixes Privilege Escalation Bug
- Oracle`s Communications Service Availability...
- Oracle Releases Exalytics, Taleo Plans
- Oracle Releases Communications Network Integ...
- Oracle Releases Communications Data Model 11...
- Oracle Releases PeopleSoft PeopleTools 8.52
- Oracle Integrates Cloudera Apache Distro, My...
- Oracle Releases MySQL 5.5.18
- Oracle Announces NoSQL Database Availability
- Sorting Database Columns With the SELECT Sta...
- Retrieving Table Data with the LIKE Operator

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: