This page intentionally left blank CHAPTER 16 Restore and Recover with RMAN Exam Objectives In this chapter you will learn to • 052.16.1 Describe the Data Recovery Advisor • 052.16.1 Use the Data Recovery Advisor to Perform Recovery (Controlfile, Redo Log File and Datafile) • 053.7.1 Perform Complete Recovery from a Critical or Noncritical Datafile Loss Using RMAN • 053.7.2 Perform Incomplete Recovery Using RMAN • 053.7.3 Recover Using Incrementally Updated Backups • 053.7.4 Switch to Image Copies for Fast Recovery • 053.7.6 Recover Using a Backup Controlfile • 053.13.3 Perform Block Media Recovery 607 OCA/OCP Oracle Database 11g All-in-One Exam Guide 608 In principle, restore and recovery (Oracle uses these terms very precisely) following a failure are simple. To restore a damaged file is to extract it from a previously made backup; to recover it is to apply change vectors extracted from the redo stream, thus bringing it up to date. Recovery can be complete (meaning no loss of data) or incomplete (meaning that you do, deliberately, lose data). However, there are many variations, depending on the nature of the damage and the downtime that can be tolerated. Using Recovery Manager automates the process, which can make restore and recovery operations much faster than using manual techniques and also eliminates the possibility of making mistakes. There is also a Data Recovery Advisor Wizard, which can diagnose problems and recommend the best action to take. The Data Recovery Advisor The Data Recovery Advisor (the DRA) is a facility for diagnosing and repairing problems with a database. There are two interfaces: the RMAN executable and Enterprise Manager. The DRA is capable of generating scripts to repair damage to datafiles and (in some circumstances) the controlfile: it does not advise on problems with the spfile or with the online redo log files. It is dependent on the Automatic Diagnostic Repository (the ADR) and the Health Monitor. The information the Health Monitor gathers and the advice the DRA gives follow the same diagnosis and repair methods that the DBA would follow without them—but they make the process quicker and less prone to error. The Health Monitor and the ADR The Health Monitor is a set of checks that run automatically when certain error conditions arise, or manually in response to the DBA’s instructions. The results of the checks are not stored in the database, but in the file system. This is because the nature of some errors is such that the database is not available: it is therefore essential to have an external repository for the Health Monitor results. This repository is the Automatic Diagnostic Repository (the ADR), which is located in the directory specified by the DIAGNOSTIC_DEST instance parameter. Different Health Monitor checks can run only at various stages: • In nomount mode, only the “DB Structure Integrity” check can run, and it can only check the integrity of the controlfile. • In mount mode, the “DB Structure Integrity” check will check the integrity of the controlfile, and of the online redo log files and the datafile headers. The “Redo Integrity Check” can also run, which will check the online and archive log files for accessibility and corruption. • In open mode, it is possible to run checks that will scan every data block for corruption, and check the integrity of the data dictionary and the undo segments. The interfaces that will allow manual running of Health Monitor checks are available only when the database is open. There are two interfaces: using SQL*Plus to invoke procedures in the DBMS_HM PL/SQL package, and Database Control. Figure 16-1 shows the Database Control interface. To reach this window, from the database home page take the Advisor Central link in the Related Links section, and then the Checkers tab. Chapter 16: Restore and Recover with RMAN 609 PART III From the window shown in Figure 16-1, you can see the results of all Health Monitor runs (runs in reaction to errors and manual runs) and also run checks on demand. The Capabilities and Limitations of the DRA The DRA can do nothing unless the instance is in nomount mode, or higher. It follows that it cannot assist if there is a problem with the initialization file. In nomount mode, it can diagnose problems with the controlfile and generate scripts to restore it, either by using an existing valid copy or (if none is available) by extracting a copy from a backup set—provided it can find one. Once the database reaches mount mode, the DRA can diagnose problems with missing or damaged datafiles and missing online log file groups, and generate repair scripts. The DRA (in the current release) only supports single-instance databases. If a fault brings down a RAC database, you can mount it in single-instance mode, use the DRA to repair the damage, and then shut it down and reopen it in RAC mode. This technique may not be able to repair damage that is local to one instance. The DRA cannot repair failures on a primary database by using blocks or files from a standby database, and neither can it repair failures on a standby database. EXAM TIP The DRA will function only for a single-instance database. It cannot work with a RAC clustered database, nor with a Data Guard standby database. Figure 16-1 The Database Control interface to the Health Monitor OCA/OCP Oracle Database 11g All-in-One Exam Guide 610 Exercise 16-1: Use the DRA to Diagnose and Advise Upon Problems In this exercise, you will cause a problem with the database, and use the DRA to report on it. 1. From an operating system prompt, launch the RMAN executable: rman target / 2. Confirm that there is a whole full backup of the SYSAUX tablespace: list backup of tablespace sysaux; If this does not return at least one backup set of type FULL, create one: backup as backupset tablespace sysaux; 3. Shut down the instance and exit from RMAN: shutdown immediate; exit; 4. Using an operating system utility, delete the datafile(s) for the SYSAUX tablespace that were listed in Step 2. If using Windows, you may have to stop the Windows service under which the instance is running to release the Windows file lock before the deletion is possible. 5. Connect to the database with SQL*Plus, and attempt a startup: startup; This will stop in mount mode, with an error regarding the missing file. If using Windows, make sure the service has been started. 6. Launch the RMAN executable and connect, as in Step 1. 7. Diagnose the problem: list failure; This will return a message to the effect that one or more non-system datafiles are missing. 8. Generate advice on the failure: advise failure; This will suggest that you should restore and recover the datafile, and generate a repair script. Open the script with any operating system editor, and study its contents. Using the Data Recovery Advisor The Data Recovery Advisor makes use of information gathered by the Health Monitor to find problems, and then it constructs RMAN scripts to repair them. As with any RMAN-based utility, the instance must be started. To start an instance in nomount mode, all that is required is a parameter file. RMAN is in fact capable of starting an instance without a parameter file, using the ORACLE_SID environment variable as a default for the one parameter for which there is no default value: the DB_NAME parameter. This ability may mean that is possible to bootstrap a restore and recovery operation from nothing. The flow for using the DRA is as follows: Chapter 16: Restore and Recover with RMAN 611 PART III • Assess data failures The Health Monitor, running reactively or on demand, will write error details to the ADR. • List failures The DRA will list all failures, classified according to severity. • Advise on repair The DRA will generate RMAN scripts to repair the damage. • Execute repair Run the scripts. The commands can be run from the RMAN executable, or through Database Control. The advice will only be generated for errors previously listed and still open. No advice will be generated for additional errors that have occurred since the listing, or for errors fixed since the listing. TIP If one or more failures exist, then you should typically use LIST FAILURE to show information about the failures and then use ADVISE FAILURE in the same RMAN session to obtain a report of your repair. Figure 16-2 shows a DRA session, launched from the RMAN executable. The situation is that the instance started and mounted the database, but failed to open. Figure 16-2 A DRA session, using the Recovery Manager OCA/OCP Oracle Database 11g All-in-One Exam Guide 612 The first command in the figure launches the RMAN executable, from an operating system prompt. The connection succeeds, but RMAN reports that the database is not open. The second command lists all current failures: there is one nonsystem datafile missing. If this step were omitted, the next step would not return anything. The third command generates advice on fixing the failure. The first suggestion is that some error by the system administrators could be responsible for the problem and could be fixed manually. Then there is an automatic repair involving restore and recovery. This is in the form of an RMAN script. The contents of the script (not shown in the figure) were # restore and recover datafile restore datafile 4; recover datafile 4; To run the script, the command would be repair failure; Following this, the database can be opened. TIP The DRA works, but you can often do better. For example, it does not generate scripts that will minimize downtime by opening the database before doing the restore and recovery (which would be possible in the example). On connecting with Database Control to a damaged database, there will always be a button named PERFORM RECOVERY. Figure 16-3 shows the window this will produce for the same situation shown in Figure 16-2. Figure 16-3 The Database Control interface to the DRA Chapter 16: Restore and Recover with RMAN 613 PART III The Information section seen in Figure 16-3 shows that there is one failure, and that the database is mounted. The ADVISE AND RECOVER button will launch a wizard that will list details of the failure, generate the repair script, and then submit it as a job to the Enterprise Manager job system, and finally prompt you to open the database. EXAM TIP The DRA will not generate any advice if you have not first asked it to list failures. Any failures occurring since the last listing, or fixed since the last listing, will not be advised upon. The DRA can generate scripts to restore a missing or damaged controlfile copy and to rebuild a missing online log file group and to restore and recover missing or damaged datafiles. It will not take any action if a member of a multiplexed log file group is damaged. Exercise 16-2: Repair a Fault with the DRA In this exercise, you will diagnose and repair the problem caused in Exercise 16-1 using Database Control. 1. Using a browser, attempt to connect to Database Control. This will present a window stating that the database is mounted, with buttons for STARTUP and PERFORM RECOVERY. 2. Click STARTUP. Enter operating system and database credentials and follow the prompts to open the database. This will fail, so click PERFORM RECOVERY. 3. In the Perform Recovery window, click ADVISE AND REPAIR to enter the DRA Wizard. 4. In the View And Manage Failures window, click ADVISE. 5. In the Manual Actions window, click CONTINUE WITH ADVICE. 6. In the Recovery Advice window, observe the script and click CONTINUE. 7. In the Review window, click SUBMIT RECOVERY JOB. 8. When the job completes, use either Database Control or SQL*Plus to open the database. It is possible that Database Control will have gotten confused as a result of this exercise, and may have trouble determining what state the database is in. If this appears to be the case, close the browser and restart the Database Control processes from an operating system prompt with the commands: emctl stop dbconsole; emctl start dbconsole; Reconnect with the browser, and confirm that the database is now open. Database Restore and Recovery Some files are critical. Damage to a critical file will mean that the database instance will terminate if it is open, and cannot be reopened until the damage is repaired. Other files are noncritical: if these are damaged, the database can remain open or be opened if it is closed. In either case, there is no reason to lose data: you should be able to perform a complete recovery from any form of damage, provided that you OCA/OCP Oracle Database 11g All-in-One Exam Guide 614 have a backup and the necessary archivelog files. The one exception to this rule is if you lose all copies of the current online log files. The critical files are • Any copy of the controlfile • A datafile that is part of the SYSTEM tablespace • A datafile that is part of the current undo tablespace Noncritical files are • Multiplexed online log files • Tempfiles • Datafiles that are not part of the SYSTEM or current undo tablespaces As a general rule, damage to any number of datafiles should be repaired with a complete recovery: no loss of data. Restore the damaged file(s), and apply redo to bring them right up to date. Incomplete recovery means to restore the database (the entire database) and apply redo only up to a certain point. All work done after that point will be lost. Why would one do this? Usually for one reason only: user error. If a mistake is serious enough, it will be necessary to take the whole database back in time to before the error was made so that the work can be redone, correctly. A second reason for incomplete recovery is because a complete recovery was attempted, but failed. This will happen if archive log files are missing, or if all copies of the current online log file group are lost. There are four steps for complete recovery: • Take the damaged data file(s) offline. • Restore the damaged file(s). • Recover the restored files(s). • Bring the recovered file(s) online. Complete Recovery from Data File Loss Using RMAN Media failure resulting in damage to one or more datafiles requires use of restore and recover routines: a backup of the datafile must be restored, and then archive redo logs applied to it to synchronize it with the rest of the database. There are various options available, depending on whether the database is in archivelog mode or not, and whether the file damaged is one that is critical to Oracle’s ongoing operation or if it is a noncritical file containing “only” user data. Recovery of Datafiles in Noarchivelog Mode There is no supported technique for recovery when in noarchivelog mode, because the archive log files needed for recovery do not exist. Therefore, only a restore can be done. But if a restored datafile is not synchronized with the rest of the database by application Chapter 16: Restore and Recover with RMAN 615 PART III of archive redo log files, it cannot be opened. The only option when in noarchivelog mode is therefore to restore the whole database: all the datafiles and the controlfile. Provided that all these files are restored from a whole offline backup, after the restore you will have a database where all these files are synchronized, and thus a database that can be opened. But you will have lost all the work done since the backup was taken. Once the full restore has been done, the database will still be missing its online redo log files, because they were never backed up. For this reason, the post-restore startup will fail, with the database being left in mount mode. While in mount mode, issue ALTER DATABASE CLEAR LOGFILE GROUP <group number> commands to recreate all the log file groups. Then open the database. If you do the restore through the Database Control interface to RMAN, this process will be fully automatic. In noarchivelog mode, loss of any one of possibly hundreds of datafiles can be corrected only by a complete restore of the last backup. The whole database must be taken back in time, with the loss of users’ work. Furthermore, that last backup must have been a whole, offline backup, which will have entailed downtime. It should by now be apparent that the decision to operate your database in noarchivelog mode should not be taken lightly. TIP Virtually all databases (including test and development systems) should run in archivelog mode. Even if the service level agreement says that data can be lost, if any work is ever lost, your users will not be happy. EXAM TIP If in noarchivelog mode, your only options following loss of any datafile are either a whole database restore, or to drop the relevant tablespace. There can be no recovery. The RMAN commands to restore a database in noarchivelog mode are shutdown abort; startup mount; restore database; alter database open resetlogs; The first command will terminate the instance. This will not be necessary if the damaged file is part of either the SYSTEM or currently active undo tablespaces, because in that case it would have aborted already. The second command brings the database up in mount mode; this can be done only if all copies of the controlfile are available— if they are not, the controlfile must first be restored (as described in Chapter 18). The third command will restore all datafiles from the most recent full, or incremental level 0, backup. The fourth command will recreate the online log file members, set the log sequence number to 1, and open the database. If you are using incremental backups, there is a minor variation in restoring a database in noarchivelog mode: the RECOVER command is needed, to apply the incremental backups. After the restore and before opening the database, run this command: recover database noredo; . Data Guard standby database. Figure 16-1 The Database Control interface to the Health Monitor OCA/ OCP Oracle Database 11g All-in-One Exam Guide 610 Exercise 16-1: Use the DRA to Diagnose and Advise. database, and neither can it repair failures on a standby database. EXAM TIP The DRA will function only for a single-instance database. It cannot work with a RAC clustered database, nor with a Data. RAC mode. This technique may not be able to repair damage that is local to one instance. The DRA cannot repair failures on a primary database by using blocks or files from a standby database,