Thursday, September 06, 2007

Flashback Recovery Area full

Ok, I promise I will take more time to read things like manuals and other documents when it comes to new features in the database.

Take the Flashback Recovery Area (FRA). Oracle does store archive logs in there, which is cool as it speeds up the recovery. The bad thing is that the Oracle Database behaves like a small child. When you do not look it will do strange things.

So I was busy setting up the online backup of an infrastructure database. This one is still in 10.1.x while all the others in this environment are on 10.2.x .
For the RMAN I have a catalog (also 10.2.x) so I had some problems with the versions as the 10.1.x database refused to get into the 10.2.x RMAN catalog. I thought that I have plenty of space for the archivelogs in the ASM (everything was RAC of course).

I turned my attention to some other issues (aka not watching your kid for a moment). Then during an important demo for the client a developer told me that he cannot use the OID.

Of course I checked, but the OID was running, and the database was running as well (ps -ef | grep smon) . Ok, a quick look at the alert.log showed that the database was struggeling with ORA-19815. Apparently the archive log destination was full.
I checked with the ASM which I had cleaned up a day before. Still 30% free?!?!

Metalink revealed that the FRA was full. Now Oracle has done everything to shield the FRA from direct access. This means that I had to get rid of the archivelogs in the FRA with RMAN.
However I still wanted to use the available catalog. Finally I decided to have one backup with RMAN (10.1.x) without the catalog. As the database was already in limbo I went to some drastic measures:

- shutdown abort
- startup
- shutdown immediate
- startup mount
- alter database flashback off;
- alter database open;

Then I tried to set the db_recovery_files_dest_size to a lower value in order to trigger the cleanup process.

- alter system set db_recovery_files_dest_size=1G scope=both;

I also changed the db_flashback_retention_target from 1440 minutes t 120 minutes.

- alter system set db_flashback_retention_target=120 scope=both;


Obviously this did not help because the v$recovery_file_dest still showed that I had 0 bytes of reclaimable_space and 267 files in the FRA.

So now I took the non-catalog RMAN with a deletion of the archive logs. When doing this the FRA is cleaned up as well.

So everything was working again and nobody in the demo was aware of this (the infra is just used for some specific logins that were not used).

Conclusion: Databases behave like little kids, always be aware what they are doing. Fortunately there are manuals for the Database ;-)

No comments: