vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Backup Server: Sun 280 Solaris 2.9 901 RMAN Catalog running on an 8.1.7 EE database Veritas Netbackup DataCenter 4.5 MP 3 Production Server: P660-6H1 AIX 4.3.3 ML09 Primary Site for an HACMP Cluster and DataGuard 9.0.1.3 EE Database, 24x7. Veritas NB Oracle DB Extension This system has been working flawlessly for the last 1.5 yrs, but, a couple weeks ago I received the following intermittent RMAN-00600 error backing up the primary site for Dataguard and HA: RMAN-00600: internal error, arguments [13014] [26] [] [] [] A trace file would get created but with virtually no text in it and would be a couple minutes after the failure. I opened a TAR w/ Oracle, gave them the trace, and they could not find anything in their docs regarding this error and said I was the only one encountering it. Before they soft closed it, it happened again with the same symptoms. It happened once again late last week, so, I my RMAN script, I turned debug on and turned tracing on in the tape channel. After turning it, two traces were produced. I sent those off to Oracle as well as the RMAN output. I'll post an example shortly. I re-opened the TAR, but so far support has no additional information to give me. RMAN> connect target sys/XXXXXXX@PROD1 2> connect catalog rman901/XXXXXXX@PRODCAT1 3> sql 'alter system archive log current'; 4> debug on ; 5> run { 6> # Hot database level 1 cumulative incremental backup 7> allocate channel t1 type 'SBT_TAPE' trace=1 8> parms="ENV=(NB_ORA_CLASS=emsdb1_PROD1, NB_ORA_SCHED=dailyincremental)"; 9> allocate channel t2 type 'SBT_TAPE' trace=1 10> parms="ENV=(NB_ORA_CLASS=emsdb1_PROD1, NB_ORA_SCHED=dailyincremental)"; 11> backup current controlfile for standby; 12> backup 13> incremental level 1 14> cumulative 15> skip offline 16> skip readonly 17> skip inaccessible 18> tag hot_PROD1_bk_level1_cm 19> filesperset 20 20> format 'bk_PROD1_hot_cm_lv1_%U_%t_%s' 21> (database); 22> } 23> debug off ; 24> resync catalog; 25> I checked Veritas' logs and it looks as though the control file gets backed up, then goes on to start the db backup and then fails. Does anyone know of any other tracing I could do? Would sql 'alter session set events '10046.......'' trace help determine this problem? TIA, Pete's |
| |||
| Hello Pete, Can you test if backup to disk (not tape) is working? Regards, Ron DBA Infopower http://www.dbainfopower.com Standard disclaimer: http://www.dbainfopower.com/dbaip_ad...isclaimer.html "Pete's" <empete2000@yahoo.com> wrote in message news:6724a51f.0402160614.c3342cd@posting.google.co m... > Backup Server: > Sun 280 Solaris 2.9 > 901 RMAN Catalog running on an 8.1.7 EE database > Veritas Netbackup DataCenter 4.5 MP 3 > > Production Server: > P660-6H1 AIX 4.3.3 ML09 > Primary Site for an HACMP Cluster and DataGuard > 9.0.1.3 EE Database, 24x7. > Veritas NB Oracle DB Extension > > > This system has been working flawlessly for the last 1.5 yrs, but, a > couple weeks ago I received the following intermittent RMAN-00600 > error backing up the primary site for Dataguard and HA: > RMAN-00600: internal error, arguments [13014] [26] [] [] [] > > A trace file would get created but with virtually no text in it and > would be a couple minutes after the failure. I opened a TAR w/ > Oracle, gave them the trace, and they could not find anything in their > docs regarding this error and said I was the only one encountering it. > Before they soft closed it, it happened again with the same symptoms. > It happened once again late last week, so, I my RMAN script, I turned > debug on and turned tracing on in the tape channel. After turning it, > two traces were produced. I sent those off to Oracle as well as the > RMAN output. I'll post an example shortly. I re-opened the TAR, but > so far support has no additional information to give me. > > RMAN> connect target sys/XXXXXXX@PROD1 > 2> connect catalog rman901/XXXXXXX@PRODCAT1 > 3> sql 'alter system archive log current'; > 4> debug on ; > 5> run { > 6> # Hot database level 1 cumulative incremental backup > 7> allocate channel t1 type 'SBT_TAPE' trace=1 > 8> parms="ENV=(NB_ORA_CLASS=emsdb1_PROD1, > NB_ORA_SCHED=dailyincremental)"; > 9> allocate channel t2 type 'SBT_TAPE' trace=1 > 10> parms="ENV=(NB_ORA_CLASS=emsdb1_PROD1, > NB_ORA_SCHED=dailyincremental)"; > 11> backup current controlfile for standby; > 12> backup > 13> incremental level 1 > 14> cumulative > 15> skip offline > 16> skip readonly > 17> skip inaccessible > 18> tag hot_PROD1_bk_level1_cm > 19> filesperset 20 > 20> format 'bk_PROD1_hot_cm_lv1_%U_%t_%s' > 21> (database); > 22> } > 23> debug off ; > 24> resync catalog; > 25> > > I checked Veritas' logs and it looks as though the control file gets > backed up, then goes on to start the db backup and then fails. > > Does anyone know of any other tracing I could do? Would sql 'alter > session set events '10046.......'' trace help determine this problem? > > TIA, > Pete's |
| |||
| "Ron" <support@dbainfopower.com> wrote in message news:<vMCdnSmpW6p6JazdRVn_iw@comcast.com>... > Hello Pete, > > Can you test if backup to disk (not tape) is working? > > Regards, > > Ron > DBA Infopower > http://www.dbainfopower.com > Standard disclaimer: > http://www.dbainfopower.com/dbaip_ad...isclaimer.html Yeah I was planning on trying that next. Thanks, Pete's |
| |||
| "Ron" <support@dbainfopower.com> wrote in message news:<vMCdnSmpW6p6JazdRVn_iw@comcast.com>... > Hello Pete, > > Can you test if backup to disk (not tape) is working? > > Regards, > Same error occurs when allocating disk channels. Pete's |
| |||
| Pete's wrote: > "Ron" <support@dbainfopower.com> wrote in message news:<vMCdnSmpW6p6JazdRVn_iw@comcast.com>... > >>Hello Pete, >> >> Can you test if backup to disk (not tape) is working? >> >> Regards, >> > > > Same error occurs when allocating disk channels. > > Pete's If what you have is ... RMAN-00600: internal error, arguments [2560] [] [] [] [] ... it means that some string literal in the RMAN script is too long. -- Daniel Morgan http://www.outreach.washington.edu/e...ad/oad_crs.asp http://www.outreach.washington.edu/e...oa/aoa_crs.asp damorgan@x.washington.edu (replace 'x' with a 'u' to reply) |
| |||
| Hello Pete, Any recent changes on system (upgrades, etc)? Can you check with SA is any system errors are showing up? Another thing: If possible, can you run dbv to verify that database datafiles are okay? Regards, Ron DBA Infopower http://www.dbainfopower.com Standard disclaimer: http://www.dbainfopower.com/dbaip_ad...isclaimer.html "Pete's" <empete2000@yahoo.com> wrote in message news:6724a51f.0402170842.7b9d3394@posting.google.c om... > "Ron" <support@dbainfopower.com> wrote in message news:<vMCdnSmpW6p6JazdRVn_iw@comcast.com>... > > Hello Pete, > > > > Can you test if backup to disk (not tape) is working? > > > > Regards, > > > > Same error occurs when allocating disk channels. > > Pete's |
| |||
| "Ron" <support@dbainfopower.com> wrote in message news:<ctydndXf0eTb7q_dRVn-vg@comcast.com>... > Hello Pete, > > Any recent changes on system (upgrades, etc)? > > Can you check with SA is any system errors are showing up? > > Another thing: If possible, can you run dbv to verify that database > datafiles are okay? Responding to Daniel Morgan's: The information you posted about string literals is more than support has told me, which has been not much of anything. Yesterday, I found in veritas logs that it was complaining of string literals when it errored. After finding that in the logs, I cut the lengths down all string literals so they were less than 10 characters. Still received the same errors writing to disk and tape. Responding to Ron: No changes have been made to this system for over 2 to 3 months, OS or DB software wise. BTW, this is fixed. The solution? I didn't think of it right away, but, I manually ran a resync on the catalog and reran a backup and it worked. It looks like the recovery catalog got mucked up somehow, don't know what but I'd sure like to know. Reason I didn't think of resync'ing the catalog before is after running any rman script, i.e. backup script, I write the scripts so that they perform a resync of the catalog(I would have thought this would have caught it). I guess I've learned an important step in troubleshooting RMAN, no matter what the error, always run a resync even if you think it's not going to help. Thanks for your suggestions gentlemen. Hope this helps someone else. Pete's |
| ||||
| Hello Pete, You can send it to Oracle support as well :-) Regards, Ron DBA Infopower http://www.dbainfopower.com Standard disclaimer: http://www.dbainfopower.com/dbaip_ad...isclaimer.html "Pete's" <empete2000@yahoo.com> wrote in message news:6724a51f.0402180835.49a7d884@posting.google.c om... > "Ron" <support@dbainfopower.com> wrote in message news:<ctydndXf0eTb7q_dRVn-vg@comcast.com>... > > Hello Pete, > > > > Any recent changes on system (upgrades, etc)? > > > > Can you check with SA is any system errors are showing up? > > > > Another thing: If possible, can you run dbv to verify that database > > datafiles are okay? > > > Responding to Daniel Morgan's: > > The information you posted about string literals is more than support > has told me, which has been not much of anything. Yesterday, I found > in veritas logs that it was complaining of string literals when it > errored. After finding that in the logs, I cut the lengths down all > string literals so they were less than 10 characters. Still received > the same errors writing to disk and tape. > > > Responding to Ron: > > No changes have been made to this system for over 2 to 3 months, OS or > DB software wise. > > > BTW, this is fixed. The solution? I didn't think of it right away, > but, I manually ran a resync on the catalog and reran a backup and it > worked. It looks like the recovery catalog got mucked up somehow, > don't know what but I'd sure like to know. Reason I didn't think of > resync'ing the catalog before is after running any rman script, i.e. > backup script, I write the scripts so that they perform a resync of > the catalog(I would have thought this would have caught it). I guess > I've learned an important step in troubleshooting RMAN, no matter what > the error, always run a resync even if you think it's not going to > help. > > Thanks for your suggestions gentlemen. > > Hope this helps someone else. > Pete's |