locked
MSDB Keeps going "Recovery Pending" RRS feed

  • Question

  • Hi everyone.  First time posting.

    I have a small issue with a production AG cluster.  One of the nodes of the server is having an issue with the MSDB database going "recovery pending".  When this happens, everything fails over to another node, so no loss of service.  But it's a pain to have to restore these databases once in awhile.

    My current theory is that windows updates are rebooting the active node without a proper failover.  In response to my theory, I've just setup cluster aware updating as of yesterday. 

    The odd thing is, I don't have MSDB added to the AG.  And in the logs, it's telling me the reason for "recovery pending" is that the log file is corrupt.

    Can someone advise a place i can look to determine the root cause? Does my theory of windows update rebooting the server sound possible?

    Wednesday, August 24, 2016 8:29 PM

Answers

  • The odd thing is, I don't have MSDB added to the AG.  And in the logs, it's telling me the reason for "recovery pending" is that the log file is corrupt.

    >>I believe we cannot add system db to AG,each instance will have its own systemdb.

    Coming to your MSDB recovery pending- this will many reasons, for this you have to look your SQL errorlog,sql agent logs along with

    a)ensure sufficient space exists in teh drive and autogrowth not capped for the MSDB data/log file.

    b)Disk i/o subsytem issues.

    C)your storage (or)outdated NICs.

    D)ensure you have applied latest SP's for the sql.

    E)File activation failure

    F)your log file corrupt and  improperly shutdown the sql server previously.

    Please check complete evetlogs further along with see -are there any disk defrag is running (or) any other scan is running.

    Also requesting you to see the cluster logs where it has issues,any recent changes made?.

    My current theory is that windows updates are rebooting the active node without a proper failover.  In response to my theory, I've just setup cluster aware updating as of yesterday. 

    >>Ofcourse you need to look in the sql errrorlog first- how the recovery of each database is happening along with errors.

    you need to strict in this matter when ever windows update performs by OS team & also it might be not advisable autoupdate to schedule so.


    Regards, S_NO "_"



    Thursday, August 25, 2016 3:35 AM

All replies

  • Hi 0 Day,

    Could you please share the log that tells you the msdb log file has been corrupted? I’m sensing the issue could be the disk(where you put msdb log file) has been corrupted which causes a critical server error and then triggered the failover. I would suggest you run CHKDSK over this disk see if it finds any error.

    If you have any other questions, please let me know.

    Regards,
    Lin
    Thursday, August 25, 2016 2:35 AM
  • The odd thing is, I don't have MSDB added to the AG.  And in the logs, it's telling me the reason for "recovery pending" is that the log file is corrupt.

    >>I believe we cannot add system db to AG,each instance will have its own systemdb.

    Coming to your MSDB recovery pending- this will many reasons, for this you have to look your SQL errorlog,sql agent logs along with

    a)ensure sufficient space exists in teh drive and autogrowth not capped for the MSDB data/log file.

    b)Disk i/o subsytem issues.

    C)your storage (or)outdated NICs.

    D)ensure you have applied latest SP's for the sql.

    E)File activation failure

    F)your log file corrupt and  improperly shutdown the sql server previously.

    Please check complete evetlogs further along with see -are there any disk defrag is running (or) any other scan is running.

    Also requesting you to see the cluster logs where it has issues,any recent changes made?.

    My current theory is that windows updates are rebooting the active node without a proper failover.  In response to my theory, I've just setup cluster aware updating as of yesterday. 

    >>Ofcourse you need to look in the sql errrorlog first- how the recovery of each database is happening along with errors.

    you need to strict in this matter when ever windows update performs by OS team & also it might be not advisable autoupdate to schedule so.


    Regards, S_NO "_"



    Thursday, August 25, 2016 3:35 AM
  • You need to check SQL error log to know the root cause. if it is saying that SQL couldn't open the mdf because of OS error 112, it means that some other process has the file locked.
    Friday, August 26, 2016 6:15 AM

  • Hi 0 Day,

    I’m writing to follow up with you on this post. Was the issue resolved? If you issue has resolved, I’d like to mark this issue as "Answered". Please also feel free to unmark the issue, with any new findings or concerns you may have.

    If you have any other questions, please let me know.

    Regards,
    Lin
    Tuesday, August 30, 2016 4:42 AM
  • Can you post the actual error message.  Also, it would help to understand a little more about your set up.  What version of SQL Server and Windows?


    --Burt King

    Tuesday, August 30, 2016 11:02 AM