none
Job failure caused by ongoing recovery point creation ID: 100 RRS feed

  • Question

  • On my DPM server I am finding failed log entries for one server which is backing up SQL DB's. The error is as follows:

    Job failure on replica of <server>\msdb on <server.domain.lan> caused by ongoing recovery point creation. (ID 100)

    I can get this error for any database from this server and if I rerun the recovery point job, the job will complete successfully. My question is, how do I get this to stop happening or what is actually causing it? I can assume it's another job taking too long.

    I am also wondering if this has something to do with the way the protection group is set up. As of now I have all of my servers in one protection group. This includes SQL databases, Hyper-V servers, File directories and system states. The protection group is setup to do a recovery point at 8AM, 12PM and 6PM each day with synchronizations happening every 4 hours. Could the protection group be my issue? Maybe I should be splitting out my backups between 2 or more protection groups for the types of data being backed-up?

    Thanks for any advice you can offer.

    Wednesday, February 27, 2013 5:40 PM

All replies

  • Anyone have an idea on this?

    Thursday, February 28, 2013 1:12 PM
  • Hi Gilead,

    I think your issue is due to Hyper-V backup (VM) and SQL Log.

    Take a look at this link : http://social.technet.microsoft.com/Forums/en-US/dpmhypervbackup/thread/a1eabaff-49ad-4724-8b88-5d933118ab73

    You should split your protection group when you are protecting different workload.

    By definition, all members of a protection group share recovery goals—that is, all data sources of a type in a protection group must have the same retention range and data loss tolerance.

    http://technet.microsoft.com/en-us/library/hh757821.aspx

    Stephane


    Please remember to click “Mark as Answer” on the post that helps you. This posting is provided "AS IS" with no warranties. knowledge is valid only if it is shared by All.

    My DPM blog Yet Another DPM Blog

    Thursday, February 28, 2013 2:06 PM
  • Hi Stephane,

    I went ahead and split the protection group into two groups which now run at different times. One for my VM's and server system states and then the other for my databases and flat files. The error still persists. I also went through the logs for Windows server as well as the SQL logs but nothing is looking suspicious. Only my databases are effected and then it's only 1 to 3 which will have the issue. The databases that this effects change daily and rerunning the job manually will fix the issue.

    Any other thoughts?

    Tuesday, March 5, 2013 5:41 PM
  • Hi Gilead,

    Based on the setup that you've mentioned that you have all the servers in the same Protection group. Based on what I understand, our SQL backups are not running due to an ongoing backup job.

    The time when we get this error, did we check if there are any other backup jobs running on the DPM server for that SQL server?

    Also, have we configured any SQL maintenance plans for these databases? If so then please disable them.

    Ideally, the protection groups should be designed carefully and keeping in mind the time the backups will take to finish. For example :

    1) All my SQL servers can be in 1 protection group.
    2) All my Hyper V servers in 1 protection group.
    3) All my System state backups in 1 Protection group.
    4) File backups in 1 protection group.

    With that said, you will have to design the protection groups with different express full backup timings and recovery point creation time.

    Let me know the outcome.


    Please remember to click “Mark as Answer” on the post that helps you. This posting is provided "AS IS" with no warranties. knowledge is valid only if it is shared by All. Regards, AJ (MSFT)

    • Proposed as answer by ANUJ_MSFT Tuesday, March 5, 2013 11:07 PM
    Tuesday, March 5, 2013 11:07 PM
  • Hi Anuj,

    I did check the databases that were getting that error and I did have some maintenance plans running on them but the time that the plans were running were no where near the times of the failures. I disabled the plans anyways to see if it would clear up the issue. I also split my protection group into 2 groups - one for databases and flat files which I view as more critical and then system states and VM's which only get backed up one per day at a totally different time.

    Lastly, I just upgraded my DPM installation to SP1. I am now rerunning consistency checks and plan to let it run through the night to see how the jobs do. I will let you know what happens.

    Wednesday, March 6, 2013 5:08 PM