none
Cannot backup Exchange 2010 SP1 DB with DPM 2010

    Question

  • I am really confused by this issue. I have been backing up an Exchange 2010 SP1 DB for months without any problems and suddenly DPM stopped protecting it, the status shows "Replica is inconsistent". Synchronization with consistency check does not help, neither does restarting the DPM and/or Exchange server. The error message in DPM is:

    Data consistency verification check failed for LOGS of Exchange Mailbox Database Mailbox Database on EXMBX.mydomain.com. (ID 30146 Details: Unknown error (0xfffffe0b) (0xFFFFFE0B)). Either the database files are corrupt or the proper versions of the Eseutil.exe and Ese.dll files are missing. If you have recently upgraded your Exchange server, copy them from that server to the DPM server.Contact your Exchange server administrator, and then verify the issue. You can recover the last known good backup to address the corrupted state.

    I verified the eseutil and ese.dll files on the DPM server - the version is the same as on the Mailbox server. I've also tried removing the replica and reinstalling the protection agent - it doesn't help. Even the first synchronization does not succeed. The event log on the Mailbox server does not seem to be giving out any useful information - only errors like this:

    Exchange VSS Writer (instance 77945763-4904-43a3-8b9d-7ba1e344b764:1) has completed the backup of database 'Mailbox Database' with errors. The backup did not complete successfully, and no log files were truncated for this database.

    I am out of ideas what to try next. Maybe the logs are corrupted indeed, but can I start a manual consistency check somehow, even if the database have to be brought offline? Despite these problems, the mailbox server is functioning properly and no user is experiencing any problem accessing e-mail, calendar, etc. Any clue? Thanks in advance.

    Tuesday, May 03, 2011 5:50 PM

Answers

  • Yordan,

     

    Have you tried running the backup after unchecking the eseutil on DPM as a test? If this succeeds then most likely there is something potentially wrong the DB or logs.  I have not seen any mention of running eseutil against the DB or the logs.  If the DB is fine then run an eseutil /ml against the checkpoint file. If you are not comfortable or familiar with eseutil then please contact our exchange support to assist with the tests.  This will require the DB's be dismounted unless you are doing an offline test.

     

    Andy


    Regards, A.Nadar, This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, May 12, 2011 1:55 AM

All replies

  • Hi Yordan

    First, verify you VSS for Exchange. Open an elevated command promt and type vssadmin list writers to populate the VSS'es located on your Exchange servers. As far as I understand you have multiple Exchange servers, DAG perhaps?

    Have you applied the latest QFE for DPM? http://www.microsoft.com/downloads/en/details.aspx?FamilyID=d3fabb18-1adb-4c87-a95d-d3c3826d5bfb

    If you look i your logs on the Exchange server do you see any errors regarding VSS errors? Any errorlogs on the DPM server? Have you updated your Exchange server lately, patches etc.

    Regarding your error: Exchange VSS Writer (instance 77945763-4904-43a3-8b9d-7ba1e344b764:1) has completed the backup of database 'Mailbox Database' with errors. The backup did not complete successfully, and no log files were truncated for this database. seems to indicate that your VSS on the Exchange server is faulty.


    Best Regards

    Robert Hedblom

    MVP DPM


    Check out my DPM blog @ http://robertanddpm.blogspot.com

    Tuesday, May 03, 2011 7:29 PM
  • Hi, Robert and thanks for your reply. I collected some information about the VSS writers and they all seem to be OK according to the output of the command above. Specifically for the Exchange VSS Writer:

     Writer name: 'Microsoft Exchange Writer'
       Writer Id: {76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}
       Writer Instance Id: {48b90b2a-0d0f-4085-a5cf-17adb057875f}
       State: [1] Stable
       Last error: No error

    As far as the updates are concerned, Windows Server 2008 R2 has all the updates installed (except SP1). I haven't applied Update Rollup 2 for Exchange 2010 SP1 yet. Is there a chance that this would help? Same situation with DPM, the QFE is not installed either.


    The Exchange infrastructure is fairly simple, no DAGs, jut one Hub Transport with Client Access server and one Mailbox server. On the Mailbox server there are some warnings with source ESE that began appearing in the Event log since the backups stopped working:

    Information Store (3844) Mailbox Database: A request to write to the file "D:\IFD Engineering Mailbox Database\Logs\E00.log" at offset 212480 (0x0000000000033e00) for 512 (0x00000200) bytes succeeded, but took an abnormally long time (143 seconds) to be serviced by the OS. This problem is likely due to faulty hardware. Please contact your hardware vendor for further assistance diagnosing the problem.

    I know that the storage is not super fast, but I haven't experienced any problems with other applications so far. And this error does not appear when a backup is started but randomly.

    Should I apply the updates or maybe check the database for consistency issues first? The DPM server claims that it is an issue with the log files. I can't figure out whether the problem is on the DPM side or Exchange side. There are multiple SQL Server databases on the same storage that are backed up by DPM without problems... If it was a hardware issue, I think the scope of the problem would be much bigger.


    Wednesday, May 04, 2011 9:43 PM
  • Hi, As mentioned in the DPM error message, the verification check is failing for mailbox database logs. This can happen if the eseutil on DPM server and Exchange mailbox server are incompatible or there is a corruption on Exchange server/on the wire/on DPM replica. Can you please copy the eseutil.exe and ese.dll from the Exchange server to the DPM install location\bin folder and try consistency check. If it still fails with the same error then there might be corruption and we need to investigate it.
    Nagesh[MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, May 05, 2011 5:37 AM
  • Ok, you VSS is according to your output in a stable state, thats a good start.

    Apply both the november 2010 QFE (http://www.microsoft.com/downloads/en/details.aspx?FamilyID=f399fbfa-5c8b-4eb6-bda2-ea997745919a) and the march 2011 QFE ( http://www.microsoft.com/downloads/en/details.aspx?FamilyID=d3fabb18-1adb-4c87-a95d-d3c3826d5bfb). Niether of them adressen any Exchange problem your having but this is the first the Microsoft Support will ask you about if you open a case. Apply thos, don'y forget to update your agents, and see if it solved the problem.

    One thing that seems odd is the new error you posted: Information Store (3844) Mailbox Database: A request to write to the file "D:\IFD Engineering Mailbox Database\Logs\E00.log" at offset 212480 (0x0000000000033e00) for 512 (0x00000200) bytes succeeded, but took an abnormally long time (143 seconds) to be serviced by the OS. This problem is likely due to faulty hardware. Please contact your hardware vendor for further assistance diagnosing the problem.

    You mentioned that your Exchange log files and datases where on the same storage. Does the storage solution fix the workload your SQL databases and Exchange information is generating? How is the RAID level configured, RAID 10,6,5,1,0?

    Is there by any chance that you could move the exchange files to another storage or local disks to determen if this is the error, hardware laytencey.

    Please apply Preformance counters for disk sub system and verify that you havne't any problem.


    Best Regards

    Robert Hedblom

    MVP DPM


    Check out my DPM blog @ http://robertanddpm.blogspot.com

    Thursday, May 05, 2011 7:32 AM
  • Sorry for the delayed answer. I managed to apply the updates today. I applied the two QFEs for DPM and also Update Rollup 2 for Exchange 2010 SP1 on the hub transport and mailbox servers. Before applying the updates I removed the protection agent from the mailbox server and then installed it again. Unfortunately this did not fix the problem. The replica is still inconsistent and the replica creation fails with the same error after ESEUTIL runs.

    Nageshbk, Update Rollup 2 updated the version of Eseutil and ese.dll and I copied them to DPM before adding the mailbox database to the protection group but it seems that this was not the source of the problem (before applying the updates the versions were equal too).

    Robert, I could try to move the logs to a fixed disk if this could help. The storage does not seem overloaded at all - the SQL databases there are only DEV DBs and no production ones, so the load is not very high. This is a RAID 5 configuration - I know that it is not recommended, but for an environment not that busy it shouldn't be a concern I think. Anyway, I suppose the logs may be the root of the issue as DPM claims. Currently the Logs folder is filled with untruncated logs since the last successful backup.



    Friday, May 06, 2011 7:35 PM
  • Yordan,

     

    Have you tried running the backup after unchecking the eseutil on DPM as a test? If this succeeds then most likely there is something potentially wrong the DB or logs.  I have not seen any mention of running eseutil against the DB or the logs.  If the DB is fine then run an eseutil /ml against the checkpoint file. If you are not comfortable or familiar with eseutil then please contact our exchange support to assist with the tests.  This will require the DB's be dismounted unless you are doing an offline test.

     

    Andy


    Regards, A.Nadar, This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, May 12, 2011 1:55 AM
  • Hi, Andy. Thanks for your suggestion. I tried protecting the DB withouth the Eseutil option and the initial express full backup succeeded. Subsequent incrementals were successful too. I see in the Logs folder that the logs have been truncated. The question is whether I need to take any further steps now. Maybe I should try checking Eseutil and trigger a consistency check to see if it will succeed. If it does, would that mean that the problem is resolved?
    Thursday, May 12, 2011 2:15 PM
  • Yordan,

    At this point you are not using any intergrity checks when doing your backup which is probably not what you want.  The next step is to actually validate the production exchange DB and log integrity by running eseutil against the DB and Logs to ensure that there is nothing wrong.  If those come back ok then follow Robert's suggestions and start looking at hardware.  Thank you

    Andy


    Regards, A.Nadar, This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, May 12, 2011 4:02 PM
  • So, incremental backups keep succeeding and I also activated consistency checks for the protection group. I tried performing a manual consistency check and it seems to check the DB and report no problems. I really haven't done a manual eseutil against the DB. Do you think it is necessary or the problem is resolved? No more errors either from DPM or in the Exchange logs. Maybe it was a temporary problem in the logs, but since the logs have been truncated, the problem is resolved. If this is possible at all. :)

    Wednesday, May 18, 2011 7:34 PM
  • I had problems backing up exchange database in a DAG cluster w/ 2 servers. Here is the site i used.

    http://itonline4free.com/index.php?p=/discussion/8/cannot-backup-exchange-2010-dag


    • Edited by Mapalo Wednesday, February 08, 2012 2:56 AM
    Wednesday, February 08, 2012 2:37 AM