none
Unable to connect to the database because of a fatal error SQL error code 64, other errors DPM 2016 / sql error on 2019 upgrade RRS feed

  • Question

  • The other day i decided to run a few updates on our already (DPM wise) up to date server (2016 server that has dpm 2016/sql instance for dpm).

    I also ran updates on 3 of my hyperv servers which were long over due (some 2012, some 2016/19) that have data in the protection group on this server.

    Now i'm getting that error periodically.  I hit close and reopen DPM, i can often go 20 mins without another instance of the error but other times its constant

    also, some of the protected items like sql or hyperv containers are throwing strange errors, like "storage involving the current operation could not be read from or written to id 40003

    Any ideas what the cause might be or how to remedy?

    I was going to do a 2019 upgrade but im thinking it may be best to fix this issue first 

    Thanks in advance


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com


    • Edited by techfun89 Friday, November 8, 2019 4:22 PM
    Monday, November 4, 2019 9:52 PM

All replies

  • Hi,

    Here's a KB article about this error:
    BMR backup fails in Data Protection Manager or Azure Backup Server


    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Monday, November 4, 2019 10:04 PM
  • Hi,

    Here's a KB article about this error:
    BMR backup fails in Data Protection Manager or Azure Backup Server


    Best regards,
    Leon


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thanks, however i'm not seeing those errors on the host servers.

    If i delete say the hyperv container that is throwing one of these from the protection group then re-add it, it then seems to work.

    Which has me wondering if the database error on the dpm server is related or how to fix that first?


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Tuesday, November 5, 2019 4:23 PM
  • The event id 40003 should be seen on the DPM server, as the KB article states.

    Removing the workload temporarily from the protection group usually solves many problems :-)

    Matej also blogged about this issue here:

    http://www.matej.guru/2018/06/05/dpm-2016-the-storage-involving-the-current-operation-could-not-be-read-from-or-written-to-id-40003/


    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, November 5, 2019 4:46 PM
  • It might be worth noting that the storage partition things are going to on the dpm server is a ReFS partition.


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Tuesday, November 5, 2019 9:09 PM
  • Do you mean the DPM storage pool? DPM uses ReFS as storage since DPM 2016 and newer.

    Blog: https://thesystemcenterblog.com LinkedIn:

    Tuesday, November 5, 2019 9:43 PM
  • Please check Mike Jaquet's reply over here:

    https://social.technet.microsoft.com/Forums/en-US/93163b67-4c70-41c9-be00-e0b453998f12/dpm-2016?forum=dataprotectionmanager


    Blog: https://thesystemcenterblog.com LinkedIn:

    We are on rollup 7 i think, the latest from sept. 

    Yeah the dpm storage pool is refs.

    I've tried the remove recovery point options in that link with no cure.


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Wednesday, November 6, 2019 8:22 PM
  • Please check Mike Jaquet's reply over here:

    https://social.technet.microsoft.com/Forums/en-US/93163b67-4c70-41c9-be00-e0b453998f12/dpm-2016?forum=dataprotectionmanager


    Blog: https://thesystemcenterblog.com LinkedIn:

    Someone suggested creating vss shadow copy's on the target server, unsure if thats a real solution?  Not sure why anything would have changed after rebooting and updating these servers.  (and why does it work if you remove and readd the protected item)

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Wednesday, November 6, 2019 8:23 PM
  • Please check Mike Jaquet's reply over here:

    https://social.technet.microsoft.com/Forums/en-US/93163b67-4c70-41c9-be00-e0b453998f12/dpm-2016?forum=dataprotectionmanager


    Blog: https://thesystemcenterblog.com LinkedIn:

    I started the defrag process on the refs volume, but its still at 66% the next day, moving very slow.  Still unsure if this will help, or why a defrag would still be needed these days

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Wednesday, November 6, 2019 8:35 PM
  • I checked with vss and the vss shadowcopy does exist at least on the one target i checked.  I'm not seeing a parameter to make it unbounded 

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Wednesday, November 6, 2019 8:48 PM
  • Assuming there really is a sql db error, how can i recover this (there dont appear to be recent backups locally, maybe one from april)?  Could the sql db connection error be causing the other issue

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Thursday, November 7, 2019 4:41 PM
  • DPM relies fully on its database, so the connection must be healthy or you will indeed encounter issues.

    Which build of DPM are you running?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, November 7, 2019 8:28 PM
  • DPM relies fully on its database, so the connection must be healthy or you will indeed encounter issues.

    Which build of DPM are you running?


    Blog: https://thesystemcenterblog.com LinkedIn:

    2016 rollup 7 

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Thursday, November 7, 2019 9:02 PM
  • The other day i decided to run a few updates on our already (DPM wise) up to date server (2016 server that has dpm 2016/sql instance for dpm).

    I also ran updates on 3 of my hyperv servers which were long over due (some 2012, some 2016/19) that have data in the protection group on this server.

    Which were the updates that were applied on the DPM server/instance?

    Is the DPM instance located locally on the same server as DPM?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, November 7, 2019 9:07 PM
  • The other day i decided to run a few updates on our already (DPM wise) up to date server (2016 server that has dpm 2016/sql instance for dpm).

    I also ran updates on 3 of my hyperv servers which were long over due (some 2012, some 2016/19) that have data in the protection group on this server.

    Which were the updates that were applied on the DPM server/instance?

    Is the DPM instance located locally on the same server as DPM?


    Blog: https://thesystemcenterblog.com LinkedIn:

    I'm not sure which updates were ran, there were quite a few ran.  On both ends.

    I did manage to restore the sql db back to april, did the reallocate replica.

    Despite this i'm still getting: storage involving the current operation could not be read from or written to id 40003

    When i try to do a recovery point.


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com


    • Edited by techfun89 Thursday, November 7, 2019 10:11 PM
    Thursday, November 7, 2019 10:11 PM
  • If the restore was completed, you could mark the replies  that were helpful as answer in your other thread: 
    DPM 2016 recreate sql database if corrupt, lose protection group?

    Just so we don't mix two different topics in this thread :-)

    Back to this thread's topic, are you receiving the same error when trying to create a recovery point for any protected workload, or just specific workloads?


    Blog: https://thesystemcenterblog.com LinkedIn:

    Thursday, November 7, 2019 10:20 PM
  • If the restore was completed, you could mark the replies  that were helpful as answer in your other thread: 
    DPM 2016 recreate sql database if corrupt, lose protection group?

    Just so we don't mix two different topics in this thread :-)

    Back to this thread's topic, are you receiving the same error when trying to create a recovery point for any protected workload, or just specific workloads?


    Blog: https://thesystemcenterblog.com LinkedIn:

    On any point it causes the error.  The only exception is if i remove one from disk (stop protection), readd it, then it works.

    However, suddenly just now i'm back to get the sql database is corrupted error.


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Friday, November 8, 2019 2:04 AM
  • What happens if i just delete the db, do i lose the protection group and all its items? I guess thats probably a given.  Not sure why the april backup would fail too

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Friday, November 8, 2019 2:05 AM
  • At this point i'm wondering if i run the upgrade for dpm 2019 if there would be any benefit, or would things get worse or the problem just follow along.

    edit: when i try to update the sql part doing the 2019 installer, it complains saying the db doesnt match the installed dpm version.  i guess because the db was from april before rollup 8 

    edit: reran the update rollup 8 and its still throwing that error

    i'm seeing this in the logs:

    InstalledDpmVersion from MsiEnumRelatedProducts is 5.0.457.0
    [11/7/2019 10:34:22 PM] Information : DatabaseVersion property from tbl_DLS_GlobalSetting is 5.0.158.0


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com





    • Edited by techfun89 Friday, November 8, 2019 3:40 AM
    Friday, November 8, 2019 2:32 AM
  • So it appears the sql version is of the RTM not the rollup?  not sure how thats possible?

    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Friday, November 8, 2019 3:41 AM
  • What happens if i just delete the db, do i lose the protection group and all its items? I guess thats probably a given.  Not sure why the april backup would fail too

    >> You will lose everything, because everything is stored in the DPM database.

    At this point i'm wondering if i run the upgrade for dpm 2019 if there would be any benefit, or would things get worse or the problem just follow along.

    edit: when i try to update the sql part doing the 2019 installer, it complains saying the db doesnt match the installed dpm version.  i guess because the db was from april before rollup 8 

    >> I would not recommend upgrading unless the DPM environment is healthy, If there are a lot of problems, it will be time consuming to start to troubleshoot them, you might want to consider starting from scratch (if that's an option), in other words a clean DPM 2019 installation.

    DPM 2016 is pretty strict with the SQL Server version being used, as you can see over here:


    Blog: https://thesystemcenterblog.com LinkedIn:

    Friday, November 8, 2019 7:15 AM
  • What happens if i just delete the db, do i lose the protection group and all its items? I guess thats probably a given.  Not sure why the april backup would fail too

    >> You will lose everything, because everything is stored in the DPM database.

    At this point i'm wondering if i run the upgrade for dpm 2019 if there would be any benefit, or would things get worse or the problem just follow along.

    edit: when i try to update the sql part doing the 2019 installer, it complains saying the db doesnt match the installed dpm version.  i guess because the db was from april before rollup 8 

    >> I would not recommend upgrading unless the DPM environment is healthy, If there are a lot of problems, it will be time consuming to start to troubleshoot them, you might want to consider starting from scratch (if that's an option), in other words a clean DPM 2019 installation.

    DPM 2016 is pretty strict with the SQL Server version being used, as you can see over here:


    Blog: https://thesystemcenterblog.com LinkedIn:

    The version on the dpm server is sql server 2016, i dont think thats an issue.  The log says the "database version" is running 2016 DPM RTM, while DPM is at Rollup 8, thats where the problem is, i have no idea how to resolve that though.


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com

    Friday, November 8, 2019 3:35 PM
  • InstalledDpmVersion from MsiEnumRelatedProducts is 5.0.457.0
    [11/7/2019 10:34:22 PM] Information : DatabaseVersion property from tbl_DLS_GlobalSetting is 5.0.158.0

    This is actually OK, the DatabaseVersion will show the RTM version, it's normal behavior.

    Here's from my lab environment:


    Blog: https://thesystemcenterblog.com LinkedIn:

    Friday, November 8, 2019 3:49 PM
  • InstalledDpmVersion from MsiEnumRelatedProducts is 5.0.457.0
    [11/7/2019 10:34:22 PM] Information : DatabaseVersion property from tbl_DLS_GlobalSetting is 5.0.158.0

    This is actually OK, the DatabaseVersion will show the RTM version, it's normal behavior.

    Here's from my lab environment:


    Blog: https://thesystemcenterblog.com LinkedIn:

    Ok, well i guess that doesnt explain the random error 64 sql crashes, nor the fact that if you run the 2019 installer it complains about the database versions mismatching whats installed.

    Not sure what to check next.  Things still fail and random sql crashes.

    

    


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com


    • Edited by techfun89 Friday, November 8, 2019 4:22 PM
    Friday, November 8, 2019 4:17 PM
  • Is reinstalling DPM an option? Are any backups running at all right now or are they all failing?

    Blog: https://thesystemcenterblog.com LinkedIn:

    Friday, November 8, 2019 5:11 PM
  • Is reinstalling DPM an option? Are any backups running at all right now or are they all failing?

    Blog: https://thesystemcenterblog.com LinkedIn:

    It could be an option.  However, do you mean starting with a fresh database?  The issue here is that we have tapes that could need to be retrieved from, i think if the db gets removed and started fresh we can no longer retrieve?

    An example:  i try to stop protection on a group of items under a server and immediately i get the sql error code 64, database corrupted.

    Its been about 5 days and only a handful of the 200 or so recovery points have been working each day, most of those are ones i was able to stop/restart protection on.

    I've tried multiple times to resync then do recovery points on some of these and they always fail with the "storage involving the current operation could not be read from or written to id 40003


    Tech, the Universe, Everything: http://tech-stew.com Just Plane Crazy http://flight-stew.com


    • Edited by techfun89 Friday, November 8, 2019 6:44 PM
    Friday, November 8, 2019 6:44 PM
  • Hi,

    Just checking to see if you have any update on your issue?

    If your issue was resolved, may I ask you to mark all the answers that helped you? This way it will also help others in the future who face the same challenge. Many thanks in advance!


    Best regards,
    Leon

    Blog: https://thesystemcenterblog.com LinkedIn:

    Wednesday, December 11, 2019 9:36 PM