locked
DPM 2010 not filling tapes? RRS feed

  • Question

  • Hi guys,

    This is a new DPM 2010 on a Windows 2008 R2 server, HP Ultrium 1840 SCSI connected.  Backups were running fine to these tapes and backup drive with BackupExec.

    One backup drive attached (though there was an Ultirum 460 at one point, which I removed), one protection group, file and HyperV data selected.  I ran a long term backup to tape, a few HyperV VMs backed up fine then it started on the file data from one of the servers, it got 1.2Tb into this fine and asked for another tape.  No problem, this was about 1.5TB onto a single tape so I was happy with that.  However the second tape was only in for a few minutes before it looked for another one, only 50-60Gb had gone onto this one!  Both tapes were Short Erased before being used.

    So I figured it might just be old tapes so stuck in a brand new one and started a new backup.  Again a few HyperV images were on but when it got to the same file data it only got to 600Gb before looking for another tape.  Stuck another brand new tape in, 300Gb later it is now looking for a third tape!

    So before I cancel this job or enter another tape, what troubleshooting can I do here?  Is there any way of seeing WHY DPM wants another tape?

     

     

    Thanks,

    Philip

     

     

     

     

     

    Monday, August 16, 2010 9:27 AM

Answers

  • MichaelC,

    What may be occuring is one or more PG's have a different long term retention range.  DPM will only co-locate protection groups that have the same protection range, so more tapes would be used under that condition. 

     

    Also offsite ready get flagged according to a combination of variables.

     

    Rotating Tapes Offsite
    ======================

    DPM Administrator Console indicates when a tape in the library should be removed and stored in your archive location by displaying a green icon in the Offsite Ready column. You can also view all tapes ready to be stored offsite in the Tape Management Report. The Tape Management Report lists tapes that will be due for offsite storage in the upcoming period of time selected for the report.

    When the data on a tape expires, return the tape to the tape library. Expired tapes not returned to the tape library will be marked as "overdue" in the Tape Management Report. Overdue tapes expired during an earlier reporting period. Expired tapes should be returned to the tape library for reuse.


    DPM 2007 - http://technet.microsoft.com/en-us/library/cc964296.aspx
    DPM 2010 - http://technet.microsoft.com/en-us/library/ff399230.aspx

    When tape colocation is enabled, a tape on to which offsite backup is written will not be shown as Offsite Ready unless one of the following conditions are met:

     - The tape is full.
     - One of the datasets has expired.
     - Write-period ratio has been crossed. (By default, this is first backup time + 15 per cent of retention range.)

     

     

    TapeWritePeriodRatio is not a registry. It is a global property for the DPM which needs to be set using DPM CLI command. Here is the command to set it

        Set-DPMGlobalProperty –DPMServerName <dpm server name> -TapeWritePeriodRatio <fraction>

     

    WritePeriodRatio:

    TapeWritePeriodRatio is a property that can be set when colocation is enabled on tape.  In case of non-colocated ProtectionGroup, the tape becomes marked as “offsite ready” as and when it is written on to.
     
    Colocation is enabled for PGs of identical retention periods.  However the times at which data are backed to  tape may vary for each collocated PG. In such cases, this parameter helps the user to decide how long to write to tape before sending the tape offsite, even though it might not be full.

    WritePeriodRatio indicates the number of days for which data can be written on to a tape as a ratio of the retention period of the first data set written to the tape.

    WritePeriodRatio value can be between 0.0 to 1.0

    Default value is 0.15 (i.e. 15%)


    For example, if the retention period for the first data set written to a tape is 100 days, and you set the Write-Period ratio as 0.15, data will be written to the tape for 15 days from the day on which the first data set was written to the tape. After this period, the tape will be marked as offsite ready and no more data will be written to that tape until all recovery points have expired on that tape.

    Additionally - if OptimizeTapeUsage property is set to true and if CurrentTime > First Dataset’s CreationTime + (WritePeriodRatio * RetentionRange of First Dataset) then no data will be written on to the tape till all datasets on tape are expired.


    TapeExpiryTolerance:

    indicates the number of days over the furthest expiry date among all datasets on the tape that the tape will continue to be used.

    The default value is 10 percent.

    These registry values are located under HKLM\Software\Microsoft\Microsoft Data Protection Server\1.0\Colocation.

     


    The OptimizeTapeUsage parameter should be set to True to set TapeWritePeriodRatio for a DPM server.
    The value can range from 0 - 1.

     

    ******************************************************************************************
    ExpiryToleranceRange Indicates the time window within which the expiry date of the dataset to be written to the tape must fall. It is expressed as a percentage. The default value is 17 percent.


    This is a DWORD type registry value located under HKLM\Software\Microsoft\Microsoft Data Protection Manager\1.0\Colocation. DPM does not create this key automatically. You must manually create this key if you want to set the ExpiryToleranceRange.

     

    For ExpiryToleranceRange:


     It is used in the calculation of the time window as follows:

    Let, Furthest expiry date among the expiry dates of all the Datasets already on the tape = FurthExpDate

    Time Window =

    FurthExpDate - TapeExpiryTolerance* (FurthExpDate –today’s date) (Lower Bound)
    FurthExpDate + TapeExpiryTolerance* (FurthExpDate –today’s date) (Upper Bound)

    So, the current dataset will be co-located on the given tape only if its expiry date falls within Time Window (both bounds inclusive)

     

    For TapeWritePeriodRatio:

    This parameter governs the time period for which DPM is able to go on adding Datasets to a co-located tape. It is expressed as a fraction between [0-1]. It is used in the calculation of the Time Limit as follows:

    Let, Earliest backup time among any of the datasets on the media = FirstBkUpTime

    Time Limit = FirstBkUpTime + WritePeriod Ratio*RetentionRangeOfFirstDataset

    So, a dataset will go in a co-located tape only if the current time is less than the Time Limit as defined above.

    ********************************************************************************************************

     

    Hope this help

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, August 25, 2010 9:54 PM

All replies

  • Just wanted to add the Alert that comes up asking for a tape, in-case I'm misunderstanding what tape it is asking for or something like that!


    Affected area: E:\
    Occurred since: 16/08/2010 10:17:14
    Description: Backup to Tape job is paused because the required tape Free Tape is not available in the library Stand Alone Drive Tape Library Hewlett Packard LTO Ultrium-4 drive. (ID 3315)
    More information
    Recommended action:
    Insert the tape with label Free Tape in the library through the I/E port or by unlocking the library door. Click "Continue Job" in the Details section of this alert, if you add the required tape without using the Add tape (I/E port) or Unlock Door actions on the Library tab in the Management task area.
    Continue job...
    Resolution: To dismiss the alert, click below
    Inactivate alert

    Monday, August 16, 2010 10:31 AM
  • It sounds like that tape drive is experiencing some I/O errors. The default behavior is for DPM to treat any I/O error after more than 30GB is written to tape as an “end of media” command, and thus ask for a new tape.

    DPM reads the following reg key

     [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent]
    "TapeSize"=dword:00030000

    If driver sends an IO_DEVICE_ERROR and amount of data written by DPM is more than TapeSize value (in MBs), DPM will auto convert IO_DEVICE_ERROR to END_OF_TAPE_REACHED and span to next media without any issues.

    To confirm that is the cause, you can try increasing the registry value and to lets say 900000 (900GB) and see if we span after that or if we get real I/O errors that need to be fixed.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, August 16, 2010 2:14 PM
  • Hi Mike, thanks for the reply!

     

    OK so I stopped the running job, it had put 345Gb onto that brand new 800/1.6TB tape.  I went into Libraries and marked the tape as Free, then did an Erase, tape then had Tape Label of Free(with data).  Not sure if that means it was going to overwrite the data or not?

    I then made the registry change and rebooted the server.  I then started another backup which ran for 480Gb before stopping and looking for another "Free Tape".

    Next steps?

     

    Monday, August 16, 2010 5:20 PM
  • Hi Philip,

     

    The other possibility is that the tape drive itself is reporting END_OF_TAPE_REACHED before it's actually full.  Not real sure how to troubleshoot that but you can download the DPMErasetape.Zip file from the below link, then use mct.exe and mytape.exe to mount a tape, and write to fill the tape and see what the last block number is that we write to.  If you run it multiple times and get different results each time, then that's the problem.

    http://cid-b03306b628ab886f.office.live.com/self.aspx/.Public/DPMeraseTape.zip?wa=wsignin1.0&sa=801749705#resId/B03306B628AB886F!822


    The utilities are not that user friendly, but here are the basics.

    Always Stop DPMLA Service prior to running MCT.EXE Commands.

      NET STOP DPMLA


    For MCT utility we have the  -m [MOVE] command to move media around inside the library.

    -m [ElemType-T] Transport# [ElemType-Source] S_lot#/D_rive# [ElemType-Destination] S_lot#/D_rive#

    Get / view command syntax for –m (move) command for changer 0

    C:\tools\mct>mct-x64 0 -m

    Opening changer \\.\Changer0
    MoveMedium : mct -m t N s\d N s\d N   [Where s/d means Slot or Drive and N is ZERO based].

     

    Some Examples:

    mct-x64 -m t 0 s 0 d 0    (Using transport-0, move media from slot-0  to drive-0)
    mct-x64 -m t 0 d 0 s 0    (Using transport-0, move media from drive-0 to slot-0)
    mct-x64 -m t 0 s 0 s 100  (Using transport-0, move media from slot-0  to slot-100)
    mct-x64 -m t 0 d 0 d 1    (Using transport-0, move media from drive-0 to drive-1)
    mct-x64 -m t 0 s 0 ie 0   (Using transport-0, move media from slot-0  to IEPort 0)

     

    Once you move a tape into a slot, use mytape commands Loadtape, taperewind, locktape, Disable hardware compression, Set block size to 65536 (64K), writeforspanning.

    You need the symbolic name for the tape drive you loaded media into - look in the DPM console by clicking the tape drive and look at the details for \\.\tape########.  use that in the following command.

     

    Mytape.exe \\.\Tape2147483638

    Status: Getting the handle for \\.\Tape2147483638...Success

    \\.\Tape2147483638>TapeConsole_1.0>loadtape

    Status: Loading the tape...Success

    \\.\Tape2147483638>TapeConsole_1.0>taperewind

    Status: Rewinding Tape ...Success

    \\.\Tape2147483638>TapeConsole_1.0>locktape

    Status: Locking the tape...Success

    \\.\Tape2147483638>TapeConsole_1.0>setdriveinfo

    Hardware error correction  [y]-Enable / [n] Disable : y
    Hardware data compression  [y]-Enable / [n] Disable : N   (BE SURE TO DISABLE)
    Data padding  [y]-Enable / [n] Disable : n
    Setmark reporting   [y]-Enable / [n] Disable : n
    Number of bytes between the end-of-tape warning and the physical end of the tape: 0
    Status: Setting Drive Information...Success


    \\.\Tape2147483638>TapeConsole_1.0>settapeinfo 65536

    Enter BlockSize: Status: Setting Tape Information...Success

    \\.\Tape2147483638>TapeConsole_1.0>writeforspanning

    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1100                 (net helpmsg 1100 = The physical end of the tape has been reached.
    Number of bytes written: 983040     (Ignore bytes written, we'll get physical tape position later)
    Giving up
    Time taken: 15788ms

    \\.\Tape2147483638>TapeConsole_1.0>gettapeposition

    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
                          Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,32015    (Note these numbers, they should not change between tests)
    Current tape partition: 0
    Current tape position: 0,32015

    \\.\Tape2147483638>TapeConsole_1.0>taperewind

    Status: Rewinding Tape ...Success

     

    REPEAT \\.\Tape2147483638>TapeConsole_1.0>writeforspanning

    When finished testing, erase the tape:

    \\.\Tape2147483638>TapeConsole_1.0>erasetape s

    Short erase / Long Erase [s/l]:Status: Erasing the tape...Success

     

    let me know what you find.


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, August 16, 2010 6:58 PM
  • OK so I've upgraded the firmware on the drive, confirmed that the latest Windows driver is installed and am using new tape media.  Output from the first run of writeforspanning is below, I'm doing it again but I think the error might be of use

     

    Status: Getting the handle for \\.\Tape2147483646...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>loadtape
    
    Status: Loading the tape...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>taperewind
    
    Status: Rewinding Tape ...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>locktape
    
    Status: Locking the tape...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>setdriveinfo
    
    Hardware error correction [y]-Enable / [n] Disable : y
    Hardware data compression [y]-Enable / [n] Disable : n
    Data padding [y]-Enable / [n] Disable : n
    Setmark reporting [y]-Enable / [n] Disable : n
    Number of bytes between the end-of-tape warning and the physical end of the tape
    : 0
    Status: Setting Drive Information...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>gettapeinfo
    
    Status: Getting Tape Information...Success
    ------------------------------------------------------------------
          Media Information
    ------------------------------------------------------------------
    Total number of bytes on the current tape partition: 195,1579155456
    Number of bytes between the current position and the end of the current tape par
    tition: 195,1579155456
    Number of bytes per block: 65536
    Number of partitions on the tape: 0
    Write protect status: Not write protected
    
    \\.\Tape2147483646>TapeConsole_1.0>settapeinfo 65536
    
    Enter BlockSize: Status: Setting Tape Information...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>writeforspanning
    
    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1117
    Number of bytes written: 0
    Giving up
    Time taken: 2115872ms

    \\.\Tape2147483646>TapeConsole_1.0>gettapeposition

    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
    Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,3645996
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,3645996

     

    Thursday, August 19, 2010 3:11 PM
  • Exactly - that error is: ErrorCode: 0x45D (1117):

    The request could not be performed because of an I/O device error.

     

    C:\>net helpmsg 1117

    The request could not be performed because of an I/O device error.

     

    Try cleaning the drive and try again.

    You need this test to fail with an error 1100 which is "The physical end of the tape has been reached" - until you fix that, DPM will not utilize the tape properly.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, August 19, 2010 3:49 PM
  • One more run for reference -

     

     

    \\.\Tape2147483646>TapeConsole_1.0>taperewind
    
    Status: Rewinding Tape ...Success
    
    \\.\Tape2147483646>TapeConsole_1.0>writeforspanning
    
    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1117
    Number of bytes written: 0
    Giving up
    Time taken: 1875163ms
    \\.\Tape2147483646>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
          Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,3236096
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,3236096
    
    \\.\Tape2147483646>TapeConsole_1.0>

     

     

     

    It's unfortunate that DPM is returning I/O errors as "end of tape, it's quite misleading and would worry you a bit about the data that has actually ended up on the tape.  This error indicates a CRC failure, so is the failed block re-transmitted to the next tape or not when the job is continued?  The reason I ask is that there are a number of people reporting similar issues on the Internet and are just assuming this behaviour of needing extra tapes is "ok" if a bit odd so ignoring it.  Maybe this could be queried with the dev team Mike?  I'm happy to troubleshoot anything they need me to in the next day or so to try and get this right.

     

    Thanks for your help so far.

    Philip

     

     

    Thursday, August 19, 2010 4:08 PM
  • the other strange thing is that BackupExec worked fine to this drive, there are no I/O errors in the Windows Event Logs and a full HP LTT test ran the other day (2 hour) passed.  Running a HP test overnight again now, 13 hour write test, will see how this goes.
    Thursday, August 19, 2010 4:36 PM
  • Hi Philip,

     

    We had to change the design of "end-of-tape" processing because some tape drive manufacturers report an i/o error instead of the proper end of tape error when EOT is reached.  If we failed the job for this bogus i/o error, we could never perform a backup to those tape drives.  As a compromise, we selected this behavior.  The data already writen to the tape is valid and we span correctly, but in your case above, we're actually getting a real i/o error before end of tape is reached.

    The 1st test failed at block  3645996 * 65536 = 238,943,993,856 bytes
    The 2nd test failed at block 3236096 * 65536 = 212,080,787,456 bytes

    Notice the inconsistency - those are real i/o errors.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, August 19, 2010 4:47 PM
  • Hi Mike,

    That makes sense alright.  What I'm getting at is that other users who are experiencing I/O errors may be ignoring them and just putting it down to DPM craziness asking for extra tapes, and sure if you've got a loader it doesn't matter anyway as it will just move onto another tape.  However what happens on one of these errors, is the failed block resent to the next tape?

    What manufacturers are we talking about here?  A whitelist on this rule for manufacturers that are reporting correct EOT might be an idea?  I could be very wrong, but I'd assume HP/Dell etc. report correctly and represent a healthy portion of the market.

     

    Should I not be seeing I/O errors in the Windows Event Viewer?

    Thursday, August 19, 2010 4:56 PM
  • Hi Mike,

    I've replaced the SCSI cable, still same error at tape position 0,4491696.  Before I contact HP is there anything else I should try from the DPM side of things?  There are a couple of reasons I ask -

     

    1. I understood I should see I/O errors in the Windows event logs in case of an error like this, I don't.  The drive also has no errors in its internal logs.

    2. I ran ALL the "HP StorageWorks Library and Tape Tools" on the drive and tape, one of which was writing a full tape 5 times overnight.  All tests passed.

    3. This drive and server were backing up disk-disk-tape using BackupExec until a couple of weeks ago when I rebuilt it and installed DPM, so unless BackupExec can 'continue on error' would I not have seen issues then?

     

     

    Thanks for your replies so far!

     

    Friday, August 20, 2010 1:37 PM
  • Hi Philip,

     

    It would be up to the tape driver to report the i/o error in the event log, I'm puzzled why that isn't being logged.  

    Instead of using the command writeforspanning - issue the comand WriteTillEnd - this will be slower because it outputs to the console each time it writes a buffer, but lets see if it fails at around the same spot or get different results.  By chance, have you used different tapes with same results ?

    Concerning backupExec, do they install their own tape drivers, or do they use the same drivers that DPM is currently using ?


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Friday, August 20, 2010 3:02 PM
  • I've only used the one new tape while doing the writeforspanning runs, but DPM has been stopping mid-tape of all tapes I've used so far (4-5).

    Yes BackupExec does tend to use its own tape drivers (it's an install time option).

    Below are two results from writetillend.  There were a whole load 1100 errors before this but I can't scroll up any further to see what was before them.  These two finished very close to each other, both close to the 800GB point (800/1600GB tapes).  I'm going to run a few more and will post in the morning.

     

    Retrying ...Failed !!!
    Error_ID reported: 1100
    Number of bytes written: 655360
    Retrying ...Failed !!!
    Error_ID reported: 1100
    Number of bytes written: 655360
    Retrying ...Failed !!!
    Error_ID reported: 1100
    Number of bytes written: 655360
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Giving up
    Time taken: 7653830ms
    \\.\Tape2147483646>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
       Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,12903420
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,12903420

     

     

     

     

     

     

     

    Number of bytes written: 655360
    Retrying ...Failed !!!
    Error_ID reported: 1100
    Number of bytes written: 655360
    Retrying ...Failed !!!
    Error_ID reported: 1100
    Number of bytes written: 655360
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Retrying ...Failed !!!
    Error_ID reported: 1129
    Number of bytes written: 0
    Giving up
    Time taken: 7671474ms
    \\.\Tape2147483646>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
          Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,12904880
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,12904880

     

     

     

     

    Saturday, August 21, 2010 8:50 PM
  • Hi,

    Yes, I would expect to see these errors at the end of the tape.

    C:\>net helpmsg 1100

    The physical end of the tape has been reached.


    C:\>net helpmsg 1129

    Physical end of tape encountered.

    12904880 * 65536 = 845,638,533,120 (845gb)

    I guess I'm puzzled why writeforspanning is failing consistently around ~200GB with an 1117 I/O error, but writetillend seems to run to completion and doesn't get the 1117 I/O error.  When I get in the office on monday, I'll try the same test with a 800gb tape.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Sunday, August 22, 2010 2:03 PM
  • Thanks Mike, again I appreciate the help.  I'll keep running these and another couple of writeforspanning and post results.

     

    1. writetillend test -

    Error_ID reported: 1129 Number of bytes written: 0 Giving up Time taken: 7652863ms \\.\Tape2147483646>TapeConsole_1.0>gettapeposition Status: Getting Tape Position...Success ------------------------------------------------------------------ Tape Position ------------------------------------------------------------------ Absolute Position: Current tape partition: 0 Current tape position: 0,12903320 Logical Position: Current tape partition: 0 Current tape position: 0,12903320

     

     

    2. writeforspanning test -

    \\.\Tape2147483646>TapeConsole_1.0>taperewind Status: Rewinding Tape ...Success \\.\Tape2147483646>TapeConsole_1.0>writeforspanning Status: Writing onto tape...Failed !!! Error_ID reported: 1117 Number of bytes written: 0 Giving up Time taken: 66316ms \\.\Tape2147483646>TapeConsole_1.0>gettapeposition Status: Getting Tape Position...Success ------------------------------------------------------------------ Tape Position ------------------------------------------------------------------ Absolute Position: Current tape partition: 0 Current tape position: 0,114396 Logical Position: Current tape partition: 0 Current tape position: 0,114396

     

     

     

     

    3. writetillend test -
    Retrying ...Failed !!! Error_ID reported: 1129 Number of bytes written: 0 Giving up Time taken: 7683237ms \\.\Tape2147483646>TapeConsole_1.0>gettapeposition Status: Getting Tape Position...Success ------------------------------------------------------------------ Tape Position ------------------------------------------------------------------ Absolute Position: Current tape partition: 0 Current tape position: 0,12906170 Logical Position: Current tape partition: 0 Current tape position: 0,12906170

     

     

     

    4. writetillend test -

    Error_ID reported: 1129 Number of bytes written: 0 Giving up Time taken: 7679508ms \\.\Tape2147483646>TapeConsole_1.0>gettapeposition Status: Getting Tape Position...Success ------------------------------------------------------------------ Tape Position ------------------------------------------------------------------ Absolute Position: Current tape partition: 0 Current tape position: 0,12907180 Logical Position: Current tape partition: 0 Current tape position: 0,12907180

     

    Sunday, August 22, 2010 4:12 PM
  • Hi,

     

    So, your tape drive does not like the writeforspanning commands, I need to get details from the DPM dev group what the different is in the IOCTL's we send to the tape for each of those commands.

     

    FYI - results of my 400GB writeforspanning test on my system.  Note the single error 1100 = The physical end of the tape has been reached and successfully filled the tape to capacity, so we know the mytape utilty works and doesn't have any problems using that writeforspanning command.

    <snip>
    \\.\Tape2147483646>TapeConsole_1.0>writeforspanning

    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1100
    Number of bytes written: 2293760
    Giving up
    Time taken: 3189862ms
    \\.\Tape2147483646>TapeConsole_1.0>gettapeposition

    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
                          Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,6552835
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,6552835
    >snip<

    Where block 6552835*65536 = 429,446,594,560 bytes.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, August 23, 2010 5:19 PM
  • Hi,

     

    This is a very interesting thread and would also like to provide feedback if it will help Microsoft. I have the following issue. I have a Spectralogic T120 fibre attached with 2x LTO5 drives, although I use LTO4 tapes. Data Protection Manager 2010 is backing up short term data to a fibre attached SAN.

    I have 6 protection groups

    File Servers

    Domain Controllers

    Hyper-V saved state

    Hyper-V non saved state

    SQL - 2000 and 2005

    Exchange - 2003

     

    I have co-location enabled and data is certainly getting co-located, but I am getting wildly different amounts of data on a tape

    On a tape with SQL and Hyper-V data I get as low as 100GB, at most I get 400GB with Domain Controller, Hyper-V saved state and Exchange on it

    On a tape that only has File Servers I get up to a 1TB of data.

    I have told each protection group to use 2 drives. It also looks like the File Server Protection Group isn't Co-Locating or it just happens to do all the other protection groups first. All the protection groups are told to start a tape backup at 6pm.

     

    Any insight into this would be most welcome. I am using 11 tapes and it looks like I should be using about 7-8.

     

    Best wishes

     

    Michael

     

    Wednesday, August 25, 2010 5:46 PM
  • MichaelC,

    What may be occuring is one or more PG's have a different long term retention range.  DPM will only co-locate protection groups that have the same protection range, so more tapes would be used under that condition. 

     

    Also offsite ready get flagged according to a combination of variables.

     

    Rotating Tapes Offsite
    ======================

    DPM Administrator Console indicates when a tape in the library should be removed and stored in your archive location by displaying a green icon in the Offsite Ready column. You can also view all tapes ready to be stored offsite in the Tape Management Report. The Tape Management Report lists tapes that will be due for offsite storage in the upcoming period of time selected for the report.

    When the data on a tape expires, return the tape to the tape library. Expired tapes not returned to the tape library will be marked as "overdue" in the Tape Management Report. Overdue tapes expired during an earlier reporting period. Expired tapes should be returned to the tape library for reuse.


    DPM 2007 - http://technet.microsoft.com/en-us/library/cc964296.aspx
    DPM 2010 - http://technet.microsoft.com/en-us/library/ff399230.aspx

    When tape colocation is enabled, a tape on to which offsite backup is written will not be shown as Offsite Ready unless one of the following conditions are met:

     - The tape is full.
     - One of the datasets has expired.
     - Write-period ratio has been crossed. (By default, this is first backup time + 15 per cent of retention range.)

     

     

    TapeWritePeriodRatio is not a registry. It is a global property for the DPM which needs to be set using DPM CLI command. Here is the command to set it

        Set-DPMGlobalProperty –DPMServerName <dpm server name> -TapeWritePeriodRatio <fraction>

     

    WritePeriodRatio:

    TapeWritePeriodRatio is a property that can be set when colocation is enabled on tape.  In case of non-colocated ProtectionGroup, the tape becomes marked as “offsite ready” as and when it is written on to.
     
    Colocation is enabled for PGs of identical retention periods.  However the times at which data are backed to  tape may vary for each collocated PG. In such cases, this parameter helps the user to decide how long to write to tape before sending the tape offsite, even though it might not be full.

    WritePeriodRatio indicates the number of days for which data can be written on to a tape as a ratio of the retention period of the first data set written to the tape.

    WritePeriodRatio value can be between 0.0 to 1.0

    Default value is 0.15 (i.e. 15%)


    For example, if the retention period for the first data set written to a tape is 100 days, and you set the Write-Period ratio as 0.15, data will be written to the tape for 15 days from the day on which the first data set was written to the tape. After this period, the tape will be marked as offsite ready and no more data will be written to that tape until all recovery points have expired on that tape.

    Additionally - if OptimizeTapeUsage property is set to true and if CurrentTime > First Dataset’s CreationTime + (WritePeriodRatio * RetentionRange of First Dataset) then no data will be written on to the tape till all datasets on tape are expired.


    TapeExpiryTolerance:

    indicates the number of days over the furthest expiry date among all datasets on the tape that the tape will continue to be used.

    The default value is 10 percent.

    These registry values are located under HKLM\Software\Microsoft\Microsoft Data Protection Server\1.0\Colocation.

     


    The OptimizeTapeUsage parameter should be set to True to set TapeWritePeriodRatio for a DPM server.
    The value can range from 0 - 1.

     

    ******************************************************************************************
    ExpiryToleranceRange Indicates the time window within which the expiry date of the dataset to be written to the tape must fall. It is expressed as a percentage. The default value is 17 percent.


    This is a DWORD type registry value located under HKLM\Software\Microsoft\Microsoft Data Protection Manager\1.0\Colocation. DPM does not create this key automatically. You must manually create this key if you want to set the ExpiryToleranceRange.

     

    For ExpiryToleranceRange:


     It is used in the calculation of the time window as follows:

    Let, Furthest expiry date among the expiry dates of all the Datasets already on the tape = FurthExpDate

    Time Window =

    FurthExpDate - TapeExpiryTolerance* (FurthExpDate –today’s date) (Lower Bound)
    FurthExpDate + TapeExpiryTolerance* (FurthExpDate –today’s date) (Upper Bound)

    So, the current dataset will be co-located on the given tape only if its expiry date falls within Time Window (both bounds inclusive)

     

    For TapeWritePeriodRatio:

    This parameter governs the time period for which DPM is able to go on adding Datasets to a co-located tape. It is expressed as a fraction between [0-1]. It is used in the calculation of the Time Limit as follows:

    Let, Earliest backup time among any of the datasets on the media = FirstBkUpTime

    Time Limit = FirstBkUpTime + WritePeriod Ratio*RetentionRangeOfFirstDataset

    So, a dataset will go in a co-located tape only if the current time is less than the Time Limit as defined above.

    ********************************************************************************************************

     

    Hope this help

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, August 25, 2010 9:54 PM
  • Hi Mike,

     

    Thanks for the reply. All protection groups have the same retention period for short term and long term. Long Term is one week as we are still testing. I will read the above, but I am on holiday today, but wanted to get this back to you asap.

    Best wishes

     

    Michael,

    Thursday, August 26, 2010 12:32 PM
  • Hi Mike,

     

    Any feedback from the dev team?  I'm just anxious to get this sorted as the customer cannot presently take offsite backups.

     

    Regards,

    Philip Harrison

    Monday, August 30, 2010 4:52 PM
  • Hi Philip,

    I did not have any problems with the writeforspanning command on my test library, so i did not pursue further getting more details on the differeneces in commands.

    I think the correct action on your part would be to contact the OEM manufacturer of the tape drive and supply the same test tools to them and let them debug their driver / firmware problem, or in the case they cannot dulpicate the problem, see if they will replace the drive under warrantee.


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, August 30, 2010 5:12 PM
  • Hi Mike,

     

    Thanks for the reply.  I have logged a call with HP and they are looking into it, however they have already come back to say that there are no errors logged on the drive so in their opinion there is no issue with it.  I would also point out that writetillend and BackupExec are able to use this drive without issue.  So I wonder if you could at least ask the dev team what the differences might be between writetillend and writeforspanning so I can at least pass this on to HP for them to look further into the issue.  As soon as I say to them that BackupExec/writetillend are working they'll (understandably) say this is an inconsistency in the software and not something they can fix.

    I just want to avoid the customer or me getting stuck in the middle of an issue that we can no longer do anything else to rectify.  If this is not the correct forum to troubleshoot this further by all means let me know who I should contact.

     

    Again thanks for your help so far.

    Regards.

    Tuesday, August 31, 2010 5:02 PM
  • Hi Mike,

     

    I got an all time low today of 21,719MB on a tape. I have put a screenshot of my tape usage here - http://dl.dropbox.com/u/262417/tape_usage.png

    The tape usage is widely different for all PG except file servers. The retention times are the same for long and short term. As you can see from the screenshot the only PG that doesn't want to seem to colocate is Exchange, but even saying that the tape usage is goes from 21GB to 1TB. I don't believe I should be using this many tapes.

     

    Best wishes

     

    Michael

    ]

    Wednesday, September 1, 2010 2:38 PM
  • Hi MichaelC

    Can you check which of these three may be the cause for offsite ready being set.

     - The tape is full.  (or we got I/O error and caused backup to span tapes as noted earlier in the thread)
     - One of the datasets has expired.
     - Write-period ratio has been crossed. (By default, this is first backup time + 15 percent of retention range.)


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, September 1, 2010 2:48 PM
  • Hi,

     

    I have just got off the phone with Spectralogic tech support and they tell me they're seeing issues with LTO4 media in IBM LTO5 drives. Some residue is coming off the tapes and getting on the drives and this is leading to excessive clean requests. My log files confirmed this.

    IBM is looking into this. HP have a little brush before the head that stops this. The IBM drives don't.

     

    Best wishes

     

    Michael

     

    Thursday, September 9, 2010 4:20 PM
  • Hi Mike,

    Bad news - HP sent me a new tape drive.  Results below, still the same issue.  So just to summarise - exact same server and hardware was backing up with BackupExec for 2-3 years.  Rebuilt with 2008 Server and DPM 2010.  DPM is showing errors doing tape backups and writeforspanning command.  writetillend and HP test tools all work perfectly.  Tried new SCSI cable, tape drive and multiple tapes.  Let me know how I should proceed here as I need to get this customer backing up ASAP or they will be going back to BackupExec pronto!

     

    \\.\Tape2147483644>TapeConsole_1.0>writeforspanning
    
    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1117
    Number of bytes written: 0
    Giving up
    Time taken: 4856623ms
    \\.\Tape2147483644>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
               Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,8427196
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,8427196
    
    \\.\Tape2147483644>TapeConsole_1.0>taperewind
    
    Status: Rewinding Tape ...Success
    
    \\.\Tape2147483644>TapeConsole_1.0>writeforspanning
    
    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1117
    Number of bytes written: 0
    Giving up
    Time taken: 951918ms
    \\.\Tape2147483644>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
               Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,1642396
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,1642396
    
    \\.\Tape2147483644>TapeConsole_1.0>unlocktape
    
    Status: Unlocking the tape...Success
    
    \\.\Tape2147483644>TapeConsole_1.0>loadtape
    
    Status: Loading the tape...Success
    
    \\.\Tape2147483644>TapeConsole_1.0>rewindtape
    
    Unrecognized command: rewindtape
    
    \\.\Tape2147483644>TapeConsole_1.0>taperewind
    
    Status: Rewinding Tape ...Success
    
    \\.\Tape2147483644>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
               Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,0
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,0
    
    \\.\Tape2147483644>TapeConsole_1.0>writeforspanning
    
    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1117
    Number of bytes written: 0
    Giving up
    Time taken: 466474ms
    \\.\Tape2147483644>TapeConsole_1.0>gettapeposition
    
    Status: Getting Tape Position...Success
    ------------------------------------------------------------------
               Tape Position
    ------------------------------------------------------------------
    Absolute Position:
    Current tape partition: 0
    Current tape position: 0,797996
    Logical Position:
    Current tape partition: 0
    Current tape position: 0,797996
    
    \\.\Tape2147483644>TapeConsole_1.0>

     

     

    Tuesday, September 14, 2010 11:43 AM
  • Hi,

    I think you need to work with HP (send them the tape test tools) so they can reproduce and debug the problem in-house.  If they cannot reproduce the problem, this there is still something wrong wth the customers system, scsi controller, or terminator. 


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, September 14, 2010 5:53 PM
  • Hi Mike,

    I already sent HP the LTT results, they saw nothing wrong but ebcause I was hassling them so much they said they'd send out a drive to try. 

    Could you speak to the dev team and find out from them what is the difference between the writetillend and writeforspanning command?  The scenario here is that BackupExec works fine, your own writetillend tool works fine, but your writeforspanning does not - now although I agree this could be a hardware issue I cannot start debugging writeforspanning to try and find out what it does differently and work back from there, and you surely can't expect HP to replace a server when all tools and software (even one of your own) work perfectly except one?  If I need to log a call with PSS instead of dealing with it through the forum then fine, but the customer and us being stuck between HP and MS is a bit disappointing.

    Philip

    Wednesday, September 15, 2010 7:49 AM
  • Hi Philip,

    I've ran the tool myself on three different libraries, both physical and virtual and have no problems with the writeforspanning command working perfect and filling tapes to full capacity, I trully believe you have some other problem on that server.  

    Please have HP download the tools themselves from the link below:

    http://cid-b03306b628ab886f.office.live.com/self.aspx/.Public/DPMeraseTape.zip?wa=wsignin1.0&sa=801749705#resId/B03306B628AB886F!822

     

    Then ask them to try to reproduce the problem using your exact steps, if they experience the same issue, they can diagnose it in-house and if they need to, they can open a case with Microsoft for collaberation.

    If they cannot reproduce the problem, that puts them on the hook to start replacing other components.

    There is another tool called irptracker.exe (bing it) that can prove useful, it will show all the ioclt commands sent to the drive and show status of each command which should detail the combination of commands the tape drive is not processing properly. Unfortunatly, there is only a x86 version, so you would need to move the tape library to a x86 system before you can use it.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, September 15, 2010 1:27 PM
  • Just a quick update on this.  I spoke to HP again, they denied strongly that the problem was there's, claimed DPM incompatibilities with the drive etc etc.  They finally had me try their Data Protector; it was showing errors after a few gig of backing up to the tape!  Suddenly they suggested I try creating the registry key

    [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\Sequential&Ven_HP&Prod_Ultrium_4-SCSI\6&c0db181&0&000300\Device Parameters\Storport]
    "BusyRetryCount"=dword:0000004b

    Rebooted the server and ran 'writeforspanning' again, now it gets to tape position 12420080, which is 758Gb which I imagine is correct for a 800/1600Gb tape.  Running it again now to be sure, but it's never got that far before so that might be it!

     

     

     

    • Proposed as answer by NeighborGeek Tuesday, January 11, 2011 7:43 PM
    Friday, September 17, 2010 5:37 PM
  • Tape position 12420080 again on this test, so looks like this may have sorted it!  I'll run a DPM backup over the weekend and see how it goes.

    Not sure if this is a HP/MS issue in the end, might be worth passing it to the dev team for something to look out for in the Library Agent in future...

     

    thanks for your help Mike!

    Saturday, September 18, 2010 7:03 PM
  • Hi - thanks for confirming the solution.

    We have had other customer report probelm with HP tape drives where the registry setting seems to fix the problem.  However, they usually see the i/o error as part of a failed job, yours was hidden from view which made it less obvious and harder to diagnose - thanks for sticking with me.

    Also DPM is not the only backup application that fails without that key in place.

    http://www.symantec.com/connect/forums/backup-exec-10d-101-and-hp-18-autoloader-g2


     http://social.technet.microsoft.com/Forums/en-US/dataprotectionmanager/thread/de6b2569-55f1-4027-86bc-0037c8c6a10e

    http://social.technet.microsoft.com/Forums/en-US/dpmtapebackuprecovery/thread/afe319ec-0df6-4eea-acdd-ca5f920e3dab

    So - for any other users of HP tape drives / libraries, please refer to the HP tech support recommendation:

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=254838&prodTypeId=12169&prodSeriesId=254838&objectID=c01655150

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Saturday, September 18, 2010 7:47 PM
  • Just a final update on this.  I've just run a DPM backup to tape, it is working as expected now, filling tapes.

    Other then this issue I have to say I am VERY impressed with DPM!  Compared to the Symantec/HP equivalent offerings there really is no comparison.

     

    Thanks again for your help Mike.

    Monday, September 20, 2010 11:34 AM
  • Just a quick update on this.  I spoke to HP again, they denied strongly that the problem was there's, claimed DPM incompatibilities with the drive etc etc.  They finally had me try their Data Protector; it was showing errors after a few gig of backing up to the tape!  Suddenly they suggested I try creating the registry key

     

    [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\Sequential&Ven_HP&Prod_Ultrium_4-SCSI\6&c0db181&0&000300\Device Parameters\Storport]
    "BusyRetryCount"=dword:0000004b

     

    Rebooted the server and ran 'writeforspanning' again, now it gets to tape position 12420080, which is 758Gb which I imagine is correct for a 800/1600Gb tape.  Running it again now to be sure, but it's never got that far before so that might be it!

     

     

     

    This registry key fixed issues with 3 of my clients with HP Tape libraries, thanks very much. As a bonus we are now only using 2 tapes in DPM with colocation enabled (upto 1.2Tb per tape) instead of spread across 4 or 5 tapes.
    Wednesday, December 1, 2010 11:03 AM
  • I'm having the same problem with my Quantum Superloader3 Any idea what the equivalent key would be for that drive?

     

    I'm thinking it's probably:

    [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\Sequential&Ven_QUANTUM&Prod_ULTRIUM_3\6&4f6a661&0&000100\Device Parameters\MediaChangeNotification\]

     

    but I'm not sure.

    Monday, January 16, 2012 8:07 PM
  • Hi,

    Under [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\Sequential&Ven_QUANTUM&Prod_ULTRIUM_3\6&4f6a661&0&000100\Device Parameters key, make a new Storport key.  Under Storport, make a new DWORD VALUE called BusyRetryCount and assign it a value of  000000FA hex.


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, January 24, 2012 5:12 PM
  • Hi Technobeam,

    Did you resolve this issue with your superloader3?

    cheers,

    Monday, May 13, 2013 3:08 PM