none
Problem with tape and event id 15 and 11 RRS feed

  • Question

  • Hello,

    I use DPM 2007 on a Dell DP600 server with windows 2003 storage server os. For longtime storage we now have bought a Tandberg Data StorageLoader LTO4. It is noticed to be compatible to DPM 2007. It is connected to the DPM-server with SCSI LVD 320 and adaptec SCSI controller card. Sometimes the protection group with the longtime tape protection passes successfully, but mostly it failed with error the following error:

    Library drive Hewlett Packard LTO Ultrium-4 drive in Tandberg Data StorageLoader/StorageLibrary is not functioning and library jobs may fail until the drive is repaired. The drive is not functioning for the following reason:
    
    (ID: 3303)
    
    DPM encountered a critical error while performing an I/O operation on the tape DC02FUE-LT-1Days-Copy0-00000014 (Barcode - ) in Drive Hewlett Packard LTO Ultrium-4 drive. (ID: 30101)

    In the system event log i get the following errors:

    event id 15

    The device, \Device\Scsi\adpu3201, is not ready for access yet.

    event id 11

    The driver detected a controller error on \Device\TapeDrive0.

    I have changed the slot of adapter card, updated firmware of components, installed the latest driver and so on, but it didn't solve these problems. The HP-diagnostic-tool passed successfully without any of these errors in the event log of windows storage server. I have read the microsoft techcenter article on these errors and because of successfully runs of the diagnostic tools, I really think that the scsi is configured well and also be terminated in the right way.

    The version of DPM 2007 is the latest, I think. DPM 2007 vers. 2.0.8861.0

    Is there a known problem with the dpm 2007 working over scsi? Any other ideas to solve this problem?

     

    Monday, August 2, 2010 5:50 AM

Answers

All replies

  • Hello, DPM seems to be the victim of a hardware related problem. The error being reported by the event messages seem to indicate that the device is dropping offline for a brief moment. Please check your termination and cabling, and you may need play with transfer speed setting in the BIOS of the scsi controller. It has been my experience that OEM hardware diagnostics utilities rarely report problems because they sometimes mask soft errors and have lots of retry logic, but if you run it over and over while also doing heavy DISK I/O, it may bubble the problem up show a failure.
    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, August 2, 2010 3:33 PM
    Moderator
  • Hello Mike,

    thanks a lot for answering. I have runned the HP diagnostic tools many times over some days without any failure. I also have reduced the transfer speed in the BIOS of my scsi controller, nothing solved the problem yet. Is it possible, that the microsoft service "RSM removable storage manager" caused this problem?

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=1143953&prodTypeId=12169&objectID=c00023443

    I'm working on this to test for some results.

     

     

    EDIT: I have deactivated the RSM, it doesn't solve the problems!

    Tuesday, August 3, 2010 5:21 AM
  • Hi,

    Does the tape backup run for awhile before getting the I/O error ? 

    Try adding the following registry value:

    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Data Protection Manager\Agent]
    "BufferQueueSize"=dword:00000002

    Without the registry key the default buffer queue size is 10. Every time the buffer is written it compares the queue size to the value. When it is set to 1 the MTFLIB_S_QUEUE_FULL is returned.

    This avoids putting the buffer in a queue that is flushed later.

    The setting of 1 flushes the data to tape for every queue and in turn keeping the tape header information updated more frequently.

     

    If that does not help, try a different scsi controller.

     


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, August 3, 2010 3:10 PM
    Moderator
  • Hi,

    sometimes the backup runs successfully but normally it fails. Sometimes it fails after transfering 14GB sometimes it fails after transfering 33GB or more.

    I try the registry key method you told me and look if it solves the problem.

    Wednesday, August 4, 2010 6:21 AM
  • Hello,

    know I have a new storage-loader with a new scsci-cable and a new scsi terminator. I have connected the storage-loader and now the DPM 2007 doesn't notice that there is a library only that there is a HP standalone drive. As offen i scanned for the library, it doesn't match. In the system both are installed and seem to run well.

    Booting up the system, the scsi adapter noticed both, the drive and the lib. Both with ID 0 and different LUNs (1 and 0)

    Any ideas?

    Monday, August 9, 2010 4:01 PM
  •  

    Try running the DPMDriveMapping.exe utility.

    See http://technet.microsoft.com/en-us/library/bb795782.aspx for running DPMDriveMapping.exe from <DPM Install>\Bin folder.


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, August 9, 2010 11:45 PM
    Moderator
  • Thanks a lot, now it works. I hope without any errors!
    Tuesday, August 10, 2010 8:05 AM
  • Hello again,

    after running 8 tapes successfully protecting one of three harddisks, I configured the protection group to protect all of the three disks. Now I have again the problem with the verification of the data. In the event view there are errors with id 11 and id 15

    event id 15

    The device, \Device\Scsi\adpu3201, is not ready for access yet.

    event id 11

    The driver detected a controller error on \Device\TapeDrive0.
    The dpm reports the following error id 2001 at computer state protection and at hdd c:\
    Thursday, August 12, 2010 7:04 AM
  • Please work with DELL to resolve this problem, DPM is the victim of a hardware problem.
    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, August 12, 2010 6:49 PM
    Moderator
  • Hello again,

    i don't think that it is really a hardware problem because I have tested the backup now without data verification and then it works without these errors. Also I have tested the SCSI with an other Backup-Tool and it works fine.

    DPM protects three harddisks and write it on tape but if verification starts the tape is ejected by DPM and inserted again. Then the ID 11 and ID 15 error appears.

    Thursday, August 26, 2010 7:54 AM
  • Doesn't anyone has another idea? I describe the whole situation again because I don't think that it is a hardware problem. Tandberg itselfs don't know how to solve this problem and also don't think that it is a hardware problem.

    I have one protection group for a domain controller. The domain controller has three hard disks and these hard disks and the computer state is protected by DPM2007. Running the backup job on tape without verification all jobs are completed without any errors. Only if I switched verification on the error appears with id 11 and id 15 in the event view (system) and in the DPM with error ID 2001.

    It is like the problem posted here: http://www.eggheadcafe.com/software/aspnet/31509230/tape-verification-failure.aspx

    But there isn't any answer...

    Thursday, September 9, 2010 8:20 AM
  • Hi,

    You mention that the errors only occur if you enable the verify option for the tape backups.  What happens if you don't verify, then after the backup completes, restore the whole tape to an alternate location - does it restore without getting errors ?

    Easiest way to restore the whole tape is to right-click the tape in the DPM library, select view details, click the checkbox next to the data sets, then click on copy - choose a location to write the data to, and then OK.   This should be pretty much equivelent to a tape verify, we need to read each block on the tape to restore the data. 


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, September 9, 2010 11:06 PM
    Moderator
  • I have disabled verification and restored the whole tape without any error. But it isn't really optimal to make backups without verification, I think.
    Friday, September 10, 2010 6:01 AM
  • Actually - I should have asked you leave verification on to see if the tape reports errors, then try to restore that tape and see if it was successful.  Can you retry the test.   Also, can you tell me if the errrors are reported at the beginning, middle, or end of the verify, or does it vary, or can you tell ?  
    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Friday, September 10, 2010 2:40 PM
    Moderator
  • I don't know if this is related or not, but we own a Tandberg T24 library and use it with DPM 2007. It works flawlessly...now. I spent quite some time back and forth with Tandberg trying to get it working and the final resolution ended up being to use the generic LTO.sys driver that comes with windows instead of the manufacturer driver. Once I switched, all of the many problems dissappeared.
    Friday, September 17, 2010 5:36 PM
  • Oh, and also make sure that there are not any other applications that use the windows removable storage service. DPM needs this serviced stopped. In my case DPM was running on an HP storage server that came with a local backup application. The app kept enabling removable storage and as a result i was getting all kinds of oddities such as the drive and the library showing up as seperate entities.

    Check in services and make sure removable storeage is set to manual and stopped. Remove any other installed backup software.

    Friday, September 17, 2010 5:39 PM
  • Please add this registry entry and reboot, then try new backups with verification.

     

    http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=110&prodSeriesId=254838&prodTypeId=12169&prodSeriesId=254838&objectID=c01655150

    Let us know the results either way.


    Regards, Mike J [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, September 21, 2010 12:02 AM
    Moderator
  • Thanks for answering, any of the suggestions has solved the problem. The backup-server is a standalone server without any further local backup applications.
    Wednesday, September 29, 2010 8:33 AM
  • I have solved the main problem now by installing the whole system three times. The problem was something about the RSM. By the way I tried to use your linked settings but it doesn't work. After reinstalling the whole system three times suddenly it works at least with the important functions.

    But verification doesn't really work. There I have also these ids 11 and 15. There whole problem is caused by ejecting the tape and reload it in the drive.

    If you have any ideas to solve the verification problem it would be great.
    Monday, November 15, 2010 8:13 AM