none
DPM 2010 - Questions on Long Term Data Retention RRS feed

  • Question

  • Good people in the DPM community-

    We are currently testing DPM 2010 in our production environment and nearly ready to go live with it.  I'm a little unsure about a few concepts with DPM still and what better place to come to chat about those concepts than the DPM community, where I've learned a vast amount about DPM already.  For that, I want to say thank you.

     

    DPM 2010 Server Configuration:

    Windows Server 2008 R2 Ent SP1 64bit  (12gb RAM, 4 x 2.26Ghz processors and the OS/DPM resides on a 100gb disk [C:])

    2TB of disk space has been added to the Storage Pool

    1.5TB of disk space has been allocated to the server to hold our virtual tape backups.

    DPM 2010 has been installed and patched up to date

    Cristalink's Firestreamer has been installed to function as our virtual tape library

     

    DPM 2010 Short/Long Term Goals:

    Backup all Executive Desktop/Laptop PCs by allocating 100GB of space per PC.  (We have about 20 PCs that will need to be backed up)

    Short term protection - backup to disk for 30 days

    Long term protection - backup to tape (virtual) for 6 months

     

    As it stands right now, the DPM server is functional and is successfully backing up/restoring data.  The questions that I have are as follows:

    1)  I created a protection group with a 30 day retention period for short term protection to disk and a 6 month retention period for long term protection to tape (virtual).  Whenever I manually initiate a recovery point creation to "Long term tape protection", it seems like the DPM Server will communicate directly with the agent on the client PC to perform the backup.  Why doesn't DPM use the last short term recovery point to disk as the source of the long term backup job that I just initiated?  In other words, why does the DPM server need to communicate with the agent on the client PC in order to perform the tape backup?  (that's another backup job over the network/usage of resources on the client PC, in addition to the normal backup to disk jobs)

    2)  Assuming my protection group is setup for 30 days of short term protection to disk and 6 months of long term protection to tape (VTL), does DPM only use the tapes for data that is older than 30 days?  what happens to data that has aged past 30 days, does it get backed up to tape for the 6 month retention period I set?  What happens to the data once it reaches 181 days old, truncated to make room for new backups?

    3)  Seeing my short term and long term backups are ultimately going to "disk", would it be more efficient use of my available storage space/DPM server to only use "long term" protection?  (If the goal is to keep the data for 6 months, i'm not sure where the benefit is in keeping a data set for 30 days and then another for 6 months.)

    4)  I would like to prevent client PCs from synchronizing/creating recovery points with the DPM server while my users are out of the office & working over a VPN connection, for obvious reasons.  While I have not seen that happen with my laptop, I'd like to curb this issue before it even comes up.  I've looked at the "throttle computer.." settings and while that's helpful to a degree, there doesn't seem to be a way to configure the server or client to only communicate when the client PC is physically connected to the LAN.  Is there a way to prevent client PCs from synchronizing over a VPN connection?

    5)  When I allocate 100GB of space to each PC in the PG, more than 100GB is allocated.  100GB for the replica volume, and what seems to be a random number of GB for the recovery point volume.  For example, my laptop has 100GB allocated for the replica volume and has another 66.91GB allocated for the recovery point volume.  It seems as though my laptop is using 166.91GB of disk space on the DPM Server, when I only want to allocate 100GB total.  As you can imagine, this throws off my numbers when planning for how much disk space i'm going to need.  Am I missing something here?

    Any insight or help here would be greatly appreciated folks.

    Many thanks and TGIF!!

    Cheers,

    Jim P.

    Friday, September 16, 2011 6:45 PM

Answers

  • Hi,

    I will address each of your questions.

    Q1)  I created a protection group with a 30 day retention period for short term protection to disk and a 6 month retention period for long term protection to tape (virtual).  Whenever I manually initiate a recovery point creation to "Long term tape protection", it seems like the DPM Server will communicate directly with the agent on the client PC to perform the backup.  Why doesn't DPM use the last short term recovery point to disk as the source of the long term backup job that I just initiated?  In other words, why does the DPM server need to communicate with the agent on the client PC in order to perform the tape backup?  (that's another backup job over the network/usage of resources on the client PC, in addition to the normal backup to disk jobs)

    A1)  The documentation you found is accurate, for D2D2T protection, ALL long term tape backup are created by mounting the latest disk based recovery point and copying it to tape.  The tape backup jobs appear to show we're backing up the client, which we are, but no indication is given where the data is being read from, which is always storage pool disk for D2D2T backups.

    Q2)  Assuming my protection group is setup for 30 days of short term protection to disk and 6 months of long term protection to tape (VTL), does DPM only use the tapes for data that is older than 30 days?  what happens to data that has aged past 30 days, does it get backed up to tape for the 6 month retention period I set?  What happens to the data once it reaches 181 days old, truncated to make room for new backups?

    A2)  Long term retention is total time to maintain a recovery point on tape, starting with the 1st backup. So for 30 days, you will have duplicate recovery points, one on disk, and one on tape for the same data. After 30 days, the 1st recovery point gets deleted from disk, but is maintained on tape. So now you only have one recovery point for day1, and that's on tape.   After 180 days, the 1st recovery point on tape will expire, so now you don't have any backup for that datasource for that day.  Eventually all the recovery points that were on that tape will expire, and at that point the tape can be reused by DPM.  You need to use the tape management report to see when tapes are DUE back so they can be reused.  There is up to a 4 week window so you can plan on bringing tape back into rotation.  

    Q3)  Seeing my short term and long term backups are ultimately going to "disk", would it be more efficient use of my available storage space/DPM server to only use "long term" protection?  (If the goal is to keep the data for 6 months, i'm not sure where the benefit is in keeping a data set for 30 days and then another for 6 months.)

    A3) Short term on disk is more effecient since it hold "incremental" block level changes for 30 days.  Long term tape backups are always FULL backups, so you will use more disk space.  I don't know your frequency of tape backups, are they daily or weekly for 6 months ?

    Q4)  I would like to prevent client PCs from

    synchronizing/creating recovery points with the DPM server while my users are out of the office & working over a VPN connection, for obvious reasons.  While I have not seen that happen with my laptop, I'd like to curb this issue before it even comes up.  I've looked at the "throttle computer.." settings and while that's helpful to a degree, there doesn't seem to be a way to configure the server or client to only communicate when the client PC is physically connected to the LAN.  Is there a way to prevent client PCs from synchronizing over a VPN connection?

    A4) We don't have any provisions for detecting which network they are connected to and preventing synchronizations, best you can do is throttle bandwidth after normal working hours.

    Q5)  When I allocate 100GB of space to each PC in the PG, more than 100GB is allocated.  100GB for the replica volume, and what seems to be a random number of GB for the recovery point volume.  For example, my laptop has 100GB allocated for the replica volume and has another 66.91GB allocated for the recovery point volume.  It seems as though my laptop is using 166.91GB of disk space on the DPM Server, when I only want to allocate 100GB total.  As you can imagine, this throws off my numbers when planning for how much disk space i'm going to need.  Am I missing something here?

    A5)  DPM is asking how much "data" you intend on protecting on each laptop, this is used to calculate the replica volume size.  Recovery point volume size is calculated by DPM automatically and is based on replica size and retention period.    The allocated space / PC is NOT a quota, so you can specify a smaller number and most likely some will use less and some use more space and it will all average out.   By default we co-locate 10 PC's per replica, so specify a smaller number / PC and overall you can get the total disk utilization down to the 100GB you want.  Down side is you need to start protection over (stop protection for all PC and delete replica, then reprotect using smaller number)

    Hope these help you and welcome to the DPM forum.


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, October 6, 2011 7:02 PM
    Moderator

All replies

  • 'Afternoon folks.  I've noticed lots of views, but no actual replies - doh!

     

    I've been continuing to research how the DPM product functions and within the DPM Operations Guide, I came across the following statements, with regards to the Long Term Backups to Tape:

    Page 229:

    "When DPM backs up data from the replica to tape, there is no network traffic and therefore no performance impact on the protected computer."

    "When DPM backs up data from the protected computer directly to tape, there will be an impact on the disk resources and performance on the protected computer.  The impact on performance is less when backing up file data than when backing up application data."

     

    This in effect addresses my 1st question above.  However, when I look in the Monitoring tab and browse the "completed jobs", I have never seen the DPM Server back up the replicas of protected PCs.  In fact, the only "tape backups" that seem to be occurring are happening by the DPM Server communicating directly with the protected PC.  This presents a few concerns..  performance issue while the user is using their PC, and failed jobs due to laptops being out of the office. 

    To hinge upon my 1st question, how can I ensure that my DPM Server is using the last replica as the long term backup-to-tape source, instead of the DPM Server communicating directly with the protected computer?

     

    Any insight would be most helpful.  Any thoughts?

     

    Thanks,

     

    Jim P.



    Tuesday, September 27, 2011 5:03 PM
  • Good Afternoon Folks,

     

    I've been hoping for at least 1 reply from the DPM community on my questions, but alas - no replies.  I'm relatively new to DPM and i've been tasked with getting this product up and running for my company.  

    Would anyone be able to shed some light on ANY of my questions?!

     

    Please feel free to let me know if my questions are unclear or if I should potentially be looking at this from a different angle.

     

    Any help would be much appreciated folks!

     

    Thank you.

    Jim P.

    Thursday, October 6, 2011 5:35 PM
  • Hi,

    I will address each of your questions.

    Q1)  I created a protection group with a 30 day retention period for short term protection to disk and a 6 month retention period for long term protection to tape (virtual).  Whenever I manually initiate a recovery point creation to "Long term tape protection", it seems like the DPM Server will communicate directly with the agent on the client PC to perform the backup.  Why doesn't DPM use the last short term recovery point to disk as the source of the long term backup job that I just initiated?  In other words, why does the DPM server need to communicate with the agent on the client PC in order to perform the tape backup?  (that's another backup job over the network/usage of resources on the client PC, in addition to the normal backup to disk jobs)

    A1)  The documentation you found is accurate, for D2D2T protection, ALL long term tape backup are created by mounting the latest disk based recovery point and copying it to tape.  The tape backup jobs appear to show we're backing up the client, which we are, but no indication is given where the data is being read from, which is always storage pool disk for D2D2T backups.

    Q2)  Assuming my protection group is setup for 30 days of short term protection to disk and 6 months of long term protection to tape (VTL), does DPM only use the tapes for data that is older than 30 days?  what happens to data that has aged past 30 days, does it get backed up to tape for the 6 month retention period I set?  What happens to the data once it reaches 181 days old, truncated to make room for new backups?

    A2)  Long term retention is total time to maintain a recovery point on tape, starting with the 1st backup. So for 30 days, you will have duplicate recovery points, one on disk, and one on tape for the same data. After 30 days, the 1st recovery point gets deleted from disk, but is maintained on tape. So now you only have one recovery point for day1, and that's on tape.   After 180 days, the 1st recovery point on tape will expire, so now you don't have any backup for that datasource for that day.  Eventually all the recovery points that were on that tape will expire, and at that point the tape can be reused by DPM.  You need to use the tape management report to see when tapes are DUE back so they can be reused.  There is up to a 4 week window so you can plan on bringing tape back into rotation.  

    Q3)  Seeing my short term and long term backups are ultimately going to "disk", would it be more efficient use of my available storage space/DPM server to only use "long term" protection?  (If the goal is to keep the data for 6 months, i'm not sure where the benefit is in keeping a data set for 30 days and then another for 6 months.)

    A3) Short term on disk is more effecient since it hold "incremental" block level changes for 30 days.  Long term tape backups are always FULL backups, so you will use more disk space.  I don't know your frequency of tape backups, are they daily or weekly for 6 months ?

    Q4)  I would like to prevent client PCs from

    synchronizing/creating recovery points with the DPM server while my users are out of the office & working over a VPN connection, for obvious reasons.  While I have not seen that happen with my laptop, I'd like to curb this issue before it even comes up.  I've looked at the "throttle computer.." settings and while that's helpful to a degree, there doesn't seem to be a way to configure the server or client to only communicate when the client PC is physically connected to the LAN.  Is there a way to prevent client PCs from synchronizing over a VPN connection?

    A4) We don't have any provisions for detecting which network they are connected to and preventing synchronizations, best you can do is throttle bandwidth after normal working hours.

    Q5)  When I allocate 100GB of space to each PC in the PG, more than 100GB is allocated.  100GB for the replica volume, and what seems to be a random number of GB for the recovery point volume.  For example, my laptop has 100GB allocated for the replica volume and has another 66.91GB allocated for the recovery point volume.  It seems as though my laptop is using 166.91GB of disk space on the DPM Server, when I only want to allocate 100GB total.  As you can imagine, this throws off my numbers when planning for how much disk space i'm going to need.  Am I missing something here?

    A5)  DPM is asking how much "data" you intend on protecting on each laptop, this is used to calculate the replica volume size.  Recovery point volume size is calculated by DPM automatically and is based on replica size and retention period.    The allocated space / PC is NOT a quota, so you can specify a smaller number and most likely some will use less and some use more space and it will all average out.   By default we co-locate 10 PC's per replica, so specify a smaller number / PC and overall you can get the total disk utilization down to the 100GB you want.  Down side is you need to start protection over (stop protection for all PC and delete replica, then reprotect using smaller number)

    Hope these help you and welcome to the DPM forum.


    Regards, Mike J. [MSFT] This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, October 6, 2011 7:02 PM
    Moderator