none
DPM agent with multiple IP's via IPSec tunnel fails after attaching to DPM server RRS feed

  • Question

  • We have a situation where the DPM agent has more than one IP address.  One IP is associated with an IPSec tunnel and another is not.  We're using split DNS to get everything to communicate over the VPN.  Ping, file sharing, and DPM attach all worked without issue.  After a period of time the agent becomes unavailable in the DPM server console.

    Traffic monitoring indicates the DPM server is trying to communicate with the IP that is not associated with the IPSec tunnel via RPC.  It uses both IPs but appears to prefer the one that is NOT associated with the VPN tunnel.  Since the DPM server doesn't know about this IP (using split DNS to make sure only the IPSec tunnel IP is used by DPM server) I'm assuming that it is coming back from the DPM agent via the RPC protocol?

    Is there a way to tell the DPM agent to use a specific IP? 

    The Add-BackupNetworkAddress works for the DPM server but isn't for the agent side.  I tried finding something via google but didn't have any luck.  

    Is there a way, via the DPM agent or registry, to force the DPM agent to use a specific IP of all the IP addresses on agent?

     


    Rob
    • Moved by Larry Yin Wednesday, March 28, 2012 11:11 PM (From:DPM 2012 Beta - Locked)
    Sunday, November 20, 2011 3:17 PM

Answers

  • I finally found the logs on the agent.  There were errors in the log related to VSS writers that were on the machine.  We found an underlying issue with the VSS writers which we were able to fix.  Once that was done, then the DPM backup across the VPN tunnel worked fine.

    We're still having issues with another machine across the tunnel but believe it is almost entirely due to latency on the link.

     


    Rob
    • Marked as answer by ip-rob Wednesday, November 23, 2011 4:18 PM
    Wednesday, November 23, 2011 4:18 PM

All replies

  • Could this be the issue?

    "A common problem in applications using address translation is a source address located in the data payload of a packet. This occurs because the DCOM servers embed the source address deep in the packet in a non-specific location so it is not possible to translate the address. Because clients respond to this source address, they attempt to respond directly to nodes, rather than to virtual servers (VIPs)."

    Is it possible to force the DPM agent DCOM application to use a specific IP on the NIC?


    Rob
    Sunday, November 20, 2011 5:40 PM
  • I don't think the issue is related to DCOM.  Although DCOM may attempt to use all the IP's, it seems to work fine as long as one of the local IP's is routable on the interface.

    I verified this by setting up a separate tunnel to another network with the same configuration.

    The issue with the first network appears to be a high latency problem.  The remote server is on a satellite link and latency is typically 700ms to 1200ms.  I think this is creating problems with the traffic.

    Although I can associate the agent fine...it seems to fail when doing a refresh and verify connectivity to the remote computer.  

    Anyone have advice on RPC with high latency?


    Rob
    Sunday, November 20, 2011 8:45 PM
  • I'm still having some issues with the DPM running from a remote network via DPM.  I can get the agents attached to DPM and all traffic appears to be flowing properly based on firewall logs.  Besides the issue above, I'm also having problems adding an attached agent to an existing protection group.  

    I only get the hourglass when I go to add the machine and click on the "+" to expand the available data sources.

    RPC traffic is constantly going between the machines but the status never changes.  Is there a log somewhere that I can use to diagnose potential issues?


    Rob
    Monday, November 21, 2011 2:20 AM
  • The DPM error log shows the following in the scenario where I get the hourglass trying to add a machine to the protection group:

     

    17D0 16E8 11/21 14:14:11.607 02 EventManager.cs(95) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Publishing event from TaskInstance.cs(406): TaskStart, [TaskID=a49c919a-4f3a-4237-aca9-b9a89e31f7fc]

    17D0 16E8 11/21 14:14:11.609 01 TaskInstance.cs(442) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Task: Initial transition enqueued, startState='Init State'

    17D0 16E8 11/21 14:14:11.614 68 BranchedVolumeInquiry.cs(67) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL BranchedVolumeInquiry::TriggerVolumeInquiry:: CMTE triggered Volume Inquiry on server [MYSERVERNAME]

    17D0 2A60 11/21 14:14:11.619 68 RunInquiryCommonTypes.cs(1353) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding inquiry workitem a49c919a-4f3a-4237-aca9-b9a89e31f7fc of server MYSERVERNAME to RunningAcInquiries HashTable

    17D0 16E8 11/21 14:14:15.206 68 RunInquiry.cs(467) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Agent Inquiry started on server MYSERVERNAME

    17D0 16E8 11/21 14:14:16.674 68 ProcessDelegate.cs(103) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL ==>UpdateIntentWithInquiryResults

    17D0 16E8 11/21 14:14:16.674 68 InquiryManager.cs(56) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL ==>UpdateDataSourceFromInquiryResult

    17D0 16E8 11/21 14:14:16.695 17 InquiryManager.cs(148) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL IntentManager::UpdateDataSourceFromInquiryResult:: Updating File System Datasources completed

    17D0 16E8 11/21 14:14:16.695 17 InquiryManager.cs(156) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL IntentManager::UpdateDataSourceFromInquiryResult:: Updating Application Datasources completed

    17D0 16E8 11/21 14:14:16.696 68 InquiryManager.cs(56) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL <--UpdateDataSourceFromInquiryResult

    17D0 16E8 11/21 14:14:16.696 68 ProcessDelegate.cs(103) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL <--UpdateIntentWithInquiryResults

    17D0 16E8 11/21 14:14:16.701 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.701 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.701 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.701 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:16.702 68 InquiryContext.cs(415) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding recoverable error 38, to recoverable errors list

    17D0 16E8 11/21 14:14:17.341 68 RunInquiry.cs(492) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Agent Inquiry Finished on server MYSERVERNAME

    17D0 16E8 11/21 14:14:17.341 68 RunInquiryCommonTypes.cs(1360) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Deleting inquiry workitem a49c919a-4f3a-4237-aca9-b9a89e31f7fc from RunningAcInquiries HashTable

    17D0 16E8 11/21 14:14:17.342 68 BranchedVolumeInquiry.cs(136) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL BranchedVolumeInquiry::VolumeInquirySuccess:: CMTE finished Volume Inquiry on server [MYSERVERNAME]

    17D0 16E8 11/21 14:14:17.343 68 ShareInquiryBlock.cs(67) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL TriggerShareInquiry:: CMTE triggered Share Inquiry on server [MYSERVERNAME]

    17D0 16E8 11/21 14:14:17.349 68 RunInquiryCommonTypes.cs(1353) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding inquiry workitem a49c919a-4f3a-4237-aca9-b9a89e31f7fc of server MYSERVERNAME to RunningAcInquiries HashTable

    17D0 2410 11/21 14:14:18.495 68 RunInquiry.cs(467) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Agent Inquiry started on server MYSERVERNAME

    17D0 2410 11/21 14:14:19.673 68 ProcessDelegate.cs(269) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Enterring ProcessDelegate::ProcessShares. 

    17D0 2410 11/21 14:14:19.673 68 ProcessDelegate.cs(269) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL            Current server context = StartNodeServer = MYSERVERNAME | ActiveNode = MYSERVERNAME | Cluster =  | PhysicalNodeCount = 0

    17D0 2A60 11/21 14:14:20.339 68 RunInquiry.cs(492) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Agent Inquiry Finished on server MYSERVERNAME

    17D0 2A60 11/21 14:14:20.339 68 RunInquiryCommonTypes.cs(1360) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Deleting inquiry workitem a49c919a-4f3a-4237-aca9-b9a89e31f7fc from RunningAcInquiries HashTable

    17D0 2410 11/21 14:14:20.340 68 ShareInquiryBlock.cs(132) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL ShareInquirySuccess:: CMTE finished Share Inquiry on server [MYSERVERNAME]

    17D0 1068 11/21 14:14:20.341 68 ApplicationInquiryBlock.cs(83) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL ApplicationInquiry::StartApplicationInquiry::Entering CMTE triggered Application Inquiry on server [MYSERVERNAME]

    17D0 1C18 11/21 14:14:20.347 68 RunInquiryCommonTypes.cs(1353) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Adding inquiry workitem a49c919a-4f3a-4237-aca9-b9a89e31f7fc of server MYSERVERNAME to RunningAcInquiries HashTable

    17D0 16E8 11/21 14:14:21.496 68 RunInquiry.cs(467) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL Agent Inquiry started on server MYSERVERNAME

    17D0 12F0 11/21 14:14:57.293 16 jobmanagerservices.cpp(401) A49C919A-4F3A-4237-ACA9-B9A89E31F7FC NORMAL CJobManagerServices::CancelJob(). jobID={C28A857E-3ED0-4D00-9A5C-C339830EDB51}

    17D0 12F0 11/21 14:14:57.293 16 JobManager.cs(483) NORMAL Cancel job:c28a857e-3ed0-4d00-9a5c-c339830edb51

    17D0 12F0 11/21 14:14:57.293 16 ActiveJob.cs(424) NORMAL Cancelling job:'c28a857e-3ed0-4d00-9a5c-c339830edb51' due to:'UserRequest'

    17D0 12F0 11/21 14:14:57.293 16 ActiveJob.cs(1081) NORMAL Changing state of job: 'c28a857e-3ed0-4d00-9a5c-c339830edb51' from 'Execute' -> 'Cancel'


    Rob
    Monday, November 21, 2011 2:22 PM
  • I finally found the logs on the agent.  There were errors in the log related to VSS writers that were on the machine.  We found an underlying issue with the VSS writers which we were able to fix.  Once that was done, then the DPM backup across the VPN tunnel worked fine.

    We're still having issues with another machine across the tunnel but believe it is almost entirely due to latency on the link.

     


    Rob
    • Marked as answer by ip-rob Wednesday, November 23, 2011 4:18 PM
    Wednesday, November 23, 2011 4:18 PM