Slow Online Recovery RRS feed

  • Question

  • Hello all,

    I'm testing online recovery performance of DPM (or I should say MARS agent). I verified two scenarios:

    1) transfer from recovery vault to azure vm in the same region (standard a4_v2, with standard disk as target) 

    2) transfer from recovery vault to on-prem hyper-v vm (1Gbps link, to volume capable of running 250MB/sec)

    Job has two stages:

    a) online recovery

    b) restore from staging area (staging folder and target folder are on the same disk).

    During both tests it appears there is some kind of bottleneck on the first stage of recovery. It runs at 10-15MB/sec at max.

    It's not CPU & RAM bottleneck.

    It's not network bottleneck - azure vm should have 1Gbps (scenario 1), on prem vm has 1Gbps internet link (scenario 2).

    It's not disk bottleneck - azure vm may run 60MB/sec on single drive and numbers stay the same when I create Storage Space out of 4 disks (scenario 1), on-prem storage may run 250MB/sec which actually is visible during second stage of process (scenario 2).

    When I checked disk activities during first stage cbengine.exe performs read operations in MARS Scratch folder (using single thread!) and write operations inside of Staging folder (using multiple threads). On the sample I have 80/20 read/write ratio. During the process I see very low IOPS on the target disk - like 1.5. Do I understand correctly that recovery process in the first stage runs 80% of operations using single thread - because this would explain why I'm not utilizing full capabilities of disk and without any outstanding IO, my transfer is actually limited to disk latency.

    Thank you in advance for your support.

    Wednesday, October 10, 2018 9:25 AM

All replies