Automl stuck on "Run status: preparing" until timeout RRS feed

  • Question

  • I'm trying to get automl-forecasting-energydemand sample working but it gets stuck on "Run status: Preparing". I tried also auto-ml-regression and same happens. It gets stuck for few hours and then timeout error.

    I created another workspace but same problem there. I tried also increasing timout 0.3 -> 2 hour and decreasing horizon 48 -> 6 but it didn't help. I talked to a chat person and he suggested I ask here. There is an error message:

    "ParentRunId: AutoML_9d33e9b4-9caa-48ed-a85e-e1e7336c3ead; ParentRunUuid: 48287786-51da-4840-880f-e58807ce3a2c; RunStatus: Failed; ErrorCode: UserError/ResourceExhausted/Timeout/ExperimentTimeout; FailureReason: usererror "

    Friday, May 22, 2020 12:12 AM

All replies

  • Hello,

    Apologies for the degraded experience you are facing while running AutoML experiments. 

    Are you using the following sample from github to run this experiment? Are the default nodes used in the cluster as the example?

    It also looks like you have tried to run your experiment using a new workspace but a similar error is seen. Is it possible to increase the nodes and check if the experiment succeeds? 

    If an error is still seen with a particular run you can share the run id details securely through Azure ML portal ml.azure.com from the send feedback option(Top right hand corner smiley) which our service team can lookup and provide suggestions on how to fix issues related to specific runs. Please select the option "Microsoft can email you about your feedback" to contact you directly for the issue.


    Friday, May 22, 2020 7:01 AM
  • Hi! Sorry, it was weekend. I think the code is same as in Github. I actually took it from the Jupyter AzureML Sample menu (image below). I have a free account to test and "Virtual machine size: STANDARD_NC6" as used in the example. I think this is maximum amount of nodes I'm allowed. Step 4 says:

    "Found existing cluster, use it.

    AmlCompute wait for completion finished

    Minimum number of nodes requested have been provisioned"

    I just ran it again with "
    max_horizon = 4" and "experiment_timeout_hours=2" and it's preparing. I will send the id using feedback as you suggest. 

    Ah, I'm not allowed to attach images. Anyway, examples (auto-ml-regression and auto-ml-forecasting-energy demand) I'm trying to use are found at:

    • Edited by mletonsa Monday, May 25, 2020 9:13 AM
    Monday, May 25, 2020 9:13 AM
  • Can someone check if those examples (auto-ml-forecasting-energy-demand and auto-ml-regression) work with STANDARD_NC6? And how long time should it take? That is, is it just my account or something wrong with the implementation?

    (now 3 hours stuck in "Run status: Preparing" with "max_horizon = 4" and "experiment_timeout_hours=2")

    • Edited by mletonsa Monday, May 25, 2020 12:07 PM
    Monday, May 25, 2020 12:06 PM
  • And somehow it used all my $200 test credit apparently. Just by trying to run couple of sample experiments unsuccesfully?

    "Thanks for exploring Azure with your $200 credit. Because you’ve used up the credit, your account and services have been disabled."

    Meters show:

    LRS Write Operations - Files
    0.93 / 1 10K
    Unlikely to exceed
    Disk Operations - Standard HDD Managed Disks
    45.9 / 200 10K
    Unlikely to exceed
    Read Operations - Files
    0.9 / 4 10K
    Unlikely to exceed
    Protocol Operations - Files
    0.81 / 4 10K
    Unlikely to exceed
    Hot LRS Write Operations - Tiered Block Blob
    0.06 / 1 10K
    Unlikely to exceed
    Hot Read Operations - Tiered Block Blob
    0.06 / 2 10K
    Unlikely to exceed


    • Edited by mletonsa Tuesday, May 26, 2020 10:25 AM
    Tuesday, May 26, 2020 10:15 AM