Conflicts Between High- and Low-Level Parallelization RRS feed

  • Question

  • I am very much interested in using the MRO distribution, but am concerned that it could lead to problems with the way I have been parallelizing my code.  I usually create nodes using parallel::makePSOCKcluster(), then parallelize tasks at a high-level using foreach() from the package of the same name.  When I say "high level", I'm talking about things like running different cross-validation trials on each node, where I am fitting a model using the training sample and evaluating the fit on the left out sample on each node-- so a big chunk of code being farmed out to each worker.

    My concern is that if I am already parallelizing at this higher level and the MRO functions attempt to further parallelize tasks at a low level, from each of the running workers sessions, it's going to end up causing a big mess and actually increase compute time overall.

    Should I be concerned about this, or is there some mechanism that would keep this from happening?  Should I somehow "manually" disable the MRO parallelization in the workers?



    Friday, June 24, 2016 6:13 PM