In bag observations in rxDForest RRS feed

  • Question

  • Hi,

    I use an R package called forestFloor, to interpret the feature effects in a random forest. This package uses randomForest objects as input. However, I would like to use it on random forests built with rxDForest.

    I tried converting rxDForest objects to randomForest objects using the as.randomForest() function. The conversion works fine, however I found the rxDForest() does not save the information needed by forestFloor: in-bag observations.

    In randomForest() there is an option called keep.inbag, which saves which observation is in bag for each tree (including the frequency when sampling with replacement). 

    I am looking for a way to save this information when using rxDForest. Is this data already available in rxDForest objects? Or would it be easy to add this feature to rxDForest?

    Thanks for your help!



    Friday, October 12, 2018 12:09 PM


  • Because rxDForest is intended for use on extremely large datasets, it is designed to return only those components that do not grow with the size of the data. The in-bag matrix grows as the data grows, so it is not possible to include it in the rxDForest return object.


    Rich Calaway

    Machine Learning Server Team

    • Marked as answer by Thomas1873 Friday, October 19, 2018 8:56 AM
    Friday, October 12, 2018 4:42 PM