How efficient is it to have the Stored Proc on the machine learning server query a remote server. RRS feed

  • Question

  • I am trying to figure out how to put all the R Code for pulling\processing the data in a stored procedure on my machine learning server without using a linked server. All the tutorials I have gone through have assumed you have the data on the machine learning server. 

    I am trying to avoid;

    • The performance hit of a linked server query not getting a good execution plan
    • Staging the data on the ML server. 



    Wednesday, March 21, 2018 11:09 AM

All replies

  • Phil,

    If you're willing to turn off the default firewall rule (see post for details) then your SQL Server instance that's running Machine Learning Services should be able to make standard queries from the R process to other SQL Server databases using ODBC. You'll need to connect to the other SQL Server databases using SQL authentication (username/passoword). Windows authentication will not work because the R process is running as one of the local R users (e.g. MSSQLSERVER01).

    The code would need to look something like:

    sqlConnString <- "Driver=SQL Server;Server=OtherServer; Database=MyDatabase;Uid=user_name;Pwd=password"
    sqlDataSource <- RxSqlServerData(connectionString = sqlConnString,
       query = "SELECT * FROM myTable",
       rowsPerRead = sqlRowsPerRead)
    myDataSet <- rxImport(sqlDataSource) # you could also load this into a .xdf file if it's very large
    I've found that performance for pulling data from SQL on the same server is generally comparable to pulling data from a different SQL Server instance (when using ODBC to move the data). In my setup where all of the machines are located in the same data center, performance is bottlenecked on single-CPU speed, not network performance.

    Good luck!


    • Proposed as answer by nielsbMVP Thursday, March 22, 2018 3:42 AM
    • Marked as answer by Developer_46038 Thursday, March 22, 2018 8:30 PM
    • Unmarked as answer by Developer_46038 Tuesday, May 8, 2018 3:04 PM
    Wednesday, March 21, 2018 1:36 PM
  • How can I run the code if stored procedure calls are not supported?

    Create SQL Server data sources

    The following functions let you define a  SQL Server data source. A data source object is a container that specifies a connection string together with the set of data that you want, defined either as a table, view, or query. Stored procedure calls are not supported.

    Tuesday, May 8, 2018 3:04 PM
  • How do you get the OtherServer to be targeted when this is used within R-Services(in-database)?

    I'm trying to deploy some code(as a stored procedure, using sp_execute_external_script) that pulls data from a remote SQL server (w\o staging). My script looks very similar to yours, but no matter what I've tried, it always tries to access the local server not the remote server. It acts as if its overriding the "server" parameter I specify. 
    Two days of banging my head trying to figure out a way around this. 

    Any thoughts?

    Thursday, January 23, 2020 8:10 PM
  • You can use the R package 'mrsdeploy' to establish remote server connections and deploy a web service.

    You can find some information here:

    This should do what you want.

    Thursday, January 30, 2020 12:28 AM