none
Suggestion - support pandas dataframes with datetime columns and better error handling RRS feed

  • General discussion

  • Hi,

    Currently, when I return a pandas dataframe with datetime column from a function published in ML Server v9.2.1 I get the following error:

        def service_fn():
            import numpy as np
            import pandas as pd
            now = np.datetime64('now')
            return pd.DataFrame({'test': [now]})
    

    {"Link":"https://go.microsoft.com/fwlink/?linkid=830136","Message":"Value cannot be null.\r\nParameter name: Value cannot be null.\r\nParameter name: outputParameters","ExceptionType":"ArgumentNullException"}

    This message does not provide any clue to the actual problem. Under the hood the stdout file contains proper error message:

    {"console_output": "", "error_message": "Error Encoding Value Timestamp('2018-02-25 16:11:33') is not JSON serializable", "success": false}

    1. I believe the "Value cannot be null" exception is raised in Microsoft.MLServer.ComputeNode.ShellManagement.ExecutionResult constructor:

        public ExecutionResult(bool success, string errorMessage, Dictionary<string, object> outputParameters, string consoleOutput = null, string[] changedFiles = null)
        {
          if (outputParameters == null)
            throw new ArgumentNullException(nameof (outputParameters));
          this.Success = success;
          this.ErrorMessage = errorMessage;
          this.OutputParameters = outputParameters;
          this.ConsoleOutput = consoleOutput;
          this.ChangedFiles = changedFiles;
        }

    I think it would be better if  custom exception was raised with the text of the errorMessage if it is present instead of generic ArgumentNullException. Unless, of course, it was done on purpose to not expose the Python error message which could reveal sensitive information.

    2. The NumpyAndPandasEncoder() JSON encoder does not support datetime values and pandas dataframes with datetime columns. If encoding was done using df.to_json() method, the datetime values would be encoded using ISO or epoch formats. It would be great if the encoder recognized datetime columns and encode them, for example, as ISO strings.

    Kind regards,

    Oleh Khoma


    Sunday, February 25, 2018 4:49 PM

All replies

  • Hi Oleh,

    Your snippet of Python code(service_fn()) works for me both in ML Server 9.2.1 as well as our latest builds. Therefore, I think their must be a problem with your install of ML Server or it is a problem specific to your machine.

    Stephen Weller

    ML Server Team

    Wednesday, February 28, 2018 1:12 AM
  • Hi Stephen,

    Here is two tests that fail for me on 9.3.0:

    import sys
    import json
    import numpy as np
    import pandas as pd
    from azureml.deploy import DeployClient
    from azureml.deploy.server import MLServer
    
    
    def test_service_with_timestamp_column():
        def run():
            import numpy as np
            import pandas as pd
            now = np.datetime64('now')
            return pd.DataFrame({'test': [now]})
    
        mls = DeployClient('http://localhost:12800', use=MLServer, auth=('admin', '<password>'))
        service = mls.deploy_service('test_service', version='1.0', code_fn=run, outputs={'result': pd.DataFrame})
        try:
            df = service.run()
            print(df)
        finally:
            mls.delete_service('test_service', version='1.0')
    
    
    def test_dataframe_with_timestamp_json_encoding():
        df = pd.DataFrame({'test': [np.datetime64('now')]})
        print(df.to_json())
    
        prev_sys_path = sys.path
        sys.path.insert(0, 'c:/Program Files/Microsoft/ML Server/R_SERVER/o16n/Microsoft.MLServer.ComputeNode/Python')
        try:
            from setup import NumpyAndPandasEncoder
            result = json.dumps(df, cls=NumpyAndPandasEncoder)
            print(result)
        finally:
            sys.path = prev_sys_path
    
    
    
    

    The first test creates service that returns simple data frame with timestamp column. It results in the original error that I described in the first post.

    The second test shows what I believe the place were error occurs.

    Both tests fail on a fresh DSVM with manually reinstalled ML Server 9.3.0. 

    Please, let me know if I am doing anything wrong.

    Kind regards,

    Oleh Khoma

    Monday, April 30, 2018 9:31 AM