locked
Power Query - Bug in OData paging RRS feed

  • Question

  • Hi,

    I am currently running Power Query 1.5.3296.2082 in Excel 2013 x64

    I have an OData service with an advanced paging mechanism. The service uses server driven paging with a non fixed page size. The page size is limited by a combination of an ever increasing page size factor AND time. In this scenario the actual number of rows returned is not constant from page to page. The OData service also supports client paging - as used by Power Query.

    Power Query is trying to be clever and bases its client page size on the number of items actually returned from a request. Power Query creates an increasing number of threads doing client paged parallel requests. If any of the client paged requests returns less items than the current client page size (the OData $top parameter) Power Query will adjust its page size down for the next set of requests. In these scenarios the server has restricted the number of returned items due to the server driven paging time limit strategy and the payload contains a new $skiptoken AND a modified $top parameter for getting the rest of the requested client page. Power Query does NOT use the new $skiptoken and continues with the old one but adjusts the $top and $skip parameters.

    The bug is in the calculation of the new $skip parameter. It does not skip far enough resulting in some entities/rows being requested twice. It seems that the amount it is off equals the new $skip retuned in the nextLink.

    Here are some examples captured in fiddler:

    1st request: Files?$select=ID&$skiptoken='2745400',102400,null,null,null&%24skip=5378&%24top=5378

    it only returns 4305 so it will have a new nextLink:

    Files?$select=ID&$top=1073&$skiptoken='2755223',5378,null,null,null

    In the $top in this nextLink you can see that you are missing 1073 entries from the full request - this is server driven paging kicking in. The nextLink is NOT used by Power Query as the thread doing the following request looks like this:

    2nd request: Files?$select=ID&$skiptoken='2745400',102400,null,null,null&$skip=8610&$top=4305

    Here you can see that Power Query uses the old $skiptoken and a new calculated $skip value. The value 8610 is wrong as Power Query already has gotten 5378+5378-1073 = 9683. It should have specified $skip=9683. The value is off by 1073.

    Could you please have a look at this?

    Another question: Could we have an option to disable the client parallel paging stuff - it is not necessarily performing better than a single thread and relying purely on server driven paging?

    Regards

    Uffe

    I have just upgraded to Power Query version 2.8.3443.101 and I can confirm the bug is in that version as well.




    • Edited by Uffe Lauesen Tuesday, October 15, 2013 7:48 PM Upgraded to latest PQ version
    Tuesday, October 15, 2013 6:58 PM

Answers

  • Hi,

    Thanks for reporting the issue. We are investigating a fix for this issue and should be available in a future release of Power Query.

    Thanks


    Zafar Abbas

    Friday, October 18, 2013 11:40 PM

All replies

  • Hi,

    Thanks for reporting the issue. We are investigating a fix for this issue and should be available in a future release of Power Query.

    Thanks


    Zafar Abbas

    Friday, October 18, 2013 11:40 PM
  • You can disable client-side parallel paging in your call to OData.Feed:

    OData.Feed("http://some/odata/url", null, false)

    Setting the third parameter to false should force serial execution.

    Wednesday, October 23, 2013 10:37 PM