locked
Querying web applications RRS feed

  • Question

  • Power Query could be a great tool to query the web and especially web applications.

    Many web applications used form-based authentication and often also require to navigate of a specify state before being able to reach data you want to query. 

    To reach that, Power Query when querying the web would need to provide a way to let the user visually record that path to reach their data.

    A specify behavior could be implemented to handle userid and password and also multi-pages navigation when your tabular data is displayed using several pages (previous/next...)



    Olivier Hault (Level IT - www.level-it.be)


    • Edited by ohault Thursday, August 22, 2013 10:36 AM
    Thursday, August 22, 2013 10:32 AM

Answers

All replies

  • Thanks Olivier for your suggestion. We will keep it in mind for future improvements and feature additions on this area.

    Regards,
    M.

    Wednesday, September 4, 2013 1:43 AM
  • Anyone found a way to post forms data with a PowerQuery so data behind a login screen can be accessed?

    This is a huge limitation of PowerQuery.

    Saturday, January 18, 2014 10:11 AM
  • Your best strategy will be to create your own server-side proxy for whatever you need to access which can't be done with HTTP headers ( which are supported by PowerQuery now).

    What is currently missing is some federated identity goodness. How about supporting Microsoft's own Windows Identity Framework?

    Thursday, January 23, 2014 3:01 AM
  • This sounds like a promising solution. Could you provide more details please? I can get around this problem by logging into the desired website using the Excel browser provided by the web query builder under the Data ribbon. Then subsequent VBA webtable queries to this website are treated as if Excel is logged into the site. I cannot find a way to log into a site in a similar way that allows for a web powerquery to be created.
    Sunday, January 26, 2014 8:45 PM
  • Certainly using a web query is an alternative. You could then create a table from the results and use PowerPivot to perform actions on the data. What I was suggesting is more involved as you have to create your own proxy web service.
    Sunday, January 26, 2014 9:25 PM
  • I need to use the functionality of PowerQuery.  Can you explain how I would configure a proxy server to login to the site?


    Can you recommend a proxy server to use?


    • Edited by Euge67 Monday, January 27, 2014 7:34 AM
    Monday, January 27, 2014 7:34 AM
  • What is the web site you need to query?  If they have an API, I can investigate creating a proxy which would work with Power Query.
    Tuesday, January 28, 2014 12:56 AM
  • The website I need to query is a standard login webpage with a HTTPS form that has one input box for username and one input box for password and the form needs to be sent using the POST in order to gain access.
    Saturday, February 1, 2014 8:34 AM
  • If it's possible to perform a single POST with the username and password and get the desired data as a response, then this can be done with Power Query directly -- though it's not for the faint of heart :).

    You essentially have to build the body of the HTTP Post manually (or hard code it). There aren't any helpers in M to do things like URL-encoding, either. And the password will be right there in plain text in your M code. The result ends up looking something like this:

    let
        Body = "username=me&password=secret"
        Source = Web.Contents("http://some.url/", [Content=Text.ToBinary(Body), Headers=[#"Content-Type"="application/x-www-form-urlencoded"]]))
    in
        Source

    The easiest way to figure out the "Body" part is to run a tool like Fiddler to watch your HTTP/HTTPS traffic, and then log into the website manually. You can then look at the data being sent with the POST, and possibly just paste it directly into the "Body" string.

    If additional clicks are required after logging in, then Power Query does not support this today.

    Saturday, February 1, 2014 2:17 PM