none
Why does the web.contents make multiple requests? RRS feed

  • Question

  • I'm accessing this page

    https://community-fund.aviva.co.uk/voting/project/view/21/

    and using fiddler it appears that mashup is getting the page and then making calls to other referenced pages (including google maps). Which is seriously impacting the performance of the query.

    Why is it following links/content?


    twitter - @simon_sabin blog - http://www.sqlblogcasts.com/blogs/simons SQLBits - Largest SQL Server Conference in Europe and its free

    Monday, May 4, 2015 8:12 PM

Answers

  • One way you may be able do this is by placing an Text.FromBinary call in between Web.Page and Web.Contents:

    = Web.Page(Text.FromBinary(Web.Contents("http://www.bing.com")))

    In this case, Web.Page should still run the Javascript but as if it was a local page and should not allow AJAX requests to remote services.

    Let us know how that works for you.

    Tuesday, May 5, 2015 8:17 PM
    Moderator
  • Because you're using Web.Page. Early on, we found that this function wasn't very useful if all it did was scrape the static HTML at the supplied site. For better or worse, a lot of sites today load their data dynamically with AJAX-like techniques. The only way for us to extract data from those sites is to load them into a browser so that the embedded JavaScript can run.

    I can assure you that I wish there were a better way... :(.

    Monday, May 4, 2015 9:07 PM

All replies

  • Because you're using Web.Page. Early on, we found that this function wasn't very useful if all it did was scrape the static HTML at the supplied site. For better or worse, a lot of sites today load their data dynamically with AJAX-like techniques. The only way for us to extract data from those sites is to load them into a browser so that the embedded JavaScript can run.

    I can assure you that I wish there were a better way... :(.

    Monday, May 4, 2015 9:07 PM
  • Wow, Shame it would be great to not have that.

    Is there no work around just to get html content.


    twitter - @simon_sabin blog - http://www.sqlblogcasts.com/blogs/simons SQLBits - Largest SQL Server Conference in Europe and its free

    Tuesday, May 5, 2015 12:23 PM
  • One way you may be able do this is by placing an Text.FromBinary call in between Web.Page and Web.Contents:

    = Web.Page(Text.FromBinary(Web.Contents("http://www.bing.com")))

    In this case, Web.Page should still run the Javascript but as if it was a local page and should not allow AJAX requests to remote services.

    Let us know how that works for you.

    Tuesday, May 5, 2015 8:17 PM
    Moderator