none
Trying to download multiple web pages with same URL format RRS feed

  • Question

  • Hi,

    I am trying to scrape multiple pages from the same website, is this possible and how do i got about it?

    I am trying to download these pages

    http://www.oddsportal.com/matches/soccer/20171020/

    http://www.oddsportal.com/matches/soccer/20171021/

    http://www.oddsportal.com/matches/soccer/20171022/

    etc.

    Thanks!

    Monday, October 23, 2017 9:34 PM

Answers

  • Hi Miguel,

    I actually managed to get the data I needed off of oddsportal.com using Power Query. But I'm just not sure how get the data from each webpage at the same time.

    So it is possible to get the data I need, I just need to work out how to get multiple web pages.

    Thanks

    JOe

    Nice! would you mind posting your code? if the table structure is the same between all of those pages, then all you need to change is the 1st step that should be called Source to point to the other webpage and it'll get the info from the other page. You can create as many queries as you want and append them if needed. 

    If you're trying to do some sort of pagination then the video posted by Chaskel might help

    Wednesday, October 25, 2017 5:05 PM

All replies

  • Hey Joe,

    What data from that website are you trying to get? Could you please explain a bit more what you're trying to accomplish?

    Tuesday, October 24, 2017 3:49 AM
  • Hi Miguel,

    I am trying to get the soccer odds data for each day.

    Then I can collate all this data and have each day in the same spreadsheet.

    Thanks

    J

    Tuesday, October 24, 2017 9:15 AM
  • Hey Joe,

    It appears that the site that you're trying to get the info from is using a combination of Ajax and javascript, which first renders the html part as a canvas and then loads the data with the ajax+js combo. Power Query doesn't execute this ajax+js combo, probably as a by design security feature, so you're unable to see the data that you want inside Power Query. You only see the first part that rendered, which is just a few tables and the rest of the html part.

    There are a few workarounds to make this work:

    1. Analyze the behavior of the table created by the JS code and try to find the direct link to the data that the javascript is getting. I tried this, but I couldn't make this work.
    2. Save the webpage as an html document and then read the final html document locally. Since you're saving the html document after the ajax+js have kicked in, you'll be getting an html document with all the data that you need. 

    hope this helps.

    Tuesday, October 24, 2017 3:03 PM
  • Hi Miguel,

    I actually managed to get the data I needed off of oddsportal.com using Power Query. But I'm just not sure how get the data from each webpage at the same time.

    So it is possible to get the data I need, I just need to work out how to get multiple web pages.

    Thanks

    JOe

    Wednesday, October 25, 2017 12:12 PM
  • As I'm new to Power Query I'm not sure if this is helpful, but while trying to solve a question I currently have open here on the forum I came across this which may help you: https://www.youtube.com/watch?v=vhr4w5G8bRA&feature=youtu.be

    In my case I'm not sure how to make it work but hopefully it may assist your specific query needs.

    Wednesday, October 25, 2017 4:36 PM
  • Hi Miguel,

    I actually managed to get the data I needed off of oddsportal.com using Power Query. But I'm just not sure how get the data from each webpage at the same time.

    So it is possible to get the data I need, I just need to work out how to get multiple web pages.

    Thanks

    JOe

    Nice! would you mind posting your code? if the table structure is the same between all of those pages, then all you need to change is the 1st step that should be called Source to point to the other webpage and it'll get the info from the other page. You can create as many queries as you want and append them if needed. 

    If you're trying to do some sort of pagination then the video posted by Chaskel might help

    Wednesday, October 25, 2017 5:05 PM