none
Power Query inconsistent table detection (ads) RRS feed

  • Question

  • I have a power query function which grabs Table 0 from a webpage (Yahoo Fantasy football). The pages only display 25 rows at a time, so I have to repeat the function appending the "Count" portion of the URL by 25. Here is the code for my function:

    (count as number) as table =>
    let
        Source = Web.Page(Web.Contents("https://football.fantasysports.yahoo.com/f1/303401/players?status=ALL&pos=O&cut_type=9&stat1=S_S_2016&myteam=0&sort=PTS&sdir=1&count=" & Number.ToText(count))),
        Data0 = Source{0}[Data],
        #"Changed Type" = Table.TransformColumnTypes(Data0,{{"", type text}, {"Offense", type text}, {"2", type text}, {"Owner", type text}, {"GP*", Int64.Type}, {"Bye", Int64.Type}, {"Fantasy Fan Pts", type number}, {"Fantasy % Owned", type text}, {"Rankings Proj", Int64.Type}, {"Rankings Actual", Int64.Type}, {"Passing Yds", Int64.Type}, {"Passing TD", Int64.Type}, {"Passing Int", Int64.Type}, {"Rushing Att*", Int64.Type}, {"Rushing Yds", Int64.Type}, {"Rushing TD", Int64.Type}, {"Receiving Tgt*", Int64.Type}, {"Receiving Rec*", Int64.Type}, {"Receiving Yds", Int64.Type}, {"Receiving TD", Int64.Type}, {"Ret TD", Int64.Type}, {"Misc 2PT", Int64.Type}, {"Fum Lost", Int64.Type}, {"3", type text}}),
        #"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Offense", Text.Trim}}),
        #"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"Offense", Text.Clean}}),
        #"Split Column by Delimiter" = Table.SplitColumn(#"Cleaned Text","Offense",Splitter.SplitTextByDelimiter(" - ", QuoteStyle.Csv),{"Offense.1", "Offense.2"}),
        #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Offense.1", type text}, {"Offense.2", type text}}),
        #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1","Offense.2",Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false),{"Offense.2.1", "Offense.2.2"}),
        #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Offense.2.1", type text}, {"Offense.2.2", type text}}),
        #"Split Column by Delimiter2" = Table.SplitColumn(#"Changed Type2","Offense.1",Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true),{"Offense.1.1", "Offense.1.2"}),
        #"Changed Type3" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Offense.1.1", type text}, {"Offense.1.2", type text}}),
        #"Split Column by Delimiter3" = Table.SplitColumn(#"Changed Type3","Offense.1.1",Splitter.SplitTextByEachDelimiter({"  "}, QuoteStyle.Csv, true),{"Offense.1.1.1", "Offense.1.1.2"}),
        #"Changed Type4" = Table.TransformColumnTypes(#"Split Column by Delimiter3",{{"Offense.1.1.1", type text}, {"Offense.1.1.2", type text}}),
        #"Removed Columns" = Table.RemoveColumns(#"Changed Type4",{"", "Offense.1.1.1", "Offense.2.2", "2", "3"}),
        #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Offense.1.1.2", "Player"}, {"Offense.1.2", "Team"}, {"Offense.2.1", "Position"}}),
        #"Replaced Errors" = Table.ReplaceErrorValues(#"Renamed Columns", {{"GP*", 0}, {"Bye", 0}, {"Fantasy Fan Pts", 0}, {"Fantasy % Owned", 0}, {"Rankings Proj", 0}, {"Rankings Actual", 0}, {"Passing Yds", 0}, {"Passing TD", 0}, {"Passing Int", 0}, {"Rushing Att*", 0}, {"Rushing Yds", 0}, {"Rushing TD", 0}, {"Receiving Tgt*", 0}, {"Receiving Rec*", 0}, {"Receiving Yds", 0}, {"Receiving TD", 0}, {"Ret TD", 0}, {"Misc 2PT", 0}, {"Fum Lost", 0}})
    in
        #"Replaced Errors"

    The problem is that the webpage occasionally loads a banner ad that power query detects as a table. So sometimes the data I want is in Table 0, and sometimes it is in Table 1. I can get the above to work one page at a time (ie. invoke 0,25,50... separately), then change Source{0} to Source{1} for those that have failed - but this is far from ideal and requires far too much manual intervention.

    Is there a way to change my source based on the contents of the table? or possibly load a version of the site without ads so that I can consistently get the right table?

    Any help is much appreciated.



    Dan Garland


    Thursday, December 22, 2016 2:50 PM

Answers

  • See if the following works (using the Query Editor):

    1) Remove the Data0 step
    2) Select the source step, and add a new step to filter the ClassName column so that only the name "Table Ta-start Fz-xs Table-mid Table-px-sm Table-interactive" is selected. By default, the new step added after Source is #"Filtered Rows."
    3) Select the #Filtered Rows" step, and add a new step to remove the first column. At this point, you should have a table with a single column and a single row.
    4) Select the new #"Removed Column 1" step, and add a new step to expand the table, making sure that the box labeled "Use original column name as prefix" is cleared.

    There should be no changes required in the subsequent steps, which start at #"Changed Type." You may need to reorder some columns.

    If you want to return the entire table (all pages), you can create a table with values 0, 25, 50, etc., then add a custom column with the function query name, which uses the column values as a parameter.

    • Marked as answer by Daniel Garland Thursday, December 22, 2016 8:48 PM
    Thursday, December 22, 2016 7:42 PM

All replies

  • See if the following works (using the Query Editor):

    1) Remove the Data0 step
    2) Select the source step, and add a new step to filter the ClassName column so that only the name "Table Ta-start Fz-xs Table-mid Table-px-sm Table-interactive" is selected. By default, the new step added after Source is #"Filtered Rows."
    3) Select the #Filtered Rows" step, and add a new step to remove the first column. At this point, you should have a table with a single column and a single row.
    4) Select the new #"Removed Column 1" step, and add a new step to expand the table, making sure that the box labeled "Use original column name as prefix" is cleared.

    There should be no changes required in the subsequent steps, which start at #"Changed Type." You may need to reorder some columns.

    If you want to return the entire table (all pages), you can create a table with values 0, 25, 50, etc., then add a custom column with the function query name, which uses the column values as a parameter.

    • Marked as answer by Daniel Garland Thursday, December 22, 2016 8:48 PM
    Thursday, December 22, 2016 7:42 PM
  • THANK YOU

    That solved it. 


    Dan Garland

    Thursday, December 22, 2016 8:48 PM