none
Automatically pull data from web, copy values into notepad and save in specific location RRS feed

  • Question

  • Hello everybody, I was told power query could help me.  I need to access a website with a table (can be 1 to 50 rows, 10 columns).  I want to copy the values from columns B and E for each row, and add text before and after each value.  From there I want to copy all of that as a notepad file and save it in a specific folder.  I need this to automatically happen at specific times during the day... any ideas?  Powershell?  Power Query?
    Saturday, March 7, 2015 8:39 PM

Answers

  • Your answer consist of three steps :

    Step 1 - Manually create the desired query

    The first step is to create the query via the UI (this cannot be done automatically).

    Here is an example of an M expression that loads a table from IMDB.com and extract the cast of Game of Thrones. The query picks two columns (Column2 and Column4) and adds prefix and suffix to the original strings.

    (In the expression below the last Table.TransformColumns is the function you will need to add the text)

    let
        Source = Web.Page(Web.Contents("http://www.imdb.com/title/tt0944947/?ref_=nv_sr_1")),
        Data1 = Source{1}[Data],
        #"Removed Top Rows" = Table.Skip(Data1,1),
        #"Changed Type" = Table.TransformColumnTypes(#"Removed Top Rows",{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}}),
        #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Column2", "Column4"}),
        #"Transform Columns" = Table.TransformColumns(#"Removed Other Columns", {{"Column2", each "prefix1_" & _ & "_suffix1"},{"Column4", each "prefix2_" & _ & "_suffix2" }})
    
    in
        #"Transform Columns"

    Now you can load the query to your worksheet.

    Step 2 - Connection Settings

    Power Query will generate a Workbook Connection. You should edit it from the Data-->Connections dialog, and ensure that the refresh is not done at the background, and that the connection will be included in RefreshAll.

    Note that this part can be also automated, but since you will probably need to do it only once, it is not important for your scenario.

    Step 3 - Write a Powershell Script that converts XLSX to CSV

    After you edited the connection and saved the workbook as XLSX, you should prepare a Powershell script that saves the file as a CSV.

    $xl = New-Object -ComObject Excel.Application; 
    $xl.DisplayAlerts = $false; 
    $xl.Visible = $false;
    
    $workbook_name = "[here is the workbook full path and name (without the extension)]";
    
    $wb = $xl.Workbooks.Open($workbook_name + ".xlsx");
    $wb.RefreshAll();
    $wb.SaveAs($workbook_name + ".csv", 6);
    $wb.Close($True);
    $xl.Quit();
    [System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl) | Out-Null;

    Hope it helps,

    Gil







    Sunday, March 8, 2015 7:45 AM