none
Power Query Update doesn't keep customn columns after a refresh RRS feed

  • Question

  • The update description (http://blogs.msdn.com/b/powerbi/archive/2014/05/06/new-updates-for-power-query.aspx) mentions that customn columns that are created in the Excel sheet are maintained after a refresh. 

    But if I 

    1) Load a Power Query result to a table

    2) Rightlick in the table and create a custom column within it

    3) refresh the query

    The column is again removed. What am I doing wrong? 

    Thursday, May 8, 2014 3:48 PM

Answers

All replies

  • It only works with columns added at the end.
    • Marked as answer by bsjut Thursday, May 8, 2014 6:25 PM
    Thursday, May 8, 2014 4:05 PM
  • Thanks, Curt!
    Thursday, May 8, 2014 6:26 PM
  • If the column is at the end, how does the table preserve the data?  Is it row related to cell content related?


    For example, power query created - add column which I manually add a phone number and contact date against.  When I refresh does this inserted column stay with the line item in the query or the row line (which changes) with every refresh?

    Monday, June 23, 2014 11:18 AM
  • We don't try to infer anything about the manually-added data. As such, this only really works if the data is relatively stable or if the value is calculated based on other values in the same row. The latter, however, is a pretty common scenario.

    If you have a list of data that you are maintaining and combining with other data, I would recommend you use Power Query itself to do the join. One approach could go like this:

    1. Copy your existing table (consisting of both the output from Power Query and your added columns) into a new sheet named "Editable" in the workbook.
    2. Define a new query for the Editable sheet. This should just return the contents of the sheet as a table with appropriate column names.
    3. Now, create another new query by doing an appropriate merge/join between the existing query and the one you created in the previous step. Make sure the columns have the same structure as those in the data that you copied in step 1.
    4. Copy the output of this new query into the "Editable" sheet.

    Now, all the changes you make should be to "Editable". When you want to do a refresh, Power Query will merge the existing Editable data with your other sources and produce new output. You must then remember to copy this output back into the Editable sheet so that it is updated with the new data.

    Monday, June 23, 2014 4:14 PM
  • To keep the custom column, data and format, you have to go to Data -> Properties and uncheck the Adjust column width and check the Preserve column sort/filter/layout. but the side effect is whatever new format / new column you made in query editor will all go the end of the table after loading them up. then you have to re arrange again.
    Wednesday, April 26, 2017 7:51 PM
  • This is the only solution I found to preserve the row of data in custom columns i.e Joining the custom column data using an existing key in the original query data. However you don't need to have separate sources (if I understand the response) you can:

     

    1. Add custom columns for manual input of data (preferably at the end but not critical)
    2. Create a new 'connection only' query called myTable* (call it what you want obviously) referencing the whole excel table including all the custom columns. We will then reference this later. Don't load it to a data table just 'connection only'.
    3. At the end of the main query (original query for the table) add the following steps to join and reorder the custom columns as follows:

     

    selectCustomColumns = Table.SelectColumns(myTable*, {"custom column", "custom column2", "key value column"}) //select all your custom columns & key value column from the myTable* connection only query table. We need the key value column to join the data back onto the new query data...


    renameKey = Table.RenameColumns(...// rename the key in selectCustomColumns (to avoid conflict in the following join)

    join =  Table.Join(lastStepOfOriginalQuery, "original key column", renameKey,  "renamed keycolumn",  JoinKind.LeftOuter) // Using LeftOuter lets you keep the new query rows without any associate custom column data

    deleteRenamedKey = Table.RemoveColumns(... //delete the duplicated key leaving only the original

    Reorder = Table.ReorderColumns(... // to whatever order you want so they don't have to be at the end.

    in

    Reorder

     

    Note: when you first run this the custom columns that are now part of the query will be duplicated in the table with a number appended at the end of the header name - just delete the original columns, you can manually edit the new 'custom' columns that have come in with the query result. In effect the original custom columns just seed the query the first time it runs.

     

    Note: it is of course possible to directly reference the table with custom columns using Excel.CurrentWorkbook(){[Name="myTable"]}[Content], however, it will be treated as a separate data source by the formula firewall and if you have thousands of rows it might fail. Therefore best to reference a staged 'connection only' query instead...








    • Edited by Joe Hobbs Tuesday, December 17, 2019 10:39 AM
    Wednesday, December 19, 2018 12:28 PM
  • Hi Joe Hobbs

    I have been using this process for a long time. It works perfectly.

    Can you please explain what you mean by

    2. Create a 'connection only' query myTable* referencing
     the whole excel table including all the custom columns
    I do not understand the difference from the  Excel.CurrentWorkbook(){[Name="myTable"]}[Content] that you suggest to avoid in your note.

    • Edited by anthony34 Wednesday, December 19, 2018 4:22 PM
    Wednesday, December 19, 2018 4:22 PM
  • Hi Anthony yes the Excel.CurrentWorkbook is often fine but I had a query often with 1000 + rows referencing  CSV files from a directory (i.e. 'the source') and as soon as I introduces another Excel.CurrentWorkbook source the query failed. After discussing the problem with Microsoft they suggested that for queries with 1000+ rows only have a single direct data source , all other data integrations should come from existing queries in a 'staged' process. It might not be a problem anymore but that's just how I'm working now!
    Wednesday, December 19, 2018 5:01 PM
  • Would you have an example of code that you could share here, because I still do not understand what means "single direct data source"
    Thursday, December 20, 2018 8:47 AM