locked
How can I use power query to strip a list of noise words out of separate list? RRS feed

  • Question

  • Good morning/afternoon/evening,

    First, let me say that I think Power Query is fascinating.  It's really a huge step toward putting sophisticated data transformation tasks into the hands of power users.

    So I have a list of company names and I want to remove common noise words from these names.

    For the sake of argument there are a lot of noise words ie inc. corp. llc., etc. so I want the list of noise words to come from a range.

    So assume I have two tables companies with just a single name column and noiseword with just a name column.

    What's the best way to go about this with Power Query???

    Thanks!

    Alan

    PS.  Someone needs to write a book on Power Query because there is a lot to it.  Maybe a set of advanced examples.  Chris Webb has a pretty cool wordpress blog...

    Wednesday, August 21, 2013 6:33 PM

Answers

  • Given table "companies" with single column "name" and table "noiseword" with single column "word".

    For example,

    name
    Microsoft Inc
    Rare, LLC
    Apple Corporation
     
    word
    Corporation
    Incorporated
    Inc
    LLC
    ,


    1. Reference companies table

    2. Add custom column: List.Accumulate( noiseword[word], [name], (state, current) => Text.Replace(state, current, ""))

    This is just find, replace... it will not work if your company is named PLLClub... (see it contains LLC)

    name Custom
    Microsoft Inc Microsoft 
    Rare, LLC Rare 
    Apple Corporation Apple 

    Friday, August 23, 2013 8:36 PM