none
Powershell Excel, find duplicate value in column 1, compare column 3 values, write line with lowerest value to another file RRS feed

  • Question

  • I'm try to use Powershell, Excel, or SQL to parse a CSV to find duplicate values in column 1, compare column 3 values in those lines, and write the line with the lowest column 3 value to another file.  It will then have to avoid doing the same thing when it comes across another column 1 value it has already worked on.  It also should copy lines without duplicates to the new file.  The end goal is to have a list of Any feed back is appreciated. 
     
    Saturday, August 1, 2015 8:48 PM

Answers

  • Something to get you started:

    $csv = @"
    "col1","col2","col3","col4"
    "a","alpha","1","one"
    "a","alpha","2","two"
    "b","bravo","3","three"
    "b","bravo","4","four"
    "b","bravo","5","five"
    "c","charlie","6","six"
    "c","charlie","6","six"
    "d","delta","7","seven"
    "@
    
    $data = ConvertFrom-Csv $csv
    
    $data | group col1 | where Count -gt 1 | foreach {
        $lowest = ($_.Group | measure col3 -min).Minimum
        $_.Group | where col3 -eq $lowest | select -first 1
    } | ConvertTo-Csv -NoTypeInformation

    Output:

    "col1","col2","col3","col4"
    "a","alpha","1","one"
    "b","bravo","3","three"
    "c","charlie","6","six"


    Saturday, August 1, 2015 10:26 PM

All replies

  • CSVs are easiest in PowerShell.  Please post your script with a specific question.


    \_(ツ)_/

    Saturday, August 1, 2015 9:53 PM
  • Something to get you started:

    $csv = @"
    "col1","col2","col3","col4"
    "a","alpha","1","one"
    "a","alpha","2","two"
    "b","bravo","3","three"
    "b","bravo","4","four"
    "b","bravo","5","five"
    "c","charlie","6","six"
    "c","charlie","6","six"
    "d","delta","7","seven"
    "@
    
    $data = ConvertFrom-Csv $csv
    
    $data | group col1 | where Count -gt 1 | foreach {
        $lowest = ($_.Group | measure col3 -min).Minimum
        $_.Group | where col3 -eq $lowest | select -first 1
    } | ConvertTo-Csv -NoTypeInformation

    Output:

    "col1","col2","col3","col4"
    "a","alpha","1","one"
    "b","bravo","3","three"
    "c","charlie","6","six"


    Saturday, August 1, 2015 10:26 PM
  • This is great Leif-Arne !  Thanks! 
    Monday, August 3, 2015 1:17 AM
  • This is what I came up with using your suggestions, and it's working well.  Thanks again!

    ==================
    # ALL: Gets entries with the lowest value of Inactivity Days for each Device_ID regardless of how many entries there are for the same Device_ID
    
    $file = Import-Csv "E:\Report-TEST.csv" | group device_id
    
    foreach($group in $file){$group.group | 
        where 'Inactivity Days' -eq ($group.group | measure 'Inactivity Days' -Minimum).Minimum | 
        Export-Csv "E:\Report-TEST-ALL.csv" -NoTypeInformation -Append
        }
    
    ==================
    # DUPS: Gets entries with the lowest value of Inactivity Days for each Device_ID when there are more than one entry for the same Device_ID
    
    $file = Import-Csv "E:\Report-TEST.csv" | group device_id 
    
    foreach($group in $file | ?{$_.Count -gt 1}){
        $group.group | 
        where 'Inactivity Days' -eq ($group.group | measure 'Inactivity Days' -Minimum).Minimum | 
        Export-Csv "E:\Report-TEST-DUPS.csv" -NoTypeInformation -Append
        }
    
    =====================
    # SINGLE: Gets only single Device_ID entries
    
    $file = Import-Csv "E:\Report-TEST.csv" | group device_id 
    
    foreach($group in $file | ?{$_.Count -lt 2}){
        $group.group | 
        Export-Csv "E:\Report Updated-TEST-SINGLES.csv" -NoTypeInformation -Append
        }
    
    =====================

    Monday, August 3, 2015 5:18 AM
  • i need this what you have all but in excel !!! HOW?
    Thursday, September 8, 2016 1:53 PM
  • You can compare 2 columns easily in Excel, but  the compare may only be with in the same row.  I'm not sure, and I'm not sure how to do the rest of the steps.  My focus was on powershell.  To compare columns, Highlight the 2 columns you want to compare .  go to Home tab > Condistional Formatting button > Highlight Cells Rules > Duplicate Values.
    Thursday, September 8, 2016 2:37 PM
  • This is not a "how do I use Excel" forum.

    -- Bill Stewart [Bill_Stewart]

    Thursday, September 8, 2016 2:43 PM
    Moderator