Answered by:
Combining duplicate entries

Question
-
I have a .csv file that lists customer email addresses, and their associated customer ID. Some of the customers have ended up with multiple Customer ID's related to their email address though, which are currently listed on separate lines as follows.
customer1@email.com ID001
customer2@email.com ID002
customer2@email.com ID003
customer3@email.com ID004I want to import this info into powershell, then combine all the IDs from duplicated email addresses onto one line, so they look like the following. Preferrably, I would like the output to have the IDs in separate cells once opened in Excel, as opposed to all in the same cell.
customer1@email.com ID001
customer2@email.com ID002, ID003
customer3@email.com ID004Obviously the first step will be to import the csv to a variable, but from there I'm not sure how to go about this. Specifically, I'm not sure how to have Powershell look for a duplicate for each entry, and then combine the ID field. There's also over 300k records in this csv file, so there's really no way to do this manually.
I'm not asking for this to be written for me, but if someone could point me in the right direction (cmdlets to look at, etc), that should get me started.
Thanks,
Andrew
- Edited by musik779 Tuesday, February 13, 2018 7:05 PM Formatting
Tuesday, February 13, 2018 7:04 PM
Answers
-
wrong tool,
use excel und pivot
or
first you must import data in powershell
get-content or import-csv
next you should sort your data
then you need a foreach
remember you email
if It same as last add ID to a variable
Chris
- Edited by -- Chris -- Tuesday, February 13, 2018 7:48 PM
- Marked as answer by musik779 Wednesday, February 14, 2018 7:00 PM
Tuesday, February 13, 2018 7:37 PM
All replies
-
wrong tool,
use excel und pivot
or
first you must import data in powershell
get-content or import-csv
next you should sort your data
then you need a foreach
remember you email
if It same as last add ID to a variable
Chris
- Edited by -- Chris -- Tuesday, February 13, 2018 7:48 PM
- Marked as answer by musik779 Wednesday, February 14, 2018 7:00 PM
Tuesday, February 13, 2018 7:37 PM -
Maybe there's some way with group-object?
Tuesday, February 13, 2018 10:52 PM -
Import-CSv <your csv> | group email | where{$_.Count -gt 1}
\_(ツ)_/
Wednesday, February 14, 2018 1:47 AM -
Maybe like this.
import-csv group.csv | group email | select name,@{n='id';e={$_.group.id}}
Name id---- --customer1@email.com ID001customer2@email.com {ID002, ID003}customer3@email.com ID004- Edited by JS2010 Wednesday, February 14, 2018 4:39 AM
- Proposed as answer by Albert LingMicrosoft contingent staff Wednesday, February 14, 2018 5:12 AM
Wednesday, February 14, 2018 3:53 AM -
cool solution. If I test it I became a wrong output
email;id
one;a1
two;b1
two;b2
two;b3
three;c1
three;c1
four;d1what is wrong? your column name is "Name" and you Group by "email"?
PS H:\> import-csv c:\temp\group.csv | group email | select name,@{n='id';e={$_.group.id}}
Name id
---- --
{$null, $null, $null, $null...}
Chris
- Edited by -- Chris -- Thursday, February 15, 2018 8:14 AM
Thursday, February 15, 2018 8:13 AM -
The default delimiter in a csv is a comma, not a semicolon.
- Edited by JS2010 Thursday, February 15, 2018 12:41 PM
- Proposed as answer by -- Chris -- Thursday, February 15, 2018 1:36 PM
Thursday, February 15, 2018 12:40 PM -
import-csv c:\temp\group.csv -delim ';' | group email | select name,@{n='id';e={$_.group.id}}
\_(ツ)_/
- Proposed as answer by -- Chris -- Thursday, February 15, 2018 1:36 PM
Thursday, February 15, 2018 12:46 PM -
thxs all
newbee question,
the property Name for the first colume is "email" but the select name is "Name"? why
Chris
Thursday, February 15, 2018 1:36 PM -
The group (group-object) cmdlet outputs count, name and group properties. If you run each cmdlet seperately, you can see what they do.
PS C:\Users\me> import-csv group.csv -delim ';' | group emailCount Name Group----- ---- -----1 one {@{email=one; id=a1}}3 two {@{email=two; id=b1}, @{email=two; id=b2}, @{email=two; id=b3}}2 three {@{email=three; id=c1}, @{email=three; id=c1}}1 four {@{email=four; id=d1}}
- Edited by JS2010 Thursday, February 15, 2018 4:04 PM
Thursday, February 15, 2018 1:49 PM