During an assignment, we came across a situation where we need to fix data issues in SharePoint Lists.

One of the issues that we found was a presence of duplicate data. In order to fix that problem, we developed a PowerShell Script to find out duplicate data based on a specific or a group of columns.

For the sake of the demo, weI have added a SharePoint List with some duplicate records in it as shown below:

1

Now let’s look into the code to understand implementation details:

In Step 1 we are getting references to the Site and Web where the SharePoint List resides.

In Step 2 we are splitting the list of columns based on which we want to find out the duplicate data.

We can see there are two input variables “ColumnToValidate” and “ColumnToDisplay”“ColumnToValidate” provides columns based on which duplicity needs to be checked while “ColumnToDisplay” contains the list of columns that need to be the part of data export.

In Step 3 we are creating the export folder that will hold the CSV files exported with duplicate records

In Step 4 we are creating the list object that will give the handle on the list which needs to be validated

2

In Step 5 we are getting list of Items from SharePoint List and grouping them based on the validation columns

In Step 6 we are creating the directory for export files

In Step 7 we are exporting all the groups which are having item count greater than 1 (this logic identifies the duplicate items)

3

That is all for the code.

Now we will see the variation in outputs depending on the columns specified for duplicate check

In Step 8 we specify the validation and display columns, for the first execution we will check duplicate values in “Title” column

In Step 9 we are calling the “Get-DuplicateListItems” function to find the duplicate values

4

After the function executed successfully we can see the following output.

In Step 10 we can see the output of this execution and can see six items found which duplicate in Title Column

5

In Step 11 we can see the CSV file that is exported by the execution considering “Title” Column to be validated.

6

In Step 12 we can see the output file and can notice duplicate values in “Title” Column

7

In Step 13 we have changed the list of columns to be validated. In this second execution, we have added another column “Role”.

Now the list will be validated for duplicity based on the combination of “Title & Role” Columns

8

In Step 14 we can see the output of this execution and can see four items found which duplicate in “Title & Role” Columns

9

In Step 15 we can see the CSV file that is exported by the execution considering “Title & Role” Column to be validated

10

In Step 16 we can see the output file and can notice duplicate values in “Title & Role” Column

11

In Step 17 we have changed the list of columns to be validated. In this second execution, we have added another column “Location”.

Now the list will be validated for duplicity based on the combination of “Title & Role & Location” Columns

12

In Step 18 we can see the output of this execution and can see two items found which duplicate in “Title & Role & Location” Columns

13

In Step 19 we can see the CSV file that is exported by the execution considering “Title & Role & Location” Column to be validated

14

In Step 20 we can see the output file and can notice duplicate values in “Title & Role & Location” Column

15

This is a very simple technique that can be used to fix one of the issues with SharePoint List data.

Hope you find it helpful.