Is excel the right tool to remove partially duplicate records
-
Friday, June 22, 2012 6:30 PMI have a huge spreadsheet with 1000's of records and the rows look like:
z:\images\12346
z:\images\12346\09132010_1151
09/13/2010 12:46 PM 14,934 12346_09132010_1151_5_i.bmp
09/13/2010 12:46 PM 14,934 12346_09132010_1151_6_i.bmp
z:\images\505722
z:\images\505722\10252011_0849
10/25/2011 11:36 AM 56,200 505722_10252011_0849_35_0.d
10/25/2011 11:36 AM 2,032,930 505722_10252011_0849_21_0.d
10/25/2011 11:36 AM 14,934 505722_10252011_0849_12_i.b
z:\images\8921
z:\images\8921\04192011_1022
04/19/2011 01:36 PM 14,934 8921_04192011_1022_92_i.bmp
Is there a way to remove the duplicate records programmatically with excel , so to make the records look like below?
z:\images\12346\09132010_1151
09/13/2010 12:46 PM 14,934 12346_09132010_1151_5_i.bmp
09/13/2010 12:46 PM 14,934 12346_09132010_1151_6_i.bmp
z:\images\505722\10252011_0849
10/25/2011 11:36 AM 56,200 505722_10252011_0849_35_0.d
10/25/2011 11:36 AM 2,032,930 505722_10252011_0849_21_0.d
10/25/2011 11:36 AM 14,934 505722_10252011_0849_12_i.b
z:\images\8921\04192011_1022
04/19/2011 01:36 PM 14,934 8921_04192011_1022_92_i.bmp
Thanks in advance.Jay Tabatabai
All Replies
-
Saturday, June 23, 2012 1:34 AM
On Fri, 22 Jun 2012 18:30:08 +0000, jaytaba wrote:>I have a huge spreadsheet with 1000's of records and the rows look like:>>z:\images\12346>z:\images\12346\09132010_1151>09/13/2010 12:46 PM 14,934 12346_09132010_1151_5_i.bmp>09/13/2010 12:46 PM 14,934 12346_09132010_1151_6_i.bmp>z:\images\505722>z:\images\505722\10252011_0849>10/25/2011 11:36 AM 56,200 505722_10252011_0849_35_0.d>10/25/2011 11:36 AM 2,032,930 505722_10252011_0849_21_0.d>10/25/2011 11:36 AM 14,934 505722_10252011_0849_12_i.b>z:\images\8921>z:\images\8921\04192011_1022>04/19/2011 01:36 PM 14,934 8921_04192011_1022_92_i.bmp>>Is there a way to remove the duplicate records programmatically with excel , so to make the records look like below?>>z:\images\12346\09132010_1151>09/13/2010 12:46 PM 14,934 12346_09132010_1151_5_i.bmp>09/13/2010 12:46 PM 14,934 12346_09132010_1151_6_i.bmp>z:\images\505722\10252011_0849>10/25/2011 11:36 AM 56,200 505722_10252011_0849_35_0.d>10/25/2011 11:36 AM 2,032,930 505722_10252011_0849_21_0.d>10/25/2011 11:36 AM 14,934 505722_10252011_0849_12_i.b>z:\images\8921\04192011_1022>04/19/2011 01:36 PM 14,934 8921_04192011_1022_92_i.bmp>>Thanks in advance.>>>Jay TabatabaiThis macro might work. It assumes your data is in column A, and that everything in column a will be processed.One caveat: this technique can only specify exact and partial matches. It cannot require, for example, that the match starts at the beginning of the string.For example:abc 123\456zyx123456abc 123\4567890123Since the first string is included in the second, it will count as a "partial match" and be deleted. If you need to also specify that the partial matches begin at the start of the strings being compared, then we need to use a different technique. I do not present that initially because it takes longer to examine each cell than it does to use the .Find method. And whether it is necessary depends on your data and desired results.
I also assumed that ABC = abc for the sake of comparisons, but if that is not the case, make the obvious change in the matchcase parameter.
To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.Ensure your project is highlighted in the Project Explorer window.Then, from the top menu, select Insert/Module andpaste the code below into the window that opens.To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select the macro by name, and <RUN>.============================Option ExplicitSub DeletePartialDups()Dim rSrc As RangeDim c As Range, r As RangeDim vRes() As RangeDim sFirstAdr As StringDim i As LongSet rSrc = Range("A1", Cells(Rows.Count, "A").End(xlUp))ReDim vRes(0)'Get list of cells to deleteFor Each c In rSrcsFirstAdr = c.AddressSet r = rSrc.Find(what:=c, after:=c, LookIn:=xlValues, _lookat:=xlPart, MatchCase:=False)If r.Address <> sFirstAdr ThenSet vRes(UBound(vRes)) = cReDim Preserve vRes(0 To UBound(vRes) + 1)End IfNext cApplication.ScreenUpdating = FalseFor i = UBound(vRes) - 1 To 0 Step -1vRes(i).Delete xlShiftUpNext iApplication.ScreenUpdating = TrueEnd Sub==========================
Ron- Edited by Ron Rosenfeld Saturday, June 23, 2012 1:36 AM
- Marked As Answer by jaytaba Saturday, June 23, 2012 3:30 AM
-
Saturday, June 23, 2012 3:34 AM
Thank you Ron appreciate your help.
I ran your macro and then
Sub main()
Dim StrPath
For Row = 1 To Range("A1").CurrentRegion.Rows.Count
If Left(Range("A" & Row).Value, 1) = "z" Then
StrPath = Range("A" & Row).Value
Else
Range("B" & Row).Value = StrPath & " " & Range("A" & Row).Value
End If
Next RowEnd Sub
And I can now import the results of "DIR z:\*.* /s" of my file system into a database for reporting & comparison.
Regards
Jay
Jay Tabatabai
-
Saturday, June 23, 2012 10:32 AMGlad to help. Thanks for the feedback.
Ron
-
Monday, June 25, 2012 1:30 AMModerator
-
Monday, July 09, 2012 4:19 AM
I also learned about the command below, which is worth adding to this post, as it eliminates the need for some of the macros discussed here. (please note that there is no space between "/a:" & "-D")
dir *.* /a:-D /b /s /n > Listing_results.txt
Thanks
Jay
Jay Tabatabai

