none
VB Script Remove Duplicates

    Question

  • Hi, I have a text file - INPUT.txt that contains -

    156.345.123.1 SERVER1_NIC1
    156.345.123.2 SERVER2_NIC1
    156.345.123.3 SERVER1_NIC2
    156.345.123.4 SERVER2_NIC2

    I have a script that loops through and reads in the just the server name (strips off everything from the name starting with the "_") and then creates an output file - (OUTPUT.txt) with just the server names, that looks something like -

    SERVER1
    SERVER2

    So far so good, I've extracted just the server name, then deduped them (since I only need the first occurance of them). However when I write the names to the output file, I still need the entire line that coincides with the first occurance of the server name, so I'd like the OUTPUT.txt file to contain the first occurance of the server and it's IP address, something like -

    156.345.123.1 SERVER1

    156.345.123.2 SERVER2

    I can only get just the server name in the output file or all of them, I tried adding the line ThisStrName3 = Trim(Mid(Line,1,14)) &" "&mid(ThisStrName1,1,ThisStrName2) to the following code, but when I do it doesn't make the line unique anymore, so it adds all occurances to my OUTPUT.txt Any ideas? Thanks!

     

    Set objDictionaryHost = CreateObject("Scripting.Dictionary")
    Set objDupHost = CreateObject("Scripting.FileSystemObject")
     Set objDupHostFile = objDupHost.OpenTextFile("c:\temp\INPUT.txt", 1)

    Set objWriteH = CreateObject("Scripting.FileSystemObject")
    Set objWriteHHostFile = objWriteH.CreateTextFile("C:\temp\OUTPUT.txt",2)

    Do
        Line = Trim(objDupHostFile.ReadLine)
        ThisStrName1 = Trim(Mid(Line,15,20))
        ThisStrName2 = InStr(1,ThisStrName1, "_") - 1
        ThisStrName3 = Trim(Mid(Line,1,14)) &" "&mid(ThisStrName1,1,ThisStrName2)

        If Not objDictionaryHost.Exists(ThisStrName3) Then
            objDictionaryHost.Add ThisStrName3,ThisStrName3
        End If
    Loop while not objDupHostFile.AtEndOfStream or ThisStrName3 = True

    For Each ThisStrName3 in objDictionaryHost.Keys
    wscript.echo ThisStrName3
     objWriteHHostFile.WriteLine Trim(ThisStrName3)
    Next

    objWriteHHostFile.Close
     objDupHostFile.Close

    Thursday, April 22, 2010 1:02 AM

Answers

  • Sorry, I didn't test the script I posted. I see now that I changed the names of some variables. I was trying to select names that meant something and confused myself. I tested the version below with your example input file and it worked for me. I used "Option Explicit" and declared all variables. That's how I found the problems. I also used only one reference to the FileSystemObject. The same object reference can be used to open both files.

     

    Option Explicit
    Dim objDictionaryHost, objFSO, objDupHostFile
    Dim objWriteHHostFile, strLine, strAddress, strServer, strItem
    
    Set objDictionaryHost = CreateObject("Scripting.Dictionary")
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objDupHostFile = objFSO.OpenTextFile("c:\temp\INPUT.txt", 1)
    
    Set objWriteHHostFile = objFSO.CreateTextFile("C:\temp\OUTPUT.txt", 2)
    
    Do Until objDupHostFile.AtEndOfStream
      strLine = Trim(objDupHostFile.ReadLine)
      If (strLine <> "") Then
        ' The line to output ends with the character before "_".
        strAddress = Mid(strLine, 1, InStr(strLine, "_") - 1)
        ' The server name starts with the character after the first space.
        strServer = Mid(strAddress, InStr(strAddress, " ") + 1)
    
        If Not objDictionaryHost.Exists(strServer) Then
          objDictionaryHost.Add strServer, strAddress
        End If
      End If
    Loop
    
    For Each strItem in objDictionaryHost.Items
      wscript.echo strItem
      objWriteHHostFile.WriteLine Trim(strItem)
    Next
    
    objWriteHHostFile.Close
    objDupHostFile.Close

    I also coded to avoid an error if the input file has a blank line. The script will still raise an error if any line lacks either a "_" or space character.

    Richard Mueller


    MVP ADSI
    Thursday, April 22, 2010 2:53 AM
    Moderator
  • Sorry I'm confused on the requirements based on your INPUT.txt example above there are no duplicate entries the IP addresses make them unique. Are you saying that your INPUT.TXT will have duplicate IP Address on it? If so I can understand the use of the dictionary object. The reason Richards code is not working is he is using an "Item" variable that has no value anywhere else in the script. So make the Dictionary key the IP address and then check for the existence of that.

    'REPLACE THIS:
    
     ' The server name starts with the character after the first space.
     strServer = Mid(Item, InStr(strAddress, " ") + 1)
    
     If Not objDictionaryHost.Exists(strServer) Then
      objDictionaryHost.Add strServer, strAddress
     End If
    
    'WITH THIS:
    
     ' Return everything left of the space before the server name.
     strIP = Left(Line, InStr(strAddress," ") - 1)
    
     If Not objDictionaryHost.Exists(strIP) Then
      objDictionaryHost.Add strIP, strAddress
     End If

     


    v/r LikeToCode....Mark the best replies as answers.
    Thursday, April 22, 2010 3:07 AM
    Moderator

All replies

  • When you add items to the dictionary object, make the key value the server name (which must be unique), but make the item value the string you want to output. Then when you enumerate the dictionary object instead of .Keys use .Items. For example:

    Do 
      strLine = Trim(objDupHostFile.ReadLine)
      ' The line to output ends with the character before "_".
      strAddress = Mid(Line, 1, InStr(Line, "_") - 1)
      ' The server name starts with the character after the first space.
      strServer = Mid(Item, InStr(strAddress, " ") + 1)
    
      If Not objDictionaryHost.Exists(strServer) Then
        objDictionaryHost.Add strServer, strAddress
      End If
    Loop while not objDupHostFile.AtEndOfStream
    
    For Each strItem in objDictionaryHost.Items
      wscript.echo strItem
      objWriteHHostFile.WriteLine Trim(strItem)
    Next
    
    
    Richard Mueller
    MVP ADSI
    Thursday, April 22, 2010 1:21 AM
    Moderator
  • Richard, thanks so much. I reran the script with your updates, and it does return the line in the format I need, but for some reason it is only returning the first instance, but I can see the loop should go through the entire INPUT file, what am I not seeing here? For example, when I run the script now, I just get...

    C:\temp>cscript dedupe_hostnames.vbs
    Microsoft (R) Windows Script Host Version 5.7
    Copyright (C) Microsoft Corporation. All rights reserved.

    156.345.123.1 SERVER1

    Thursday, April 22, 2010 1:31 AM
  • I'm used to looping as follows:

    Do Until objDupHostFile.AtEndOfStream
     strLine = Trim(objDupHostFile.ReadLine)
     ' The line to output ends with the character before "_".
     strAddress = Mid(Line, 1, InStr(Line, "_") - 1)
     ' The server name starts with the character after the first space.
     strServer = Mid(Item, InStr(strAddress, " ") + 1)
    
     If Not objDictionaryHost.Exists(strServer) Then
      objDictionaryHost.Add strServer, strAddress
     End If
    Loop

    In your version, objDupHostFile.AtEndOfStream was false the first time through the loop, so "Not objDupHostFile.AtEndOfStream" was true and the loop ended. I missed that.

    Richard Mueller


    MVP ADSI
    Thursday, April 22, 2010 2:17 AM
    Moderator
  • Will this work? I removed the dictionary object and used the Instr Function to find the amount if characters including the "_" and then used the Left Function to return a string of only those characters before the "_". Then wrote it line by line to the OUTPUT.txt file.

    Set objDupHost = CreateObject("Scripting.FileSystemObject")
    Set objDupHostFile = objDupHost.OpenTextFile("c:\scripts\INPUT.txt", 1)
    
    Set objWriteH = CreateObject("Scripting.FileSystemObject")
    Set objWriteHHostFile = objWriteH.CreateTextFile("C:\scripts\OUTPUT.txt",2)
    
    Do Until objDupHostFile.AtEndOfStream
    	strLine = Trim(objDupHostFile.ReadLine)
    	' get the count of characters to the "_" and then subtract 1 for the "_" itself.
    	intCount = InStr(strLine, "_") - 1
    	' Return everything to the left of the "_"
    	objWriteHHostFile.WriteLine Left(strLine, intCount)
    Loop 
    objWriteHHostFile.Close
    objDupHostFile.Close

    v/r LikeToCode....Mark the best replies as answers.
    Thursday, April 22, 2010 2:25 AM
    Moderator
  • Richard, this is exactly what I have, and it is still returning just the first instance of:  156.345.123.1 SERVER1

     

    Do Until objDupHostFile.AtEndOfStream
      Line = Trim(objDupHostFile.ReadLine)
      ' The line to output ends with the character before "_".
      strAddress = Mid(Line, 1, InStr(Line, "_") - 1)
      ' The server name starts with the character after the first space.
      strServer = Mid(Item, InStr(strAddress, " ") + 1)

      If Not objDictionaryHost.Exists(strServer) Then
        objDictionaryHost.Add strServer, strAddress
      End If
    Loop

    Thursday, April 22, 2010 2:30 AM
  • Sorry, I didn't test the script I posted. I see now that I changed the names of some variables. I was trying to select names that meant something and confused myself. I tested the version below with your example input file and it worked for me. I used "Option Explicit" and declared all variables. That's how I found the problems. I also used only one reference to the FileSystemObject. The same object reference can be used to open both files.

     

    Option Explicit
    Dim objDictionaryHost, objFSO, objDupHostFile
    Dim objWriteHHostFile, strLine, strAddress, strServer, strItem
    
    Set objDictionaryHost = CreateObject("Scripting.Dictionary")
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objDupHostFile = objFSO.OpenTextFile("c:\temp\INPUT.txt", 1)
    
    Set objWriteHHostFile = objFSO.CreateTextFile("C:\temp\OUTPUT.txt", 2)
    
    Do Until objDupHostFile.AtEndOfStream
      strLine = Trim(objDupHostFile.ReadLine)
      If (strLine <> "") Then
        ' The line to output ends with the character before "_".
        strAddress = Mid(strLine, 1, InStr(strLine, "_") - 1)
        ' The server name starts with the character after the first space.
        strServer = Mid(strAddress, InStr(strAddress, " ") + 1)
    
        If Not objDictionaryHost.Exists(strServer) Then
          objDictionaryHost.Add strServer, strAddress
        End If
      End If
    Loop
    
    For Each strItem in objDictionaryHost.Items
      wscript.echo strItem
      objWriteHHostFile.WriteLine Trim(strItem)
    Next
    
    objWriteHHostFile.Close
    objDupHostFile.Close

    I also coded to avoid an error if the input file has a blank line. The script will still raise an error if any line lacks either a "_" or space character.

    Richard Mueller


    MVP ADSI
    Thursday, April 22, 2010 2:53 AM
    Moderator
  • Sorry I'm confused on the requirements based on your INPUT.txt example above there are no duplicate entries the IP addresses make them unique. Are you saying that your INPUT.TXT will have duplicate IP Address on it? If so I can understand the use of the dictionary object. The reason Richards code is not working is he is using an "Item" variable that has no value anywhere else in the script. So make the Dictionary key the IP address and then check for the existence of that.

    'REPLACE THIS:
    
     ' The server name starts with the character after the first space.
     strServer = Mid(Item, InStr(strAddress, " ") + 1)
    
     If Not objDictionaryHost.Exists(strServer) Then
      objDictionaryHost.Add strServer, strAddress
     End If
    
    'WITH THIS:
    
     ' Return everything left of the space before the server name.
     strIP = Left(Line, InStr(strAddress," ") - 1)
    
     If Not objDictionaryHost.Exists(strIP) Then
      objDictionaryHost.Add strIP, strAddress
     End If

     


    v/r LikeToCode....Mark the best replies as answers.
    Thursday, April 22, 2010 3:07 AM
    Moderator