locked
Powershell email RRS feed

  • Question

  • I have two folders one is a payroll.cvs, which has names and emails(gmail.com,yahoo.com,tonkin.com,gooseegee.com) and a fianl.txt file that has only emails of the workers at the company so they end in(tonkin.com,gooseegee.com,ecs), both files have over 1000 different emails. I need to find out all the work emails from payroll.csv that match the ones in final.txt, i dont need the gmail, yahoo, emails.(Gmail,Yahoo) would be consider external emails, and the company emails(tonkin.com,gooseegee.com) would be internal emails. I don't wana have to go one by one is there a way I can automate this in powershell, I have been trying but cant seem to get it to work.
    • Moved by Andy DavidMVP Saturday, December 1, 2018 4:26 PM PS Question
    Friday, November 30, 2018 8:19 PM

All replies

  • I've never used SQL in Powershell to work with CSV files so I thought I'd give it a try. This works with a small set of data.

    I'd suggest that you "final.txt" should be a CSV file, though. Just add the appropriate header line to it (e.g. "Email"). You'll have to adjust the SQL query to return the columns you want, and the name of the columns that hold the email addresses in both files.

    $csv1 = "c:\temp\payroll.csv"
    $csv2 = "c:\temp\fianl.csv"
    
    # both CSV files (i.e. "tables") must be in the same directory (i.e. "database)
    $path1 = split-path $csv1
    $path2 = split-path $csv2
    if ($path1 -ne $path2)
    {
        write-host "ERROR: Both CSV files must be in the directory" -ForegroundColor Red
        return
    }
    
    # All the code between here and the next comment of all "#" is just to allow the CSV files
    # to use a delimiter that's not a comma, or to use a CSV file without any headers
    $Csv1FirstRowColumnNames = "Yes"
    $Csv1Delimiter = ","
    $Csv2FirstRowColumnNames = "No"
    $Csv2Delimiter = ","
    
    # If the delimiter isn't a comma, create a schema.ini file to specify the delimiter
    # If the CSV has no header, create a schema.ini file to specify the lack of headers
    $createSchema = $True   # should use Set-Content to create a new file
    
    if ($Csv1FirstRowColumnNames -eq "No" -or $Csv1Delimiter -ne ",") 
    {
        $filename1 = Split-Path $csv1 –leaf
        $schemaFile = $path1 + "\schema.ini"
        Set-Content -Path $schemaFile -Value "[$filename1]"
        if ($Csv1Delimiter -ne ",") 
        {
            Add-Content -Path $schemaFile -Value "Format=Delimited($Csv1Delimiter)"
        }
        if ($Csv1FirstRowColimnNames -eq "No")
        {
            Add-Content -Path $schemaFile -Value "ColNameHeader=False"
       }
       $createSchema = $false     # don't overwrite the schema just created if 2nd file also needs attention
    }
    if ($Csv2FirstRowColumnNames -eq "No" -or $Csv2Delimiter -ne ",") 
    {
        $filename2 = Split-Path $csv2 –leaf
        $schemaFile = $path2 + "\schema.ini"
        if ($createSchema) 
        {
            Set-Content -Path $schemafile -Value "[$filename2]"
        }
        else {
            Add-Content -Path $schemafile -Value "[$filename2]"
        }
        if ($Csv2Delimiter -ne ",")
        {
            Add-Content -Path $schemafile -Value "Format=Delimited($Csv2Delimiter)"
        }
        if ($Csv2FirstRowColumnNames -eq "No")
        {
            Add-Content -Path $schemafile -Value "ColNameHeader=False"
        }
    }
    #######################################################################################
    # end of delimiter and header specialization
    
    # just pick the 1st provider (there are two on my system, one for MS Office 12 and one for 15)
    $provider = (New-Object System.Data.OleDb.OleDbEnumerator).GetElements() | Where-Object { $_.SOURCES_NAME -like "Microsoft.ACE.OLEDB.*" } 
    if ($provider -is [system.array]) { $provider = $provider[0].SOURCES_NAME } else {  $provider = $provider.SOURCES_NAME }
    
    $connstring = "Provider=$provider;Data Source=$(Split-Path $csv1);Extended Properties='text;';"
    
    $tablename1 = (Split-Path $csv1 -leaf).Replace(".","#")
    $tablename2 = (Split-Path $csv2 -leaf).Replace(".","#")
    
    # table "P" is the payroll table (with mixed e-mail addresses)
    # table "F" is the "fianl" file  (with company e-mail addresses)
    
    # SQL query if BOTH csv files have headers
    #$sql = "SELECT p.FirstName, p.LastName, p.Email as PayrollEmail, f.Email as FianlEmail from $tablename1 as p, $tablename2 as f WHERE p.Email = f.Email"
    
    # SQL query for 1st CSV file WITH headers and 2nd one WITHOUT headers
    # 2nd file (fianl.csv) has no headers so the column name "F1" is auto generated by the provider software
    $sql = "SELECT p.FirstName, p.LastName, p.Email as PayrollEmail, f.F1 as FianlEmail from $tablename1 as p, $tablename2 as f WHERE p.Email = f.F1"
    
    # Setup connection and command
    $conn = New-Object System.Data.OleDb.OleDbconnection
    $conn.ConnectionString = $connstring
    $conn.Open()
    
    $cmd = New-Object System.Data.OleDB.OleDBCommand
    $cmd.Connection = $conn
    $cmd.CommandText = $sql
    
    # Load into datatable
    $dt = New-Object System.Data.DataTable
    $dt.Load($cmd.ExecuteReader("CloseConnection"))
    # Clean up
    $cmd.dispose | Out-Null
    $conn.dispose | Out-Null
    
    # Output results to console
    $dt | Format-Table -AutoSize
    # Output results to a new CSV
    $dt | Export-Csv -Path C:\temp\MatchedEmail.csv -NoTypeInformation
    
    ###################################################################################
    # Contents of test payroll.csv and fianl.csv files
    # if you want to create them
    $payroll = @"
    FirstName,LastName,Email
    Donna,Cameron,dc@heroffice.org
    Sean,Burkholter,burky@aol.com
    Jessica,Biel,JesseBiel@bogus.org
    Heather,Lyons,Hlyons@nospam.com
    foo,bar,blan@blah.com.au
    Bogus,Test,Email
    "@
    
    $fianl = @"
    Email
    dc@heroffice.org
    JesseBiel@bogus.org
    blan@blah.com.au
    "@



    --- Rich Matheisen MCSE&I, Exchange Ex-MVP (16 years)


    Sunday, December 2, 2018 3:40 AM
  • This is much easier and can do almoat anything with two CSV files:

    #
    #   Easily join multiope CSV tables - VERY FAST
    #    Requires ACE drivers which can be installed form here: 
    #        https://www.microsoft.com/en-us/download/details.aspx?id=13255
    #
    
    Function Get-CsvReader {
    	Param (
    		$csvPath=$pwd,
    		[Parameter(Mandatory)]
    		$CommandText
    	)
    	
    	$tmpl = 'Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties="Text;HDR=Yes;FORMAT=Delimited"'
    	$connStr = $tmpl -f $csvPath
    	Write-Verbose $connStr	
    	
    	$csvConnection = New-Object System.Data.OleDb.OleDbConnection($connStr)
    	$csvConnection.Open()
    	
    	$cmd = $csvConnection.CreateCommand()
    	$cmd.CommandText = $CommandText
    	
    	$dt = New-Object System.Data.DataTable
    	$rdr = $cmd.ExecuteReader()
    	$dt.Load($rdr)
    	$rdr.Close()
    	$dt
    }
    
    # Test code
    $csvPath = 'd:\scripts'
    $sql = @'
    	SELECT 
    		F1.ID as ID,
    		F1.Name as Name, 
    		F1.Status as Status1, 
    		F2.Status as Status2 
    	FROM 
    		[test1.csv] as F1,
    		[test2.csv] As F2 
    	WHERE 
    		F1.ID = F2.ID
    '@
    
    Get-CsvReader -csvPath $csvPath -CommandText $sql
    
    
    <#
      # test data
    
    # TEST1,CSV
    "id","Name","Status"
    1,"AdobeARMservice","OK"
    2,"AJRouter","OK"
    3,"ALG","ERROR"
    4,"AppHostSvc","OK"
    
    # TEST2.CSV
    "id","Name","Status"
    1,"AdobeARMservice","OK"
    2,"AJRouter","OK"
    3,"ALG","OK"
    4,"AppHostSvc","OK"
    
    #>

    Download link: https://1drv.ms/u/s!AjiiPtIUqzK_gs0UobI_Zrj9tslXKw

    A non "ACE" method to manipulate a CSV relation:

    #	
    #  Here is an an example of joining two tables in a DataViewRelation with disconnected data.
    #   [of couse this is much esier with Linq}
    #
    #
    
    #region Control Helper Functions
    function ConvertTo-DataTable {
    	[OutputType([System.Data.DataTable])]
    	param (
    		[ValidateNotNull()]
    		$InputObject,
    		[ValidateNotNull()]
    		[System.Data.DataTable]$Table,
    		[switch]$RetainColumns,
    		[switch]$FilterWMIProperties)
    	
    	if ($null -eq $Table) {
    		$Table = New-Object System.Data.DataTable
    	}
    	
    	if ($InputObject -is [System.Data.DataTable]) {
    		$Table = $InputObject
    	} elseif ($InputObject -is [System.Data.DataSet] -and $InputObject.Tables.Count -gt 0) {
    		$Table = $InputObject.Tables[0]
    	} else {
    		if (-not $RetainColumns -or $Table.Columns.Count -eq 0) {
    			#Clear out the Table Contents
    			$Table.Clear()
    			
    			if ($null -eq $InputObject) { return } #Empty Data
    			
    			$object = $null
    			#find the first non null value
    			foreach ($item in $InputObject) {
    				if ($null -ne $item) {
    					$object = $item
    					break
    				}
    			}
    			
    			if ($null -eq $object) { return } #All null then empty
    			
    			#Get all the properties in order to create the columns
    			foreach ($prop in $object.PSObject.Get_Properties()) {
    				if (-not $FilterWMIProperties -or -not $prop.Name.StartsWith('__')) #filter out WMI properties
    {
    					#Get the type from the Definition string
    					$type = $null
    					
    					if ($null -ne $prop.Value) {
    						try { $type = $prop.Value.GetType() } catch { Out-Null }
    					}
    					
    					if ($null -ne $type) # -and [System.Type]::GetTypeCode($type) -ne 'Object')
    {
    						[void]$table.Columns.Add($prop.Name, $type)
    					} else #Type info not found
    {
    						[void]$table.Columns.Add($prop.Name)
    					}
    				}
    			}
    			
    			if ($object -is [System.Data.DataRow]) {
    				foreach ($item in $InputObject) {
    					$Table.Rows.Add($item)
    				}
    				return @( ,$Table)
    			}
    		} else {
    			$Table.Rows.Clear()
    		}
    		
    		foreach ($item in $InputObject) {
    			$row = $table.NewRow()
    			
    			if ($item) {
    				foreach ($prop in $item.PSObject.Get_Properties()) {
    					if ($table.Columns.Contains($prop.Name)) {
    						$row.Item($prop.Name) = $prop.Value
    					}
    				}
    			}
    			[void]$table.Rows.Add($row)
    		}
    	}
    	
    	return @( ,$Table)
    }
    #endregion
    
    $csv1 = Import-Csv testjoin1.csv
    $dt1 = ConvertTo-DataTable $csv1
    $dt1.PrimaryKey = $dt1.Columns[0]
    
    $csv2 = Import-Csv testjoin2.csv
    $dt2 = ConvertTo-DataTable $csv2
    $dt2.PrimaryKey = $dt2.Columns[0]
    
    $dt3 = [System.Data.DataTable]::new()
    $dt3.Merge($dt1)
    $dt3.Merge($dt2)
    $dt3
    
    $ds = [System.Data.DataSet]::new('DemoView')
    $ds.Tables.AddRange(@($dt1,$dt2))
    
    #get new table based on releations and view
    $dtjoined = $ds.Select(

    Download link: https://1drv.ms/u/s!AjiiPtIUqzK_haRLPKIyViAaN9JBTg

    You can use either of these techniques to join filter and manipulate CSV files in almost any way imaginable.  It can manage as many CSV files as necessary.  These examples use two CSV files but you can write SQL for three, four or more files.

    Also remember the SQL can do sorting, grouping, filtering and lookups.  Just write the simple SQL text and all else is automatic.




    \_(ツ)_/



    Sunday, December 2, 2018 4:07 AM