locked
read 2 text files and based on key load datatable RRS feed

  • Question

  •  I have 2 text files that are bar (|) delimited with no headers in either file. I want to pull different data elements from
    each file based on the key to each file and load a datatable. The first 5 fields are how the 2 text files are joined(keys).

    hdr.txt
    exr|200|2017|05|899a2|tex|2017-06-05

    dtl.txt
    exr|200|2017|05|899a2|900|9999|glue

    In the example above since first 5 fields in hdr\dtl match I then want fields 6 and 7 from hdr and fields 6,7,8 from the dtl.
    I want that put into a datatable to be able to load a mssql table. The hdr table with have 10,000 lines and dtl should
    have 40,000. If I don't find a match from Hdr to the Dtl write key from hdr to log file for review. I have seen example reading a single table, but not 2 inputs. The files are different formats so can't merge.

    Thanks.

    Friday, June 16, 2017 7:45 PM

Answers

  • The best method is the following:

    #
    # Easily join multiope CSV tables - VERY FAST
    #    Requires ACE drivers which can be installed form # here: https://www.microsoft.com/en-us/download/details.aspx?id=13255
    #
    
    Function Get-CsvReader {
    	Param (
    		$csvPath=$pwd,
    		$CommandText
    	)
    	
    	$tmpl = 'Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties="Text;HDR=Yes;FORMAT=Delimited"'
    	$connStr = $tmpl -f $csvPath
    	
    	$csvConnection = New-Object System.Data.OleDb.OleDbConnection($connStr)
    	$csvConnection.Open()
    	
    	$cmd = $csvConnection.CreateCommand()
    	$cmd.CommandText = $CommandText
    	
    	$dt = New-Object System.Data.DataTable
    	$adapt = New-Object System.Data.OleDb.OleDbDataAdapter($cmd)
    	[void]$adapt.Fill($dt)
    	$dt
    }
    $sql = @'
    	SELECT 
    		F1.ID as ID,
    		F1.Name as Name, 
    		F1.Status as Status, 
    		F2.Status as Status2 
    	FROM 
    		[test1.csv] as F1,
    		[test2.csv] As F2 
    	WHERE 
    		F1.ID = F2.ID
    '@
    
    Get-CsvReader -csvPath c:\scripts -CommandText $sql
    


    \_(ツ)_/


    • Edited by jrv Sunday, June 18, 2017 12:36 AM
    • Proposed as answer by Hello_2018 Monday, June 19, 2017 8:20 AM
    • Marked as answer by hart60 Tuesday, June 20, 2017 5:40 PM
    Sunday, June 18, 2017 12:34 AM
  • Here is a quick and simple way to inner join two tables (two CSV files) without needing any external drivers or code:

    $csv1 = Import-Csv testjoin1.csv
    $dt1 = ConvertTo-DataTable $csv1
    $dt1.PrimaryKey = $dt1.Columns[0]
    
    $csv2 = Import-Csv testjoin2.csv
    $dt2 = ConvertTo-DataTable $csv2
    $dt2.PrimaryKey = $dt2.Columns[0]
    
    $dt3 = [System.Data.DataTable]::new()
    $dt3.Merge($dt1)
    $dt3.Merge($dt2)
    $dt3
    
    The PrimaryKey can be multiple columns.


    \_(ツ)_/

    • Marked as answer by hart60 Thursday, June 22, 2017 8:04 PM
    Thursday, June 22, 2017 1:17 PM

All replies

  • What have you tried.  I recommend that you start by converting these files to legitimate CSV files by importing with a declared header and exporting to a true CSV.  From there we can use a umber of methods to join the two files.  The easiest being to use a data connection.  The second is to convert to a data table and add t a dataset then create a view that joins the data.  You can also just use a brute force parse of one file against the second file but non-unique keys can make that difficult.


    \_(ツ)_/

    Friday, June 16, 2017 8:32 PM
  •  I have the code to create the csv.

    Import-Csv -path 'c:\compress_in\hdr.txt' -Delimiter '|' -Header comp,div,YY,MM,Inv,Channel,Invdate | Export-Csv -Path 'c:\compress_in\hdr.csv' -NoTypeInformation
    Import-Csv -path 'c:\compress_in\dtl.txt' -Delimiter '|' -Header comp,div,YY,MM,Inv,Qty,Pline,Desc | Export-Csv -Path 'c:\compress_in\dtl.csv' -NoTypeInformation

     How do I use the data connection to:
    In the example above since first 5 fields in hdr\dtl match I then want fields 6 and 7 from hdr and fields 6,7,8 from the dtl.
    I want that put into a datatable to be able to load a mssql table. The hdr table with have 10,000 lines and dtl should
    have 40,000. If I don't find a match from Hdr to the Dtl write key from hdr to log file for review. I have seen example reading a single table, but not 2 inputs. The files are different formats so can't merge.

     Thanks.

    Sunday, June 18, 2017 12:32 AM
  • The best method is the following:

    #
    # Easily join multiope CSV tables - VERY FAST
    #    Requires ACE drivers which can be installed form # here: https://www.microsoft.com/en-us/download/details.aspx?id=13255
    #
    
    Function Get-CsvReader {
    	Param (
    		$csvPath=$pwd,
    		$CommandText
    	)
    	
    	$tmpl = 'Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties="Text;HDR=Yes;FORMAT=Delimited"'
    	$connStr = $tmpl -f $csvPath
    	
    	$csvConnection = New-Object System.Data.OleDb.OleDbConnection($connStr)
    	$csvConnection.Open()
    	
    	$cmd = $csvConnection.CreateCommand()
    	$cmd.CommandText = $CommandText
    	
    	$dt = New-Object System.Data.DataTable
    	$adapt = New-Object System.Data.OleDb.OleDbDataAdapter($cmd)
    	[void]$adapt.Fill($dt)
    	$dt
    }
    $sql = @'
    	SELECT 
    		F1.ID as ID,
    		F1.Name as Name, 
    		F1.Status as Status, 
    		F2.Status as Status2 
    	FROM 
    		[test1.csv] as F1,
    		[test2.csv] As F2 
    	WHERE 
    		F1.ID = F2.ID
    '@
    
    Get-CsvReader -csvPath c:\scripts -CommandText $sql
    


    \_(ツ)_/


    • Edited by jrv Sunday, June 18, 2017 12:36 AM
    • Proposed as answer by Hello_2018 Monday, June 19, 2017 8:20 AM
    • Marked as answer by hart60 Tuesday, June 20, 2017 5:40 PM
    Sunday, June 18, 2017 12:34 AM
  •  This works, but is there another option that doesn't require an install of other software?

     Thanks again.

    Monday, June 19, 2017 1:10 PM
  • You would have to enumerate one file and match it to the second file then output the required fields.

    \_(ツ)_/

    Monday, June 19, 2017 6:18 PM
  •  Do you have an example I could work from?

     Thanks.

    Monday, June 19, 2017 10:35 PM
  •  I meant an example without use ACE driver.

     Thanks.

    Tuesday, June 20, 2017 1:30 PM
  •  Thanks again for your suggestions and ACE example.

    Tuesday, June 20, 2017 5:40 PM
  • Here is a quick and simple way to inner join two tables (two CSV files) without needing any external drivers or code:

    $csv1 = Import-Csv testjoin1.csv
    $dt1 = ConvertTo-DataTable $csv1
    $dt1.PrimaryKey = $dt1.Columns[0]
    
    $csv2 = Import-Csv testjoin2.csv
    $dt2 = ConvertTo-DataTable $csv2
    $dt2.PrimaryKey = $dt2.Columns[0]
    
    $dt3 = [System.Data.DataTable]::new()
    $dt3.Merge($dt1)
    $dt3.Merge($dt2)
    $dt3
    
    The PrimaryKey can be multiple columns.


    \_(ツ)_/

    • Marked as answer by hart60 Thursday, June 22, 2017 8:04 PM
    Thursday, June 22, 2017 1:17 PM
  • Great Thanks!!!!!
    Thursday, June 22, 2017 8:04 PM