locked
Importing data from word document(Table format) RRS feed

  • Question

  • Hello,

    Could any one let me know how to import data from word document into sql server table in SSIS.

    Any help would be greatly appreciated.

    Regards,

    Vinay s

    Tuesday, August 4, 2015 11:40 AM

Answers

  • Then it can't be done.

    A table within a word document is an embedded object inside another object(s) (the pages) so therefore it is not in a recognised structure that SQL Server can cope with it so therefore it is not possible.

    There are two workarounds that I can think of: -

    Tackle this at source and alter the processes so that data is surrendered in a format where you can set up an automated process to carry out the data imports.  Excel, xml and csv file formats are three examples.

    OR

    Use these links to try and engineer a process to import your data into MS Access and then from Access into SQL Server.

    http://stackoverflow.com/questions/3970488/import-ms-word-form-fields-into-ms-access

    http://www.access-programmers.co.uk/forums/showthread.php?t=199989


    Please click "Mark As Answer" if my post helped. Tony C.




    Tuesday, August 4, 2015 3:00 PM
  • Hummmmm, I've never heard of someone going from MS Word stright to SQL Server via SSIS.  How about going form Word tables to Excel, and then from Excel to SQL Server?  There are LOTS of ways to do that.  Please try this concept, to begin with . . . .

    I've got about 350 tables in Word. Each has 2 columns and 22 rows. I need 
    to import them into excel with 22 columns and 350 rows. The left column in 
    each Word table contains what will become the column title in excel. The 
    right column in each Word table contains what will become the data for each 
    record. Any help would be appreciated. Thanks
    
    
    
    Sub GetTables() 
    
    FName = Application _ 
    .GetOpenFilename("Word Files (*.doc), *.doc") 
    
    Set WordObject = GetObject(FName) 
    
    First = True 
    RowCount = 2 
    For Each Tble In WordObject.tables 
    For i = 1 To 22 
    If First = True Then 
    Data = Tble.Rows(i).Cells(1).Range 
    'Remove cell markers 
    Cells(1, i) = Left(Data, Len(Data) - 2) 
    End If 
    Data = Tble.Rows(i).Cells(2).Range.Text 
    'Remove cell markers 
    Cells(RowCount, i) = Left(Data, Len(Data) - 2) 
    Next i 
    RowCount = RowCount + 1 
    First = False 
    Next Tble 
    WordObject.Close savechanges = False 
    End Sub
    


    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:29 AM
  • Now, push the data form Excel to SQL Server . . .

    Sub InsertInto() 'Declare some variables Dim cnn As adodb.Connection Dim cmd As adodb.Command Dim strSQL As String 'Create a new Connection object Set cnn = New adodb.Connection 'Set the connection string cnn.ConnectionString = "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=True;Initial Catalog=Northwind;Data Source=Excel-PC\SQLEXPRESS" 'cnn.ConnectionString = "DRIVER=SQL Server;SERVER=Excel-PC\SQLEXPRESS;DATABASE=Northwind;Trusted_Connection=Yes" 'Create a new Command object Set cmd = New adodb.Command 'Open the Connection to the database cnn.Open 'Associate the command with the connection cmd.ActiveConnection = cnn 'Tell the Command we are giving it a bit of SQL to run, not a stored procedure cmd.CommandType = adCmdText 'Create the SQL strSQL = "UPDATE TBL SET JOIN_DT = '2013-01-22' WHERE EMPID = 2" 'Pass the SQL to the Command object cmd.CommandText = strSQL 'Execute the bit of SQL to update the database cmd.Execute 'Close the connection again cnn.Close 'Remove the objects Set cmd = Nothing Set cnn = Nothing

    End Sub



    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:32 AM
  • Also . . . .

    Sub Rectangle1_Click()
    'TRUSTED CONNECTION
        On Error GoTo errH
       
        Dim con As New ADODB.Connection
        Dim rs As New ADODB.Recordset
        Dim strPath As String
        Dim intImportRow As Integer
        Dim strFirstName, strLastName As String
       
        Dim server, username, password, table, database As String
       
       
        With Sheets("Sheet1")
               
                server = .TextBox1.Text
                table = .TextBox4.Text
                database = .TextBox5.Text
               
               
                If con.State <> 1 Then
           
                    con.Open "Provider=SQLOLEDB;Data Source=" & server & ";Initial Catalog=" & database & ";Integrated Security=SSPI;"
                    'con.Open
           
                End If
                'this is the TRUSTED connection string
               
                Set rs.ActiveConnection = con
               
                'delete all records first if checkbox checked
                If .CheckBox1 Then
                    con.Execute "delete from tbl_demo"
                End If
           
                'set first row with records to import
                'you could also just loop thru a range if you want.
                intImportRow = 10
               
                Do Until .Cells(intImportRow, 1) = ""
                    strFirstName = .Cells(intImportRow, 1)
                    strLastName = .Cells(intImportRow, 2)
                   
                    'insert row into database
                    con.Execute "insert into tbl_demo (firstname, lastname) values ('" & strFirstName & "', '" & strLastName & "')"
                   
                    intImportRow = intImportRow + 1
                Loop
               
                MsgBox "Done importing", vbInformation
               
                con.Close
                Set con = Nothing
       
        End With
       
    Exit Sub
    
    errH:
        MsgBox Err.Description
    End Sub
    

    Finally . . . please see this link . . . .

    http://www.excel-sql-server.com/excel-sql-server-import-export-using-vba.htm#Introduction


    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:34 AM

All replies

  • Hi vinay,

    If I understand correctly you want to import the data in the word document to Sql Server table. There is no direct way I can think of to achieve this. But you can try this.

    1. Export the data into Microsoft Excel or Microsoft Access.

    2. Create a database in Microsoft Sql Server 2005 using Microsoft SQL Server Management studio.

    3. Right click on the database you created from the Management studio object explorer.

    4. Select Tasks-> Import Data.

    5. It will launch Import/ Export wizard. There you can select your data source to the Microsoft Excel document or Microsoft Access database.

    From there you can import the data from the word document to the SQL Server database.


    Please Dont forget to mark as answer. It helps others to find relevant posts to the same question. Milan Das

    Tuesday, August 4, 2015 11:48 AM
  • See if this helps... https://technet.microsoft.com/en-us/library/ms141235(v=sql.105).aspx

    But As far as I know Word documents are not structured files. The normal way to extract unstructured data using SSIS is by using code. You can download Primary Interop Assemblies for Office that enable you to interact with Word documents using dotnet code - hence you can use them in a script task/component.


    Hope this helps ! ------------------------------------------------------Please Mark This As Answer if it solved your issue. Please Vote This As Helpful if it helps to solve your issue

    Tuesday, August 4, 2015 11:51 AM
  • Hello Milan\Jinu

    Thanks for your suggestion.

    But the steps on what you have mentioned has to be done manually.I want this process to happen dynamically.

    As I'm new in .net coding, could you please help me out in coding part in script task?

    Regards,
    Vinay s
    Tuesday, August 4, 2015 12:02 PM
  • Data cannot be imported from a MS word document as a word document is split across pages.

    Copy the data into an Excel spreadsheet and import the Spreadsheet.


    Please click "Mark As Answer" if my post helped. Tony C.

    Tuesday, August 4, 2015 12:03 PM
  • Hi Vinay,

    Can you check the following article

    http://www.codeproject.com/Articles/635206/Extracting-Table-Data-from-Word-Document-using-Asp


    Please Dont forget to mark as answer. It helps others to find relevant posts to the same question. Milan Das

    Tuesday, August 4, 2015 12:14 PM
  • Hi Milan,

    I didn't find useful information on that link which you have provided.

    Please let me know the alternate steps or method.

    Regards,

    Vinay s

    Tuesday, August 4, 2015 1:09 PM
  • Hi Milan,

    I didn't find useful information on that link which you have provided.

    Please let me know the alternate steps or method.

    Regards,

    Vinay s

    Hello Vinay

    Please refer to my response to this post earlier; you need to copy the data into an Excel spreadsheet and then import the Spreadsheet data into the Database.


    Please click "Mark As Answer" if my post helped. Tony C.

    Tuesday, August 4, 2015 1:13 PM
  • Hello Milan,

    This has to be done dynamic, I cannot copy the data into an excel spreadsheet manually.

    I'm receiving the docx files directly from our customers daily around 10,00 files per day.

    So this has to be made dynamic to read the docx files and store in database.

    Regards,
    Vinay s
    Tuesday, August 4, 2015 1:19 PM
  • Hello Tony,

    This has to be done dynamic, I cannot copy the data into an excel spreadsheet manually.

    I'm receiving the docx files directly from our customers daily around 10,00 files per day.

    So this has to be made dynamic to read the docx files and store in database.

    Regards,
    Vinay s
    Tuesday, August 4, 2015 1:19 PM
  • Then it can't be done.

    A table within a word document is an embedded object inside another object(s) (the pages) so therefore it is not in a recognised structure that SQL Server can cope with it so therefore it is not possible.

    There are two workarounds that I can think of: -

    Tackle this at source and alter the processes so that data is surrendered in a format where you can set up an automated process to carry out the data imports.  Excel, xml and csv file formats are three examples.

    OR

    Use these links to try and engineer a process to import your data into MS Access and then from Access into SQL Server.

    http://stackoverflow.com/questions/3970488/import-ms-word-form-fields-into-ms-access

    http://www.access-programmers.co.uk/forums/showthread.php?t=199989


    Please click "Mark As Answer" if my post helped. Tony C.




    Tuesday, August 4, 2015 3:00 PM
  • Hummmmm, I've never heard of someone going from MS Word stright to SQL Server via SSIS.  How about going form Word tables to Excel, and then from Excel to SQL Server?  There are LOTS of ways to do that.  Please try this concept, to begin with . . . .

    I've got about 350 tables in Word. Each has 2 columns and 22 rows. I need 
    to import them into excel with 22 columns and 350 rows. The left column in 
    each Word table contains what will become the column title in excel. The 
    right column in each Word table contains what will become the data for each 
    record. Any help would be appreciated. Thanks
    
    
    
    Sub GetTables() 
    
    FName = Application _ 
    .GetOpenFilename("Word Files (*.doc), *.doc") 
    
    Set WordObject = GetObject(FName) 
    
    First = True 
    RowCount = 2 
    For Each Tble In WordObject.tables 
    For i = 1 To 22 
    If First = True Then 
    Data = Tble.Rows(i).Cells(1).Range 
    'Remove cell markers 
    Cells(1, i) = Left(Data, Len(Data) - 2) 
    End If 
    Data = Tble.Rows(i).Cells(2).Range.Text 
    'Remove cell markers 
    Cells(RowCount, i) = Left(Data, Len(Data) - 2) 
    Next i 
    RowCount = RowCount + 1 
    First = False 
    Next Tble 
    WordObject.Close savechanges = False 
    End Sub
    


    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:29 AM
  • Sub GetWordDocContents()
      Dim oWord As Object
      Dim vFiles
      Dim iFile As Integer
      Dim r As Range
      Dim LastRow As Long
       
      vFiles = Application.GetOpenFilename("Word files (*.doc*),*.doc*", Title:="Please select the files you want to copy from", MultiSelect:=True)
      If TypeName(vFiles) = "Boolean" Then Exit Sub ' Cancelled
      Set oWord = CreateObject("Word.Application")
      oWord.Visible = True
      Set r = ActiveSheet.Range("A1")
      For iFile = LBound(vFiles) To UBound(vFiles)
        oWord.Documents.Open vFiles(iFile)
        oWord.ActiveDocument.Select
        oWord.Selection.Copy
        ActiveSheet.Paste r
       
        r.Offset(0, 6).Value = oWord.ActiveDocument.Name
        
        Set r = Cells(ActiveSheet.UsedRange.Rows.Count + 1, 1)
        oWord.ActiveDocument.Close False
    
      Next
      oWord.Quit
      Set oWord = Nothing
      ActiveSheet.Columns.AutoFit
    
        ' Delete row if blank
        Columns("A:A").Select
        For LastRow = Cells(Rows.Count, 1).End(xlUp).Row To 1 Step -1
           If Cells(LastRow, 1) = "" Then Rows(LastRow).Delete
        Next LastRow
        
    End Sub
    


    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:30 AM
  • Sub importFromWord()
    
    'Activate Word Object Library
    
    'Dim WordApp As Word.Application    ///  error arises here
    Dim WordDoc As Word.Document
    
    Set WordApp = CreateObject("word.application") ' Open Word session
    
    WordApp.Visible = False 'keep word invisible
    Set WordDoc = WordApp.Documents.Open("C:\Documents and Settings\Ryan Shuell\Desktop\Briefcase\Modeling\Excel Files\Word & Excel\Word.doc") ' open Word file
    
    'copy third row of first Word table
    WordDoc.Tables(1).Rows(3).Range.Copy
    
    'paste in Excel
    Range("A1").PasteSpecial xlPasteValues
    
    WordDoc.Close 'close Word doc
    WordApp.Quit ' close Word
    
    End Sub
    
    

    Send Word Data to Excel:
    
    
    
    Sub CreateNewExcelWB()
    ' to test this code, paste it into a Word module
    ' add a reference to the Excel-library
    ' create a new folder named C:\Foldername or edit the filnames in the code
    Dim xlApp As Excel.Application
    Dim xlWB As Excel.Workbook
    Dim i As Integer
        Set xlApp = CreateObject("Excel.Application")
        xlApp.Visible = True
        Set xlWB = xlApp.Workbooks.Add ' create a new workbook
        ' or
        'Set xlWB = xlApp.Workbooks.Open("C:\Foldername\Filename.xls")
        ' open an existing workbook
        ' example excel operations
                'Selection.WholeStory
                'Selection.Copy
        
        With xlWB.Worksheets(1)
                'Selection.Paste
    
            For i = 1 To 100
                .Cells(i, 1).Formula = "Here is a example test line #" & i
            Next i
            .SaveAs ("C:\TEMP\MyNewExcelWB.xls")
        End With
        xlWB.Close False ' close the workbook without saving
        xlApp.Quit ' close the Excel application
        Set xlWB = Nothing
        Set xlApp = Nothing
    End Sub
    



    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:31 AM
  • Now, push the data form Excel to SQL Server . . .

    Sub InsertInto() 'Declare some variables Dim cnn As adodb.Connection Dim cmd As adodb.Command Dim strSQL As String 'Create a new Connection object Set cnn = New adodb.Connection 'Set the connection string cnn.ConnectionString = "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=True;Initial Catalog=Northwind;Data Source=Excel-PC\SQLEXPRESS" 'cnn.ConnectionString = "DRIVER=SQL Server;SERVER=Excel-PC\SQLEXPRESS;DATABASE=Northwind;Trusted_Connection=Yes" 'Create a new Command object Set cmd = New adodb.Command 'Open the Connection to the database cnn.Open 'Associate the command with the connection cmd.ActiveConnection = cnn 'Tell the Command we are giving it a bit of SQL to run, not a stored procedure cmd.CommandType = adCmdText 'Create the SQL strSQL = "UPDATE TBL SET JOIN_DT = '2013-01-22' WHERE EMPID = 2" 'Pass the SQL to the Command object cmd.CommandText = strSQL 'Execute the bit of SQL to update the database cmd.Execute 'Close the connection again cnn.Close 'Remove the objects Set cmd = Nothing Set cnn = Nothing

    End Sub



    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:32 AM
  • Also . . . .

    Sub Rectangle1_Click()
    'TRUSTED CONNECTION
        On Error GoTo errH
       
        Dim con As New ADODB.Connection
        Dim rs As New ADODB.Recordset
        Dim strPath As String
        Dim intImportRow As Integer
        Dim strFirstName, strLastName As String
       
        Dim server, username, password, table, database As String
       
       
        With Sheets("Sheet1")
               
                server = .TextBox1.Text
                table = .TextBox4.Text
                database = .TextBox5.Text
               
               
                If con.State <> 1 Then
           
                    con.Open "Provider=SQLOLEDB;Data Source=" & server & ";Initial Catalog=" & database & ";Integrated Security=SSPI;"
                    'con.Open
           
                End If
                'this is the TRUSTED connection string
               
                Set rs.ActiveConnection = con
               
                'delete all records first if checkbox checked
                If .CheckBox1 Then
                    con.Execute "delete from tbl_demo"
                End If
           
                'set first row with records to import
                'you could also just loop thru a range if you want.
                intImportRow = 10
               
                Do Until .Cells(intImportRow, 1) = ""
                    strFirstName = .Cells(intImportRow, 1)
                    strLastName = .Cells(intImportRow, 2)
                   
                    'insert row into database
                    con.Execute "insert into tbl_demo (firstname, lastname) values ('" & strFirstName & "', '" & strLastName & "')"
                   
                    intImportRow = intImportRow + 1
                Loop
               
                MsgBox "Done importing", vbInformation
               
                con.Close
                Set con = Nothing
       
        End With
       
    Exit Sub
    
    errH:
        MsgBox Err.Description
    End Sub
    

    Finally . . . please see this link . . . .

    http://www.excel-sql-server.com/excel-sql-server-import-export-using-vba.htm#Introduction


    Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

    Saturday, August 8, 2015 2:34 AM