Read a Document Library and Update table and database
-
2012년 6월 14일 목요일 오후 1:55
Hi All,
I would like to create and update a SQL document Library using SSIS. I am not sure where and how to do this. I have been searching, researching, asking etc. etc.
I would appreciate any help!
This is what I would like to do:
- Read a Windows folder (Document Labrary) and update a table with the document name; date created; date modified; document type; document path
- Update a SQL database with the ACTUAL documents contained in the document library
Thanks in advance!
Corne le Roux
모든 응답
-
2012년 6월 14일 목요일 오후 3:03
Hi Corne ,
1. Define package variables - DocumentName, CreateDate,ModifyDate,DocumentType,DocumentPath.
2. Use a script component, then pass these variables into script component and set them can writable.
3. In script use C# code to get the file list in the windows folder, C# can get the properties of file, set values to package variables.
4. Compose a insert SQL statement, then the values in SQL statement are from these variables.
Actually, step 3 and step 4 should be included in the loop container, you should know how to read all files in specified folder.
Please vote if it's helpful and mark it as an answer!
-
2012년 6월 15일 금요일 오전 6:53
Hi Simon,
Do you perhaps have an idea where I could get hold of a step-by-step tutorial re the creation of a document library using SSIS?
I am a greenhorn at SSIS! Or, could you perhaps give me some guidelines as to what the items are called in the SSIS / BIDS toolbox that I should use?
Thanks for your help!
Corne le Roux
-
2012년 6월 15일 금요일 오전 7:35
Hi Corne,
First, you need to know how to use C# to get all files in specified folder, then loop files and get file properties of files. System.IO.File, you may need in C#.
Second, above codes will be used in Script Component.
This work is a little complicated, but it's not very hard to implement.
You will use : Script component, Foreach loop container, Excecute SQL Task, Package Variables, C# IO read..
Please refer to http://simonlv.blogspot.com/2012/05/ssis-step-by-step-002-output-data-to.html to know how to use foreach loop container.
Can you provide your table schema where the document information inside .
Please vote if it's helpful and mark it as an answer!
- 편집됨 Simon Lv 2012년 6월 15일 금요일 오전 8:30
-
2012년 6월 15일 금요일 오전 7:54
Hi Simon,
Thanks for the feedback. It would be great if you could assist me. I am going to check out your For Each Loop Container post now, and will then make contact with you.
Corne le Roux
-
2012년 6월 15일 금요일 오전 8:24
If you have time, you also can try this codes in script component:
1. Define a package variable - Directory , value is your folder address
2. Add a script component in control flow, pass package variable - Directory.
3. In scripts, import System.IO.File
string directory = Dts.Variables["User::Directory"].Value.ToString(); string[] fileNames = Directory.GetFiles(directory); MessageBox.Show("File - "+fileNames[0]); FileInfo file = new FileInfo(fileNames[0]); MessageBox.Show("File Created Time - "+file.CreationTime.ToString() + '\n'+ "File Last Write Time - "+file.LastWriteTime.ToString() + '\n'+ "File Name - "+file.Name + '\n'+ "File Type - "+file.Extension + '\n'+ "File Full Path - "+file.FullName);Just a test for the core C# codes, result shows please refer to this figure below -
Above steps don't meet your requirements, but you can follow this way.
1. First define a package variable - object type, just like what I did in my blog .
2. Add a script component, pass the package variable [Directory] into script, you only need to get the file names under your directory, and save file names into your package object-type variable.
string directory = Dts.Variables["User::Directory"].Value.ToString();
string[] fileNames = Directory.GetFiles(directory);
// Then save fileNames to Object Variable.3. Add a loop foreach container, loop this object item get the file name save as a package variable [FilePath]
Add a new script component in foreach loop container, pass the [FilePath] into script and other package file variables like name, createdDate, extension..
Then use the filePath to build a File object,
FileInfo file = new FileInfo(//[FilePath] Dts.Variable["User::CreationTime"].Value = file.CreationTime.ToString();
// do the same thing.
Each loop, it will create a File object in script, and get all detail information by this object, assign these values to your package variables
4. Within the loop container and after the script component, you can add a Execute SQL Task to insert these information to table.
Some file extension will be .sql .txt, if you want to save information like 'Microsoft SQL Server Query File', 'Text document File', you can build a mapping table to map these values. But I think such extension is enought to let users know what is the type of the file.
5. I don't know what's your logic to update existing document information. Does it mean if this document name exist, you will use the newest file information to update the old one, if this document name doesn't exist, then add a new record for this new document ?
If you need to do this, you can use a look up to search if the file name exist in table, if exist, then update with new informaiton, if doesn't exist , then insert new records. About how to use look up, just made a little reasearch by Google and MSDN tutorial, it's very easy to understand the logic.
Hope above can help you.
Please vote if it's helpful and mark it as an answer!
-
2012년 6월 15일 금요일 오전 8:30
Or you can use Merge into output in SQL to instead of look up, please refer to my example in - http://simonlv.blogspot.com/2012/06/merge-operation-and-output-clause.html
I guess above all is enough to solve your problems.
Please vote if it's helpful and mark it as an answer!
-
2012년 6월 15일 금요일 오후 1:55
Thanks for your help!
I have worked through both your tutorials on SSIS - great posts!
My reason for wanting to create this package is, to update a SQL Document Library.
This is what I am trying to do (in plain language):
- Read through this document directory - D:\SYNERGEE IMS (New)
- Update this table - [ChessDB_Aveng_Trident]. [dbo].[document_tbl] with the following information retrieved from step 1(column names): document_name; document_path; date_opened; document_type; date_last_modified; folder1; folder2; folder3 etc. This process should update documents where the Date Last Modified is newer, where new documents were added etc. Like a simple document management system.
- The ACTUAL documents contained in this directory - D:\SYNERGEE IMS (New) - must be copied to a database called: [ChessDB_DocLib].[dbo].[document_storage_tbl]. The information and columns for this table is: document_pk; document_date; document_name; document_type; document_desc; current_status. Once again, this process should update documents where the Date Last Modified is newer, where new documents were added etc.
Any ideas?
Corne le Roux
-
2012년 6월 15일 금요일 오후 2:29
Thanks for your help!
I have worked through both your tutorials on SSIS - great posts!
My reason for wanting to create this package is, to update a SQL Document Library.
This is what I am trying to do (in plain language):
- Read through this document directory - D:\SYNERGEE IMS (New)
- Update this table - [ChessDB_Aveng_Trident]. [dbo].[document_tbl] with the following information retrieved from step 1(column names): document_name; document_path; date_opened; document_type; date_last_modified; folder1; folder2; folder3 etc. This process should update documents where the Date Last Modified is newer, where new documents were added etc. Like a simple document management system.
- The ACTUAL documents contained in this directory - D:\SYNERGEE IMS (New) - must be copied to a database called: [ChessDB_DocLib].[dbo].[document_storage_tbl]. The information and columns for this table is: document_pk; document_date; document_name; document_type; document_desc; current_status. Once again, this process should update documents where the Date Last Modified is newer, where new documents were added etc.
Any ideas?
Corne le Roux
I guess you can try my answer above which has figures and codes, it covers your step 1 to step 3.
Please vote if it's helpful and mark it as an answer!
-
2012년 6월 16일 토요일 오전 11:25
1. First define a package variable - object type, just like what I did in my blog .
2. Add a script component, pass the package variable [Directory] into script, you only need to get the file names under your directory, and save file names into your package object-type variable.
string directory = Dts.Variables["User::Directory"].Value.ToString();
string[] fileNames = Directory.GetFiles(directory);
// Then save fileNames to Object Variable.3. Add a loop foreach container, loop this object item get the file name save as a package variable [FilePath]
Add a new script component in foreach loop container, pass the [FilePath] into script and other package file variables like name, createdDate, extension..
Then use the filePath to build a File object,
FileInfo file = new FileInfo(//[FilePath] Dts.Variable["User::CreationTime"].Value = file.CreationTime.ToString();
// do the same thing.
Each loop, it will create a File object in script, and get all detail information by this object, assign these values to your package variables
4. Within the loop container and after the script component, you can add a Execute SQL Task to insert these information to table.
I am still a bit in the dark.
I am unsure about viriables and how to link the variable in the different toolbox items and how to pull the information through the different toolbox items to eventually get it into the tables.
Is it possible for you to elaborate on the solution above? Perhaps clarify the steps. Remember, I am very new to SSIS! I really appreciate all your help.
Thanks
Corne le Roux
-
2012년 6월 16일 토요일 오전 11:37
Hi Corne,
Does my descriptions above meet your requirements ? If these descriptions are correct, I can show you how to design it later.
Please vote if it's helpful and mark it as an answer!
-
2012년 6월 17일 일요일 오전 7:33
Hi Corne,
Please refer to my new post here - http://simonlv.blogspot.com/2012/06/ssis-step-by-step-003-sync-document.html
Please vote if it's helpful and mark it as an answer!
-
2012년 6월 18일 월요일 오전 7:35
Hi Simon,
You are legendary! I just got to office now, so will work through this during the course of the day.
Thank you for going above and beyond!
Corne le Roux
-
2012년 6월 18일 월요일 오전 10:05
You're welcome, Corne. My example won't fully meet your requirements, but you can get some useful solution and skills in my post I guess. You have to do more research on it to solve the rest problem by yourself - I hope ~_~!
You can try to understand the components you used, find the relationship of components and know what're input and output columns of components, good luck to you!
Please vote if it's helpful and mark it as an answer!
- 답변으로 표시됨 onmyway365 2012년 6월 19일 화요일 오후 1:25
-
2012년 6월 19일 화요일 오후 1:27
Hi Simon,
I build your solution and it works great! I now only have to create / add a process to copy the physical documents to a database, and I am almost there!
Corne le Roux

