miércoles, 16 de mayo de 2012 6:06
I'm planning an upgrade from SQL Server 2008 R2 to SQL Server 2012. Our data warehouse is about 4TB in size. Because of filegroup/files and fragmentation issues, I will have to create the empty tables on the 2012 instance, and then "move" the data from the 2008 R2 instance into 2012.
Most of he target tables on the 2012 instance will be partitioned and indexed.
I'd like some suggestions on what the quickest way would be to move the data? Do I use bulk export/import? SSIS export/import packages? Linked servers? Also, do I disable all the indexes on the target tables first?
Any suggestions would be much appreciated. I have a 12 hour window to move that data.
Marius van Schalkwyk
Todas las respuestas
miércoles, 16 de mayo de 2012 14:20
IMHO, you bottleneck in this case could be network limitations instead of database limitations (assuming these databases are on two different servers). SSIS packages are probably a good way to go and should provide comparable performance to anything else. It would provide a good amount of flexibility as well, especially since you are using SQL Server on both sides the source/target equation. I would not use bulk export/import or linked servers. I would definitely drop the indexes on your target tables before loading them.
It goes without saying, if you have a chance to do this in a dev/test environment before hand, do so. Even if you only do a portion of some of your larger tables, it will give you a good idea of the speed you should expect. You may even be able to load historical data before your maintenance window begins. Then, during the 12 hours, you will only have to load more recent data if that is an option to you. It may not be.
hope this helps.
miércoles, 16 de mayo de 2012 14:42
I would recommend to:
1) Use SSIS to transfer data. (OleDB Source. SQL Server Destination)
2) Set database to simple recovery and presize data and log files
3) Create index-less, but partitioned, tables (heaps)
4) Run dataflows in the package in parallel. Depends on the number of cores, but generally, # of cores/2 = number of parallel threads.
5) Build indexes as the last step.
Generally, your deadline, 12h for data movement, is pretty challenging.
But there are good example that it is doable (whitepaper): We Loaded 1TB in 30 Minutes with SSIS, and So Can You
It is really depends on your hardware capacity.
So, I agree with Gene that it is better to setup partial loads to sync data at the final step of data migration.
- Marcado como respuesta MariusBIDW jueves, 17 de mayo de 2012 6:26
jueves, 17 de mayo de 2012 0:46
you need to write the ETL packages and use the minimally log insert operations (BCP,Bulk Insert, select into ) in your packages and make the task parallel on the basis of your hardware. for instance if you have 16 logical cores then you can put 18 task in parallel.followin are the good article on minimally log opertaions for you.
i will recommend to don't create the indexes on the table during the loading operations. put indexes after the data loading.
if you want to load the data on the with indexes then you need to use the order directives in bulk insert. please have a look bulk insert in the following article:
martes, 29 de mayo de 2012 15:48
You'll have to test it and see.
almost certainly removing indexes for the transfer would be the way to go - and adding them could be the most time consuming part.
The other issue is the network. Copying via ssis can be and issue and simpler would be to bcp the data out to a file, copy the file then bcp it in. Easy to generate bcp statements for every table. Also makes it easy to test - create the files, then you can run multiple tests to copy them and insert destination. You can compress the files for the transfer if needed.
Test the various possibiities but I suspect for that amount of data the extract and insert won't be a problem but the transfer and indexing could be.