none
loading 250 million rows from a staging table to production table with 500 columns RRS feed

  • Question

  • Hi all,

    As the title says, we've created a view to convert the data type on the 500 columns and then use SSIS to fastload into the production table.  This table has a columnstore index on it.  The problem is it took WAY TOO long to load (because of the datatype conversion in the view).  Is there a better way to load faster even with datatype conversion?

    thanks in advance

    Sunday, September 8, 2019 12:33 AM

All replies

  • I strongly recommend Azure Data Factory.

    https://azure.microsoft.com/en-gb/services/data-factory/

    Best regards,
    Pete

     

     


    #PEJL
    Got any nice code? If you invest time in coding an elegant, novel or impressive answer on MSDN forums, why not copy it over to TechNet Wiki, for future generations to benefit from! You'll never get archived again, and you could win weekly awards!

    Have you got what it takes to become this month's TechNet Technical Guru? Join a long list of well known community big hitters, show your knowledge and prowess in your favoured technologies!

    Sunday, September 8, 2019 2:21 AM
    Moderator
  • I assume you create index AFTER you load the data?


    Sunday, September 8, 2019 3:28 AM
  • Hi 

    Although I am not clear about the level/intensity of transformation performed in your view - Can you have a try with the persisted view rather than a normal which will hopefully improve your performance from the SSIS stand point.

    Sample query to convert normal view to Materialized/Persisted View is as follows (Just create a Unique Clustered Index on top of view)

    CREATE VIEW Sales.vOrders
       WITH SCHEMABINDING
       AS  
          SELECT SUM(UnitPrice*OrderQty*(1.00-UnitPriceDiscount)) AS Revenue,
             OrderDate, ProductID, COUNT_BIG(*) AS COUNT
          FROM Sales.SalesOrderDetail AS od, Sales.SalesOrderHeader AS o
          WHERE od.SalesOrderID = o.SalesOrderID
          GROUP BY OrderDate, ProductID;
    GO
    --Create an index on the view.
    CREATE UNIQUE CLUSTERED INDEX IDX_V1
       ON Sales.vOrders (OrderDate, ProductID);
    GO

    Reference link with full details.Let me know how it goes.

    Thanks


    If this post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.


    Sunday, September 8, 2019 5:50 AM
  • Hi ChanUot,

    There are several ways you could try to improve the performance.

    Please refer to Top 10 Methods to Improve ETL Performance Using SSIS.

    Hope it helps.

    Regards,

    Zoe


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Monday, September 9, 2019 6:51 AM
  • Hi

    I am writing to follow up this thread with you. Have you solved your issue? If you have solved your issue, please  mark helpful replies as answers. By doing so, it will benefit all community members who are having this similar issue. 

    Thank you


    If this post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.

    Thursday, September 12, 2019 4:41 AM