none
Implementing Best Practices is Increasing Execution Time

    Question

  • I have inherited some SSIS packages that are used to populate a data warhouse using SAP SQL Server DB as a source.  I am fairly new to SSIS so I've been doing a lot of research on best practices as part of my education.

    One of the data flow tasks pulls material related data and looked like a prime target for optimization as it was utilizing 6 separate OLE DB Source tasks each with their own SSIS Sort all merged within SSIS.  I was able to duplicate the logic with a single query and incorporate server-side sorting.  This should have, from all I've learned, increased performance and reduced the execution time.  Instead the execution time went from 1:02.200 to 2:35.580.

    Confused I tried another tact and just removed the Sort Tasks from the original package by moving the sort to the OLE DB Sources.  Execution time was better than the attempt at consolidation but still greater than the original at 1:32.259.

    Does anyone have any idea what's going on here and why it is behaving contrary to expectations?  The total output is only about 6500 rows.  If there is any additional info needed to make a prognosis let me know I just didn't want to bloat this post any more that it already is.

    Thanks

    Sunday, July 28, 2013 8:33 PM

Answers

  • A lot of joins, so if your tables are not properly indexed you could run into some issues. Statistics may be out of date as well. Not much we can do without a query plan. Look for table/index scans instead of index seeks and look out for bookmark lookups as well.

    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    Monday, July 29, 2013 6:57 PM

All replies

  • Seems you have a very slow source. 6500 rows is not much and fits easily into memory, so sorting and joining inside the SSIS data flow will be very fast.

    The best practices you find in SSIS are meant to make your packages more scalable. Sorting and merging within the SSIS data flow is fast as long as everything fits nicely into memory. If you have to transfer millions of rows, this data flow will spill to disk and probably be slower than the single source query on the server.

    No idea why though the source is so slow. Bad indexing? Outdated statistics? You'd have to take a look at the execution plan for that query.


    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    Sunday, July 28, 2013 8:50 PM
  • I agree with Koen. Sorting 6500 rows with the Sort Transformation is very fast because it easily fit's in memory.

    A good way to indicate whether performance tuning within SSIS is useful, is to get a baseline: Remove all transformations from your Data Flow Task, but leave the source(s). Now run the Data Flow Task and check how close it is to the original. If it is then you just have a slow source/network...

    Further more performance tuning a package that runs only 1 minute, 2 seconds and 200 milliseconds isn't very useful, unless you have a couple of hundreds of them.


    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com | Twitter

    Monday, July 29, 2013 5:39 AM
    Moderator
  • 6500 rows is taking 1 hr 02 mins or 1 Mins 02 secs?

    If this is taking 1 hr+ serious issue.


    Thanks and Regards, Prajesh Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.


    • Edited by Prajesh Monday, July 29, 2013 9:34 AM better
    Monday, July 29, 2013 9:34 AM
  • 6500 rows is taking 1 hr 02 mins or 1 Mins 02 secs?

    If this is taking 1 hr+ serious issue.


    Thanks and Regards, Prajesh Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.


    1hour for 6500 rows? What on earth is wrong with your source database?
    Try to do a baseline check as instructed by Joost. Just read the data and see how long it takes.

    edit: realized later on that this reply was not made by the OP. So ignore this :)


    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.


    • Edited by Koen Verbeeck Monday, July 29, 2013 6:54 PM screw the guy that made this mandotory
    Monday, July 29, 2013 9:43 AM
  • I agree that optimizing a task that takes one minute as part of a overall process that is taking 12 hours and consists of 50 separate dataflow operations is not overly significant.  In fact, the portion I asked about is only about a fourth of one of those sub-packages that I have separated out for analysis.  I just found it odd that I could execute a "query" in SSIS with all of its separate queries and sorts faster than I could execute it in SSMS.  Below is the equivalent query:

    SELECT M.MATNR AS materialNumber,
    MGroup1.VTWEG AS DistributionChannel,
    MGroup1.VKORG AS Organization,
    MGroup1.DWERK AS Plant,
    ISNULL(MGroup1.MVGR1, MGroup2.MVGR1) AS GroupOneNumber,
    MGroup1.VMSTA AS DChainStatus,
    MDesc.MAKTX AS [Description],
    '0' + M.EAN11 AS UPC,
    M.ERSDA AS AddedDate,
    M.SPART AS DivisionNumber,
    Division.VTEXT AS DivisionDesc,
    M.MATKL AS GroupNumber,
    MValue.LBKUM AS BOH,
    MValue.SALK3 AS BOHDollars,
    T023T.WGBEZ as GroupDesc,
    (MValue.VERPR / MValue.PEINH) AS LandedEachCost,
    M.MTART AS MaterialType,
    M.BRGEW AS EaGrossWeight,
    M.NTGEW AS EaNetWeight,
    M.GEWEI AS EaWeightUnit, 
    M.VOLUM AS EaVolume,
    M.VOLEH AS EaVolumeUnit,
    M.LAENG AS EaLength,
    M.BREIT AS EaWidth,
    M.HOEHE AS EaHeight,
    M.MEABM AS EaUnitOfDimension,
    M.MSTAV AS CrossDistChainMatStatus,
    M.MTPOS_MARA AS GeneralItemCategoryGroup,
    MMeasure_IP.UMREZ AS IP,
    MMeasure_IP.LAENG AS [Length],
    MMeasure_IP.BREIT AS Width,
    MMeasure_IP.HOEHE AS Height, 
        MMeasure_IP.MEABM AS UnitOfDimension,
        MMeasure_IP.VOLUM AS Volume,
        MMeasure_IP.VOLEH AS VolumeUnit,
        MMeasure_IP.BRGEW AS GrossWeight,
        MMeasure_IP.GEWEI AS WeightUnit,
        MMeasure_MC.UMREZ AS MC,
        MMeasure_MC.LAENG AS MCLength,
        MMeasure_MC.BREIT AS MCWidth,
        MMeasure_MC.HOEHE AS MCHeight, 
    MMeasure_MC.MEABM AS MCUnitOfDimension,
    MMeasure_MC.VOLUM AS MCVolume,
    MMeasure_MC.VOLEH AS MCVolumeUnit,
    MMeasure_MC.BRGEW AS MCGrossWeight,
    MMeasure_MC.GEWEI AS MCWeightUnit
    FROM ecp.MARA AS M
    INNER JOIN ecp.MAKT AS MDesc
    ON M.MATNR = MDesc.MATNR
    INNER JOIN ecp.T023T
    ON M.MATKL = T023T.MATKL
    LEFT JOIN ecp.MBEW MValue
    ON M.MATNR = MValue.MATNR
    AND MValue.BWKEY = 'BP01'
    LEFT OUTER JOIN ecp.MARM AS MMeasure_IP
    ON M.MATNR = MMeasure_IP.MATNR
    AND MMeasure_IP.MEINH = 'IP'
    LEFT OUTER JOIN ecp.MARM AS MMeasure_MC
    ON M.MATNR = MMeasure_MC.MATNR
    AND MMeasure_MC.MEINH = 'MC'
    LEFT OUTER JOIN ecp.MVKE AS MGroup1
    ON M.MATNR = MGroup1.MATNR
    AND MGroup1.VTWEG = '01'
    LEFT OUTER JOIN ecp.MVKE AS MGroup2
    ON M.MATNR = MGroup2.MATNR
    AND MGroup2.VTWEG = '02'
    LEFT OUTER JOIN ecp.TSPAT AS Division
    ON M.SPART = Division.SPART
    AND Division.MANDT = 400
    AND Division.SPRAS = 'E'

    Monday, July 29, 2013 4:48 PM
  • Prajesh,

    I think you missed the units in Time the requestor put up: 1:02.200 equals to mins:secs.millisecs


    Thanks, hsbal

    Monday, July 29, 2013 5:12 PM
  • Please share/check the exec plan of this query in SSMS. It will provide useful info.

    Thanks, hsbal

    Monday, July 29, 2013 5:15 PM
  • A lot of joins, so if your tables are not properly indexed you could run into some issues. Statistics may be out of date as well. Not much we can do without a query plan. Look for table/index scans instead of index seeks and look out for bookmark lookups as well.

    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    Monday, July 29, 2013 6:57 PM