none
PowerPivot is very slow importing large datasets from OData feed. Are there other types of data feeds which will import large datasets more quickly?

    General discussion

  • I am evaluating the performance of PowerPivot consuming OData feeds with large datasets as we are looking at providing OData feeds to our business analysts so they can work with large amounts of data in PowerPivot.

    I was shocked at how long it was taking to load 600,000 records and gave up at 100,000 loaded.  This was with an OData feed service I wrote going against one of our SQL databases.  I added paging of 25 records per page.

    I tried a public OData feed.  It took 1.5 hours to load 230,000 rows from the Badges table in this feed and it was still still running when I stopped the import.  Based on Fiddler it appeared to load about 50 records per page.  Also Fiddler showed about 4 request errors between every good request.

    http://data.stackexchange.com/superuser/atom

    My laptop specs
    Windows 7 Enterprise Service Pack 1 64 bit
    4 GB RAM
    Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz [4 processors]

    Excel and PowerPivot specs
    MS Excel Version 14.0.6106.5005 (32-bit)
    MS SQL Server 2012 PowerPivot for Excel 32-bit Version 11.1.3129.0

    Is the performance I saw normal?  What steps can I take to make this a better experience for our business analysts?  I'm fairly certain we are not going to allow connecting directly to the database or a flat file for security reasons.  Is there a faster web service I can write that supports the Atom standard for PowerPivot?  As you can see the public data feed's performance above was poor in regards to large datasets.

    Thursday, June 6, 2013 8:38 PM

All replies

  • I have done some additional testing against a DB table that has 1.5 million records.  If I connect via DB connections string, POWERPIVOT imports all data from table in less than 1 minute.  The same holds true if I pull the 1.5 million records from a quoted CSV file.  In general is there any kind of data feed I can write that will pull 1.5 million records into POWERPIVOT in minutes rather than hours?  In testing it took over 30 minutes just to bring in first 10,000 records (still have over 1 million to load).  The data feed will need to support the Atom standard based on what I've read on POWERPIVOT.  So far I've worked with OData which I think is powerful, but not suited for POWERPIVOT with large datasets.  I have tried both with and without paging in code.  We need to use a service mechanism to provide data to our POWERPIVOT users for security reasons.  The load times I'm seeing now are going to be unacceptable to a POWERPIVOT user.
    Friday, June 14, 2013 9:18 PM
  • At this point I'm guessing there is no other kind of data feed that is going to perform any faster with large data sets.  I think once we have the caching and compression working in an optimal fashion that 1 million records is still going to be too much to import into POWERPIVOT in a reasonable time after a user has seen how quickly it will import from flat file or direct connection to the DB.  My guess is that we should probably set the user's expectations to more like 10,000 to 50,000 records as a maximum based on how fast we can get the OData service working with caching and compression.

    Is my assumption correct that we're going to have to get the user to filter the dataset via OData filter syntax prior to importing into POWERPIVOT or possibly use Excel's DATA EXPLORER to filter the dataset prior importing into Excel?  If we use DATA EXPLORER, we'll have to warn the user not to use things like Remove Duplicates since OData does not support distinct.

    Thursday, June 20, 2013 6:21 PM
  • Is this still an issue?

    Thanks!


    Ed Price, SQL Server Customer Program Manager (Blog, Small Basic, Wiki Ninjas, Wiki)

    Answer an interesting question? Create a wiki article about it!

    Thursday, August 22, 2013 6:42 PM
    Owner