Splitting data out into 100s of worksheets RRS feed

  • Question

  • Hi there,

    I have been asked by someone in my organisation to try and improve an excel file they use for managing account transactions.

    They basic format is this:

    1. We generate a tab delimited TXT file from our banking portal which lists all of the transactions made for the month. 
    2. This is then pasted into a worksheet called "All Transactions". 
    3. We have approx. 160 other worksheets (1 per account) which uses VLOOKUPs to get transactions only related to that account.
    4. The starting balance of the account is entered on each worksheet and then each transaction is added/subtracted from this balance, giving the new balance.

    The performance is very poor on this and it can cause the users machine to lock up. Calculations take up to 5 minutes when new data is pasted in. I have tried replacing VLOOKUPS with INDEX and MATCH but have noticed no improvement.

    Can excel handle this in a better way? I feel like pivot tables could be useful here but I'm not sure how you could handle subtraction and addition to the account balance using them.

    Any advice would be greatly appreciated.

    Tuesday, May 29, 2018 9:01 AM

All replies

  • Is the starting balance taken from the txt file, or somewhere else?

    I'm pretty sure you should be able to automate it with PowerQuery, maybe adding PowerPivot for the balance calculation

    EDIT - spelling

    Tuesday, May 29, 2018 4:38 PM
  • I would see if there is a way to replace the number of lookups performed. If you can run a Match once (or not many times) per account and use the result to populate the rest of that worksheet, that will be better than running a vlookup for each value on that sheet.

    This is probably out of your control, but I would suggest that you don't use Excel. Excel is really not designed to handle this quantity of data and is bad at lookups. In addition, it does a poor job of ensuring that data is correct and does not get accidently edited. A database program would be better.


    Ethan Strauss

    Tuesday, May 29, 2018 7:22 PM
  • based on step 2 it seems that you have less than 1mln rows, PowerQuery is able to handle that easily
    can you paste the sample data?
    alternatively - have you tried calculating the balance in "All Transactions" directly?
    Tuesday, May 29, 2018 11:49 PM
  • First, rather than 160 other worksheets I would suggest making just one "Account" worksheet, and use a DV list (of all accounts) so that you can select an account to see the relevant data by referencing that DV cell in your formulas. In addition, if you need to print those individual account sheets, it is trivial to write a few lines of VBA code to simply loop through every option on that DV list, print the sheet, move to the next one, print again, and so on. We've now reduced your number of calculations by over 100x because you don't need individual account sheets anymore

    Next, I would sort the raw data by account (I'm assuming that some accounts will have multiple transactions) You can creatively use the different options on MATCH (-1,0, or 1) to identify the first row of that account and the first row of the next account (subtract one to get the relevant range of rows for the account you actually want)- then any additional calculations you are doing for that account will be limited to a fraction of the overall rows in the raw data

    Lastly, sometimes Excel thinks that the last used row is different than what you would expect. When you look at the "All Transactions" sheet, use the right scroll bar to identify where Excel thinks the data stops by clicking and dragging the scrollbar (not the arrow at the bottom of the scrollbar). If it stops at the bottom of your data, great. If not- if it scrolls to the bottom of the whole worksheet- then select all those blank rows by selecting the first blank row below your data, and press CTRL-SHIFT-DownArrow, then right-click on any row number and select 'Delete'. Then re-save the file (Excel won't reset the last used row counter until you save). If your vlookups or other formulas were selecting entire columns (C:F) instead of just a range (C2:F10000) then this may speed up calculations (and as a bonus, may shrink the saved workbook size as well)


    Thursday, May 31, 2018 11:12 AM