none
Row calculations ETL vs Computed Column vs Cube

    Question

  • Hi all,

    I'm after some general advice on performance. I am using SQL 2008R2.

    One of my fact tables has multiple columns that make up a measure e.g. basic charge, freight charge, fuel charge = total charge. Most end user analysis is performed on the total of these 'constituent measures'. I had previously left this to an MDX calculation, however, I am now moving these calculations to my relational database - (this is because I've recently added a "shell dimension" populated with calculated dimension members for time calculations - it appears I would need to scope calculated measures individually, so I figured I should create "real measures" where I can).

    The advice I'm after is where this should be done for the best performance:

    - ETL Source Query

    - Calculated Column on relational data warehouse

    - View on relational data warehouse (note I do not currently use views at all and probably should)

    - Other

    Thanks for reading - I appreciate your advice.

    Regards, Clay

    Friday, September 13, 2013 6:52 AM

Answers

  • computed persisted column will get ignored in some scenarios (http://connect.microsoft.com/SQLServer/feedback/details/725361/base-columns-used-instead-of-persisted-computed-column-by-engine) so i'd avoid those if you plan on indexing it

    I'd persist it as a 'normal' column in the fact table but that's more personal preference than anything else; it's available to all consumers @ the 'base' data level & removes ambiguity about the logic behind it.


    Adelaide, Australia

    Wednesday, September 18, 2013 12:41 AM

All replies

  • This question has been up a few a days - I guess it too open a question for these forums.

    Maybe all options are valid - I thought there may have been issues with

    1. persisted computed columns in data warehouses (e.g. can I bulk load, partition etc.)

    2. row level calculations in cube (better suited to relational db?)

    3. calculating in the source means a wider data flow = less rows etc.

    Sole BI developer here - appreciate any advice on best practices etc.

    Cheers.


    Tuesday, September 17, 2013 11:02 PM
  • computed persisted column will get ignored in some scenarios (http://connect.microsoft.com/SQLServer/feedback/details/725361/base-columns-used-instead-of-persisted-computed-column-by-engine) so i'd avoid those if you plan on indexing it

    I'd persist it as a 'normal' column in the fact table but that's more personal preference than anything else; it's available to all consumers @ the 'base' data level & removes ambiguity about the logic behind it.


    Adelaide, Australia

    Wednesday, September 18, 2013 12:41 AM