none
data warehouse design and how to make relation between specific tables

    Question

  • Dear all

    I am working on my data warehouse with start schema structure it is retial sales

    I have my Fact table is “Fact Sales”

    I have a target Header and Target details in my OLTP database.

    I want to add the targets tables to my warehouse database

    So my idea of thinking is the following:

    ·         I will create a new table “Dim Targets” and I will relate this table to “Fact Sales”

    ·         I will create a new table “DimTargetDetails” and I will relate this table to “Dim Targets”

    As far as my understanding that this structure is called snowflake design?

    Is there any other way that I can avoid the snowflake design?

    Is this a good design or bad one?

    Thanks,


    Ramzy N.Ebeid
    Thursday, September 15, 2011 12:33 PM

Answers

  • OK, It's good to have some feedback.
    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    • Marked as answer by RamzyEbeid Monday, September 26, 2011 2:27 PM
    Monday, September 19, 2011 3:05 PM

All replies

  • Hi,

    A star schema has typically one fact table (e.g. Fact Sales and one or more dimensions where all these dimensions (e.g. Dim Targets and Dim Target Details) are directly connected to the fact table.

    A common reason why the snow flake design exists is that the process of de-normalisation has not been completed. For example in a Sales Star Schema you have have a Fact Sales table, a Sales Order Header Dimension table and a Sales Order detail Dimension table. To avoid having the Sales Order detail table connecting to the Sales Fact Sales table via the Sales Order Header these two dimensions can be de-normalised into a single dimension.

    Dimensions are connected to fact tables by surrogate keys.

    Star schemas are good practice however due to deadlines, etc. I often find that I end up working with snow flake schemas more than star schemas.

    As always with these sorts of questions I recommend; -

    The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling ; - http://www.amazon.com/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247/ref=sr_1_1?s=books&ie=UTF8&qid=1316092346&sr=1-1

     

    MCTS Self-Paced Training Kit (Exam 70-448): Microsoft® SQL Server® 2008 Business Intelligence Development and Maintenance (Self-Paced Training Kits) ; - http://www.amazon.com/MCTS-Self-Paced-Training-Exam-70-448/dp/0735626367/ref=sr_1_1?s=books&ie=UTF8&qid=1316092399&sr=1-1

    Further links; - http://en.wikipedia.org/wiki/Star_schema

    http://en.wikipedia.org/wiki/Surrogate_key

    http://en.wikipedia.org/wiki/Snowflake

    http://en.wikipedia.org/wiki/Denormalization

     


    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood



    Thursday, September 15, 2011 1:03 PM
  • Let me ask you something else
    I have “DimSite” table and this table is linked to “Fact sales” table
    I have another table “DimTarget” which have a column “SiteID”
    Do I need to make any relation between the 2 tables “DimSite” and “DimTarget” ?
    Or I will leave it like this?
    “SiteID” column here in table “DimTarget” will be equivalent to “SiteKey” or “SiteAlternateKey” in table “DimSite”?

    Ramzy N.Ebeid
    Thursday, September 15, 2011 1:24 PM
  • So I understand you will have a SiteID as the primary key of your DimSite table (where I hope SiteID is a surrogate key which is different to the key in your data source, hence the name surrogate).

    This SiteID will be held as a foreign key within your Fact Sales table.

    For this schema to be a star schema as in your original question you would need a direct relationship (link) between your Fact Sales and your Dim Target. To help you define this relationship I believe I need to have the structure of your Fact Sales table and your Dim Target, plus give me a more detailed business objective of these tables. I suspect that the relationship between your Fact Sales and Dim Target will need to be more granular than SiteId.

    If you are relating (linking) two dimensions together directly (not via a fact table ) then you are building a snow flake schema not a star schema. To encap one of the main business objectives of having a star schema is you are focusing on simplicity of your queries and speed of query response, you are less concerned with updates and storage space.

    How have you found my links? Do you manage to read them?


    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    Thursday, September 15, 2011 1:44 PM
  • go to this link: https://skydrive.live.com/#!/?cid=e967bdf9dcb139b0&sc=documents&uc=1&id=E967BDF9DCB139B0%21137

    document name: Sales Data Warehouse

     


    Ramzy N.Ebeid
    Thursday, September 15, 2011 1:51 PM
  • Hi,

    Your diagram looks to me almost a valid star schema. DimTarget joins directly to FactSFASales and DimSite joins directly to FactSFASales which is compatible with a star schema.

    However the dimensions DimRoute and DimCustomerRoute have to be linked to another Dimension before you can link them to a Fact table which is not compatible with a star schema.

    Kind Regards,

     

    Kieran. 


    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    Thursday, September 15, 2011 2:26 PM
  • what is your opnion cocnering:

    I have “DimSite” table and this table is linked to “Fact sales” table
    I have another table “DimTarget” which have a column “SiteID”
    Do I need to make any relation between the 2 tables “DimSite” and “DimTarget” ?
    Or I will leave it like this?
    “SiteID” column here in table “DimTarget” will be equivalent to “SiteKey” or “SiteAlternateKey” in table “DimSite”?


    Ramzy N.Ebeid
    Thursday, September 15, 2011 2:36 PM
  • i will add 5 Dates in the FactSales Table
    and i only have 1 DimDate table
    how can i solve this?

    Ramzy N.Ebeid
    Thursday, September 15, 2011 3:01 PM
  • please i need your advice

    thanks


    Ramzy N.Ebeid
    Thursday, September 15, 2011 3:35 PM
  • Simply have 5 joins between the Fact Table and the DimDate table. One for each date which you will have as foreign keys on your fact table.
    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood

    Thursday, September 15, 2011 3:46 PM
  • The AdventureWorks sample demonstrates this time dimension concept very well; - http://msftdbprodsamples.codeplex.com/releases/view/55926 I do need to re-assert the recommended books I gave at the beginning of my answer. If you are under time pressure open a http://safaribooksonline.com/Corporate/Index/ acount to gain quick access to these books.

     


    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    Thursday, September 15, 2011 3:56 PM
  • Just a further thought on you Sales Data Warehouse diagram; - https://skydrive.live.com/#!/?cid=e967bdf9dcb139b0&sc=documents&uc=1&id=E967BDF9DCB139B0%21137

     

    I notice that there are 4 Fact tables which I feel is a little high. There is no hard and fast rule that you should have just one fact table per star schema, business rules may dictate that you need to have more than one fact table per star schema, if for example you need to query more than one fact table in the same report. I think maybe you could consider having two more star schemas once focusing on Sales, the other focusing on Inventory.

    I prefer to have one cube per star schema, this will increase the storage space used but simplfy design and improve query optimisation.

    Other factors to consider are the number of reports which will query this data warehouse and the number of records contained within this data warehouse.

    I hope the above is helpful.

    Please tell me how you get on.

     

     

     

     


    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    Friday, September 16, 2011 9:32 AM
  • i want to ask you about my design is it a good design professional one, for me as a beginner


    Ramzy N.Ebeid
    Monday, September 19, 2011 7:11 AM
  • Given the information you have provided me with your design does seem like a reasonably good one as a starting point. However how this design will be implemented in practice depends on many other factors such as your business requirements, number of reports which will query this data warehouse and the number of records contained within this data warehouse, the business rules which define these queries behind these reports, number of and types of users / stakeholders etc.

    To what extent have you applied the advice I have already given you? e.g.

    • downloading and analysing the AdventureWorks samples,
    • purchasing the 2 books I recommended,
    • usage of Date dimension,
    • reading the links I gave you.

     

     

     


    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    Monday, September 19, 2011 12:27 PM
  • all of them except some links
    Ramzy N.Ebeid
    Monday, September 19, 2011 3:03 PM
  • i updated the data warehouse again, if you please take a look on it

    go to this link: https://skydrive.live.com/#!/?cid=e967bdf9dcb139b0&sc=documents&uc=1&id=E967BDF9DCB139B0%21137

    document name: Sales Data Warehouse

    thanks,


    Ramzy N.Ebeid
    Monday, September 19, 2011 3:04 PM
  • OK, It's good to have some feedback.
    If you have found any of my posts helpful then please vote them as helpful. Kieran Patrick Wood MCTS BI, PGD SoftDev (Open), MBCS, MCC http://uk.linkedin.com/in/kieranpatrickwood
    • Marked as answer by RamzyEbeid Monday, September 26, 2011 2:27 PM
    Monday, September 19, 2011 3:05 PM
  • i updated the data warehouse again, if you please take a look on it

    go to this link: https://skydrive.live.com/#!/?cid=e967bdf9dcb139b0&sc=documents&uc=1&id=E967BDF9DCB139B0%21137

    document name: Sales Data Warehouse

    thanks,


    Ramzy N.Ebeid
    Monday, September 19, 2011 3:37 PM
  • In my data warehouse as attached.

    The DimBudget table contains a field “Channel(Customer)”

    This column is exists in the DimCustomer table.

    If I want to relate these 2 tables to each other then I will create a new table “DimChannels” and then relate it to DimCustomer and to DimBudget

    Tell me how to think.


    Ramzy N.Ebeid
    Monday, September 19, 2011 5:08 PM
  • I have 2 fact tables “FastSFASales” and “FactTarget”

    I have Item Hierarchy of product (Item / brand / family)

    The FactSFASales are using Item as its granularity

    Mean while FactTarget are using Brand as its granularity

    So what I did is that I excluded the Brand and Family from DimItem to be on sperate DimFamily and DimBrand to server 2 facts tables.

    Is there any other solution that will help me avoid the snowflake design?

    Thanks,

     


    Ramzy N.Ebeid
    Monday, September 19, 2011 6:22 PM
  • Always link the dim to the fact table. If you have to join dimensions to each other because their is a many-to-many relationship or due to the circumstances of your environment then use a bridge table or fact-less fact table.

    Still Dimensions should link only to the fact tables. If they are linked to more than 1 fact table assuming they have the same level of granularity these are called conformed dimensions. Lets say the level of detail was different for a dimension that related to an additional fact table, this is when you may need to create a Bridge table.

    I used to always end up with a snowflake until about 4-5 years ago, and really focused on finding ways to keep it a star schema. I've not had to create a snowflake model since, several times I was sure I was going to need to, by either combining dimensions, utilizing Role Playing dimensions,  eliminating attributes etc... Always architected and implemented the model as a star schema.

    Faster rendering time on queries, cubes process faster. Reduced time in loading the data warehouse also. Many other benefits.

    Thursday, June 7, 2018 10:36 PM