In Staging tables, is it a good practice to have primary/foreign keys and is it a good practice to set ALL table columns to Not Null?
Wednesday, February 27, 2013 3:57 PM
our company is beginning the process of using Staging tables for some data manipulations and warehousing. there is some discussion around the pros and cons of having primary/foreign keys on staging tables. the other items for discussion are the pros and cons of setting every field in a staging table to be Not Null. and lastly, the pros and cons of having lengthy table and columns names.
is there a "best practices" paper somewhere that would define how these staging tables should be named and structured?
- Edited by hddeluxe09 Wednesday, February 27, 2013 6:27 PM
Wednesday, February 27, 2013 8:05 PM
I doubt you find a "best practice" paper for staging area. Diferent people have different experience.
What we go by is the situation.
If we know the source data has intetgrity, there would be no integrity issue by not having a primary key duplicated in a staging table unless it helps with performance when moving from staging to Data Mart.
If the source is a combination of data tables and we need integrity before the Data Mart, we add primary Keys to staging area.
The preference is usually on performance of staging to Data Mart/Warehouse.
Some tables have no primary key or index during intial load, then we add cluster and non-cluster indexes after load to speed retrieval to Data Mart.
If you know there are no possiblities for Null values from source, then there is no need for Not Null on column. Again, this is for integrity of data. Would you rather have the ETL in the staging fail or load to Data Mart fail for integrity issues? The ETL part can move bad data to a seperate staging table.
Lengthy table and column names is a preference. Some like to use long names for self-documentation while others use a data dictionary for reference. It is up to you.
The Kimball book on ETL is a good source for well implemented paths.
TheSmilingDBA Thomas LeBlanc MCITP 2008 DBA