Now we have a car renting BI project, a car can be rent by different user, a user can rent different cars, it seems that the transaction Fact table is reasonable, for the booking dimension table, we have so many transaction, it will be very huge, right? who can share some ideas how to design this business datawarehouse structure? Any comment is hightly appreciated.
- Edited by Jeddd Sunday, August 18, 2013 3:33 AM typo
You will need to provide a bit more information. The one thing I can say is that "Booking" should not be a dimension table, but a fact table (either transactional or an accumulating snapshot).
Who is involved in a booking? Customer, car, employee? What facts?
Marius (Microsoft BI Solutions Architect)
Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.
Big thanks for your comment!
For the booking, it is really a transaction event, but I don't know whether we need such a Booking dimenion.
In order to describe the scenario clearly, I will give a example: a user come to our station to rent a car, so user, car, employee will all involve in. To the car, we have different car group, different stations, all of these should belong to car dimension I think, to the user, we have different user group....all of these should belong to user dimension. We have many information need to be analyzed.
1. Count different information: eg. car, user, station,
2. Booking duration (End Time - Start Time), sometimes, user will extend or decrease the booking duration this is our key information. we need to analysis duration based on different car, different station..... and also Distance, which will tell us how many kilometers.
3. Different of Revenue (fueling, rental price, Tickets, Premium Insurance, Driver Service, Deductabiles)
4. If the car is damaged by the user, we will also have Insurance & Damages fee analysis.
if you need more information, please feel free to let me know.
I think you are asking for advice on designing your data warehouse. There are many ways to design, and you must look at your own scenario to decide. I prefer to design in a snowflake schema. I try to normalize the data to what is reasonable. It's okay to have repeating information in a table if it balances the cost of table joins and the cost of space taken up by the repeating data. Here is a starting point from the information you provided. The Dimensions would be tables that contain related attributes about a subject area. The Measures would be in transactional Fact tables that relate back to the keys of as many of the Dimensions as possible.
There may be measures and dimensions that I have missed, but you get the idea.