This is a discussion on Fact Table(s) Design Question within the SQL Server Data Warehousing forums, part of the Microsoft SQL Server category; --> Please forgive the cross-posting, I wasn't sure the best place to put this question. I'm modeling a procurement process ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Please forgive the cross-posting, I wasn't sure the best place to put this question. I'm modeling a procurement process as part of a data warehouse project - and in this case what we're procuring is people (filling jobs). The basic process is that the company creates a requisition, and that requisition will be to hire X number of people for a certain type of job. People then apply for those jobs until we hire enough to fill the number of openings from the requisition. In the transactional system, there are two main tables that hold the information for this process. One is Requisition, which has info like the number of openings and the min and max wage for the jobs. The second is RequisitionApplicants, which has a row for every person that applies for the job, and a flag that indicates of that person was hired or not. For the most part, the facts we're concerned with are counts - how many Requisitions there are by the age of the requisition, department, etc... and how many applicants we had vs how many hired by various dimensions. There's a disagreement among the design team about how this should be modeled. My contention is that there should be two fact tables, one based on the Requisition, another based on RequisitionPeople. This will allow us to easily do both kinds of counts that we need and keep the design clean. It would look like: FactRequsition -------------- AgeBucketKey DepartmentKey NumberOfOpenings MaxRate FactRequisitionApplicant ------------------------ PersonKey StatusKey HiredFlag Another team member says we should combine them into one table with the granularity being the applicant from RequisitionPeople - rolling everything up into one table. His design would be: FactRequisitionApplicant ----------------------- AgeBucketKey DepartmentKey PersonKey StatusKey NumberOfOpenings MaxRate HiredFlag The dimension keys from the old FactRequisition table would be repeated for every applicant to that requisition. Similarly, the facts would be repeated as well. I have a lot of problems with this. The grain feels all wrong and incoherent. It seems like we're unnecessarily repeating data and dimension keys. It also seems like the counts I need to come up with will be more difficult to obtain. For instance, if we have 5 requisitions and 22 people have applied to the various jobs from those requisitions, we'll have 22 rows in the fact table. Counting the applicants is easy, but counting the requisitions is more difficult. My question is - am I right? Which design is better? |