vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have a table that I am converting to a cube with almost 1 million rows in it. I am curious if in the dimension table, is it good design to have a 1:1 relationship with the fact table rows, or should I distinct the data in the dimension table, and point to it from the fact table? thanks!! |
| ||||
| Having a 1:1 relationship sort of defeats the purpose of having a dimension table tbh - its still a relational database after all. Your end users won't thank you for it either! You should cetainly choose to reduce the number of rows in a dimension table, having large dimension tables are the cause of some of the biggest headaches in data warehousing. How you populate the dimension table depends on the level of granularity you've decided on (ie what is represented by one row in the dimension table) which, depends on what the dimension itself represents. In the classic example of a Date dimension each row usually represents one day so you have 365 rows for each year (366 for leap years of course). You might have millions of fact table rows in a day but only one row for a Date dimension. Other classic types of dimension granularity would be a Product dimension with one row per product or an Employee dimension with one row per employee. Which dimensions you choose to create should reflect the requirements of your users as much as possible. Generally its wisest to choose a level of granularity that gives you a lot of detail - ie choose the lowest possible. That is almost always an order of magnitude smaller than your fact table. The level of detail available will obviously depend on your source data - so doing a SELECT DISTINCT on each column of source data is a good place to start, once you've worked out what dimensions you need. Dimensional modelling (which is what you're essentially asking about) is a huge topic, but one of its best proponents is Ralph Kimball - its worth looking out his 'Data Warehouse Toolkit' on amazon here: http://www.amazon.co.uk/Data-Warehou...82270&sr= 8-1. HTH -- Phil http://www.clarity-integration.com http://www.phil-austin.blogspot.com "Carl Henthorn" <CarlHenthorn@discussions.microsoft.com> wrote in message news:70017C02-BC13-4FBC-A556-1796415403D5@microsoft.com... >I have a table that I am converting to a cube with almost 1 million rows in > it. I am curious if in the dimension table, is it good design to have a > 1:1 > relationship with the fact table rows, or should I distinct the data in > the > dimension table, and point to it from the fact table? > thanks!! |