Unix Technical Forum

Smarter Table build - might split the table

This is a discussion on Smarter Table build - might split the table within the SQL Server forums, part of the Microsoft SQL Server category; --> Hello, I am using SQL 2005 and Cognos' Data Manager. It is an ETL tool for data warehousing. I ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-29-2008, 08:52 PM
rcamarda
 
Posts: n/a
Default Smarter Table build - might split the table

Hello,
I am using SQL 2005 and Cognos' Data Manager. It is an ETL tool for
data warehousing.
I have a problem with time it takes to load new changes, and I am
seeking advice on a better way to manage the data.

I have a table that tracks student attendance and it contains about 13
million records. On a daily basis, there are 5,000 - 20,000 inserts and
10,000 - 50,000 updates.
The daily data comes for two different text files from my operation
system; current and historical (CLSFIL and CLSHIS).
The data is loaded into a staging area from the operational system,
where data cleansing and other fields are added to the table.

The final step is delivering the table to my target database, which is
used for reporting.
Heres the situation: I find it takes 45 minutes to do a relational
update, where only the records that changed in the last day will be
loaded. However, if I choose the native API load instead of a
Relational Load, it can load all 13M records in 7 minutes. The table is
heavly indexed
At some point, the API load will take more time than the relational
load, (the changes and new records will remain a constant, but the file
will continue to grow).
I'm seeking another solution is more efficient. I'm considering two
tables for history and current and creating a view for reporting via a
union.

This a good idea? How can I make the view effeicent to use the where
clause? Looking to bounce around ideas.

Other Ideas?
Thanks in Advance
Rob

(I maintain the key relationships in the tool, not the tables. I know
I have lots to learn and improvments)
CREATE TABLE "dbo"."F_BI_Class_Attendance_Detail"
(
"CLASS_ATTENDANCE_ID" VARCHAR(50) NULL,
"CLASSES_OFFERED_ID" VARCHAR(26) NULL,
"CLASS_CAMPUS_ID" VARCHAR(10) NULL,
"STUDENT_ID" CHAR(20) NULL,
"FULL_CLASS_ID" CHAR(15) NOT NULL,
"SESSION_ID" CHAR(10) NULL,
"SECTION_ID" VARCHAR(5) NULL,
"MEET_DT" DATETIME NULL,
"MEETING" SMALLINT NULL,
"PRESENT" CHAR(2) NOT NULL,
"SESSION_SKEY" BIGINT NULL,
"STUDENT_SKEY" BIGINT NULL,
"CLASS_CAMPUS_SKEY" BIGINT NULL,
"CLASSES_OFFERED_SKEY" BIGINT NULL,
"LOAD_DT" DATETIME NULL,
"COMPUTED_DT" DATETIME NULL
)
;

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-29-2008, 08:53 PM
rcamarda
 
Posts: n/a
Default Re: Smarter Table build - might split the table

I have not partitioned the table. Here are the indexes:
CREATE INDEX CLASS_ATTENDANCE_ID ON
"dbo"."F_BI_Class_Attendance_Detail" ( "CLASS_ATTENDANCE_ID" );
CREATE INDEX CLASS_CAMPUS_ID ON "dbo"."F_BI_Class_Attendance_Detail" (
"CLASS_CAMPUS_ID" );
CREATE INDEX STUDENT_ID ON "dbo"."F_BI_Class_Attendance_Detail" (
"STUDENT_ID" );
CREATE INDEX FULL_CLASS_ID ON "dbo"."F_BI_Class_Attendance_Detail" (
"FULL_CLASS_ID" );
CREATE INDEX SESSION_ID ON "dbo"."F_BI_Class_Attendance_Detail" (
"SESSION_ID" );
CREATE INDEX MEETING ON "dbo"."F_BI_Class_Attendance_Detail" (
"MEETING" );
CREATE INDEX PRESENT ON "dbo"."F_BI_Class_Attendance_Detail" (
"PRESENT" );
CREATE INDEX SESSION_SKEY ON "dbo"."F_BI_Class_Attendance_Detail" (
"SESSION_SKEY" );
CREATE INDEX STUDENT_SKEY ON "dbo"."F_BI_Class_Attendance_Detail" (
"STUDENT_SKEY" );
CREATE INDEX CLASS_CAMPUS_SKEY ON "dbo"."F_BI_Class_Attendance_Detail"
( "CLASS_CAMPUS_SKEY" );

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-29-2008, 08:53 PM
rcamarda
 
Posts: n/a
Default Re: Smarter Table build - might split the table

Upon more research in this news group, it looks like I should look at
horizontal partitioning.
I am using SQL 2005 Enterprise, so I should be able to use partitioned
data and indexes.
If I partition on year, what should I consider to make this as
maintenance free as possible (what happens where a new year comes?)
TIA
Rob

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-29-2008, 08:53 PM
Hugo Kornelis
 
Posts: n/a
Default Re: Smarter Table build - might split the table

On 10 Jul 2006 07:31:10 -0700, rcamarda wrote:

(snip)
>Heres the situation: I find it takes 45 minutes to do a relational
>update, where only the records that changed in the last day will be
>loaded.


Hi Rob,

What column(s) is/are used to relate rows in the staging tables to the
matching rows in the target table? Are there any indexes on the staging
table? Can you post the query you use for this relational update?

--
Hugo Kornelis, SQL Server MVP
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:44 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com