Unix Technical Forum

SQL Server storing large amounts of data in multiple tables

This is a discussion on SQL Server storing large amounts of data in multiple tables within the SQL Server forums, part of the Microsoft SQL Server category; --> Hello, Currently we have a database, and it is our desire for it to be able to store millions ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 07:10 PM
jeff brubaker
 
Posts: n/a
Default SQL Server storing large amounts of data in multiple tables

Hello,
Currently we have a database, and it is our desire for it to be able
to store millions of records. The data in the table can be divided up
by client, and it stores nothing but about 7 integers.
| table |
| id | clientId | int1 | int2 | int 3 | ... |

Right now, our benchmarks indicate a drastic increase in performance
if we divide the data into different tables. For example,
table_clientA, table_clientB, table_clientC, despite the fact the
tables contain the exact same columns. This however does not seem very
clean or elegant to me, and rather illogical since a database exists
as a single file on the harddrive.

| table_clientA |
| id | clientId | int1 | int2 | int 3 | ...

| table_clientB |
| id | clientId | int1 | int2 | int 3 | ...

| table_clientC |
| id | clientId | int1 | int2 | int 3 | ...

Is there anyway to duplicate this increase in database performance
gained by splitting the table, perhaps by using a certain type of
index?

Thanks,
Jeff Brubaker
Software Developer
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 07:11 PM
Chuck Urwiler
 
Posts: n/a
Default Re: SQL Server storing large amounts of data in multiple tables

Why not create a view that will combine the separate tables back into your
original format. You could even place the different base tables in different
locations (i.e., different servers) - the whole distributed partitioned view
concept.

Other than this, what do you have indexed on this table? What kind of query
is showing a drastic improvement when you separate the table like this?

-Chuck Urwiler, MCSD, MCDBA

"jeff brubaker" <jeff@priva.com> wrote in message
news:b7387660.0310021523.f2fb5e6@posting.google.co m...
> Hello,
> Currently we have a database, and it is our desire for it to be able
> to store millions of records. The data in the table can be divided up
> by client, and it stores nothing but about 7 integers.
> | table |
> | id | clientId | int1 | int2 | int 3 | ... |
>
> Right now, our benchmarks indicate a drastic increase in performance
> if we divide the data into different tables. For example,
> table_clientA, table_clientB, table_clientC, despite the fact the
> tables contain the exact same columns. This however does not seem very
> clean or elegant to me, and rather illogical since a database exists
> as a single file on the harddrive.
>
> | table_clientA |
> | id | clientId | int1 | int2 | int 3 | ...
>
> | table_clientB |
> | id | clientId | int1 | int2 | int 3 | ...
>
> | table_clientC |
> | id | clientId | int1 | int2 | int 3 | ...
>
> Is there anyway to duplicate this increase in database performance
> gained by splitting the table, perhaps by using a certain type of
> index?
>
> Thanks,
> Jeff Brubaker
> Software Developer



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 07:12 PM
Erland Sommarskog
 
Posts: n/a
Default Re: SQL Server storing large amounts of data in multiple tables

[posted and mailed, please reply in news]

jeff brubaker (jeff@priva.com) writes:
> Currently we have a database, and it is our desire for it to be able
> to store millions of records. The data in the table can be divided up
> by client, and it stores nothing but about 7 integers.
>| table |
>| id | clientId | int1 | int2 | int 3 | ... |
>
> Right now, our benchmarks indicate a drastic increase in performance
> if we divide the data into different tables. For example,
> table_clientA, table_clientB, table_clientC, despite the fact the
> tables contain the exact same columns. This however does not seem very
> clean or elegant to me, and rather illogical since a database exists
> as a single file on the harddrive.
>...
>
> Is there anyway to duplicate this increase in database performance
> gained by splitting the table, perhaps by using a certain type of
> index?


It is not implausible, but with out further knowledge of your tables
and the benchmark queries, it is impossible to tell.

You could get a more informative answer, if you posted:

o The CREATE TABLE statements (both for the unpartitioned table,
and the partitioned table).
o Any indexes on the tables.
o The queries you use for the benchmark.
o If you have scripts that generates data for the benchmarks, that
would extremely useful. (Provided that they reasonably small.)

Which client did you use for the benchmark? Query Analyzer?

--
Erland Sommarskog, SQL Server MVP, sommar@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 07:20 PM
jeff brubaker
 
Posts: n/a
Default Re: SQL Server storing large amounts of data in multiple tables

Erland Sommarskog <sommar@algonet.se> wrote in message news:<Xns940AEEEF4BC5EYazorman@127.0.0.1>...
> [posted and mailed, please reply in news]
>
> jeff brubaker (jeff@priva.com) writes:
> > Currently we have a database, and it is our desire for it to be able
> > to store millions of records. The data in the table can be divided up
> > by client, and it stores nothing but about 7 integers.
> >| table |
> >| id | clientId | int1 | int2 | int 3 | ... |
> >
> > Right now, our benchmarks indicate a drastic increase in performance
> > if we divide the data into different tables. For example,
> > table_clientA, table_clientB, table_clientC, despite the fact the
> > tables contain the exact same columns. This however does not seem very
> > clean or elegant to me, and rather illogical since a database exists
> > as a single file on the harddrive.
> >...
> >
> > Is there anyway to duplicate this increase in database performance
> > gained by splitting the table, perhaps by using a certain type of
> > index?

>
> It is not implausible, but with out further knowledge of your tables
> and the benchmark queries, it is impossible to tell.
>
> You could get a more informative answer, if you posted:
>
> o The CREATE TABLE statements (both for the unpartitioned table,
> and the partitioned table).
> o Any indexes on the tables.
> o The queries you use for the benchmark.
> o If you have scripts that generates data for the benchmarks, that
> would extremely useful. (Provided that they reasonably small.)
>
> Which client did you use for the benchmark? Query Analyzer?



Okay, sorry for the delay. Here is a SQL Script to setup the
experiment. Basically it builts one table with 100,000 records, and 10
small tables with 10,000 records. To select all the records from
table_5 is substantially faster than selecting all the records from
bigTable where clientID = 5



SET NOCOUNT ON
/* Drop any tables that might exist */

if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[bigTable]') and OBJECTPROPERTY(id, N'IsUserTable')
= 1)
drop table [dbo].[bigTable]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_0]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_0]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_1]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_1]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_2]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_2]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_3]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_3]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_4]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_4]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_5]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_5]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_6]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_6]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_7]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_7]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_8]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_8]
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[table_9]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [dbo].[table_9]


/* Create the tables */

CREATE TABLE [dbo].[bigTable] (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id110] [int] NULL ,
[clientid] [int] NULL
) ON [PRIMARY]

CREATE TABLE table_1 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]

CREATE TABLE table_2 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_3 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_4 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_5 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_6 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_7 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_8 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_9 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]


CREATE TABLE table_0 (
[id1] [int] NULL ,
[id2] [int] NULL ,
[id3] [int] NULL ,
[id4] [int] NULL ,
[id5] [int] NULL ,
[id6] [int] NULL ,
[id7] [int] NULL ,
[id8] [int] NULL ,
[id9] [int] NULL ,
[id10] [int] NULL ,
[id11] [int] NULL ,
[id12] [int] NULL ,
[id13] [int] NULL ,
[id14] [int] NULL ,
[id15] [int] NULL ,
[id16] [int] NULL ,
[id17] [int] NULL ,
[id18] [int] NULL ,
[id19] [int] NULL ,
[id20] [int] NULL ,
) ON [PRIMARY]
DECLARE @countPerClient int
SET @countPerClient = 10000
DECLARE @counter int
SET @counter = 1

/* Fill the big table with the 10 clients */

WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (0)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (1)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (2)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (3)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (4)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (5)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (6)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (7)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (8)
SET @counter = @counter + 1
END
SET @counter=1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT bigTable (clientId) VALUES (9)
SET @counter = @counter + 1
END


/* Fill each of the table with 1 clients */


SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_1 DEFAULT VALUES
SET @counter = @counter + 1
END

SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_2 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_3 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_4 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_5 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_6 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_7 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_8 DEFAULT VALUES
SET @counter = @counter + 1
END

SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_9 DEFAULT VALUES
SET @counter = @counter + 1
END
SET @counter = 1
WHILE (@counter <= @countPerClient)
BEGIN
INSERT table_0 DEFAULT VALUES
SET @counter = @counter + 1
END
GO
/* Now time for the queries */
DECLARE @x datetime


SELECT @x = GetDate()
select count(*) from table_5
SELECT 'Split Tables' as label,DateDiff(millisecond, @x, GetDate())

SELECT @x = GetDate()
select count(*) from bigTable where clientId=5
SELECT 'Big Table' as label,DateDiff(millisecond, @x, GetDate())
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-28-2008, 07:22 PM
Erland Sommarskog
 
Posts: n/a
Default Re: SQL Server storing large amounts of data in multiple tables

[posted and mailed, please reply in news]

jeff brubaker (jeff@priva.com) writes:
> Okay, sorry for the delay. Here is a SQL Script to setup the
> experiment. Basically it builts one table with 100,000 records, and 10
> small tables with 10,000 records. To select all the records from
> table_5 is substantially faster than selecting all the records from
> bigTable where clientID = 5


Yes, since there is no index at all at your tables, this is not
strange.

First, I had to increase the number of rows per client to 100000 to
get a significant difference.

When I had run the first test, I ran these two statements:

CREATE CLUSTERED INDEX clientid_ix on bigTable (clientId)
go
DBCC DROPCLEANBUFFERS

The first statement builds an index on bigTable.clientId. The second
just cleans out the cache, so that all data will be read from disk.
(Don't do this on a production machine!)

I then ran the benchmarks. table_5 was still faster with 450 ms, where
as the SELECT COUNT(*) from bigTable needed 563 ms. However, on
successive runs, table_5 took 110 ms, where as the read from bigTable
was 16 ms.

Before you consider advanced techniques like splitting tables, you should
make sure that you have a sound index strategy. A clustered index has
the data as its leaf pages, so after adding the clustered index,
bigTable is really like table_1 to table_9 glued together.
--
Erland Sommarskog, SQL Server MVP, sommar@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:00 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com