Unix Technical Forum

database design and query problem

This is a discussion on database design and query problem within the MySQL forums, part of the Database Server Software category; --> Hi, I'm looking at creating a database which will contain over 1.2 million telephone numbers, however after putting them ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 08:14 AM
Adnan Barakat
 
Posts: n/a
Default database design and query problem

Hi,

I'm looking at creating a database which will contain over 1.2 million
telephone numbers, however after putting them all in one table the time
it takes to retrieve particular numbers takes quite a while (approx 5.2
secs per request), so i tried splitting up the data into 8 tables (named
by prefix of the number), this has worked out much quicker as only the
tables where the user has a number is being searched. if anyone has any
suggestions on a better design please let me know.

This leads me onto the main problem, how can I select from multiple
tables but get all the results in 1 query. I am looking to do this so I
can let MySQL process the LIMIT and the OFFSET rather than let PHP do so.

I've tried this but it doesnt seem to work

SELECT
*
FROM
`0870`,
`0800`,
`0702`
WHERE
`0870`.`owner` = '0'
LIMIT
30


Many thanks
Adnan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 08:14 AM
Giuseppe Maxia
 
Posts: n/a
Default Re: database design and query problem

Adnan Barakat wrote:
> Hi,
>
> I'm looking at creating a database which will contain over 1.2 million
> telephone numbers, however after putting them all in one table the time
> it takes to retrieve particular numbers takes quite a while (approx 5.2
> secs per request), so i tried splitting up the data into 8 tables (named
> by prefix of the number), this has worked out much quicker as only the
> tables where the user has a number is being searched. if anyone has any
> suggestions on a better design please let me know.
>
> This leads me onto the main problem, how can I select from multiple
> tables but get all the results in 1 query. I am looking to do this so I
> can let MySQL process the LIMIT and the OFFSET rather than let PHP do so.
>
> I've tried this but it doesnt seem to work
>
> SELECT
> *
> FROM
> `0870`,
> `0800`,
> `0702`
> WHERE
> `0870`.`owner` = '0'
> LIMIT
> 30
>
>
> Many thanks
> Adnan


I have a similar table with over 5 million records and the answer
is always immediate.

Is your table indexed?

Can you post the structure of your table?

SHOW CREATE TABLE your_table_name\G

ciao
gmax

--
_ _ _ _
(_|| | |(_|>< The Data Charmer
_|
http://datacharmer.blogspot.com/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 08:14 AM
Adnan Barakat
 
Posts: n/a
Default Re: database design and query problem

thanks for your quick response gmax,

as you can see the table isn't indexed, which I take it isn't a good
idea, how would I go about indexing it (as in which columns would i need
to other than `uid`)

+----------+------------------------------------------+
| Table | Create Table |
+----------+------------------------------------------+
| numbers | CREATE TABLE `numbers` (
`uid` varchar(11) NOT NULL default '',
`band` smallint(2) NOT NULL default '0',
`owner` int(11) NOT NULL default '0',
`status` smallint(2) NOT NULL default '0',
`dest` varchar(250) NOT NULL default '',
PRIMARY KEY (`uid`)
) TYPE=MyISAM |
+----------+------------------------------------------+


many thanks

Adnan




Giuseppe Maxia wrote:
> Adnan Barakat wrote:
>
>> Hi,
>>
>> I'm looking at creating a database which will contain over 1.2 million
>> telephone numbers, however after putting them all in one table the
>> time it takes to retrieve particular numbers takes quite a while
>> (approx 5.2 secs per request), so i tried splitting up the data into 8
>> tables (named by prefix of the number), this has worked out much
>> quicker as only the tables where the user has a number is being
>> searched. if anyone has any suggestions on a better design please let
>> me know.
>>
>> This leads me onto the main problem, how can I select from multiple
>> tables but get all the results in 1 query. I am looking to do this so
>> I can let MySQL process the LIMIT and the OFFSET rather than let PHP
>> do so.
>>
>> I've tried this but it doesnt seem to work
>>
>> SELECT
>> *
>> FROM
>> `0870`,
>> `0800`,
>> `0702`
>> WHERE
>> `0870`.`owner` = '0'
>> LIMIT
>> 30
>>
>>
>> Many thanks
>> Adnan

>
>
> I have a similar table with over 5 million records and the answer
> is always immediate.
>
> Is your table indexed?
>
> Can you post the structure of your table?
>
> SHOW CREATE TABLE your_table_name\G
>
> ciao
> gmax
>

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 08:14 AM
Giuseppe Maxia
 
Posts: n/a
Default Re: database design and query problem

Adnan Barakat wrote:
> thanks for your quick response gmax,
>
> as you can see the table isn't indexed, which I take it isn't a good
> idea, how would I go about indexing it (as in which columns would i need
> to other than `uid`)
>
> +----------+------------------------------------------+
> | Table | Create Table |
> +----------+------------------------------------------+
> | numbers | CREATE TABLE `numbers` (
> `uid` varchar(11) NOT NULL default '',
> `band` smallint(2) NOT NULL default '0',
> `owner` int(11) NOT NULL default '0',
> `status` smallint(2) NOT NULL default '0',
> `dest` varchar(250) NOT NULL default '',
> PRIMARY KEY (`uid`)
> ) TYPE=MyISAM |
> +----------+------------------------------------------+
>
>
> many thanks
>
> Adnan
>


I can't see form the names the meaning of each field.
Assuming that your phone number is in `owner`, then you should do

ALTER TABLE phones ADD UNIQUE KEY (owner);

After that, if you run a query with
WHERE owner = somenumber

it will return a result immediately.

Depending on your most frequent queries, you could also add
a compound index. For example, if you search for status, band, and dest,
then you could add

ALTER TABLE phones ADD UNIQUE KEY (owner), ADD KEY (status, band, dest);

However, it depends on what you are asking for, and how often you are issuing such queries.

Try adding a KEY on the phone number field (if numbers are supposed to be without
duplicates, add a UNIQUE KEY) and see if the situation improves.

ciao
gmax

P.S. You should answer AFTER the quoted message, not before.

--
_ _ _ _
(_|| | |(_|>< The Data Charmer
_|
http://datacharmer.blogspot.com/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-28-2008, 08:14 AM
Dikkie Dik
 
Posts: n/a
Default Re: database design and query problem

Please keep responding before the quote, So I won't have to scroll all
the way down to see the answer.

In fact, the splitting of the tables is a bit what an index does, but in
a far better fashion. By default, a textual index is 4 characters long,
meaning that the 4 first characters can be used in a binary search. If
all numbers start with the same 4 characters, this does not make much
sense. You can therefore supply a key length for the index as well. So
if, say, 80% of your phone numbers start with '0870', the index would
not be very effective and you could gain a lot more speed by setting the
index length to a larger value. Supposing that 'uid' contains the phone
number (it really pays off to choose descriptive names), you could try
to enlarge the key to 6 characters with:

ALTER TABLE phones DROP PRIMARY KEY;
ALTER TABLE phones ADD PRIMARY KEY(uid(6));

(I could not find a modification command for an index)

Best regards

Giuseppe Maxia wrote:
> Adnan Barakat wrote:
>
>> thanks for your quick response gmax,
>>
>> as you can see the table isn't indexed, which I take it isn't a good
>> idea, how would I go about indexing it (as in which columns would i
>> need to other than `uid`)
>>
>> +----------+------------------------------------------+
>> | Table | Create Table |
>> +----------+------------------------------------------+
>> | numbers | CREATE TABLE `numbers` (
>> `uid` varchar(11) NOT NULL default '',
>> `band` smallint(2) NOT NULL default '0',
>> `owner` int(11) NOT NULL default '0',
>> `status` smallint(2) NOT NULL default '0',
>> `dest` varchar(250) NOT NULL default '',
>> PRIMARY KEY (`uid`)
>> ) TYPE=MyISAM |
>> +----------+------------------------------------------+
>>
>>
>> many thanks
>>
>> Adnan
>>

>
> I can't see form the names the meaning of each field.
> Assuming that your phone number is in `owner`, then you should do
>
> ALTER TABLE phones ADD UNIQUE KEY (owner);
>
> After that, if you run a query with
> WHERE owner = somenumber
>
> it will return a result immediately.
>
> Depending on your most frequent queries, you could also add
> a compound index. For example, if you search for status, band, and dest,
> then you could add
>
> ALTER TABLE phones ADD UNIQUE KEY (owner), ADD KEY (status, band, dest);
>
> However, it depends on what you are asking for, and how often you are
> issuing such queries.
>
> Try adding a KEY on the phone number field (if numbers are supposed to
> be without
> duplicates, add a UNIQUE KEY) and see if the situation improves.
>
> ciao
> gmax
>
> P.S. You should answer AFTER the quoted message, not before.
>

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-28-2008, 08:14 AM
Jerry Stuckle
 
Posts: n/a
Default Re: database design and query problem

Dikkie Dik wrote:
> Please keep responding before the quote, So I won't have to scroll all
> the way down to see the answer.
>


Please don't top post. The standards for this group are to respond
inline (like this) or after the quoted message.

> In fact, the splitting of the tables is a bit what an index does, but in
> a far better fashion. By default, a textual index is 4 characters long,
> meaning that the 4 first characters can be used in a binary search. If
> all numbers start with the same 4 characters, this does not make much
> sense. You can therefore supply a key length for the index as well. So
> if, say, 80% of your phone numbers start with '0870', the index would
> not be very effective and you could gain a lot more speed by setting the
> index length to a larger value. Supposing that 'uid' contains the phone
> number (it really pays off to choose descriptive names), you could try
> to enlarge the key to 6 characters with:
>


Actually, an index does not split the table at all - any more than the
index to a book splits the book. An index only contains the key and a
pointer to the record.

As for the first 4 characters of the text search - where did you find
this information? I haven't found anything like this in the MySQL doc,
and my experience has been that the entire key is in the table.

Additionally, the key would not be able to enforce uniqueness if it
didn't contain the entire key.

> ALTER TABLE phones DROP PRIMARY KEY;
> ALTER TABLE phones ADD PRIMARY KEY(uid(6));
>
> (I could not find a modification command for an index)
>


Yes, but the doc indicates this can be used to SHORTEN the amount of
space used by the index. There is no place I can find where it states
only the first 4 characters of a text field are used for searches.

> Best regards
>



--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-28-2008, 08:14 AM
Giuseppe Maxia
 
Posts: n/a
Default Re: database design and query problem

Dikkie Dik wrote:
> Please keep responding before the quote, So I won't have to scroll all
> the way down to see the answer.
>


In newsgroups it is customary to post AFTER the quote.

To quote an old joke:

A: Top posting!
Q: What is the most irritating thing on Usenet?

http://www.parashift.com/c++-faq-lit...t.html#faq-5.4

> In fact, the splitting of the tables is a bit what an index does, but in
> a far better fashion.


???
Splitting a table is against normalization, and it should be done only if
there are really compelling reasons for it. (Like creating a data warehouse)
It might be more efficient, but it goes against relational principles,
thus making queries much more difficult.

> By default, a textual index is 4 characters long,


I can't find any proof of this. Care to show where did you get that info?
A far as I know, a textual index is as long as the column to which it refers,
unless a limit is explicitly given.

> meaning that the 4 first characters can be used in a binary search. If
> all numbers start with the same 4 characters, this does not make much
> sense. You can therefore supply a key length for the index as well. So
> if, say, 80% of your phone numbers start with '0870', the index would
> not be very effective and you could gain a lot more speed by setting the
> index length to a larger value.


If you need to find an exact telephone number, 4 characters are not enough. However,
if your phone numbers have an area code, it may be a good idea to split the
column into two, and to create a compound index.

> Supposing that 'uid' contains the phone
> number (it really pays off to choose descriptive names), you could try
> to enlarge the key to 6 characters with:
>
> ALTER TABLE phones DROP PRIMARY KEY;
> ALTER TABLE phones ADD PRIMARY KEY(uid(6));
>
> (I could not find a modification command for an index)
>


All in one command:
ALTER TABLE phones DROP KEY old_index, ADD KEY new_index.

[SNIP]

ciao
gmax

--
_ _ _ _
(_|| | |(_|>< The Data Charmer
_|
http://datacharmer.blogspot.com/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-28-2008, 08:14 AM
Adnan Barakat
 
Posts: n/a
Default Re: database design and query problem

Giuseppe Maxia wrote:
> Adnan Barakat wrote:
>
>> thanks for your quick response gmax,
>>
>> as you can see the table isn't indexed, which I take it isn't a good
>> idea, how would I go about indexing it (as in which columns would i
>> need to other than `uid`)
>>
>> +----------+------------------------------------------+
>> | Table | Create Table |
>> +----------+------------------------------------------+
>> | numbers | CREATE TABLE `numbers` (
>> `uid` varchar(11) NOT NULL default '',
>> `band` smallint(2) NOT NULL default '0',
>> `owner` int(11) NOT NULL default '0',
>> `status` smallint(2) NOT NULL default '0',
>> `dest` varchar(250) NOT NULL default '',
>> PRIMARY KEY (`uid`)
>> ) TYPE=MyISAM |
>> +----------+------------------------------------------+
>>
>>
>> many thanks
>>
>> Adnan
>>

>
> I can't see form the names the meaning of each field.
> Assuming that your phone number is in `owner`, then you should do
>
> ALTER TABLE phones ADD UNIQUE KEY (owner);
>
> After that, if you run a query with
> WHERE owner = somenumber
>
> it will return a result immediately.
>
> Depending on your most frequent queries, you could also add
> a compound index. For example, if you search for status, band, and dest,
> then you could add
>
> ALTER TABLE phones ADD UNIQUE KEY (owner), ADD KEY (status, band, dest);
>
> However, it depends on what you are asking for, and how often you are
> issuing such queries.
>
> Try adding a KEY on the phone number field (if numbers are supposed to
> be without
> duplicates, add a UNIQUE KEY) and see if the situation improves.
>
> ciao
> gmax
>
> P.S. You should answer AFTER the quoted message, not before.
>


hi gmax,

`uid` is the actual telephone number as there cant be any duplicates (it
is varchar because the number starts with a zero so int didn't work). I
will change the name of the column to `number` to make it more obvious!
`owner` is the uid of the user who owns the number.
`band` is the type of number (a number which relates to the value of the
number).
`status` is the type of service provided to that number.
`dest` is the destination of the number.

I have altered the table and the time has been reduced a quite a bit.

I ran ALTER TABLE numbers ADD UNIQUE KEY (uid);

many thanks
Adnan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-28-2008, 08:14 AM
Chander Ganesan
 
Posts: n/a
Default Re: database design and query problem

While there are a number of good reasons to split data into separate
tables, the reason you mention above isn't one of them. Splitting a
table to speed indexes might be useful if you plan to query a member
table directly (so query a subset - such as the numbers that were
entered in a particular quarter of the year - which are all stored in a
member table), but not if you are always accessing the parent.

Indexes (by default) are built on the entire field - if the field is 4
characters long, then the index will index all 4 characters. If the
field is 32 characters, the index will index all 32 characters.

MySQL uses a B-TREE method for creating the index. I don't think
you'll see much of a performance difference if you split the area code
into a separate field and use a compound index. The only time that
would be useful is if you sometimes search for a phone number *without*
an area code - in which case you could order the compound index with
the 'number' and then the 'area code'. Of course, that means that the
'area code' indexed fields wouldn't be accessible directly - but the
cardinality of an area code is probably low anyways...

It is sometimes useful to perform partial indexing (in MySQL this is
done by specifying the length to index), if the cardinality of the
partially indexed field is close enough to the cardinality of the
entire field, or if you have other business requirements (such as
always perform matches of a subpart of the field) that require this.
You can count the cardinality of a partial index using a select unique
combined with a substring statement.

I think the previous answer is the optimal one, use a UNIQUE index on
the whole field, or create a PRIMARY KEY index (if nulls are allowed
you'll have to use UNIQUE).

P.S. - Posting through Google Groups here , which I don't think even
includes the original message in the response ...

--
Chander Ganesan
Open Technology Group, Inc.
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com
Expert MySQL training

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-28-2008, 08:14 AM
Adnan Barakat
 
Posts: n/a
Default Re: database design and query problem

Chander Ganesan wrote:
> While there are a number of good reasons to split data into separate
> tables, the reason you mention above isn't one of them. Splitting a
> table to speed indexes might be useful if you plan to query a member
> table directly (so query a subset - such as the numbers that were
> entered in a particular quarter of the year - which are all stored in a
> member table), but not if you are always accessing the parent.
>
> Indexes (by default) are built on the entire field - if the field is 4
> characters long, then the index will index all 4 characters. If the
> field is 32 characters, the index will index all 32 characters.
>
> MySQL uses a B-TREE method for creating the index. I don't think
> you'll see much of a performance difference if you split the area code
> into a separate field and use a compound index. The only time that
> would be useful is if you sometimes search for a phone number *without*
> an area code - in which case you could order the compound index with
> the 'number' and then the 'area code'. Of course, that means that the
> 'area code' indexed fields wouldn't be accessible directly - but the
> cardinality of an area code is probably low anyways...
>
> It is sometimes useful to perform partial indexing (in MySQL this is
> done by specifying the length to index), if the cardinality of the
> partially indexed field is close enough to the cardinality of the
> entire field, or if you have other business requirements (such as
> always perform matches of a subpart of the field) that require this.
> You can count the cardinality of a partial index using a select unique
> combined with a substring statement.
>
> I think the previous answer is the optimal one, use a UNIQUE index on
> the whole field, or create a PRIMARY KEY index (if nulls are allowed
> you'll have to use UNIQUE).
>
> P.S. - Posting through Google Groups here , which I don't think even
> includes the original message in the response ...
>
> --
> Chander Ganesan
> Open Technology Group, Inc.
> Phone: 877-258-8987/919-463-0999
> http://www.otg-nc.com
> Expert MySQL training
>


Hi Chander,

Thank you very much for your detailed response, it has cleared things up
for me

Adnan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:55 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com