Unix Technical Forum

indexing

This is a discussion on indexing within the SQL Server forums, part of the Microsoft SQL Server category; --> Hello, I need some help understanding why my indexes do not seem to be affecting my searches. I would ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-29-2008, 03:17 AM
superfly2
 
Posts: n/a
Default indexing

Hello,

I need some help understanding why my indexes do not seem to be affecting my
searches. I would really appreciate help understanding what indexes I need
to make this query run faster. I realize that I use wildcards when searching
for g1.gene_name, but is there anything I can do to make that less of a
problem? I ran EXPLAIN on the search I wanted to optimize and got the
following:

EXPLAIN SELECT c1.SFID FROM Gene g1, cDNA c1, Transcript t1, Refseq r1 WHERE
(c1.SFID = t1.cDNA_SFID AND t1.gene_SFID = g1.SFID AND (g1.gene_sym = 'hh'
OR g1.genbank_acc = 'hh' OR g1.gene_name LIKE '%hh%')) OR (c1.genbank_acc =
'hh' OR c1.SUID = 'hh') OR (c1.SFID = t1.cDNA_SFID AND t1.gene_SFID =
g1.SFID AND g1.locuslink_id = r1.locuslink_id AND (r1.mRNA_acc = 'hh'));
+-------+-------+--------------------------+------+---------+------+--------
+-------------------------+
| table | type | possible_keys | key | key_len | ref | rows
| Extra |
+-------+-------+--------------------------+------+---------+------+--------
+-------------------------+
| r1 | index | mRNA_acc,llid,rma,rllid | rma | 25 | NULL | 20093
| Using index |
| g1 | ALL | PRIMARY,llid,ggs,gga,gll | NULL | NULL | NULL | 190475
| |
| c1 | ALL | PRIMARY,cga,cs | NULL | NULL | NULL | 43714
| where used |
| t1 | index | gene_SFID,gS,cS,tg,tc | gS | 4 | NULL | 47238
| where used; Using index |
+-------+-------+--------------------------+------+---------+------+--------
+-------------------------+



I have the following indexes (which were all added after the database was
populated):

ALTER TABLE cDNA ADD INDEX cga
(genbank_acc, SFID);

ALTER TABLE cDNA ADD INDEX co
(organism, SFID);

ALTER TABLE cDNA ADD INDEX cs
(SUID, SFID);


ALTER TABLE Gene ADD INDEX ggs
(gene_sym, SFID);

ALTER TABLE Gene ADD INDEX gga
(genbank_acc, SFID);

ALTER TABLE Gene ADD INDEX ggn
(gene_name, SFID);

ALTER TABLE Gene ADD INDEX go
(organism, SFID);

ALTER TABLE Gene ADD INDEX gll
(locuslink_id, SFID);

ALTER TABLE Gene ADD INDEX gui
(unigene_id, SFID);


ALTER TABLE Transcript ADD INDEX tg
(gene_SFID, cDNA_SFID);

ALTER TABLE Transcript ADD INDEX tc
(cDNA_SFID);


ALTER TABLE Refseq ADD INDEX rma
(mRNA_acc, locuslink_id);

ALTER TABLE Refseq ADD INDEX rllid
(locuslink_id);


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-29-2008, 03:17 AM
Simon Hayes
 
Posts: n/a
Default Re: indexing


"superfly2" <darius_fatakia@yahoo.com> wrote in message
news:cb030a$11i$1@news.Stanford.EDU...
> Hello,
>
> I need some help understanding why my indexes do not seem to be affecting

my
> searches. I would really appreciate help understanding what indexes I need
> to make this query run faster. I realize that I use wildcards when

searching
> for g1.gene_name, but is there anything I can do to make that less of a
> problem? I ran EXPLAIN on the search I wanted to optimize and got the
> following:
>
> EXPLAIN SELECT c1.SFID FROM Gene g1, cDNA c1, Transcript t1, Refseq r1

WHERE
> (c1.SFID = t1.cDNA_SFID AND t1.gene_SFID = g1.SFID AND (g1.gene_sym = 'hh'
> OR g1.genbank_acc = 'hh' OR g1.gene_name LIKE '%hh%')) OR (c1.genbank_acc

=
> 'hh' OR c1.SUID = 'hh') OR (c1.SFID = t1.cDNA_SFID AND t1.gene_SFID =
> g1.SFID AND g1.locuslink_id = r1.locuslink_id AND (r1.mRNA_acc = 'hh'));
>

+-------+-------+--------------------------+------+---------+------+--------
> +-------------------------+
> | table | type | possible_keys | key | key_len | ref | rows
> | Extra |
>

+-------+-------+--------------------------+------+---------+------+--------
> +-------------------------+
> | r1 | index | mRNA_acc,llid,rma,rllid | rma | 25 | NULL |

20093
> | Using index |
> | g1 | ALL | PRIMARY,llid,ggs,gga,gll | NULL | NULL | NULL |

190475
> | |
> | c1 | ALL | PRIMARY,cga,cs | NULL | NULL | NULL |

43714
> | where used |
> | t1 | index | gene_SFID,gS,cS,tg,tc | gS | 4 | NULL |

47238
> | where used; Using index |
>

+-------+-------+--------------------------+------+---------+------+--------
> +-------------------------+
>
>
>
> I have the following indexes (which were all added after the database was
> populated):
>
> ALTER TABLE cDNA ADD INDEX cga
> (genbank_acc, SFID);
>
> ALTER TABLE cDNA ADD INDEX co
> (organism, SFID);
>
> ALTER TABLE cDNA ADD INDEX cs
> (SUID, SFID);
>
>
> ALTER TABLE Gene ADD INDEX ggs
> (gene_sym, SFID);
>
> ALTER TABLE Gene ADD INDEX gga
> (genbank_acc, SFID);
>
> ALTER TABLE Gene ADD INDEX ggn
> (gene_name, SFID);
>
> ALTER TABLE Gene ADD INDEX go
> (organism, SFID);
>
> ALTER TABLE Gene ADD INDEX gll
> (locuslink_id, SFID);
>
> ALTER TABLE Gene ADD INDEX gui
> (unigene_id, SFID);
>
>
> ALTER TABLE Transcript ADD INDEX tg
> (gene_SFID, cDNA_SFID);
>
> ALTER TABLE Transcript ADD INDEX tc
> (cDNA_SFID);
>
>
> ALTER TABLE Refseq ADD INDEX rma
> (mRNA_acc, locuslink_id);
>
> ALTER TABLE Refseq ADD INDEX rllid
> (locuslink_id);
>
>


I believe that EXPLAIN is a MySQL statement - this is a Microsoft SQL Server
nesgroup, so I guess you'll get a better answer in a MySQL forum.

Simon


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-29-2008, 03:17 AM
Erland Sommarskog
 
Posts: n/a
Default Re: indexing

[posted and mailed, please reply in news]

superfly2 (darius_fatakia@yahoo.com) writes:
> EXPLAIN SELECT c1.SFID FROM Gene g1, cDNA c1, Transcript t1, Refseq r1
> WHERE (c1.SFID = t1.cDNA_SFID AND t1.gene_SFID = g1.SFID AND
> (g1.gene_sym = 'hh' OR g1.genbank_acc = 'hh' OR g1.gene_name LIKE
> '%hh%')) OR (c1.genbank_acc = 'hh' OR c1.SUID = 'hh') OR (c1.SFID =
> t1.cDNA_SFID AND t1.gene_SFID = g1.SFID AND g1.locuslink_id =
> r1.locuslink_id AND (r1.mRNA_acc = 'hh'));


Obviously you are in the wrong newsgroup, beause there is no EXPLAIN
command in SQL Server, and you don't use ALTER TABLE to add indexes.

Nevertheless, I might be able to give some input. You have this
condition:

g1.gene_name LIKE '%hh%'

To resolve this condition, the DB engine cannot use the index for a
quick look up. Compare with looking through the index of a book and
try to find all keywords with 'hh' in them. You would have to scan
the entire index.

I don't know about your DBMS, but MS SQL Server could use the index
for a scan, and it would do it, if the index includes all columns in the
table that aer referred to. If I see right, you would have to add
locuslink_id to

ALTER TABLE Gene ADD INDEX ggs
(gene_sym, SFID);

Not that I know if this would help on your engine, but most DBMS's are
fond of using covering indexes.

Then again, I have no idea how your DBMS handle all those OR.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:57 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com