Unix Technical Forum

Strange optimizer behavior

This is a discussion on Strange optimizer behavior within the MySQL General forum forums, part of the MySQL category; --> Hello all, Given this table: DROP TABLE IF EXISTS `maprimary`.`tbl_locales_ip2l`; CREATE TABLE `maprimary`.`tbl_locales_ip2l` ( `ipStart` int(10) unsigned zerofill NOT ...


Go Back   Unix Technical Forum > Database Server Software > MySQL > MySQL General forum

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 07:02 AM
Sharon
 
Posts: n/a
Default Strange optimizer behavior

Hello all,
Given this table:
DROP TABLE IF EXISTS `maprimary`.`tbl_locales_ip2l`;
CREATE TABLE `maprimary`.`tbl_locales_ip2l` (
`ipStart` int(10) unsigned zerofill NOT NULL default '0000000000',
`ipEnd` int(10) unsigned zerofill NOT NULL default '0000000000',
`countryCode` varchar(2) default NULL,
`country` varchar(100) default NULL,
`state` varchar(100) default NULL,
`city` varchar(120) default NULL,
`lat` float NOT NULL default '0',
`lon` float NOT NULL default '0',
`zipCode` varchar(10) NOT NULL default '0',
`timeZone` int(10) NOT NULL default '0',
PRIMARY KEY USING BTREE (`ipStart`,`ipEnd`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

When I use this query:
SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 3741319167 AND
`ipEnd` >= 3741319167;
I can see that the primary key is not used and the query takes about 3 sec.
But when I use this query:
SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 374131916 AND
`ipEnd` >= 374131916;
The primary key is used.
The table contains about 3M rows.
Can anyone explain?
Thanks, Sharon.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 07:02 AM
Baron Schwartz
 
Posts: n/a
Default Re: Strange optimizer behavior

Hi,

On Dec 20, 2007 2:15 AM, Sharon <talsharon@hotmail.com> wrote:
> Hello all,
> Given this table:
> DROP TABLE IF EXISTS `maprimary`.`tbl_locales_ip2l`;
> CREATE TABLE `maprimary`.`tbl_locales_ip2l` (
> `ipStart` int(10) unsigned zerofill NOT NULL default '0000000000',
> `ipEnd` int(10) unsigned zerofill NOT NULL default '0000000000',
> `countryCode` varchar(2) default NULL,
> `country` varchar(100) default NULL,
> `state` varchar(100) default NULL,
> `city` varchar(120) default NULL,
> `lat` float NOT NULL default '0',
> `lon` float NOT NULL default '0',
> `zipCode` varchar(10) NOT NULL default '0',
> `timeZone` int(10) NOT NULL default '0',
> PRIMARY KEY USING BTREE (`ipStart`,`ipEnd`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
>
> When I use this query:
> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 3741319167 AND
> `ipEnd` >= 3741319167;
> I can see that the primary key is not used and the query takes about 3 sec.
> But when I use this query:
> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 374131916 AND
> `ipEnd` >= 374131916;
> The primary key is used.
> The table contains about 3M rows.
> Can anyone explain?
> Thanks, Sharon.


if the query will access more than a certain amount of rows, it won't
be used. There is a set of heuristics for this; the actual number
varies, but people often say a full scan is about as much work as an
index scan that retrieves 30% of the rows. That's not quite the way
the optimizer works, but it gives you an idea.

If you think it really will be faster, use USE INDEX or FORCE INDEX and see.

Baron
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 07:02 AM
Sharon
 
Posts: n/a
Default Re: Strange optimizer behavior

Baron Schwartz wrote:
> Hi,
>
> On Dec 20, 2007 2:15 AM, Sharon <talsharon@hotmail.com> wrote:
>> Hello all,
>> Given this table:
>> DROP TABLE IF EXISTS `maprimary`.`tbl_locales_ip2l`;
>> CREATE TABLE `maprimary`.`tbl_locales_ip2l` (
>> `ipStart` int(10) unsigned zerofill NOT NULL default '0000000000',
>> `ipEnd` int(10) unsigned zerofill NOT NULL default '0000000000',
>> `countryCode` varchar(2) default NULL,
>> `country` varchar(100) default NULL,
>> `state` varchar(100) default NULL,
>> `city` varchar(120) default NULL,
>> `lat` float NOT NULL default '0',
>> `lon` float NOT NULL default '0',
>> `zipCode` varchar(10) NOT NULL default '0',
>> `timeZone` int(10) NOT NULL default '0',
>> PRIMARY KEY USING BTREE (`ipStart`,`ipEnd`)
>> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
>>
>> When I use this query:
>> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 3741319167 AND
>> `ipEnd` >= 3741319167;
>> I can see that the primary key is not used and the query takes about 3 sec.
>> But when I use this query:
>> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 374131916 AND
>> `ipEnd` >= 374131916;
>> The primary key is used.
>> The table contains about 3M rows.
>> Can anyone explain?
>> Thanks, Sharon.

>
> if the query will access more than a certain amount of rows, it won't
> be used. There is a set of heuristics for this; the actual number
> varies, but people often say a full scan is about as much work as an
> index scan that retrieves 30% of the rows. That's not quite the way
> the optimizer works, but it gives you an idea.
>
> If you think it really will be faster, use USE INDEX or FORCE INDEX and see.
>
> Baron


You're right, forcing the index results in a 47 sec. query.
Any idea how to optimize this table?
3 seconds query (not forcing the index) is way too slow.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 07:02 AM
Baron Schwartz
 
Posts: n/a
Default Re: Strange optimizer behavior

On Dec 20, 2007 7:16 AM, Sharon <talsharon@hotmail.com> wrote:
>
> Baron Schwartz wrote:
> > Hi,
> >
> > On Dec 20, 2007 2:15 AM, Sharon <talsharon@hotmail.com> wrote:
> >> Hello all,
> >> Given this table:
> >> DROP TABLE IF EXISTS `maprimary`.`tbl_locales_ip2l`;
> >> CREATE TABLE `maprimary`.`tbl_locales_ip2l` (
> >> `ipStart` int(10) unsigned zerofill NOT NULL default '0000000000',
> >> `ipEnd` int(10) unsigned zerofill NOT NULL default '0000000000',
> >> `countryCode` varchar(2) default NULL,
> >> `country` varchar(100) default NULL,
> >> `state` varchar(100) default NULL,
> >> `city` varchar(120) default NULL,
> >> `lat` float NOT NULL default '0',
> >> `lon` float NOT NULL default '0',
> >> `zipCode` varchar(10) NOT NULL default '0',
> >> `timeZone` int(10) NOT NULL default '0',
> >> PRIMARY KEY USING BTREE (`ipStart`,`ipEnd`)
> >> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
> >>
> >> When I use this query:
> >> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 3741319167 AND
> >> `ipEnd` >= 3741319167;
> >> I can see that the primary key is not used and the query takes about 3 sec.
> >> But when I use this query:
> >> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 374131916 AND
> >> `ipEnd` >= 374131916;
> >> The primary key is used.
> >> The table contains about 3M rows.
> >> Can anyone explain?
> >> Thanks, Sharon.

> >
> > if the query will access more than a certain amount of rows, it won't
> > be used. There is a set of heuristics for this; the actual number
> > varies, but people often say a full scan is about as much work as an
> > index scan that retrieves 30% of the rows. That's not quite the way
> > the optimizer works, but it gives you an idea.
> >
> > If you think it really will be faster, use USE INDEX or FORCE INDEX and see.
> >
> > Baron

>
> You're right, forcing the index results in a 47 sec. query.
> Any idea how to optimize this table?
> 3 seconds query (not forcing the index) is way too slow.


Try InnoDB with the same primary key. This will cluster the rows
together physically and *might* be faster, but it depends on your
queries.

Side note: be careful of making the varchar columns larger than you
need, as any operations that use an in-memory temporary table (shown
by "Using temporary" in EXPLAIN) will use the full length of the
column, even if only a few characters are used. (The Memory storage
engine doesn't support variable-length rows).
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-28-2008, 07:02 AM
Sharon
 
Posts: n/a
Default Re: Strange optimizer behavior

Baron Schwartz wrote:
> On Dec 20, 2007 7:16 AM, Sharon <talsharon@hotmail.com> wrote:
>> Baron Schwartz wrote:
>>> Hi,
>>>
>>> On Dec 20, 2007 2:15 AM, Sharon <talsharon@hotmail.com> wrote:
>>>> Hello all,
>>>> Given this table:
>>>> DROP TABLE IF EXISTS `maprimary`.`tbl_locales_ip2l`;
>>>> CREATE TABLE `maprimary`.`tbl_locales_ip2l` (
>>>> `ipStart` int(10) unsigned zerofill NOT NULL default '0000000000',
>>>> `ipEnd` int(10) unsigned zerofill NOT NULL default '0000000000',
>>>> `countryCode` varchar(2) default NULL,
>>>> `country` varchar(100) default NULL,
>>>> `state` varchar(100) default NULL,
>>>> `city` varchar(120) default NULL,
>>>> `lat` float NOT NULL default '0',
>>>> `lon` float NOT NULL default '0',
>>>> `zipCode` varchar(10) NOT NULL default '0',
>>>> `timeZone` int(10) NOT NULL default '0',
>>>> PRIMARY KEY USING BTREE (`ipStart`,`ipEnd`)
>>>> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
>>>>
>>>> When I use this query:
>>>> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 3741319167 AND
>>>> `ipEnd` >= 3741319167;
>>>> I can see that the primary key is not used and the query takes about 3 sec.
>>>> But when I use this query:
>>>> SELECT * FROM `tbl_locales_ip2l` WHERE `ipStart` <= 374131916 AND
>>>> `ipEnd` >= 374131916;
>>>> The primary key is used.
>>>> The table contains about 3M rows.
>>>> Can anyone explain?
>>>> Thanks, Sharon.
>>> if the query will access more than a certain amount of rows, it won't
>>> be used. There is a set of heuristics for this; the actual number
>>> varies, but people often say a full scan is about as much work as an
>>> index scan that retrieves 30% of the rows. That's not quite the way
>>> the optimizer works, but it gives you an idea.
>>>
>>> If you think it really will be faster, use USE INDEX or FORCE INDEX and see.
>>>
>>> Baron

>> You're right, forcing the index results in a 47 sec. query.
>> Any idea how to optimize this table?
>> 3 seconds query (not forcing the index) is way too slow.

>
> Try InnoDB with the same primary key. This will cluster the rows
> together physically and *might* be faster, but it depends on your
> queries.
>
> Side note: be careful of making the varchar columns larger than you
> need, as any operations that use an in-memory temporary table (shown
> by "Using temporary" in EXPLAIN) will use the full length of the
> column, even if only a few characters are used. (The Memory storage
> engine doesn't support variable-length rows).


9 sec. using InnoDB, 3 times slower.
By the way, the query will always return 1 row.
Thanks for the tip, I really need to cut those varchars.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:14 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com