Unix Technical Forum

Using two indexes on the same table

This is a discussion on Using two indexes on the same table within the MySQL forums, part of the Database Server Software category; --> jfgroff@gmail.com wrote: > Hi Axel, > > Sorry for messing up your name; it was late ;-) > > ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 02-28-2008, 07:37 AM
Jerry Stuckle
 
Posts: n/a
Default Re: Using two indexes on the same table

jfgroff@gmail.com wrote:
> Hi Axel,
>
> Sorry for messing up your name; it was late ;-)
>
> I did a few tests today, and here are my results. From a fresh database
> loaded with about 4 million records, I always perform the same query
> yielding 34 results near København:
> SELECT id, name, lat, lon FROM Places
> WHERE lat > 55.5 AND lat < 55.7
> AND lon > 12.0 AND lon < 12.1;
>
>
> The query cache is disabled for these tests, although it would be
> enabled in production.
>
> 1) Using InnoDB
> - with no index: 6.50 seconds
> - with indexes on lat and lon (range of about 8000 rows using ixlon):
> 1.25 seconds
> - repeating the same query: 0.85 seconds
> - forcing use of ixlat (range of about 16000 rows): 1.21 seconds
> - repeating the same query: 1.01 seconds
> - with a combined index on (lat,lon), forcing its use (range of about
> 20000 rows): 0.91 seconds
> - repeating the same query: 0.92 seconds
>
>
> 2) Using MyISAM
> - with no index: 4.41 seconds
> - with indexes on lat and lon (range of about 4000 rows using ixlon):
> 1.54 seconds
> - repeating the same query: 0.07 seconds
> - forcing use of ixlat (range of about 8000 rows): 0.28 seconds
> - repeating the same query: 0.22 seconds
> - with a combined index on (lat,lon), forcing its use (range of about
> 8000 rows): 0.21 seconds
> - similar queries on random locations take between 0.11 and 0.19
> seconds (letting the query planner choose its preferred index)
> - when forced to use the combined index, they take between 0.11 and
> 0.28 seconds
>
> Conclusions:
> - Use of a combined index does not bring any significant performance
> improvement, and it consumes a lot of extra disk space: abandoned.
> - MyISAM is about 4 times faster than InnoDB for this situation.
> - But with a sustained rate of 5 requests per second, we still can't
> put this in production... Our goal is 100 rps.
> - MyISAM is more efficient at re-using data from previous queries,
> essentially thanks to the built-in key cache.
> - So let's try to tune the key cache!
>
> 3) MyISAM with pre-loaded key cache
> - set global key_buffer_size=200*1000*1024; (200 MB can accomodate
> our full indexes in the key cache)
> - LOAD INDEX INTO CACHE Places;
> - now the usual query takes 0.12 seconds
> - similar queries on random locations take between 0.07 and 0.13
> seconds
>
> We are now standing at 10 requests per second with a moderate memory
> expense; good progress but still not enough.
>
> 4) Mapping (lat,lon) to a single integer key
> - we add an INT column called quadrant calculated as INT(lat*10) *
> 10000 + INT((lon + 180) * 10)
> - each quadrant maps to a 0.1 x 0.1 degrees (lat,lon) rectangle,
> precise enough for our needs
> - our standard query becomes:
> SELECT id, name, lat, lon FROM Places
> WHERE quadrant = 5561920;
> - it returns 25 results in 0.00 seconds, not exactly the same results
> due to rounding of positions, but that's easy to take into account in
> the client application
> - we can extend the query to the neighbouring quadrants, e.g. for a
> 0.3 x 0.2 degrees rectangle:
> SELECT id, name, lat, lon FROM Places
> WHERE quadrant = 5551920 OR quadrant = 5561920 OR quadrant = 5571920
> OR quadrant = 5551921 OR quadrant = 5561921 OR quadrant = 5571921;
> - this query returns 128 results in 0.00 seconds
> - random queries anywhere take between 0.00 and 0.01 seconds
> - we need more precise time measurement but it looks like this is the
> way to get the desired performance
> - we haven't tuned the key cache and query cache yet...
>
> Conclusion: optimizing the queries is nice, optimizing the caches is
> nicer, optimizing the data is nicest ;-)
>
> As a side note, I didn't think we should move back to MyISAM, but this
> looks good for infrequently-updated tables which need fast selects and
> don't hold critical customer data.
>
> Thanks again for your help.
>
> JFG
>


You also might find converting lat and long to an int will improve performance.
Integer comparisons are always the fastest.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 02-28-2008, 07:37 AM
onedbguru@firstdbasource.com
 
Posts: n/a
Default Re: Using two indexes on the same table

You said that using the spatial was "over-kill" - why would you assume
that anything dealing with spatial data - as you are - would be
over-kill? While you are testing, why not test to see if the spatial
option will give you what you really need - a way to efficiently select
spatial data.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 02-28-2008, 07:37 AM
jfgroff@gmail.com
 
Posts: n/a
Default Re: Using two indexes on the same table

I wanted to test it, but the documentation says that spatial tables
currently can not be populated with LOAD DATA INFILE from a text
format, so it would be difficult to import my existing geo data in
there for proper testing. However, if you have practical experience
with using spatial tables in a similar context, I'd be glad to read
about it. Our needs are: a few million places defined as simple lat/lon
points, typical query is to find places in a 0.1x0.1 degrees area, must
support about 100 requests per second on a single machine.

Besides, I have now measured that the mapping of lat and lon to a
single integer quadrant can deliver the required performance with
minimal hassle, so I'm going to use this for the first production
release, but I remain open to more sophisticated solutions for the
future.

JFG

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:26 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com