vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello- I currently have a large table - 100 million rows, create as such: CREATE TABLE `result` ( `id` int(11) NOT NULL auto_increment, `hostname` varchar(75) default NULL, `ip` varchar(15) default NULL, PRIMARY KEY (`id`), KEY `ip` (`ip`) ) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1; I have 2 questions. My query is like this: SELECT hostname,ip FROM result WHERE ip = '192.168.1.1' ORDER BY hostnam e LIMIT 0,100; 1. Would it be better to have the IP address split in to 4 int(3) fields and index each of those? The search seems pretty quick right now, but I may make more complex queries on those fields in future. For example, I may want to search a netblock as such; SELECT hostname,ip FROM result WHERE ip1 = 192 AND ip2 = 168 AND ip3 IN (1,2,3) ORDER BY hostname; 2. What is the best way to index hostname? Usually when I search by IP, the hostnames are unique enough in just the first 15 characters. I tried creating a partial index on hostname(15), but when running EXPLAIN the query still uses filesort, and the query still takes forever when sorting. I found that if I create a full index on hostname, it worked well, but that index takes forever to create. Would it be better to create another field, like hostnameindex, and insert into that field the first 15 chars of each hostname? Then build an index on that and sort by that? Thanks for any help. I am not exactly a pro at database design. |
| ||||
| On Fri, 07 Dec 2007 16:10:34 -0600, dis <dis@nomail.com> wrote: >Hello- > >I currently have a large table - 100 million rows, create as such: > >CREATE TABLE `result` ( > `id` int(11) NOT NULL auto_increment, > `hostname` varchar(75) default NULL, > `ip` varchar(15) default NULL, > PRIMARY KEY (`id`), > KEY `ip` (`ip`) >) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1; > > >I have 2 questions. My query is like this: > >SELECT hostname,ip FROM result WHERE ip = '192.168.1.1' ORDER BY hostnam >e LIMIT 0,100; > >1. Would it be better to have the IP address split in to 4 int(3) fields >and index each of those? The search seems pretty quick right now, but I >may make more complex queries on those fields in future. > >For example, I may want to search a netblock as such; > >SELECT hostname,ip FROM result WHERE ip1 = 192 AND ip2 = 168 AND ip3 IN >(1,2,3) ORDER BY hostname; In general it would be better to convert IP addresses into a single integer with INET_ATON('ip1.ip2.ip3.ip4') and store that in the database (INT UNSIGNED). Every time you look for a specific address you have to convert your arguments to integers as well. INET_NTOA() converts back to a 'ip1.ip2.ip3.ip4' string. >2. What is the best way to index hostname? Usually when I search by IP, >the hostnames are unique enough in just the first 15 characters. I tried >creating a partial index on hostname(15), but when running EXPLAIN the >query still uses filesort, and the query still takes forever when >sorting. I found that if I create a full index on hostname, it worked >well, but that index takes forever to create. > >Would it be better to create another field, like hostnameindex, and >insert into that field the first 15 chars of each hostname? Then build >an index on that and sort by that? I have no ideas about that right now. >Thanks for any help. I am not exactly a pro at database design. Neither am I HTH -- ( Kees ) c[_] Build a fire for a man and he will be warm for a day, but set fire to that man and he will be warm forever! (Sun Tzu) (#443) |