vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I have two tables: Customer: objectid, lastname, fk_address Address: objectid, city I want to select all customers with a name >= some_name and living in a city >= some_city, all comparisons case insensitive Below is what I actually have. Given the fact that it takes forever to get a result (> 6 seconds) , there must be something wrong with my solution or my expectation. Can anyone tell what I should do to make this query go faster ( or convince me to wait for the result ;-()? SELECT customers.objectid FROM prototype.customers,prototype.addresses WHERE customers.contactAddress = addresses.objectId AND ( TRIM(UPPER(lastName)) >= TRIM(UPPER('some_name')) AND TRIM(UPPER(city)) >= TRIM(UPPER('some_city')) ) order by TRIM(UPPER(lastname)), TRIM(UPPER(city)) Explain analyze after a full alayse vacuum: Sort (cost=54710.68..54954.39 rows=97484 width=111) (actual time=7398.971..7680.405 rows=96041 loops=1) Sort Key: btrim(upper(customers.lastname)), btrim(upper(addresses.city)) -> Hash Join (cost=14341.12..46632.73 rows=97484 width=111) (actual time=1068.862..5472.788 rows=96041 loops=1) Hash Cond: ("outer".contactaddress = "inner".objectid) -> Seq Scan on customers (cost=0.00..24094.01 rows=227197 width=116) (actual time=0.018..1902.646 rows=223990 loops=1) Filter: (btrim(upper(lastname)) >= 'JANSEN'::text) -> Hash (cost=13944.94..13944.94 rows=158473 width=75) (actual time=1068.467..1068.467 rows=158003 loops=1) -> Bitmap Heap Scan on addresses (cost=1189.66..13944.94 rows=158473 width=75) (actual time=71.259..530.986 rows=158003 loops=1) Recheck Cond: (btrim(upper(city)) >= 'NIJMEGEN'::text) -> Bitmap Index Scan on prototype_addresses_trim_upper_city (cost=0.00..1189.66 rows=158473 width=0) (actual time=68.290..68.290 rows=158003 loops=1) Index Cond: (btrim(upper(city)) >= 'NIJMEGEN'::text) Total runtime: 7941.095 ms I have indices on : fki_customers_addresses customer.lastname (both lastname and trim(uppercase(lastname)) addresses.city (both city and trim(uppercase(city)) I -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: J.Kraaijeveld@Askesis.nl web: www.askesis.nl |
| |||
| On 3/4/06, Joost Kraaijeveld <J.Kraaijeveld@askesis.nl> wrote: > Below is what I actually have. Given the fact that it takes forever to get > a result (> 6 seconds) , there must be something wrong with my solution or > my expectation. Can anyone tell what I should do to make this query go > faster ( or convince me to wait for the result ;-()? > Explain analyze after a full alayse vacuum: > Sort (cost=54710.68..54954.39 rows=97484 width=111) (actual > time=7398.971..7680.405 rows=96041 loops=1) > Sort Key: btrim(upper(customers.lastname)), btrim(upper(addresses.city)) > -> Hash Join (cost=14341.12..46632.73 rows=97484 width=111) (actual > time=1068.862..5472.788 rows=96041 loops=1) > Hash Cond: ("outer".contactaddress = "inner".objectid) > -> Seq Scan on customers (cost=0.00..24094.01 rows=227197 > width=116) (actual time=0.018..1902.646 rows=223990 loops=1) > Filter: (btrim(upper(lastname)) >= 'JANSEN'::text) > -> Hash (cost=13944.94..13944.94 rows=158473 width=75) (actual > time=1068.467..1068.467 rows=158003 loops=1) > -> Bitmap Heap Scan on addresses (cost=1189.66..13944.94 > rows=158473 width=75) (actual time=71.259..530.986 rows=158003 loops=1) > Recheck Cond: (btrim(upper(city)) >= 'NIJMEGEN'::text) > -> Bitmap Index Scan on > prototype_addresses_trim_upper_city (cost=0.00..1189.66 > rows=158473 width=0) (actual time=68.290..68.290 rows=158003 loops=1) > Index Cond: (btrim(upper(city)) >= > 'NIJMEGEN'::text) > Total runtime: 7941.095 ms explain clearly shows, that index is used for addresses scan, but it is not so for users. explain estimates that 227197 customers match the lastname criteria - which looks awfuly high. how many record do you have in the customers table? i would try to create index test on customers(contactAddress, trim(uppercase(lastname))); or with other ordring of fields. try this - create the index, make analyze of customers table, and recheck explain. then try the second index in the same manner. maybe this could of some help... depesz ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Hi Hubert, On Sat, 2006-03-04 at 14:49 +0100, hubert depesz lubaczewski wrote: > > Sort (cost=54710.68..54954.39 rows=97484 width=111) (actual > > time=7398.971..7680.405 rows=96041 loops=1) > > Sort Key: btrim(upper(customers.lastname)), btrim(upper(addresses.city)) > > -> Hash Join (cost=14341.12..46632.73 rows=97484 width=111) (actual time=1068.862..5472.788 rows=96041 loops=1) > > Hash Cond: ("outer".contactaddress = "inner".objectid) > > -> Seq Scan on customers (cost=0.00..24094.01 rows=227197 width=116) (actual time=0.018..1902.646 rows=223990 loops=1) > > Filter: (btrim(upper(lastname)) >= 'JANSEN'::text) > > -> Hash (cost=13944.94..13944.94 rows=158473 width=75) (actual time=1068.467..1068.467 rows=158003 loops=1) > > -> Bitmap Heap Scan on addresses (cost=1189.66..13944.94 rows=158473 width=75) (actual time=71.259..530.986 rows=158003 loops=1) > > Recheck Cond: (btrim(upper(city)) >= 'NIJMEGEN'::text) > > -> Bitmap Index Scan on prototype_addresses_trim_upper_city (cost=0.00..1189.66 rows=158473 width=0) (actual time=68.290..68.290 rows=158003 loops=1) > > Index Cond: (btrim(upper(city)) >=> 'NIJMEGEN'::text) > > Total runtime: 7941.095 ms > > explain clearly shows, that index is used for addresses scan, but it Yes, but I do not understand why I have both a "Bitmap Index Scan" and a "Bitmap Heap Scan" on (btrim(upper(city)) >=> 'NIJMEGEN'::text)? > is not so for users. > explain estimates that 227197 customers match the lastname criteria - > which looks awfuly high. > how many record do you have in the customers table? 368915 of which 222465 actually meet the condition. >From what I understand from the mailing list, PostgreSQL prefers a table scan whenever it expects that the number of records in the resultset will be ~ > 10 % of the total number of records in the table. Which explains the table scan for customers, but than again, it does not explain why it uses the index on addresses: it has 369337 addresses of which 158003 meet the condition > i would try to create index test on customers(contactAddress, > trim(uppercase(lastname))); > or with other ordring of fields. > > try this - create the index, make analyze of customers table, and > recheck explain. > then try the second index in the same manner. Makes no difference. -- Groeten, Joost Kraaijeveld Askesis B.V. Molukkenstraat 14 6524NB Nijmegen tel: 024-3888063 / 06-51855277 fax: 024-3608416 e-mail: J.Kraaijeveld@Askesis.nl web: www.askesis.nl ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On 3/4/06, Joost Kraaijeveld <J.Kraaijeveld@askesis.nl> wrote: > > how many record do you have in the customers table? > 368915 of which 222465 actually meet the condition. > >From what I understand from the mailing list, PostgreSQL prefers a table > scan whenever it expects that the number of records in the resultset > will be ~ > 10 % of the total number of records in the table. Which > explains the table scan for customers, but than again, it does not > explain why it uses the index on addresses: it has 369337 addresses of > which 158003 meet the condition bitmap index scan is faster than sequential table scan. that's all. it was introduced in 8.1 as far as i remember. basically - i doubt if you can get better performace from query when the result row-count is that high. out of curiosity though - why do you need so many rows? it's not possible to view them, nor do anything meaningful with 200 thousand rows! depesz ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| On Saturday 04 March 2006 08:23, hubert depesz lubaczewski wrote: > On 3/4/06, Joost Kraaijeveld <J.Kraaijeveld@askesis.nl> wrote: > > > how many record do you have in the customers table? > > > > 368915 of which 222465 actually meet the condition. > > > > >From what I understand from the mailing list, PostgreSQL prefers a table > > > > scan whenever it expects that the number of records in the resultset > > will be ~ > 10 % of the total number of records in the table. Which > > explains the table scan for customers, but than again, it does not > > explain why it uses the index on addresses: it has 369337 addresses of > > which 158003 meet the condition > > bitmap index scan is faster than sequential table scan. that's all. it > was introduced in 8.1 as far as i remember. > basically - i doubt if you can get better performace from query when > the result row-count is that high. > > out of curiosity though - why do you need so many rows? it's not > possible to view them, nor do anything meaningful with 200 thousand > rows! > > depesz If you're just displaying, use limit and offset to grab one page at a time. If you're manipulating it would be a good idea to do something in a stored procedure. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |