This is a discussion on Best way to use indexes for partial match at beginning within the Pgsql General forums, part of the PostgreSQL category; --> I have 10 CHARACTER columns in table mytable. Mytable is indexed by some of those columns. CREATE TABLE mytable ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have 10 CHARACTER columns in table mytable. Mytable is indexed by some of those columns. CREATE TABLE mytable ( col1 CHARACTER(10), col2 CHARACTER(10),col3 CHARACTER(10),col4 CHARACTER(10),col5 CHARACTER(10),col6 CHARACTER(10),col7 CHARACTER(10),col8 CHARACTER(10),col9 CHARACTER(10), col10 CHARACTER(10) ); CREATE INDEX i1 ON mytable(col1); CREATE INDEX i2 ON mytable(col2); I need to select records by knowing some characters from beginning. I know always 1-10 first characters of col1. So my LIKE pattern starts always with constant characters and ends with % . I can use LIKE: SELECT * FROM mytable WHERE col1 LIKE 'A%' AND col2 LIKE 'BC%' AND col3 LIKE 'DEF%' AND col4 LIKE 'G%'; or substring(): SELECT * FROM mytable WHERE substring(col1 for 1)='A' AND substring(col2 for 2)= 'BC' AND substring(col3 for 3)='DEF' AND substring(col4 for 1) ='G'; Can Postgres 8.1 use indexes to speed the queries above ? Which is the best way to to write the where clause in this case so that index is used ? Andrus. |
| |||
| Well, for starters, see if PostgreSQL is currently using any indexes via EXPLAIN. First rule of performance tuning: don't. If it's not (which is probably the case), then your best bet is to create functional indexes; ie: CREATE INDEX mytable__col1_4 ON mytable( substring( col1 for 4 ) ); You can then either SELECT ... WHERE substring( col1 for 4 ) = blah or SELECT ... WHERE substring( col1 for 4 ) LIKE 'bla%' Though that last one might not use the index; you'll have to check and see. Also, keep in mind that PostgreSQL doesn't store CHAR the same as most other databases; the internal storage is the same as what's used for VARCHAR and TEXT. On Sun, Nov 06, 2005 at 11:03:01PM +0200, Andrus Moor wrote: > I have 10 CHARACTER columns in table mytable. Mytable is indexed by some of > those columns. > > CREATE TABLE mytable ( col1 CHARACTER(10), > col2 CHARACTER(10),col3 CHARACTER(10),col4 CHARACTER(10),col5 > CHARACTER(10),col6 CHARACTER(10),col7 CHARACTER(10),col8 CHARACTER(10),col9 > CHARACTER(10), col10 CHARACTER(10) ); > > CREATE INDEX i1 ON mytable(col1); > CREATE INDEX i2 ON mytable(col2); > > I need to select records by knowing some characters from beginning. > I know always 1-10 first characters of col1. So my LIKE pattern starts > always with constant characters and ends with % . > > I can use LIKE: > > SELECT * FROM mytable > WHERE col1 LIKE 'A%' > AND col2 LIKE 'BC%' > AND col3 LIKE 'DEF%' > AND col4 LIKE 'G%'; > > or substring(): > > SELECT * FROM mytable > WHERE substring(col1 for 1)='A' > AND substring(col2 for 2)= 'BC' > AND substring(col3 for 3)='DEF' > AND substring(col4 for 1) ='G'; > > > Can Postgres 8.1 use indexes to speed the queries above ? > > Which is the best way to to write the where clause in this case so that > index is used ? > > Andrus. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| > Well, for starters, see if PostgreSQL is currently using any indexes via > EXPLAIN. First rule of performance tuning: don't. I'm designing a new application. Data is not available yet. I'm using Postgres 8.1 in Windows. Database encoding is UTF-8 lc_ctype is Estonian_Estonia.1257. lc_collate is Estonian currently. However I can set lc_collate to C if this solves this issue. Doc says that " to allow PostgreSQL to use indexes with LIKE clauses under a non-C locale, several custom operator classes exist" I don't understand "non-C locale". Does this mean lc_collate or also some other lc_ setting ? > If it's not (which is probably the case), then your best bet is to > create functional indexes; ie: > > CREATE INDEX mytable__col1_4 ON mytable( substring( col1 for 4 ) ); > > You can then either > > SELECT ... WHERE substring( col1 for 4 ) = blah I need to optimize queries with variable number of characters in beginning like SELECT ... WHERE substring( col1 for 1 ) = 'f' SELECT ... WHERE substring( col1 for 2 ) = 'fo' SELECT ... WHERE substring( col1 for 3 ) = 'foo' etc This approach requires creating 10 indexes for each column which is unreasonable. In my current dbms, Microsoft Visual FoxPro I have a single index CREATE INDEX i1 ON mytable(col1) I can use queries: WHERE col1 BETWEEN 'f' and 'f'+CHR(255) WHERE col1 BETWEEN 'fo' and 'fo'+CHR(255) WHERE col1 BETWEEN 'foo' and 'foo'+CHR(255) All those queries can use same index automatically in all locales. CHR(255) is last character in any lc_collate sequence. CHR(255) is not used in col1 data. How to get same functionality in Postgres ? Does there exist unicode special character which is greater than all other chars ? Andrus. |
| |||
| On Wed, Nov 09, 2005 at 12:37:25PM +0200, Andrus wrote: > I'm using Postgres 8.1 in Windows. Database encoding is UTF-8 > lc_ctype is Estonian_Estonia.1257. > lc_collate is Estonian currently. However I can set lc_collate to C if this > solves this issue. > > Doc says that > " to allow PostgreSQL to use indexes with LIKE clauses under a non-C locale, > several custom operator classes exist" > > I don't understand "non-C locale". Does this mean lc_collate or also some > other lc_ setting ? lc == locale. There are several different locale settings but collation affects ordering. And Estonian is not C (obviously). > I need to optimize queries with variable number of characters in beginning > like > > SELECT ... WHERE substring( col1 for 1 ) = 'f' > SELECT ... WHERE substring( col1 for 2 ) = 'fo' > SELECT ... WHERE substring( col1 for 3 ) = 'foo' > etc If you use queries like: SELECT ... WHERE col1 LIKE 'fo%' it can use an index declared like: CREATE INDEX myindex on mytable(col1 text_pattern_ops); > In my current dbms, Microsoft Visual FoxPro I have a single index > > CREATE INDEX i1 ON mytable(col1) > > I can use queries: > > WHERE col1 BETWEEN 'f' and 'f'+CHR(255) > WHERE col1 BETWEEN 'fo' and 'fo'+CHR(255) > WHERE col1 BETWEEN 'foo' and 'foo'+CHR(255) Well, you could do that in postgresql too, you just need to use the SQL standard concatination operator. WHERE col1 BETWEEN 'f' and 'f' || chr(255); > How to get same functionality in Postgres ? > Does there exist unicode special character which is greater than all other > chars ? Umm, I don't think so. Order is defined by the locale, not the character set. My guess is that text_pattern_ops is the way to go. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQFDcdb8IB7bNG8LQkwRArUfAJ9Lbn21OrBbioOJjNULA2 ATEysuEQCfXOSH CpkbbZqoTFDUnlIQRpJS0bA= =sWPr -----END PGP SIGNATURE----- |
| |||
| Martijn, >> I can use queries: >> >> WHERE col1 BETWEEN 'f' and 'f'+CHR(255) >Well, you could do that in postgresql too, you just need to use the SQL >standard concatination operator. >WHERE col1 BETWEEN 'f' and 'f' || chr(255); thank you. I think it is best to use regular indexes since regular indexes since they can be used in other types of queries also. It seems that only way is to use BETWEEN comparison for this in Postgres 8.1. I tried CREATE TABLE foo ( col1 CHAR(20)); CREATE INDEX i1 ON foo(col1); INSERT INTO foo VALUES ('bar'); SELECT * FROM foo WHERE col1 BETWEEN 'b' and 'b' || chr(255); But this does not return any data. How to write index optimizable WHERE clause when only some (variable number) of characters from beginning of col1 are known ? Only way seems to use BETWEEN comparison by concatenating character greater than all other characters in locale. Since CHR(255) does not work this is not possible. So CREATE INDEX i1 ON foo(col1); cannot be used to optimize queries of type "get all rows where first n charaters of col1 are known" in Postgres. Andrus. |
| |||
| Andrus wrote: > So > > CREATE INDEX i1 ON foo(col1); > > cannot be used to optimize queries of type "get all rows where first n > charaters of col1 are known" in Postgres. Of course it will! Any btree based index will let you do that. Re-read the previous answers and make sure you pay attention to the bit about text_pattern_ops and LIKE in non-C locales. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On 11/9/05, Andrus <eetasoft@online.ee> wrote: > Martijn, > > >> I can use queries: > >> > >> WHERE col1 BETWEEN 'f' and 'f'+CHR(255) > > >Well, you could do that in postgresql too, you just need to use the SQL > >standard concatination operator. > > >WHERE col1 BETWEEN 'f' and 'f' || chr(255); > > thank you. > > I think it is best to use regular indexes since regular indexes since they > can be used in other types of queries also. > > It seems that only way is to use BETWEEN comparison for this in Postgres > 8.1. > > I tried > > CREATE TABLE foo ( col1 CHAR(20)); > CREATE INDEX i1 ON foo(col1); > INSERT INTO foo VALUES ('bar'); > SELECT * FROM foo WHERE col1 BETWEEN 'b' and 'b' || chr(255); > > But this does not return any data. > > How to write index optimizable WHERE clause when only some (variable number) > of characters from beginning of col1 are known ? > > Only way seems to use BETWEEN comparison by concatenating character greater > than all other characters in locale. Since CHR(255) does not work this is > not possible. > > So > > CREATE INDEX i1 ON foo(col1); > > cannot be used to optimize queries of type "get all rows where first n > charaters of col1 are known" in Postgres. > > Andrus. > > you can create two indexes: CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops); and CREATE INDEX myindex_normal ON foo(col1); the first one will be used when using LIKE and the other for normal comparisons . -- regards, Jaime Casanova (DBA: DataBase Aniquilator ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| >> CREATE INDEX i1 ON foo(col1); >> >> cannot be used to optimize queries of type "get all rows where first n >> charaters of col1 are known" in Postgres. > > Of course it will! Any btree based index will let you do that. Re-read the > previous answers and make sure you pay attention to the bit about > text_pattern_ops and LIKE in non-C locales. Richard, thank you. I try to formulate my problem more presicely. I have table CREATE TABLE foo ( bar CHAR(10) PRIMARY KEY); Cluster locale is non-C. Database encoding is UTF-8. Postgres vers is 8.1 PRIMARY KEY clause creates btree based index so the index exists on bar. I want to run fast queries by knowing first characters of bar like : 1. Select records from foo where first character of bar is A 2. Select records from foo where first character of bar is B 3. Select records from foo where first two characters of bar are BC 4. Select records from foo where first three characters of bar are ABC etc. Can you write sample WHERE clause which can use existing primary key index for those queries ? Andrus. |
| |||
| > you can create two indexes: > > CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops); > and > CREATE INDEX myindex_normal ON foo(col1); > > the first one will be used when using LIKE and the other for normal > comparisons . Jaime, CREATE INDEX myindex_normal ON foo(col1); Creates btree structure. In other dbm system btree structure can be used for searches where only some first characters in index key are known. So I see no reason to create second index using text_pattern_ops for this purpose. I'm searching a way to use Postgres regular index for this. Andrus. |
| ||||
| On Wed, 2005-11-09 at 14:56, Andrus wrote: > > you can create two indexes: > > > > CREATE INDEX myindex_lc on mytable(col1 text_pattern_ops); > > and > > CREATE INDEX myindex_normal ON foo(col1); > > > > the first one will be used when using LIKE and the other for normal > > comparisons . > > Jaime, > > CREATE INDEX myindex_normal ON foo(col1); > > Creates btree structure. In other dbm system btree structure can be used for > searches where only some first characters in index key are known. > > So I see no reason to create second index using text_pattern_ops for this > purpose. > > I'm searching a way to use Postgres regular index for this. Easy, do what those other databases do. Setup your database to not use a locale. initdb --locale=C and you're golden. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|