This is a discussion on Re: [Fwd: Index Advisor] within the pgsql Hackers forums, part of the PostgreSQL category; --> Hi, Am 15.11.2006 um 15:34 schrieb Gurjeet Singh: > ===== > .) The SELECTs in the pg_advise are returning ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, Am 15.11.2006 um 15:34 schrieb Gurjeet Singh: > ===== > .) The SELECTs in the pg_advise are returning wrong results, when > the same index is suggested twice, because of the SUM() aggregates. I don't think that this is a bug. If the same index is recommended for two different queries it will appear two times in pg_indexadvisor. So, if you want to calculate the overall benefit of this index, then you have to sum up the local benefits for each query. > .) I doubt that on a table t(a,b), for a suggestion of idx(b,a), > pg_advise will > suggest idx(a,b); ?? Not sure, if I understand you right. idx(b,a) and idx(a,b) are completely different indexes. Why should pg_advise suggest idx(a,b). But there is another bug: if there are recommendations like idx (a,b,c), idx(a,b) and idx(a) it would be a good idea to create just idx(a). I will add this to pg_advise as an optional feature. Best, Kai ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On 11/19/06, Kai-Uwe Sattler <kus@tu-ilmenau.de> wrote: > > Hi, > > .) The SELECTs in the pg_advise are returning wrong results, when > > the same index is suggested twice, because of the SUM() aggregates. > I don't think that this is a bug. If the same index is recommended > for two different queries it will appear two times in > pg_indexadvisor. So, if you want to calculate the overall benefit of > this index, then you have to sum up the local benefits for each query. If this is intended behaviour, then its okay. > .) I doubt that on a table t(a,b), for a suggestion of idx(b,a), > > pg_advise will > > suggest idx(a,b); > ?? Not sure, if I understand you right. idx(b,a) and idx(a,b) are > completely different indexes. Why should pg_advise suggest idx(a,b). I am referring to the way get_column_names() is coded. First, the SQL for the portal does not guarantee any order of the result; secondly, the 'for' loops that follow, will always output the columns in their increasing order of attribute number. Here's a small way to reproduce the bug, that I cooked up just now: Change the SQL in read_advisor_output() to: res = PQexec(conn, "DECLARE myportal CURSOR FOR " "SELECT relname," "int2vector_to_string(index_attrs) AS colids," "MAX(index_pages) AS size_in_pages," "SUM(profit) AS benefit," "SUM(profit)/MAX(index_pages) AS gain " "FROM pg_indexadvisor," "pg_class " "WHERE backend_pid = 0 " "AND rel_oid = pg_class.oid " "GROUP BY relname, colids " "ORDER BY gain DESC"); Notice the backend_pid = 0. Now insert the following into pg_indexadvisor: insert into pg_indexadvisor values( 1259, '2 1', 2, 1000, 20,0,0 ); This should prompt the advisor to generate the statement: create index advidx_1 on pg_class (relnamespace,relname); But instead, it will output this: create index advidx_1 on pg_class (relname,relnamespace); Now run the advisor with any workload, and inspect the output. $ pg_advise.exe -d postgres -h localhost -p 5432 -U gsk -o create_index.sql workload.sql We should tokenize the list of attribute numbers (column_ids variable) in get_column_names() and query them individually. But there is another bug: if there are recommendations like idx > (a,b,c), idx(a,b) and idx(a) it would be a good idea to create just > idx(a). I will add this to pg_advise as an optional feature. > I'd say it's a new feature request and not a bug why would you want to not build idx(a,b,c) in such a situation? idx(a,b,c) can be useful in places where idx(a,b) or idx(a) is required, but the same can't be said about idx(a) or idx(a,b) being useful where idx(a,b,c) is required! Best regards, -- gurjeet[.singh]@EnterpriseDB.com singh.gurjeet@{ gmail | hotmail | yahoo }.com |
| |||
| On 11/20/06, Gurjeet Singh <singh.gurjeet@gmail.com> wrote: > > We should tokenize the list of attribute numbers (column_ids variable) in > get_column_names() and query them individually. > Done and bug resolved. Regards, -- gurjeet[.singh]@EnterpriseDB.com singh.gurjeet@{ gmail | hotmail | yahoo }.com |
| ||||
| Am 20.11.2006 um 06:35 schrieb Gurjeet Singh: > But there is another bug: if there are recommendations like idx > (a,b,c), idx(a,b) and idx(a) it would be a good idea to create just > idx(a). I will add this to pg_advise as an optional feature. > > I'd say it's a new feature request and not a bug > understand why would you want to not build idx(a,b,c) in such a > situation? idx(a,b,c) can be useful in places where idx(a,b) or idx > (a) is required, but the same can't be said about idx(a) or idx > (a,b) being useful where idx(a,b,c) is required! > You are right - that's what I actually meant... Kai ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |