Unix Technical Forum

[PATCH] add CLUSTER table ORDER BY index

This is a discussion on [PATCH] add CLUSTER table ORDER BY index within the Pgsql Patches forums, part of the PostgreSQL category; --> The following table add's a new variant of the CLUSTER command. The old variants are preserved, as suggested in ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Patches

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 09:47 AM
Holger Schurig
 
Posts: n/a
Default [PATCH] add CLUSTER table ORDER BY index

The following table add's a new variant of the CLUSTER command. The old
variants are preserved, as suggested in the TODO entry.

Things I changed:

* The grammar
* psql help text
* psql tab-completion, it favours now CLUSTER table ORDER BY index"
* two uses of CLUSTER in the regression, so that both the old and
new syntax get checked

Things to consider:

* not yet in the documentation
* psql should probably no longer emit "CLUSTER index ON table"


Index: src/doc/TODO
================================================== =================
*** src.orig/doc/TODO 2007-03-27 21:18:01.000000000 +0200
--- src/doc/TODO 2007-03-27 21:18:26.000000000 +0200
***************
*** 624,631 ****

o %Add VERBOSE option to report tables as they are processed,
like VACUUM VERBOSE
- o Add more logical syntax CLUSTER table ORDER BY index;
- support current syntax for backward compatibility


* COPY
--- 624,629 ----
Index: src/doc/src/FAQ/TODO.html
================================================== =================
*** src.orig/doc/src/FAQ/TODO.html 2007-03-27 21:18:01.000000000 +0200
--- src/doc/src/FAQ/TODO.html 2007-03-27 21:18:26.000000000 +0200
***************
*** 558,565 ****
</p>
</li><li>%Add VERBOSE option to report tables as they are processed,
like VACUUM VERBOSE
- </li><li>Add more logical syntax CLUSTER table ORDER BY index;
- support current syntax for backward compatibility
</li></ul>
</li><li>COPY
<ul>
--- 558,563 ----
Index: src/src/backend/parser/gram.y
================================================== =================
*** src.orig/src/backend/parser/gram.y 2007-03-27 21:18:01.000000000 +0200
--- src/src/backend/parser/gram.y 2007-03-27 21:18:26.000000000 +0200
***************
*** 209,215 ****

%type <str> relation_name copy_file_name
database_name access_method_clause access_method attr_name
! index_name name file_name

%type <list> func_name handler_name qual_Op qual_all_Op subquery_Op
opt_class opt_validator
--- 209,215 ----

%type <str> relation_name copy_file_name
database_name access_method_clause access_method attr_name
! index_name name file_name opt_cluster_order_by

%type <list> func_name handler_name qual_Op qual_all_Op subquery_Op
opt_class opt_validator
***************
*** 5327,5332 ****
--- 5327,5333 ----
*
* QUERY:
* cluster <index_name> on <qualified_name>
+ * cluster <qualified_name> ORDER BY <index_name>
* cluster <qualified_name>
* cluster
*
***************
*** 5340,5350 ****
n->indexname = $2;
$$ = (Node*)n;
}
! | CLUSTER qualified_name
{
ClusterStmt *n = makeNode(ClusterStmt);
n->relation = $2;
! n->indexname = NULL;
$$ = (Node*)n;
}
| CLUSTER
--- 5341,5351 ----
n->indexname = $2;
$$ = (Node*)n;
}
! | CLUSTER qualified_name opt_cluster_order_by
{
ClusterStmt *n = makeNode(ClusterStmt);
n->relation = $2;
! n->indexname = $3;
$$ = (Node*)n;
}
| CLUSTER
***************
*** 5356,5361 ****
--- 5357,5368 ----
}
;

+ opt_cluster_order_by:
+ ORDER BY index_name { $$ = $3; }
+ | /*EMPTY*/ { $$ = NULL; }
+ ;
+
+
/************************************************** ***************************
*
* QUERY:
Index: src/src/bin/psql/sql_help.h
================================================== =================
*** src.orig/src/bin/psql/sql_help.h 2007-03-27 21:18:01.000000000 +0200
--- src/src/bin/psql/sql_help.h 2007-03-27 21:18:26.000000000 +0200
***************
*** 119,125 ****

{ "CLUSTER",
N_("cluster a table according to an index"),
! N_("CLUSTER indexname ON tablename\nCLUSTER tablename\nCLUSTER") },

{ "COMMENT",
N_("define or change the comment of an object"),
--- 119,125 ----

{ "CLUSTER",
N_("cluster a table according to an index"),
! N_("CLUSTER indexname ON tablename\nCLUSTER tablename [ORDER BY indexname]\nCLUSTER") },

{ "COMMENT",
N_("define or change the comment of an object"),
Index: src/src/bin/psql/tab-complete.c
================================================== =================
*** src.orig/src/bin/psql/tab-complete.c 2007-03-27 21:18:01.000000000 +0200
--- src/src/bin/psql/tab-complete.c 2007-03-27 21:18:26.000000000 +0200
***************
*** 822,832 ****

COMPLETE_WITH_LIST(list_COLUMNALTER);
}
! else if (pg_strcasecmp(prev3_wd, "TABLE") == 0 &&
! pg_strcasecmp(prev_wd, "CLUSTER") == 0)
COMPLETE_WITH_CONST("ON");
else if (pg_strcasecmp(prev4_wd, "TABLE") == 0 &&
- pg_strcasecmp(prev2_wd, "CLUSTER") == 0 &&
pg_strcasecmp(prev_wd, "ON") == 0)
{
completion_info_charp = prev3_wd;
--- 822,830 ----

COMPLETE_WITH_LIST(list_COLUMNALTER);
}
! else if (pg_strcasecmp(prev3_wd, "TABLE") == 0)
COMPLETE_WITH_CONST("ON");
else if (pg_strcasecmp(prev4_wd, "TABLE") == 0 &&
pg_strcasecmp(prev_wd, "ON") == 0)
{
completion_info_charp = prev3_wd;
***************
*** 929,952 ****

/*
* If the previous word is CLUSTER and not without produce list of
! * indexes.
*/
else if (pg_strcasecmp(prev_wd, "CLUSTER") == 0 &&
pg_strcasecmp(prev2_wd, "WITHOUT") != 0)
! COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_index es, NULL);
! /* If we have CLUSTER <sth>, then add "ON" */
else if (pg_strcasecmp(prev2_wd, "CLUSTER") == 0 &&
! pg_strcasecmp(prev_wd, "ON") != 0)
! COMPLETE_WITH_CONST("ON");

/*
! * If we have CLUSTER <sth> ON, then add the correct tablename as well.
*/
! else if (pg_strcasecmp(prev3_wd, "CLUSTER") == 0 &&
! pg_strcasecmp(prev_wd, "ON") == 0)
{
! completion_info_charp = prev2_wd;
! COMPLETE_WITH_QUERY(Query_for_table_owning_index);
}

/* COMMENT */
--- 927,952 ----

/*
* If the previous word is CLUSTER and not without produce list of
! * tables
*/
else if (pg_strcasecmp(prev_wd, "CLUSTER") == 0 &&
pg_strcasecmp(prev2_wd, "WITHOUT") != 0)
! COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_table s, NULL);
! /* If we have CLUSTER <sth>, then add "ORDER BY" */
else if (pg_strcasecmp(prev2_wd, "CLUSTER") == 0 &&
! pg_strcasecmp(prev_wd, "ON") != 0) {
! COMPLETE_WITH_CONST("ORDER BY");
! }

/*
! * If we have CLUSTER <sth> ORDER BY, then add the index as well.
*/
! else if (pg_strcasecmp(prev4_wd, "CLUSTER") == 0 &&
! pg_strcasecmp(prev_wd, "BY") == 0 &&
! pg_strcasecmp(prev2_wd, "ORDER") == 0)
{
! completion_info_charp = prev3_wd;
! COMPLETE_WITH_QUERY(Query_for_index_of_table);
}

/* COMMENT */
Index: src/src/test/regress/expected/cluster.out
================================================== =================
*** src.orig/src/test/regress/expected/cluster.out 2007-03-27 21:18:01.000000000 +0200
--- src/src/test/regress/expected/cluster.out 2007-03-27 21:18:26.000000000 +0200
***************
*** 329,335 ****
CLUSTER clstr_2;
ERROR: there is no previously clustered index for table "clstr_2"
CLUSTER clstr_1_pkey ON clstr_1;
! CLUSTER clstr_2_pkey ON clstr_2;
SELECT * FROM clstr_1 UNION ALL
SELECT * FROM clstr_2 UNION ALL
SELECT * FROM clstr_3;
--- 329,335 ----
CLUSTER clstr_2;
ERROR: there is no previously clustered index for table "clstr_2"
CLUSTER clstr_1_pkey ON clstr_1;
! CLUSTER clstr_2 ORDER BY clstr_2_pkey;
SELECT * FROM clstr_1 UNION ALL
SELECT * FROM clstr_2 UNION ALL
SELECT * FROM clstr_3;
Index: src/src/test/regress/sql/cluster.sql
================================================== =================
*** src.orig/src/test/regress/sql/cluster.sql 2007-03-27 21:18:01.000000000 +0200
--- src/src/test/regress/sql/cluster.sql 2007-03-27 21:18:26.000000000 +0200
***************
*** 122,128 ****
CLUSTER clstr_2;

CLUSTER clstr_1_pkey ON clstr_1;
! CLUSTER clstr_2_pkey ON clstr_2;
SELECT * FROM clstr_1 UNION ALL
SELECT * FROM clstr_2 UNION ALL
SELECT * FROM clstr_3;
--- 122,128 ----
CLUSTER clstr_2;

CLUSTER clstr_1_pkey ON clstr_1;
! CLUSTER clstr_2 ORDER BY clstr_2_pkey;
SELECT * FROM clstr_1 UNION ALL
SELECT * FROM clstr_2 UNION ALL
SELECT * FROM clstr_3;

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 09:47 AM
Gregory Stark
 
Posts: n/a
Default Re: [PATCH] add CLUSTER table ORDER BY index

"Holger Schurig" <holgerschurig@gmx.de> writes:

> * psql tab-completion, it favours now CLUSTER table ORDER BY index"


It occurs to me (sorry that I didn't think of this earlier) that if we're
going to use "ORDER BY" it really ought to take a list columns. It would be
more consistent with what ORDER BY means in queries and one day we may want to
add support for ordering by arbitrary lists of columns even if no index
exists.

It may be reasonable to allow both to coexist and just have an arbitrary rule
that if an index of the specified name exists takes precedence over any
columns. I've never seen anyone name their indexes in a way that would
conflict with a column anyways.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 09:47 AM
Tom Lane
 
Posts: n/a
Default Re: [PATCH] add CLUSTER table ORDER BY index

Gregory Stark <stark@enterprisedb.com> writes:
> "Holger Schurig" <holgerschurig@gmx.de> writes:
>> * psql tab-completion, it favours now CLUSTER table ORDER BY index"


> It occurs to me (sorry that I didn't think of this earlier) that if we're
> going to use "ORDER BY" it really ought to take a list columns.


Surely you jest. The point is to be ordered the same as the index, no?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 09:47 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: [PATCH] add CLUSTER table ORDER BY index

Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
>> "Holger Schurig" <holgerschurig@gmx.de> writes:
>>> * psql tab-completion, it favours now CLUSTER table ORDER BY index"

>
>> It occurs to me (sorry that I didn't think of this earlier) that if we're
>> going to use "ORDER BY" it really ought to take a list columns.

>
> Surely you jest. The point is to be ordered the same as the index, no?


There's some narrow corner cases where it makes sense to CLUSTER without
an index:

* You're going to build an index with the same order after clustering.
It's cheaper to sort the data first and then create index, than to build
index, sort data, and rebuild index.

* You're doing a lot of large sort + merge joins. Sorts are cheaper if
the data is already in order. One might ask, though, why don't you just
create an index then...

* You're using CLUSTER as a VACUUM FULL replacement, and there's no
handy index to sort with. (It'd be better if we had a VACUUM FULL that
rewrites the table like CLUSTER, though)

Though I don't think we're implementing "CLUSTER table ORDER BY col1,
col2" anytime soon, ORDER BY does imply that a list of columns is to
follow. How about "CLUSTER table USING index"?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 09:47 AM
Tom Lane
 
Posts: n/a
Default Re: [PATCH] add CLUSTER table ORDER BY index

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Though I don't think we're implementing "CLUSTER table ORDER BY col1,
> col2" anytime soon, ORDER BY does imply that a list of columns is to
> follow. How about "CLUSTER table USING index"?


+1 ... AFAIR there was 0 discussion of the exact syntax before,
so I don't feel wedded to ORDER BY.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 09:47 AM
Holger Schurig
 
Posts: n/a
Default Re: [PATCH] add CLUSTER table ORDER BY index

> +1 ... AFAIR there was 0 discussion of the exact syntax before,
> so I don't feel wedded to ORDER BY.


A changed patch comes with the next e-mail.

I can not create a patch for "CLUSTER table USING col1,col2,col3",
because I'm not yet deep into postgresql and don't have the time
for that. I just thought that the skill level needed for the TODO-
item was in my range :-)

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:43 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com