vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi All, I have a table of roughly half a million records and am retrieving a list of distinct values for a column. The total number of unique values in this column happens to be 16. If I do a query of "select distinct my_column from my_table", the planner chooses a sort followed by a unique; this takes about 5 seconds to run. If I change this to "select my_column from my_table group by my_column" (which to me actually looks like I've forgotten to put in an aggregate somewhere...), the planner chooses a hash aggregate and this runs in 0.7 seconds. I remember reading somewhere that in older versions of pgsql a distinct clause always resulted in a sort and unique plan. However, I seem to recall reading that distinct may choose other methods in version 8.1 onwards (take this with a pinch of salt, as I can't find the page again...). Can anyone clarify this? It just seems slightly odd that distinct doesn't appear to take into account statistics (especially on a single column select). Regards, Jon. |