vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Jan Urbański wrote: > I've been fooling around my GSoC project, and here's the first version > I'm not actually ashamed of showing. Oh, wow, at this speed you'll be done before the summer even starts ;-) > There's one fundamental problem I came across while writing a typanalyze > function for tsvectors. > update_attstats() constructs an array that's later inserted into the > appropriate stavaluesN for a given relation attribute. However, it > assumes that the elements of that array will be of the same type as > their corresponding attribute. Yep, those stavalues fields are quite a hack... > It is no longer true with the design that I planned to use. The > typanalyze function for the tsvector type returns an array of > most-frequent lexemes (cstrings actually) from the tsvectors, not an > array of tsvectors. The question is: is this approach OK? Should > typanalyze functions be able to communicate the type of their result to > analyze_rel() ? I'm thinking of extending the VacAttrStats structure, so > a typanalyze func could set the proper fields to the proper values.re Hmm. One idea is to store an array of tsvectors, with only one lexeme in each tsvector. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Jan Urbański wrote: >> It is no longer true with the design that I planned to use. The >> typanalyze function for the tsvector type returns an array of >> most-frequent lexemes (cstrings actually) from the tsvectors, not an >> array of tsvectors. The question is: is this approach OK? Should >> typanalyze functions be able to communicate the type of their result to >> analyze_rel() ? I'm thinking of extending the VacAttrStats structure, so >> a typanalyze func could set the proper fields to the proper values.re > Hmm. One idea is to store an array of tsvectors, with only one lexeme in > each tsvector. Jan's right: this is an oversight in the design of the VacAttrStats API. The existing pg_statistics "slot" types all need an array of the same datatype as the underlying column, but it's obvious when you think about it that there could be kinds of statistics that need to be stored as an array of some other type. I'm good with the idea of extending VacAttrStats for the purpose. (Whether it's actually a good idea to store the entries as cstrings is another question...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| Tom Lane wrote: > Jan's right: this is an oversight in the design of the VacAttrStats API. > The existing pg_statistics "slot" types all need an array of the same > datatype as the underlying column, but it's obvious when you think about > it that there could be kinds of statistics that need to be stored as an > array of some other type. I'm good with the idea of extending > VacAttrStats for the purpose. Perhaps we would also want the ability to store the base element type when the column is an array. So for a 1D int[] column, we would store a 1D array in pg_statistics instead of a 2D array. Modules like intagg may find some use to that ability. I point this out because it also says that instead of inventing "most common lexeme" we want to turn into the more generic "most common element" or something like that. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| ||||
| Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> Jan's right: this is an oversight in the design of the VacAttrStats API. > Perhaps we would also want the ability to store the base element type > when the column is an array. Well, that would be up to the type-specific analyze routine to determine what it wanted to do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |