Unix Technical Forum

SEO

vBulletin Search Engine Optimization


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 05-10-2008, 02:05 PM
Heikki Linnakangas
 
Posts: n/a
Default Re: gsoc08, text search selectivity, pg_statistics holding an array of a different type

Jan Urbański wrote:
> I've been fooling around my GSoC project, and here's the first version
> I'm not actually ashamed of showing.


Oh, wow, at this speed you'll be done before the summer even starts ;-)

> There's one fundamental problem I came across while writing a typanalyze
> function for tsvectors.
> update_attstats() constructs an array that's later inserted into the
> appropriate stavaluesN for a given relation attribute. However, it
> assumes that the elements of that array will be of the same type as
> their corresponding attribute.


Yep, those stavalues fields are quite a hack...

> It is no longer true with the design that I planned to use. The
> typanalyze function for the tsvector type returns an array of
> most-frequent lexemes (cstrings actually) from the tsvectors, not an
> array of tsvectors. The question is: is this approach OK? Should
> typanalyze functions be able to communicate the type of their result to
> analyze_rel() ? I'm thinking of extending the VacAttrStats structure, so
> a typanalyze func could set the proper fields to the proper values.re


Hmm. One idea is to store an array of tsvectors, with only one lexeme in
each tsvector.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 05-10-2008, 02:05 PM
Tom Lane
 
Posts: n/a
Default Re: gsoc08, text search selectivity, pg_statistics holding an array of a different type

"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Jan Urbański wrote:
>> It is no longer true with the design that I planned to use. The
>> typanalyze function for the tsvector type returns an array of
>> most-frequent lexemes (cstrings actually) from the tsvectors, not an
>> array of tsvectors. The question is: is this approach OK? Should
>> typanalyze functions be able to communicate the type of their result to
>> analyze_rel() ? I'm thinking of extending the VacAttrStats structure, so
>> a typanalyze func could set the proper fields to the proper values.re


> Hmm. One idea is to store an array of tsvectors, with only one lexeme in
> each tsvector.


Jan's right: this is an oversight in the design of the VacAttrStats API.
The existing pg_statistics "slot" types all need an array of the same
datatype as the underlying column, but it's obvious when you think about
it that there could be kinds of statistics that need to be stored as an
array of some other type. I'm good with the idea of extending
VacAttrStats for the purpose.

(Whether it's actually a good idea to store the entries as cstrings is
another question...)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 05-10-2008, 02:05 PM
Alvaro Herrera
 
Posts: n/a
Default Re: gsoc08, text search selectivity, pg_statisticsholding an array of a different type

Tom Lane wrote:

> Jan's right: this is an oversight in the design of the VacAttrStats API.
> The existing pg_statistics "slot" types all need an array of the same
> datatype as the underlying column, but it's obvious when you think about
> it that there could be kinds of statistics that need to be stored as an
> array of some other type. I'm good with the idea of extending
> VacAttrStats for the purpose.


Perhaps we would also want the ability to store the base element type
when the column is an array. So for a 1D int[] column, we would store
a 1D array in pg_statistics instead of a 2D array. Modules like intagg
may find some use to that ability.

I point this out because it also says that instead of inventing "most
common lexeme" we want to turn into the more generic "most common
element" or something like that.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 05-10-2008, 02:05 PM
Tom Lane
 
Posts: n/a
Default Re: gsoc08, text search selectivity, pg_statistics holding an array of a different type

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> Jan's right: this is an oversight in the design of the VacAttrStats API.


> Perhaps we would also want the ability to store the base element type
> when the column is an array.


Well, that would be up to the type-specific analyze routine to determine
what it wanted to do.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 06:32 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145