Unix Technical Forum

Poorly designed tsearch NOTICEs

This is a discussion on Poorly designed tsearch NOTICEs within the pgsql Hackers forums, part of the PostgreSQL category; --> regression=# SELECT plainto_tsquery('the any'); NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored plainto_tsquery ----------------- (1 row) regression=# ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-15-2008, 10:28 PM
Tom Lane
 
Posts: n/a
Default Poorly designed tsearch NOTICEs

regression=# SELECT plainto_tsquery('the any');
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored
plainto_tsquery
-----------------

(1 row)

regression=# select ''::tsquery;
NOTICE: tsearch query doesn't contain lexeme(s): ""
tsquery
---------

(1 row)

IMHO, it's really bad design to have this sort of NOTICE emitted by
tsquery input. Even if an application uses numnode() or querytree() or
something similar to detect bogus queries, it's going to have its logs
cluttered with these notices.

I could see having the @@ operator emit the notice if the query is
actually used for searching --- though I'm not quite sure how to get it
to come out only once per query ... maybe we could put it into the index
consistent() functions somehow?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-15-2008, 10:35 PM
Tom Lane
 
Posts: n/a
Default Re: Poorly designed tsearch NOTICEs

Last month I complained:
> regression=# SELECT plainto_tsquery('the any');
> NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored
> plainto_tsquery
> -----------------


> (1 row)


> regression=# select ''::tsquery;
> NOTICE: tsearch query doesn't contain lexeme(s): ""
> tsquery
> ---------


> (1 row)


> IMHO, it's really bad design to have this sort of NOTICE emitted by
> tsquery input. Even if an application uses numnode() or querytree() or
> something similar to detect bogus queries, it's going to have its logs
> cluttered with these notices.


> I could see having the @@ operator emit the notice if the query is
> actually used for searching --- though I'm not quite sure how to get it
> to come out only once per query ... maybe we could put it into the index
> consistent() functions somehow?


I experimented with this and found out that it works all right for GIN
indexes, if the NOTICE is put into gin_extract_query(); that seems to be
called just once per GIN index search. However, the only possible place
to put it in GIST tsearch support would be in the consistent() routines,
and that's no good because those will be called once per entry on the
index's root page --- so you get multiple copies of the NOTICE.

So it seems that the practical alternatives are:

1. Leave these notices where they are. Expect complaints from people
who would rather not have their logs cluttered with 'em.

2. Remove the notices altogether. Expect complaints from people who
get no matches on queries that they don't realize are all-stopwords.

3. Remove the notices from the input routines, and put one into
gin_extract_query only. We'll still get complaints as in #2, but
only from people using GIST indexes or no index at all for searching.

None of these are really terribly attractive, but I'm kinda leaning
to #2 myself. I'm not convinced that it's the province of the DB to be
issuing messages like this. In a lot of common scenarios, NOTICEs
aren't going to be seen by the actual person entering the query anyway,
because there are layers of software between him and the DB. All they
will accomplish is to bloat some logs somewhere.

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-15-2008, 10:35 PM
Robert Treat
 
Posts: n/a
Default Re: Poorly designed tsearch NOTICEs

On Tuesday 27 November 2007 19:03, Tom Lane wrote:
> Last month I complained:
> > regression=# SELECT plainto_tsquery('the any');
> > NOTICE: query contains only stopword(s) or doesn't contain lexeme(s),
> > ignored plainto_tsquery
> > -----------------
> >
> > (1 row)
> >
> > regression=# select ''::tsquery;
> > NOTICE: tsearch query doesn't contain lexeme(s): ""
> > tsquery
> > ---------
> >
> > (1 row)
> >
> > IMHO, it's really bad design to have this sort of NOTICE emitted by
> > tsquery input. Even if an application uses numnode() or querytree() or
> > something similar to detect bogus queries, it's going to have its logs
> > cluttered with these notices.
> >
> > I could see having the @@ operator emit the notice if the query is
> > actually used for searching --- though I'm not quite sure how to get it
> > to come out only once per query ... maybe we could put it into the index
> > consistent() functions somehow?

>
> I experimented with this and found out that it works all right for GIN
> indexes, if the NOTICE is put into gin_extract_query(); that seems to be
> called just once per GIN index search. However, the only possible place
> to put it in GIST tsearch support would be in the consistent() routines,
> and that's no good because those will be called once per entry on the
> index's root page --- so you get multiple copies of the NOTICE.
>
> So it seems that the practical alternatives are:
>
> 1. Leave these notices where they are. Expect complaints from people
> who would rather not have their logs cluttered with 'em.
>
> 2. Remove the notices altogether. Expect complaints from people who
> get no matches on queries that they don't realize are all-stopwords.
>
> 3. Remove the notices from the input routines, and put one into
> gin_extract_query only. We'll still get complaints as in #2, but
> only from people using GIST indexes or no index at all for searching.
>
> None of these are really terribly attractive, but I'm kinda leaning
> to #2 myself. I'm not convinced that it's the province of the DB to be
> issuing messages like this. In a lot of common scenarios, NOTICEs
> aren't going to be seen by the actual person entering the query anyway,
> because there are layers of software between him and the DB. All they
> will accomplish is to bloat some logs somewhere.
>
> Comments?


I would lean toward #1 since it seems to be closest to the behavior from
previous releases.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:11 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com