This is a discussion on CVS HEAD dumps core on simple tsvector input example within the pgsql Hackers forums, part of the PostgreSQL category; --> regression=# SELECT 'a very fat cat sat:4 on:5 a:6 mat:7'::tsvector; tsvector ----------------------------------------------- 'a' 'on':5 'cat' 'fat' 'mat':7 'sat':4 'very' ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| regression=# SELECT 'a very fat cat sat:4 on:5 a:6 mat:7'::tsvector; tsvector ----------------------------------------------- 'a' 'on':5 'cat' 'fat' 'mat':7 'sat':4 'very' (1 row) regression=# SELECT 'a very fat cat sat:4 on:5 a:6 mat:7'::tsvector; server closed the connection unexpectedly Notice it's the same input both times --- only the second one crashes. The coredump happens inside repalloc, making me suspect a memory clobber is involved. BTW, why does the 'a':6 lexeme disappear? To the extent that I understand how this should work, I'd have expected 'a' and 'a':6 to merge into 'a':6 not plain 'a'. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Tom Lane wrote: > BTW, why does the 'a':6 lexeme disappear? To the extent that I > understand how this should work, I'd have expected 'a' and 'a':6 > to merge into 'a':6 not plain 'a'. 'a':1,6 perhaps? -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J We take risks not to escape from life, but to prevent life escaping from us. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> BTW, why does the 'a':6 lexeme disappear? To the extent that I >> understand how this should work, I'd have expected 'a' and 'a':6 >> to merge into 'a':6 not plain 'a'. > 'a':1,6 perhaps? No, it would be inappropriate to add a '1' that wasn't specified. My reasoning is that 'a':1 and 'a':6 are distinct bits of information, hence their combination is 'a':1,6. But 'a' doesn't give any more information than 'a':6 so it should be dropped by the duplicate-elimination code. It's not clear to me whether that's what Oleg and Teodor think, though. Hm, just found a variant of the bug: regression=# select 'a a:6'::tsvector; tsvector ---------- 'a' (1 row) regression=# select 'a a:6'::tsvector; tsvector -------------------------------------------------------------------------- 'a':6,16255C,0,12C,8C,2640,0,512,0,312,12C,400C,0, 312,0,1,0,0,8448C,21,6 (1 row) This makes it look even more like a memory-corruption issue. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |