View Single Post

   
  #1 (permalink)  
Old 04-18-2008, 11:10 AM
Andrew Dunstan
 
Posts: n/a
Default Re: [HACKERS] like/ilike improvements



Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>
>> do { (t)++; (tlen)--} while ((*(t) & 0xC0) == 0x80 && tlen > 0)
>>

>
> The while *must* test those two conditions in the other order.
> (Don't laugh --- we've had reproducible bugs before in which the backend
> dumped core because of running off the end of memory due to this type
> of mistake.)
>
>
>> In fact, I'm wondering if that might make the other UTF8 stuff redundant
>> - the whole point of what we're doing is to avoid expensive calls to
>> NextChar;
>>

>
> +1 I think. This test will be approximately the same expense as what
> the outer loop would otherwise be (tlen > 0 and *t != firstpat), and
> doing it this way removes an entire layer of intellectual complexity.
> Even though the code is hardly different, we are no longer dealing in
> misaligned pointers anywhere in the match algorithm.
>
>
>


OK, here is a patch that I think incorporates all the ideas discussed
(including part of Mark Mielke's suggestion about optimising %_). There
is now no special treatment of UTF8 other than its use of a faster
NextChar macro.

cheers

andrew


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Reply With Quote