vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I've written a small program to return a random word in a database dependent on an id. The ids in the table are 0..n, consecutive and non-repeating (I've double-checked). To implement this, I tried SELECT word FROM nouns WHERE id=FLOOR(RAND()*{table row count}); I see that there's a bug in this in any case, since RAND() returns a number 0 >= n >= 1. If it returns 1, which it is bound to do at some point, it'll return the number of rows, which doesn't exist as an id in the table. This is a small problem and isn't the subject of this post. The problem is that executing the statement often returns an unexpected number of records. Each id only appears once in the table, so I'd think that only one record would be returned for each call. Instead, most often between 0..2 rows are returned, and sometimes more. I'd be interested if someone could give me insight into this behavior, as it seems counter-intuitive to me. Thanks, Walter Gildersleeve Freiburg, Germany P.S. Here's the table definition: word varchar(255) id smallint(6) PRI 0 (indexed) The table has 27,166 records. I tried this on "Ver 14.7 Distrib 4.1.16, for Win32 (ia32)" and "Ver 14.7 Distrib 4.1.20, for redhat-linux-gnu (i686) using readline 4.3". Here's a page that demonstrates the problem: http://devhed.com/perform_query.plx ---------------------------- To e-mail me directly, simply reverse the characters in the e-mail left of the at sign. |
| |||
| Walter G. wrote: > Hi, > > I've written a small program to return a random word in a database > dependent on an id. The ids in the table are 0..n, consecutive and > non-repeating (I've double-checked). To implement this, I tried > > SELECT word FROM nouns WHERE id=FLOOR(RAND()*{table row count}); > ... > The problem is that executing the statement often returns an unexpected > number of records. ... I think the short answer is that the WHERE expression, hence RAND(), is being independently evaluated for each row. Try: SELECT word FROM nouns ORDER BY RAND() LIMIT 1; > > Thanks, > > Walter Gildersleeve > Freiburg, Germany > > P.S. Here's the table definition: > > word varchar(255) > id smallint(6) PRI 0 (indexed) > > The table has 27,166 records. I tried this on "Ver 14.7 Distrib > 4.1.16, for Win32 (ia32)" and "Ver 14.7 Distrib 4.1.20, for > redhat-linux-gnu (i686) using readline 4.3". > > Here's a page that demonstrates the problem: > > http://devhed.com/perform_query.plx > > ---------------------------- > To e-mail me directly, simply reverse the characters in the e-mail left > of the at sign. |
| |||
| Okay, that makes sense. Of course the LIMIT won't work the way I need it, since often I get no results (not to mention the performance hit I'm taking of having to create 20K rands every time I do a simple search). Oh well. Thanks, Walter toby wrote: > I think the short answer is that the WHERE expression, hence RAND(), is > being independently evaluated for each row. > > Try: > SELECT word FROM nouns ORDER BY RAND() LIMIT 1; > |
| |||
| Walter G. wrote: > Okay, that makes sense. Of course the LIMIT won't work the way I need > it, since often I get no results (not to mention the performance hit > I'm taking of having to create 20K rands every time I do a simple > search). Oh well. > > Thanks, > > Walter > > toby wrote: > > > I think the short answer is that the WHERE expression, hence RAND(), is > > being independently evaluated for each row. > > > > Try: > > SELECT word FROM nouns ORDER BY RAND() LIMIT 1; On my 5.0.28 this *always* returns a single row, which is what you want? (But I think ORDER BY RAND() may not work as expected in some older versions. What version are you using?) For speed, have you tried: SELECT COUNT(*) FROM nouns INTO @cnt; -- or keep external count SELECT FLOOR(RAND()*@cnt) INTO @id; -- or generate externally SELECT word FROM nouns WHERE id=@id; |
| |||
| toby wrote: > Walter G. wrote: > > Okay, that makes sense. Of course the LIMIT won't work the way I need > > it, since often I get no results (not to mention the performance hit > > I'm taking of having to create 20K rands every time I do a simple > > search). Oh well. > > > > Thanks, > > > > Walter > > > > toby wrote: > > > > > I think the short answer is that the WHERE expression, hence RAND(), is > > > being independently evaluated for each row. > > > > > > Try: > > > SELECT word FROM nouns ORDER BY RAND() LIMIT 1; > > On my 5.0.28 this *always* returns a single row, which is what you > want? (But I think ORDER BY RAND() may not work as expected in some > older versions. What version are you using?) Oops, I see, you said 4.1.16. According to 4.1 manual, the above should work, returning one row every time. The manual also explains why WHERE RAND() fails: http://dev.mysql.com/doc/refman/4.1/...html#id2978318 > > For speed, have you tried: > > SELECT COUNT(*) FROM nouns INTO @cnt; -- or keep external count > SELECT FLOOR(RAND()*@cnt) INTO @id; -- or generate externally > SELECT word FROM nouns WHERE id=@id; |
| ||||
| toby wrote: > Oops, I see, you said 4.1.16. According to 4.1 manual, the above should > work, returning one row every time. The manual also explains why WHERE > RAND() fails: > http://dev.mysql.com/doc/refman/4.1/...html#id2978318 I had seen that--my mistake was in misunderstanding how often the "where" clause is reevaluated. I assumed it was evaluated once per search, not once per row. > > For speed, have you tried: > > > > SELECT COUNT(*) FROM nouns INTO @cnt; -- or keep external count > > SELECT FLOOR(RAND()*@cnt) INTO @id; -- or generate externally > > SELECT word FROM nouns WHERE id=@id; I've decided to produce the random indexes externally--that seems like the fastest method and is easier than the multiple select statements necessary to get MySQL to do it for me. Thanks, Walter |