This is a discussion on MySQL search with result scoring / relevance within the MySQL forums, part of the Database Server Software category; --> Hi, Suppose i have the following sql table: my_table (col_1, col_2, col_3, col_4) and the folowing user input for ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, Suppose i have the following sql table: my_table (col_1, col_2, col_3, col_4) and the folowing user input for the search: col_1:value_1 col_2:value_2 value_3 The above input means the folowing: search col_1 for value_1, col_2 for value_2 and value_3 in all columns. Besides this i need some scoring (this is used for sorting the results by their relevance). A match in col_1 will be 4 points, col_2 will be 3 points, col_3 will be 2 poits and the last one 1 point. All the results will be sorted by computing this score. For example: matching in col_1 for value_1 (+4), matching in col_2 (+3) and col_4 (+1) for value_3 will mean a +8 score for a row. I need some ideas and if it is possible some web links that might help me or even code. I'm also open to any suggestions that can offer a solution to my problem. Regards, Marius. |
| |||
| Marius, I think something like this is what you want: DROP TABLE IF EXISTS `test`.`test`; CREATE TABLE `test`.`test` ( `TestID` int(11) NOT NULL auto_increment, `Col1` varchar(255) default NULL, `Col2` varchar(255) default NULL, `Col3` varchar(255) default NULL, `Col4` varchar(255) default NULL, PRIMARY KEY (`TestID`) ) ENGINE=InnoDB ; insert Test (col1, col2, col3, col4) select 'mystring', '', '', ''; insert Test (col1, col2, col3, col4) select '', 'mystring', '', ''; insert Test (col1, col2, col3, col4) select '', '', 'mystring', ''; insert Test (col1, col2, col3, col4) select '', '', '', 'mystring'; insert Test (col1, col2, col3, col4) select 'mystring', 'mystring', 'mystring', 'mystring'; select Col1 , Col2 , Col3 , Col4 , case when ifnull(Col1,'') = 'mystring' then 4 else 0 end /* now lets add them all up */ + case when ifnull(Col2,'') = 'mystring' then 3 else 0 end + case when ifnull(Col3,'') = 'mystring' then 2 else 0 end + case when ifnull(Col4,'') = 'mystring' then 1 else 0 end MyRank from Test -- Bill <mareeus@gmail.com> wrote in message news:1174945884.429520.117870@n59g2000hsh.googlegr oups.com... > Hi, > > Suppose i have the following sql table: > my_table (col_1, col_2, col_3, col_4) > > and the folowing user input for the search: > col_1:value_1 > col_2:value_2 > value_3 > > The above input means the folowing: search col_1 for value_1, col_2 > for value_2 and value_3 in all columns. > > Besides this i need some scoring (this is used for sorting the results > by their relevance). > A match in col_1 will be 4 points, col_2 will be 3 points, col_3 will > be 2 poits and the last one 1 point. > > All the results will be sorted by computing this score. > For example: matching in col_1 for value_1 (+4), matching in col_2 > (+3) and col_4 (+1) for value_3 will mean a +8 score for a row. > > I need some ideas and if it is possible some web links that might help > me or even code. > I'm also open to any suggestions that can offer a solution to my > problem. > > Regards, > Marius. > |
| |||
| Thanks a lot Bill. Your example works very fine. select Col1 , Col2 , Col3 , Col4 , case when ifnull(Col1,'') = 'mystring' then 4 else 0 end /* now lets add them all up */ + case when ifnull(Col2,'') = 'mystring' then 3 else 0 end + case when ifnull(Col3,'') = 'mystring' then 2 else 0 end + case when ifnull(Col4,'') = 'mystring' then 1 else 0 end MyRank from Test Now i have another ouestion. Suppose the folowing user input: col_1:value_1 or (col_2:value_2 and col_3:value_3) or col_4:value_4 I need to adapt your example to work with "()", "AND", "OR", and in the case of one condition not meeting the boolean expression to exclude that row from my selection. |
| |||
| I've solved the 'or' and 'and' problem but now I'm facing a new one (a shameful syntax error). This is my query: select * , case when `col1` LIKE '%val1%' then 1 else 0 end Token1 , case when `col2` LIKE '%val2%' then 2 else 0 end Token2 , case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score from `table` HAVING score > 0 order by score DESC I get the following error: #1054 - Unknown column 'Token0' in 'field list' How can i make this work keeping Token1 and Token2 and using them to compute the score and order results by score |
| |||
| On Mar 28, 8:22 am, "mare...@gmail.com" <mare...@gmail.com> wrote: > I've solved the 'or' and 'and' problem but now I'm facing a new one (a > shameful syntax error). > This is my query: > > select * > , case when `col1` LIKE '%val1%' then 1 else 0 end Token1 > , case when `col2` LIKE '%val2%' then 2 else 0 end Token2 > , case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score > from `table` HAVING score > 0 order by score DESC > > I get the following error: > #1054 - Unknown column 'Token0' in 'field list' > > How can i make this work keeping Token1 and Token2 and using them to > compute the score and order results by score My last post attempt seemed to fail...lets try again... You cant reference aliases from within the field list. You have to re- state the formulas again. Heres my possible solution to it all... SELECT col1, col2, col3, col4, ((`col1` LIKE '%val1%') * 4) + ((`col2` LIKE '%val2%') * 3) + ((`col3` LIKE '%val3%') * 2) + ((`col4` LIKE '%val4%') * 1) score, (`col1` LIKE '%val1%') OR ((`col2` LIKE '%val2%') AND (`col3` LIKE '%val3%') ) OR (`col4` LIKE '%val4%') booleanMatch FROM test HAVING booleanMatch = 1 AND score > 0 order by score DESC |
| |||
| Thanks Tigger, Your solution is almost perfect but imagine a complex search string with 10 or 20 search values. Computing both score and booleanmatch won't be very optimized. They both mean almost the same thing. That's why i need some temporary values like Token1 and Token2 to store the values once computed and reduce the query complexity a little. So if anyone could help me to rewrite my query keeping Token1 and Token2, would be great. select * , case when `col1` LIKE '%val1%' then 1 else 0 end Token1 , case when `col2` LIKE '%val2%' then 2 else 0 end Token2 , case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score from `table` HAVING score > 0 order by score DESC #1054 - Unknown column 'Token0' in 'field list' Regards, Marius. |
| ||||
| On Mar 28, 11:44 pm, "mare...@gmail.com" <mare...@gmail.com> wrote: > Thanks Tigger, > > Your solution is almost perfect but imagine a complex search string > with 10 or 20 search values. Computing both score and booleanmatch > won't be very optimized. They both mean almost the same thing. That's > why i need some temporary values like Token1 and Token2 to store the > values once computed and reduce the query complexity a little. So if > anyone could help me to rewrite my query keeping Token1 and Token2, > would be great. > > select * > , case when `col1` LIKE '%val1%' then 1 else 0 end Token1 > , case when `col2` LIKE '%val2%' then 2 else 0 end Token2 > , case when Token1 * Token2 > 0 then Token1 * Token2 else 0 end score > from `table` HAVING score > 0 order by score DESC > > #1054 - Unknown column 'Token0' in 'field list' > > Regards, > Marius. You could play around with the order of things. Like SELECT col1, col2, col3, col4 FROM test WHERE (`col1` LIKE '%val1%') OR ((`col2` LIKE '%val2%') AND (`col3` LIKE '%val3%') ) OR (`col4` LIKE '%val4%') ORDER BY (((`col1` LIKE '%val1%') * 4) + ((`col2` LIKE '%val2%') * 3) + ((`col3` LIKE '%val3%') * 2) + ((`col4` LIKE '%val4%') * 1)) DESC This would limit the 2nd calculations to only happen for rows that match the boolean expression. Another thing to look at is doing Full-Text Searches using the MATCH AGAINST syntax... http://dev.mysql.com/doc/refman/5.0/...xt-search.html This can provide cleverer results for your searches, and I remember reading once that repeating the same MATCH statement in the field and in a where clause will only cause the one calculation, so you don't have to worry about the same efficiency issue. |