vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi all, I'm afraid I need some help with a string manipulation function in mysql. Suppose I have a column which contains a comma-separated set of unique strings e.g. a,aab,b,bc,de c,de,dd,dfr xo,ff,fr I would like to retrieve the nth element of a set in the same simple way that find_in_set allows me retrieve the index of an element within the list/set e.g. mysql> SELECT FIND_IN_SET('c','a,b,c,d'); +----------------------------+ | FIND_IN_SET('c','a,b,c,d') | +----------------------------+ | 3 | +----------------------------+ 1 row in set (0.00 sec) what I want is a "reverse find_in_set lookup" like mysql>SELECT GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d'); .... answer would be +------------------------------------------------------+ | GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d');| +-------------------------------------------------------+ | c | +----------------------------+ 1 row in set (0.00 sec) (thats just made up) I ' ve found away to achieve this by using various string function but I'm quite sure there must be a better and more effective and simpler way: mysql> SELECT MID('a,b,c,d', length(substring_index('a,b,c,d',",",3)), length(substring_index('a,b,c,d',",",4))- length(substring_index('a,b,c,d',",",3))-1 ) complicated; +-------------+ | complicated | +-------------+ | c | +-------------+ 1 row in set (0.00 sec) I know I could store that into a function and have my own function to accomplish this rather nicely but isn't there a built in function to achieve this? Any hints or tips or reference I didnt find? TIA Chris |
| |||
| cvh@LE wrote: > Hi all, > > I'm afraid I need some help with a string manipulation function in > mysql. > > Suppose I have a column which contains a comma-separated set of unique > strings > e.g. > > a,aab,b,bc,de > c,de,dd,dfr > xo,ff,fr > > I would like to retrieve the nth element of a set in the same simple > way that find_in_set allows me retrieve the index of an element within > the list/set > e.g. > mysql> SELECT FIND_IN_SET('c','a,b,c,d'); > +----------------------------+ > | FIND_IN_SET('c','a,b,c,d') | > +----------------------------+ > | 3 | > +----------------------------+ > 1 row in set (0.00 sec) > > what I want is a "reverse find_in_set lookup" > like > mysql>SELECT GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d'); > ... > answer would be > +------------------------------------------------------+ > | GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d');| > +-------------------------------------------------------+ > | c | > +----------------------------+ > 1 row in set (0.00 sec) > > (thats just made up) > > I ' ve found away to achieve this by using various string function but > I'm quite sure there must be a better and more effective and simpler > way: > > mysql> SELECT MID('a,b,c,d', > length(substring_index('a,b,c,d',",",3)), > length(substring_index('a,b,c,d',",",4))- > length(substring_index('a,b,c,d',",",3))-1 > ) complicated; > +-------------+ > | complicated | > +-------------+ > | c | > +-------------+ > 1 row in set (0.00 sec) > > I know I could store that into a function and have my own function to > accomplish this rather nicely but isn't there a built in function to > achieve this? > > Any hints or tips or reference I didnt find? > > TIA > > Chris > > > > Normalize your database. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| On May 5, 3:17 pm, "cvh@LE" <christian.han...@cpi-service.com> wrote: > Hi all, > ... > Any hints or tips or reference I didnt find? RTFM would reveal SUBSTRING_INDEX(): http://dev.mysql.com/doc/refman/5.0/...ubstring-index > TIA > > Chris |
| |||
| On May 5, 7:13*pm, toby <t...@telegraphics.com.au> wrote: > On May 5, 3:17 pm, "cvh@LE" <christian.han...@cpi-service.com> wrote: > > > Hi all, > > ... > > Any hints or tips or reference I didnt find? > > RTFM would reveal SUBSTRING_INDEX():http://dev.mysql.com/doc/refman/5.0/....html#function... > > > TIA > > > Chris RTFM??!? If you had read what I 've written you would have realised that I am well aware of substring_index: >mysql> SELECT MID('a,b,c,d', > length(substring_index('a,b,c,d',",",3)), > length(substring_index('a,b,c,d',",",4))- >length(substring_index('a,b,c,d',",",3))-1 > ) complicated; It rather seems to me that you dont know what a SET is! So RTFM to you. If you dont have valueble input than just leave it to others to give advice. substring index is helpful function I do use to achieve the goal of returning a string As said , I solved the problem but am wondering if there does exist a built-in function like FIND_IN_SET which takes a index as parameter and not a string-element. I created my own function as follows : CREATE FUNCTION element_in_set (mystring varchar(4096),sep varchar(25), myindex int(4)) RETURNS varchar(4096) DETERMINISTIC BEGIN RETURN replace(substr(mystring,length(substring_index(mys tring,sep,myindex)) +1,length(substring_index(mystring,sep,myindex+1))- length(substring_index(mystring,sep,myindex))),sep ,""); END $$ mysql> -- Select the 5th (index=4) element from a space separated set mysql> select element_in_set('So this is a sample set separated by spaces'," ",4); +--------------------------------------------------------------------- + | element_in_set('So this is a sample set separated by spaces'," ",4) | +--------------------------------------------------------------------- + | sample | +--------------------------------------------------------------------- + 1 row in set (0.00 sec) mysql> -- finding the 7th ( index=6) element in aa comma separated set mysql> select element_in_set('this,is,a,comma,separated,set,with , 9,elements',",",6); +----------------------------------------------------------------------- + | element_in_set('this,is,a,comma,separated,set,with ,9,elements',",", 6) | +----------------------------------------------------------------------- + | with | +----------------------------------------------------------------------- + 1 row in set (0.00 sec) So is there a built-in function? @Jerry; Unfortunately this is not an option in this case. The database is to be taken as is, but I am afraid I cant provide more information on this. So just trust me on this. |
| |||
| cvh@LE wrote: > On May 5, 7:13 pm, toby <t...@telegraphics.com.au> wrote: >> On May 5, 3:17 pm, "cvh@LE" <christian.han...@cpi-service.com> wrote: >> >>> Hi all, >>> ... >>> Any hints or tips or reference I didnt find? >> RTFM would reveal SUBSTRING_INDEX():http://dev.mysql.com/doc/refman/5.0/....html#function... >> >>> TIA >>> Chris > > RTFM??!? If you had read what I 've written you would have realised > that I am well aware of substring_index: > >> mysql> SELECT MID('a,b,c,d', >> length(substring_index('a,b,c,d',",",3)), >> length(substring_index('a,b,c,d',",",4))- >> length(substring_index('a,b,c,d',",",3))-1 >> ) complicated; > > It rather seems to me that you dont know what a SET is! So RTFM to > you. If you dont have valueble input than just leave it to others to > give advice. > > substring index is helpful function I do use to achieve the goal of > returning a string > > As said , I solved the problem but am wondering if there does exist a > built-in function like FIND_IN_SET which takes a index as parameter > and not a string-element. I created my own function as follows : > > CREATE FUNCTION element_in_set > (mystring varchar(4096),sep varchar(25), myindex int(4)) > RETURNS varchar(4096) DETERMINISTIC > BEGIN > RETURN > replace(substr(mystring,length(substring_index(mys tring,sep,myindex)) > +1,length(substring_index(mystring,sep,myindex+1))- > length(substring_index(mystring,sep,myindex))),sep ,""); > END > $$ > > > > > mysql> -- Select the 5th (index=4) element from a space separated set > mysql> select element_in_set('So this is a sample set separated by > spaces'," ",4); > +--------------------------------------------------------------------- > + > | element_in_set('So this is a sample set separated by spaces'," ",4) > | > +--------------------------------------------------------------------- > + > | sample > | > +--------------------------------------------------------------------- > + > 1 row in set (0.00 sec) > > mysql> -- finding the 7th ( index=6) element in aa comma separated set > mysql> select element_in_set('this,is,a,comma,separated,set,with , > 9,elements',",",6); > +----------------------------------------------------------------------- > + > | element_in_set('this,is,a,comma,separated,set,with ,9,elements',",", > 6) | > +----------------------------------------------------------------------- > + > | > with > | > +----------------------------------------------------------------------- > + > 1 row in set (0.00 sec) > > > So is there a built-in function? > > @Jerry; Unfortunately this is not an option in this case. The database > is to be taken as is, but I am afraid I cant provide more information > on this. So just trust me on this. > Like I said, there is a way to do it. Normalize your database. Your table violates first normal form. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| @Jerry : As said, I am neither the admin nor have I the privileges to normalize the db. Furthermore normalizing this column would create more overhead than performance gain could be won. @tobi: After reconsidering my function and rereading substring_index documentation I changed the function to CREATE FUNCTION element_in_set(mystring varchar(4096),sep varchar(25), myindex int(4)) RETURNS VARCHAR(4096) DETERMINISTIC BEGIN RETURN substring_index(substring_index(mystring,sep,myind ex),sep,-1); END So thanks for the hint. I nonetheless consider a "RTFM" and a oneliner rather rude, unhelpful and a totally unnecessary personal attack. Groups are for seeking and giving advice not for lecturing, downtalking or slagging so. off |
| |||
| cvh@LE wrote: > @Jerry : As said, I am neither the admin nor have I the privileges to > normalize the db. Furthermore normalizing this column would create > more overhead than performance gain could be won. > Not having the authority is one thing. Claiming it will create more overhead is another - and probably FALSE. Very seldom have I actually SEEN a degradation of performance when normalizing a database. And virtually never when normalizing a column with multiple values. That almost always improves performance across the board. And it definitely improves performance on this particular query. It was true that normalization hurt database performance - 25 years ago. But things have come a long ways since then, and RDBMS's typically do better with normalized data than not. Of course, there are always exceptions. But until you actually take performance benchmarks with your statements and your data, you can't say that. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| On 6 Mai, 13:09, Jerry Stuckle <jstuck...@attglobal.net> wrote: > cvh@LE wrote: > > @Jerry : As said, I am neither the admin nor have I the privileges to > > normalize the db. Furthermore normalizing this column would create > > more overhead than performance gain could be won. > > Not having the authority is one thing. *Claiming it will create more > overhead is another - and probably FALSE. > > Very seldom have I actually SEEN a degradation of performance when > normalizing a database. *And virtually never when normalizing a column > with multiple values. *That almost always improves performance across > the board. *And it definitely improves performance on this particular query. > > It was true that normalization hurt database performance - 25 years ago. > * But things have come a long ways since then, and RDBMS's typically do > better with normalized data than not. > > Of course, there are always exceptions. *But until you actually take > performance benchmarks with your statements and your data, you can't say > that. > > -- > ================== > Remove the "x" from my email address > Jerry Stuckle > JDS Computer Training Corp. > jstuck...@attglobal.net > ================== @Jerry I absolutely agree with you on normalisation. And I, too, have seldomly (although I do have) experienced that normalization is a performance downer. However, the application-database which makes use of this column- containing-a-set problem is normalized already, except for this column which happens to contain data similar to comments. Comments mostly are formulated as sentences, which happens to be nothing more than an ordered space separated set. This column contains exactly the same. I doubt anybody would normalize a short one-sentence comment field in a table holding, let's say, additional remarks (3NF) given as part of responses (2NF) by respondents (1NF) in surveys. Anyway, the whole question of mine targeted a function allowing me to retrieve the nth element in a set (I agree, the word occurence was the wrong choice - but, hey, English isnt my native tongue). To illustrate the problem a bit: The column in question containing the set could itself be the result of an aggregate function : imagine : mysql> create table test1 (pk int(1),col1 varchar(20)); Query OK, 0 rows affected (0.00 sec) mysql> insert into test1(pk,col1) values (1,"a"),(2,"b"),(1,"c"), (2,"e"),(1,"f"),(2,"g"); Query OK, 6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql> select * from test1; +------+------+ | pk | col1 | +------+------+ | 1 | a | | 2 | b | | 1 | c | | 2 | e | | 1 | f | | 2 | g | +------+------+ 6 rows in set (0.01 sec) mysql> select pk,group_concat(col1 separator ",") from test1 group by 1; +------+----------------------------------+ | pk | group_concat(col1 separator ",") | +------+----------------------------------+ | 1 | a,c,f | | 2 | b,e,g | +------+----------------------------------+ 2 rows in set (0.00 sec) mysql> select pk,metadaten.element_in_set(group_concat(col1 separator ","),",",2) 2ndElement from test1 group by 1; +------+------------+ | pk | 2ndElement | +------+------------+ | 1 | c | | 2 | e | +------+------------+ 2 rows in set (0.00 sec) mysql> |
| |||
| cvh@LE wrote: > On 6 Mai, 13:09, Jerry Stuckle <jstuck...@attglobal.net> wrote: >> cvh@LE wrote: >>> @Jerry : As said, I am neither the admin nor have I the privileges to >>> normalize the db. Furthermore normalizing this column would create >>> more overhead than performance gain could be won. >> Not having the authority is one thing. Claiming it will create more >> overhead is another - and probably FALSE. >> >> Very seldom have I actually SEEN a degradation of performance when >> normalizing a database. And virtually never when normalizing a column >> with multiple values. That almost always improves performance across >> the board. And it definitely improves performance on this particular query. >> >> It was true that normalization hurt database performance - 25 years ago. >> But things have come a long ways since then, and RDBMS's typically do >> better with normalized data than not. >> >> Of course, there are always exceptions. But until you actually take >> performance benchmarks with your statements and your data, you can't say >> that. >> >> -- >> ================== >> Remove the "x" from my email address >> Jerry Stuckle >> JDS Computer Training Corp. >> jstuck...@attglobal.net >> ================== > > @Jerry I absolutely agree with you on normalisation. And I, too, have > seldomly (although I do have) experienced that normalization is a > performance downer. > > However, the application-database which makes use of this column- > containing-a-set problem is normalized already, except for this column > which happens to contain data similar to comments. Comments mostly are > formulated as sentences, which happens to be nothing more than an > ordered space separated set. This column contains exactly the same. I > doubt anybody would normalize a short one-sentence comment field in a > table holding, let's say, additional remarks (3NF) given as part of > responses (2NF) by respondents (1NF) in surveys. > > Anyway, the whole question of mine targeted a function allowing me to > retrieve the nth element in a set (I agree, the word occurence was the > wrong choice - but, hey, English isnt my native tongue). > > To illustrate the problem a bit: > > The column in question containing the set could itself be the result > of an aggregate function : > > imagine : > mysql> create table test1 (pk int(1),col1 varchar(20)); > Query OK, 0 rows affected (0.00 sec) > > mysql> insert into test1(pk,col1) values (1,"a"),(2,"b"),(1,"c"), > (2,"e"),(1,"f"),(2,"g"); > Query OK, 6 rows affected (0.01 sec) > Records: 6 Duplicates: 0 Warnings: 0 > > mysql> select * from test1; > +------+------+ > | pk | col1 | > +------+------+ > | 1 | a | > | 2 | b | > | 1 | c | > | 2 | e | > | 1 | f | > | 2 | g | > +------+------+ > 6 rows in set (0.01 sec) > > mysql> select pk,group_concat(col1 separator ",") from test1 group by > 1; > +------+----------------------------------+ > | pk | group_concat(col1 separator ",") | > +------+----------------------------------+ > | 1 | a,c,f | > | 2 | b,e,g | > +------+----------------------------------+ > 2 rows in set (0.00 sec) > > mysql> select pk,metadaten.element_in_set(group_concat(col1 separator > ","),",",2) 2ndElement from test1 group by 1; > +------+------------+ > | pk | 2ndElement | > +------+------------+ > | 1 | c | > | 2 | e | > +------+------------+ > 2 rows in set (0.00 sec) > > mysql> > > > > Yes, I understand your design. But there are no functions to retrieve information form a set because the set itself violates 1NF. And if the database is normalized properly, there is no need. And yes, I would normalize such a field in such circumstances. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| ||||
| On Wed, 07 May 2008 07:55:48 -0400, Jerry Stuckle wrote: > cvh@LE wrote: >> @Jerry I absolutely agree with you on normalisation. And I, too, have >> seldomly (although I do have) experienced that normalization is a >> performance downer. >> >> However, the application-database which makes use of this column- >> containing-a-set problem is normalized already, except for this column >> which happens to contain data similar to comments. Comments mostly are >> formulated as sentences, which happens to be nothing more than an >> ordered space separated set. This column contains exactly the same. I >> doubt anybody would normalize a short one-sentence comment field in a >> table holding, let's say, additional remarks (3NF) given as part of >> responses (2NF) by respondents (1NF) in surveys. >> > > Yes, I understand your design. But there are no functions to retrieve > information form a set because the set itself violates 1NF. And if the > database is normalized properly, there is no need. > > And yes, I would normalize such a field in such circumstances. To generalize a bit off of what I think Jerry's saying is that since your requirement to access parts of comments exists, then "comment" isn't a discrete element anymore. "Foo-part of comment" becomes then the discrete element and you must then define what each normal form looks like in light of that discrete element in order to decide whether it's normalized or not. For example, if all I care about is counts in a set of users' mailboxes, a single email is a discrete element, and tracking an ID for each email, and what user it belongs to is normalized enough for the purpose. If I need to know how many emails are in each conversation thread, and how many were sent only internally to this mail system, then it's NOT normalized, even though the data I have to work with hasn't changed at all. -- The Write Many, Read Never drive. For those people that don't know their system has a /dev/null already. -- Rik Steenwinkel, singing the praises of 8mm Exabytes |