vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, we have a large table that contains a bitmask field: CREATE TABLE A ( ID NUMBER(15), ...... MASK NUMBER(19) ) The application then performs a query that includes BITAND(MASK, ?) > 0 condition. Is there any way to use an index for this condition. I was thinking about bitmap indexes for this but could not figure out how to use them here. Can anyone help? Thank you, Alex Kuntz. |
| |||
| kuntz.1507173@bloglines.com wrote: > we have a large table that contains a bitmask field: > MASK NUMBER(19) > > The application then performs a query that includes BITAND(MASK, ?) > 0 > condition. > > Is there any way to use an index for this condition. I was thinking > about bitmap indexes for this but could not figure out how to use them > here. Why in the world would you bundle data up into a column like that? I can't imagine why the correct solution isn't to break those bits out into individual columns with descriptive names so people can actually see what information is stored there. Then you could use bitmap indexes on them if performance dictated it (keeping in mind the problems they create). -- Richard Kuhler |
| |||
| Richard Kuhler wrote: > kuntz.1507173@bloglines.com wrote: > >> we have a large table that contains a bitmask field: >> MASK NUMBER(19) >> >> The application then performs a query that includes BITAND(MASK, ?) > 0 >> condition. >> >> Is there any way to use an index for this condition. I was thinking >> about bitmap indexes for this but could not figure out how to use them >> here. > > > Why in the world would you bundle data up into a column like that? I > can't imagine why the correct solution isn't to break those bits out > into individual columns with descriptive names so people can actually > see what information is stored there. Then you could use bitmap indexes > on them if performance dictated it (keeping in mind the problems they > create). > > -- > Richard Kuhler > Why would Oracle use this internally all over? (Check sql.bsq). For one, you can add a new mask, without adding columns to the design. -- Regards, Frank van Bortel |
| |||
| Frank van Bortel wrote: > Richard Kuhler wrote: > >> kuntz.1507173@bloglines.com wrote: >> >>> we have a large table that contains a bitmask field: >>> MASK NUMBER(19) >>> >>> The application then performs a query that includes BITAND(MASK, ?) > 0 >>> condition. >>> >>> Is there any way to use an index for this condition. I was thinking >>> about bitmap indexes for this but could not figure out how to use them >>> here. >> >> Why in the world would you bundle data up into a column like that? I >> can't imagine why the correct solution isn't to break those bits out >> into individual columns with descriptive names so people can actually >> see what information is stored there. Then you could use bitmap >> indexes on them if performance dictated it (keeping in mind the >> problems they create). >> >> -- >> Richard Kuhler > > Why would Oracle use this internally all over? (Check sql.bsq). > For one, you can add a new mask, without adding columns to > the design. No offense, but if you mimic everything I've seen done by Oracle's own developers then you're going to make many mistakes. They aren't gods. In fact, why not just have one table with one BLOB column, store everything in that and we'd never have to change the data model again? STOP! I'm not suggesting you do that! (although I've seen it done with disastrous results). The reason is that it defeats one of the major purposes of having a data model. Namely, so that data is stored in a well-described manner so a wide variety of people can access it intuitively and easily. This bitmask scenario is extremely counter to that. Besides, why is adding columns to a table to support new functionality a bad thing? Heaven forbid it might actually work to inform people that it exists. -- Richard Kuhler |
| |||
| Richard Kuhler <noone@nowhere.com> wrote: > kuntz.1507173@bloglines.com wrote: > > we have a large table that contains a bitmask field: > > MASK NUMBER(19) > > > > The application then performs a query that includes BITAND(MASK, ?) > 0 > > condition. > > > > Is there any way to use an index for this condition. I was thinking > > about bitmap indexes for this but could not figure out how to use them > > here. > > Why in the world would you bundle data up into a column like that? I > can't imagine why the correct solution isn't to break those bits out > into individual columns with descriptive names so people can actually > see what information is stored there. Then you could use bitmap indexes > on them if performance dictated it (keeping in mind the problems they > create). I don't know why the OP would bundle data up like that, but I do it because N+512 columns in a table is just too many, and N+2048 columns is way too many. (Of course, I have little hopes of getting bitmap indices to work effectively on this data anyway.) Xho -- -------------------- http://NewsReader.Com/ -------------------- Usenet Newsgroup Service $9.95/Month 30GB |
| |||
| ctcgag@hotmail.com wrote: > Richard Kuhler <noone@nowhere.com> wrote: > >>kuntz.1507173@bloglines.com wrote: >> >>>we have a large table that contains a bitmask field: >>>MASK NUMBER(19) >>> >>>The application then performs a query that includes BITAND(MASK, ?) > 0 >>>condition. >>> >>>Is there any way to use an index for this condition. I was thinking >>>about bitmap indexes for this but could not figure out how to use them >>>here. >> >>Why in the world would you bundle data up into a column like that? I >>can't imagine why the correct solution isn't to break those bits out >>into individual columns with descriptive names so people can actually >>see what information is stored there. Then you could use bitmap indexes >>on them if performance dictated it (keeping in mind the problems they >>create). > > I don't know why the OP would bundle data up like that, but I do it because > N+512 columns in a table is just too many, and N+2048 columns is way too > many. (Of course, I have little hopes of getting bitmap indices to work > effectively on this data anyway.) > > Xho Wow, you really have a tuple with 2048 distinct and meaningful attributes? What is this application if I might ask? Even if you convinced me that they absolutely can't be columns, I'd wonder why they can't be rows in a related table. -- Richard Kuhler |
| |||
| Richard Kuhler wrote: > > kuntz.1507173@bloglines.com wrote: > > we have a large table that contains a bitmask field: > > MASK NUMBER(19) > > > > The application then performs a query that includes BITAND(MASK, ?) > 0 > > condition. > > > > Is there any way to use an index for this condition. I was thinking > > about bitmap indexes for this but could not figure out how to use them > > here. > > Why in the world would you bundle data up into a column like that? I > can't imagine why the correct solution isn't to break those bits out > into individual columns with descriptive names so people can actually > see what information is stored there. Then you could use bitmap indexes > on them if performance dictated it (keeping in mind the problems they > create). > > -- > Richard Kuhler Lets says its 20 attributes, each being boolean. A b-tree index isn't going to help since the cardinality is so low. Bitmap indexes pretty much mean that the table has to be read-only. Moving the attributes to rows might not work so well since may end up with a massive lot of joins to the attribute table to handle the various AND/OR permutations. So you might indeed be up for a full scan no matter what, at which point, a bit mask in a single column may indeed be the best way to go Cheers Connor -- Connor McDonald Co-author: "Mastering Oracle PL/SQL - Practical Solutions" ISBN: 1590592174 web: http://www.oracledba.co.uk web: http://www.oaktable.net email: connor_mcdonald@yahoo.com Coming Soon! "Oracle Insight - Tales of the OakTable" "GIVE a man a fish and he will eat for a day. But TEACH him how to fish, and...he will sit in a boat and drink beer all day" ------------------------------------------------------------ |
| |||
| Connor McDonald wrote: > Richard Kuhler wrote: > >>kuntz.1507173@bloglines.com wrote: >> >>>we have a large table that contains a bitmask field: >>>MASK NUMBER(19) >>> >>>The application then performs a query that includes BITAND(MASK, ?) > 0 >>>condition. >>> >>>Is there any way to use an index for this condition. I was thinking >>>about bitmap indexes for this but could not figure out how to use them >>>here. >> >>Why in the world would you bundle data up into a column like that? I >>can't imagine why the correct solution isn't to break those bits out >>into individual columns with descriptive names so people can actually >>see what information is stored there. Then you could use bitmap indexes >>on them if performance dictated it (keeping in mind the problems they >>create). >> >>-- >>Richard Kuhler > > > Lets says its 20 attributes, each being boolean. A b-tree index isn't > going to help since the cardinality is so low. That's an over simplification. Especially with a bitmask where you are looking for rows with bits set. What if only 1 row in a 100 million row table has the attribute value you are looking for? Even if you forget about histograms, you can use a null-or-set design to create an index that will just have the rows that have the attribute set. > So you might indeed be up for a full scan no matter what, at which > point, a bit mask in a single column may indeed be the best way to go Sure, but should we design a data model purely so it gives the best performance possible? When does usability come into the equation? -- Richard Kuhler |
| |||
| Richard Kuhler <noone@nowhere.com> wrote: > ctcgag@hotmail.com wrote: > > Richard Kuhler <noone@nowhere.com> wrote: > > > >>kuntz.1507173@bloglines.com wrote: > >> > >>>we have a large table that contains a bitmask field: > >>>MASK NUMBER(19) > >>> > >>>The application then performs a query that includes BITAND(MASK, ?) > > >>>0 condition. > >>> > >>>Is there any way to use an index for this condition. I was thinking > >>>about bitmap indexes for this but could not figure out how to use them > >>>here. > >> > >>Why in the world would you bundle data up into a column like that? I > >>can't imagine why the correct solution isn't to break those bits out > >>into individual columns with descriptive names so people can actually > >>see what information is stored there. Then you could use bitmap > >>indexes on them if performance dictated it (keeping in mind the > >>problems they create). > > > > I don't know why the OP would bundle data up like that, but I do it > > because N+512 columns in a table is just too many, and N+2048 columns > > is way too many. (Of course, I have little hopes of getting bitmap > > indices to work effectively on this data anyway.) > > > > Xho > > Wow, you really have a tuple with 2048 distinct and meaningful > attributes? They certainly are distinct. I don't know about meaningful. Bit #478 being on means that the underlying object has a sub-component which hashes to 478 (or maybe to 479, I can never remember). > What is this application if I might ask? Cheminformatics (and some bioinformatics). Some queries in this field have the unfortunate property that testing a query against an object to determine if they match is an NP-complete task, and can take anywhere from 0.01 second to 10 seconds per row. But you can come up with clever ways to create bitstrings from both object and query such that, for each bit that is "on" in the query, there is a 100% chance of it also being on in a matching object while only a 50% (or so) chance of it being on in a non-matching object. If you test hundreds or thousands of bits, almost all the non-matching objects fall out, leaving you to do the slow, accurate, NP-complete check on only a handful of objects instead of millions. > Even if you > convinced me that they absolutely can't be columns, I'd wonder why they > can't be rows in a related table. In my case the entire reason for using the bitstring is for performance, and I very much doubt doing this would increase performance. Not that this has much to do with the original question any more, but you did ask Xho -- -------------------- http://NewsReader.Com/ -------------------- Usenet Newsgroup Service $9.95/Month 30GB |
| ||||
| Richard Kuhler wrote: > Frank van Bortel wrote: > >> Richard Kuhler wrote: >> >>> kuntz.1507173@bloglines.com wrote: >>> >>>> we have a large table that contains a bitmask field: >>>> MASK NUMBER(19) >>>> >>>> The application then performs a query that includes BITAND(MASK, ?) > 0 >>>> condition. >>>> >>>> Is there any way to use an index for this condition. I was thinking >>>> about bitmap indexes for this but could not figure out how to use them >>>> here. >>> >>> >>> Why in the world would you bundle data up into a column like that? I >>> can't imagine why the correct solution isn't to break those bits out >>> into individual columns with descriptive names so people can actually >>> see what information is stored there. Then you could use bitmap >>> indexes on them if performance dictated it (keeping in mind the >>> problems they create). >>> >>> -- >>> Richard Kuhler >> >> >> Why would Oracle use this internally all over? (Check sql.bsq). >> For one, you can add a new mask, without adding columns to >> the design. > > > No offense, but if you mimic everything I've seen done by Oracle's own > developers then you're going to make many mistakes. They aren't gods. > In fact, why not just have one table with one BLOB column, store > everything in that and we'd never have to change the data model again? > STOP! I'm not suggesting you do that! (although I've seen it done with > disastrous results). The reason is that it defeats one of the major > purposes of having a data model. Namely, so that data is stored in a > well-described manner so a wide variety of people can access it > intuitively and easily. This bitmask scenario is extremely counter to > that. > > Besides, why is adding columns to a table to support new functionality a > bad thing? Heaven forbid it might actually work to inform people that > it exists. > > -- > Richard Kuhler > You do need to rewrite every bit of PL/SQL and Java; bitwise operations do not require that. Apart from that - it seems quite fast. And one bitand(column) is easier to code than where col1 = 1 and col2 = 'Y' and col3=... ..... and col75... And I do not mimic Oracle code expert (wish I had the knowledge), but have not heard any technical reasons why I shouldn't use bitand. Datamodels are not the same as technical implementations. -- Regards, Frank van Bortel |