This is a discussion on Bitwise question - within the SQL Server forums, part of the Microsoft SQL Server category; --> I am new to bitwise thing in MSSQL. Let's suppose there's a table of favorite foods insert int fav_foods(food_name,bitwiseVal) ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I am new to bitwise thing in MSSQL. Let's suppose there's a table of favorite foods insert int fav_foods(food_name,bitwiseVal) values('Pasta',1) insert int fav_foods(food_name,bitwiseVal) values('Chicken',2) insert int fav_foods(food_name,bitwiseVal) values('Beef',4) insert int fav_foods(food_name,bitwiseVal) values('Fish',8) insert int fav_foods(food_name,bitwiseVal) values('Pork',16) How do I write query to find people who selected more than one item and selected items from "Pasta, Chicken, Beef, Pork"(but not fish)? I hope my question is not confusing..... |
| |||
| Bostonasian wrote: > I am new to bitwise thing in MSSQL. > > Let's suppose there's a table of favorite foods > > insert int fav_foods(food_name,bitwiseVal) > values('Pasta',1) > > insert int fav_foods(food_name,bitwiseVal) > values('Chicken',2) > > insert int fav_foods(food_name,bitwiseVal) > values('Beef',4) > > insert int fav_foods(food_name,bitwiseVal) > values('Fish',8) > > insert int fav_foods(food_name,bitwiseVal) > values('Pork',16) > > How do I write query to find people who selected more than one item and > selected items from "Pasta, Chicken, Beef, Pork"(but not fish)? > I hope my question is not confusing..... > Your question isn't confusing but your design decision is. Why use bitwise on something like this? If you were to use proper table design this query would be trivial (and fast). Zach |
| |||
| I tried to simply the example as much as possible, that's probably why it didn't look that neccesary to build table like this. I actually have survey data. Survey answer includes text, single select multiple choice and multi-select multiple choice. In answered data table, I currently have schema like following : customer | question_id | answer ----------------------------------------------- John | 1 | Pasta John | 1 | Beef John | 1 | Chicken John | 1 | Pork And I've got 2.4 million customers to manage, so I thought it'd save some rows by using bitwise to reduce row numbers to one. |
| |||
| Bostonasian wrote: > I tried to simply the example as much as possible, that's probably why > it didn't look that neccesary to build table like this. > > I actually have survey data. Survey answer includes text, single select > multiple choice and multi-select multiple choice. > > In answered data table, I currently have schema like following : > > customer | question_id | answer > ----------------------------------------------- > John | 1 | Pasta > John | 1 | Beef > John | 1 | Chicken > John | 1 | Pork > > And I've got 2.4 million customers to manage, so I thought it'd save > some rows by using bitwise to reduce row numbers to one. > What you save in rows (i.e. disk space, which is cheap), you'll likely lose in readability, mainainability, performance and standardization. Search out one of Joe Celko's rants about thinking like a procedural programmer and not a SQL/set based programmer because I think that's the problem here. Zach |
| |||
| > And I've got 2.4 million customers to manage, so I thought it'd save > some rows by using bitwise to reduce row numbers to one. I'll bet that disk space is much cheaper than the cost of the time you'll spend fixing up a kludge like that :-) Try this: CREATE TABLE Foods (customer_id INTEGER NOT NULL REFERENCES Customers (customer_id), food INTEGER NOT NULL REFERENCES Foods (food), PRIMARY KEY (customer_id, food)) SELECT customer_id FROM Foods WHERE food IN (1,2,3,4,5) /* Pasta,Chicken,Beef,Pork,Fish */ GROUP BY customer_id HAVING COUNT(CASE WHEN food IN (1,2,3,4) THEN 1 END) = COUNT(*) /* Everything except fish */ -- David Portas SQL Server MVP -- |
| |||
| Bostonasian (axkixx@gmail.com) writes: > I am new to bitwise thing in MSSQL. > > Let's suppose there's a table of favorite foods > > insert int fav_foods(food_name,bitwiseVal) > values('Pasta',1) > > insert int fav_foods(food_name,bitwiseVal) > values('Chicken',2) > > insert int fav_foods(food_name,bitwiseVal) > values('Beef',4) > > insert int fav_foods(food_name,bitwiseVal) > values('Fish',8) > > insert int fav_foods(food_name,bitwiseVal) > values('Pork',16) > > How do I write query to find people who selected more than one item and > selected items from "Pasta, Chicken, Beef, Pork"(but not fish)? > I hope my question is not confusing..... SELECT * FROM tbl WHERE fav_food & (SELECT SUM(bitwiseVal) FROM fav_foods WHERE food_name IN ('Pasta', 'Chicken', 'Beef', 'Pork')) But as pointed out by others, this is a poor design. You may save disk space, but if you need to find all that selected Chicken, you will find that you cannot have an index on bit in an integer column, so you get awful performance. Look at David's query, and use that instead of the above. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp |
| ||||
| Bostonasian wrote: > > I tried to simply the example as much as possible, that's probably why > it didn't look that neccesary to build table like this. > > I actually have survey data. Survey answer includes text, single select > multiple choice and multi-select multiple choice. > > In answered data table, I currently have schema like following : > > customer | question_id | answer > ----------------------------------------------- > John | 1 | Pasta > John | 1 | Beef > John | 1 | Chicken > John | 1 | Pork > > And I've got 2.4 million customers to manage, so I thought it'd save > some rows by using bitwise to reduce row numbers to one. Hi Bostonasian, There is no need to denormalize or use bitwise operations for this. IMO, bitwise operations are not suitable for this problem. The database does not have to grow very fast. If you normalize all the way through, you would get a Customers (reference) table, a Questions (reference) table and a Answers (reference) table. All these reference tables can have short keys, which you use in your CustomerAnswers (data) table. If you have fewer than 64000 customers, fewer than 256 questions and fewer than 256 (fixed) answers per question, then each row in CustomerAnswers would be just 2+1+1 = 4 bytes (excluding the free format text answers). Your schema could look something like this: CREATE TABLE Customers(CustomerID smallint PRIMARY KEY, Name nvarchar(100)) CREATE TABLE Questions(QuestionID tinyint PRIMARY KEY, Question nvarchar(3000)) CREATE TABLE Answers (QuestionID tinyint, AnswerID tinyint, Answer nvarchar(200),PRIMARY KEY (QuestionID,AnswerID)) CREATE TABLE CustomerAnswers (CustomerID smallint REFERENCES Customers ,QuestionID tinyint REFERENCES Questions ,AnswerID tinyint ,TextAnswer nvarchar(2000) ,PRIMARY KEY (CustomerID,QuestionID,AnswerID) ,FOREIGN KEY (QuestionID,AnswerID) REFERENCES Answers ) Hope this helps, Gert-Jan |