vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| The database server and the web server are on separate machines. Table A contains a record for each user. Let's say Table B contains 'relationship' information. They can be of type 'friend' or 'family'. If a user knows another user, this relationship would be kept in this table, along with the type of relationship. Table B can get big. 10,000's or maybe 100,000's. I'm doing a query in PHP and want to end up with two arrays. One for type friend and one for type family. Which is better: (Method 1) Do ONE query for all the records that meet a certain criteria (let's say 'active'). Then use PHP to loop through the results and put each record into either the friend array or the family array. (Method 2) Do TWO queries. One just for friend. Loop through the records and put into friend array; Then do another query for family...and loop through again. Method (1) needs to evaluate an IF statement in PHP for every record. Method (2) hits the database twice, but doesn't require a PHP IF. (Should I take an extra hit on the database and use Method 2?) -James |
| |||
| James Tu wrote: > The database server and the web server are on separate machines. > Table A contains a record for each user. > Let's say Table B contains 'relationship' information. They can be of > type 'friend' or 'family'. > If a user knows another user, this relationship would be kept in this > table, along with the type of relationship. Table B can get big. > 10,000's or maybe 100,000's. > > > I'm doing a query in PHP and want to end up with two arrays. > One for type friend and one for type family. > > Which is better: > (Method 1) Do ONE query for all the records that meet a certain > criteria (let's say 'active'). Then use PHP to loop through the > results and put each record into either the friend array or the family > array. > > (Method 2) Do TWO queries. One just for friend. Loop through the > records and put into friend array; > Then do another query for family...and loop through again. > > > Method (1) needs to evaluate an IF statement in PHP for every record. > Method (2) hits the database twice, but doesn't require a PHP IF. > > (Should I take an extra hit on the database and use Method 2?) > > -James > Either way, I think you are running into a problem with just having two arrays. Keep in mind that the relationship is relative, so to speak. A person who is a friend is not an absolute friend; they are going to be a friend of somebody else. With that in mind, assuming that you just want two "absolute" arrays, here's what I would suggest (and this is a shot in the dark) Given: USER USER_ID 'more columns AND RELATIONSHIP RELATIONSHIP_ID FRIEND_A FRIEND_B $query = "SELECT USER.*,RELATIONSHIP_DESCRIPTION FROM USER LEFT JOIN RELATIONSHIPS ON (USER.USER_ID = RELATIONSHIP.FRIEND_A OR USER.USER_ID = RELATIONSHIP.FRIEND_B)"; $retval = mysql_query($query) or die(mysql_error); while ($row = mysql_fetch_array($retval, MYSQL_ASSOC)) { $array[$row["USER_ID"]; } -- The NCP Revue -- http://www.ncprevue.com/blog |
| |||
| On 5/9/07, James Tu <jtu@esidesign.com> wrote: > > The database server and the web server are on separate machines. > Table A contains a record for each user. > Let's say Table B contains 'relationship' information. They can be > of type 'friend' or 'family'. > If a user knows another user, this relationship would be kept in this > table, along with the type of relationship. Table B can get big. > 10,000's or maybe 100,000's. > > > I'm doing a query in PHP and want to end up with two arrays. > One for type friend and one for type family. > > Which is better: > (Method 1) Do ONE query for all the records that meet a certain > criteria (let's say 'active'). Then use PHP to loop through the > results and put each record into either the friend array or the > family array. > > (Method 2) Do TWO queries. One just for friend. Loop through the > records and put into friend array; > Then do another query for family...and loop through again. In general, you don't want to introduce arbitrarily large result sets into PHP. PHP is fast, but there are memory limits and speed of iteration limits. In general, you want to structure things so that MySQL returns exactly the results you need, and in the order you need. In general: a)Check your database design to be sure that the queries you are interested in are O(log N). If not, make them that way, by rethinking your database design and/or adding indexes. b)See if you can get all the data you want in one query. In the example you gave, I think the WHERE clause syntax will allow checking for certain of an enumerated type, i.e WHERE (X=3 OR X=5) ... that kind of thing. So, retrieving friends and family in one query shouldn't be a problem. Two queries should not be required. Here is what you need to remember: a)Designs that aren't O(log N) for the queries you are interested in often catch up with you as the database grows. b)There is a speed hierarchy involved. PHP is the slowest of all, so if you loop over records in PHP it needs to be a guaranteed small set. MySQL takes a one-time hit parsing the SQL statement, but after that it can operate on the database FAR faster than PHP can. In general, let MySQL do the work, because it can do the sorting, filtering, etc. FAR faster than PHP. Dave. |
| |||
| Thanks David! This the kind of answer that I was looking for (more about general PHP and MySQL performance) I think b/c of the way the tables are designed, I have to perform multiple queries, unfortunately. I think I'll have to do some performance testing at some point. But for now I will let MySQL do the work instead of filtering with an IF in PHP. -James On May 10, 2007, at 12:20 PM, David T. Ashley wrote: > On 5/9/07, James Tu <jtu@esidesign.com> wrote: The database server > and the web server are on separate machines. > Table A contains a record for each user. > Let's say Table B contains 'relationship' information. They can be > of type 'friend' or 'family'. > If a user knows another user, this relationship would be kept in this > table, along with the type of relationship. Table B can get big. > 10,000's or maybe 100,000's. > > > I'm doing a query in PHP and want to end up with two arrays. > One for type friend and one for type family. > > Which is better: > (Method 1) Do ONE query for all the records that meet a certain > criteria (let's say 'active'). Then use PHP to loop through the > results and put each record into either the friend array or the > family array. > > (Method 2) Do TWO queries. One just for friend. Loop through the > records and put into friend array; > Then do another query for family...and loop through again. > > In general, you don't want to introduce arbitrarily large result > sets into PHP. PHP is fast, but there are memory limits and speed > of iteration limits. > > In general, you want to structure things so that MySQL returns > exactly the results you need, and in the order you need. > > In general: > > a)Check your database design to be sure that the queries you are > interested in are O(log N). If not, make them that way, by > rethinking your database design and/or adding indexes. > > b)See if you can get all the data you want in one query. In the > example you gave, I think the WHERE clause syntax will allow > checking for certain of an enumerated type, i.e WHERE (X=3 OR > X=5) ... that kind of thing. So, retrieving friends and family in > one query shouldn't be a problem. Two queries should not be required. > > Here is what you need to remember: > > a)Designs that aren't O(log N) for the queries you are interested > in often catch up with you as the database grows. > > b)There is a speed hierarchy involved. PHP is the slowest of all, > so if you loop over records in PHP it needs to be a guaranteed > small set. MySQL takes a one-time hit parsing the SQL statement, > but after that it can operate on the database FAR faster than PHP > can. In general, let MySQL do the work, because it can do the > sorting, filtering, etc. FAR faster than PHP. > > Dave. > > |
| |||
| On 5/10/07, James Tu <jtu@esidesign.com> wrote: > > I think b/c of the way the tables are designed, I have to perform > multiple queries, unfortunately. Hi James, My suggestion to you would be that if you have a situation you don't believe you can handle in one query, post all the details to the MySQL list and let others take a whack at it. I've not yet encountered a situation where the database can't be designed for "one query" results. I also believe that MySQL has temporary table functionality: http://www.xaprb.com/blog/2006/03/26...ties-in-mysql/ http://www.devwebpro.com/devwebpro-3...ith-MySQL.html I don't know how this works (I guess I should read the manual), but I think this would give you the ability in many cases to have MySQL (rather than PHP) do the heavy lifting. It will be much more efficient in MySQL than in PHP. Good luck, Dave. |
| |||
| David: I definitely can get the result set using one query, but what I do with the result set has me thinking about breaking it up into two queries. Here's an example with a simple table: describe collection; +------------------+---------------------+------+----- +---------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+---------------------+------+----- +---------------------+----------------+ | id | bigint(20) unsigned | | PRI | NULL | auto_increment | | receiver_id | bigint(20) unsigned | | MUL | 0 | | | set_type_id | int(2) unsigned | | | 0 | | | card_id | int(3) unsigned | | | 0 | | | completed_set_id | bigint(20) unsigned | | | 0 | | | created_on_gmt | datetime | | | 0000-00-00 00:00:00 | | +------------------+---------------------+------+----- +---------------------+----------------+ I want to end up with two PHP arrays. One for set_type_id = 22 and one for set_type_id=21. (1) one query method: SELECT * from collection WHERE set_type_id=22 OR set_type_id=21; ....do query... while( $row = $this->db->fetch_array_row() ){ if ($row['set_type_id'] == 21){ $array_a[] = $row; } else { $array_b[] = $row; } } (2) two query method: SELECT * from collection WHERE set_type_id=22; ....do query... while( $row = $this->db->fetch_array_row() ){ $array_a[] = $row; } SELECT * from collection WHERE set_type_id=21; ....do query... while( $row = $this->db->fetch_array_row() ){ $array_b[] = $row; } Which method is better? I still think that based on David's comments regarding MySQL being more performative I'm leaning towards option (2). -James On May 10, 2007, at 12:54 PM, David T. Ashley wrote: > On 5/10/07, James Tu <jtu@esidesign.com> wrote: >> >> I think b/c of the way the tables are designed, I have to perform >> multiple queries, unfortunately. > > > Hi James, > > My suggestion to you would be that if you have a situation you > don't believe > you can handle in one query, post all the details to the MySQL list > and let > others take a whack at it. > > I've not yet encountered a situation where the database can't be > designed > for "one query" results. > > I also believe that MySQL has temporary table functionality: > > http://www.xaprb.com/blog/2006/03/26...subtleties-in- > mysql/ > > http://www.devwebpro.com/devwebpro-3...y-Tables-With- > MySQL.html > > I don't know how this works (I guess I should read the manual), but > I think > this would give you the ability in many cases to have MySQL (rather > than > PHP) do the heavy lifting. It will be much more efficient in MySQL > than in > PHP. > > Good luck, Dave. |
| |||
| On 5/10/07, James Tu <jtu@esidesign.com> wrote: > David: > I definitely can get the result set using one query, but what I do > with the result set has me thinking about breaking it up into two > queries. <Technical Details Omitted> Ah, OK, I misunderstood. You want to (get two results, each of which is useful individually) rather than (issue two queries, then iterate in PHP to combine the query results). Two queries seem fine in that case. Just one caution: be aware that another process (such as a web user) can sneak in in between your two queries and modify the database and render the two sets of query results inconsistent with one another. To give you an example, suppose you issue three queries in order (I'm going to botch the syntax here): SELECT COUNT(*) FROM USERS; (call this A) SELECT COUNT(*) FROM USERS WHERE IDX <= 10; (call this B) SELECT COUNT(*) FROM USERS WHERE IDX > 10; (call this C) It is very possible (in the presence of other simultaneous database activity) that A != B + C. It depends on the application whether this is significant. Table locking is the easiest way to prevent this if it matters. Dave. |
| ||||
| If you are dong as two seperate queries, I recommend using a transactional table type setting the read isolation mode to repeatable read and doing both your queries within a single transaction. (David, sorry about the double send) - michael On 5/10/07, David T. Ashley <dashley@gmail.com> wrote: > On 5/10/07, James Tu <jtu@esidesign.com> wrote: > > > David: > > I definitely can get the result set using one query, but what I do > > with the result set has me thinking about breaking it up into two > > queries. > > > <Technical Details Omitted> > > Ah, OK, I misunderstood. You want to (get two results, each of which is > useful individually) rather than (issue two queries, then iterate in PHP to > combine the query results). > > Two queries seem fine in that case. > > Just one caution: be aware that another process (such as a web user) can > sneak in in between your two queries and modify the database and render the > two sets of query results inconsistent with one another. > > To give you an example, suppose you issue three queries in order (I'm going > to botch the syntax here): > > SELECT COUNT(*) FROM USERS; (call this A) > > SELECT COUNT(*) FROM USERS WHERE IDX <= 10; (call this B) > > SELECT COUNT(*) FROM USERS WHERE IDX > 10; (call this C) > > It is very possible (in the presence of other simultaneous database > activity) that A != B + C. > > It depends on the application whether this is significant. > > Table locking is the easiest way to prevent this if it matters. > > Dave. > -- - michael dykman - mdykman@gmail.com - All models are wrong. Some models are useful. |