This is a discussion on Count unique data in column within the MySQL forums, part of the Database Server Software category; --> Hi, I have an optimization question. Is there a way to count instances of each unique entry in a ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I have an optimization question. Is there a way to count instances of each unique entry in a column (not unique entries)? Let me explain using the column count. count 3 3 4 5 2 3 2 Question: Is there a way to return an associative array of total instances of unique entries? Desired Result: {2}->2, {3}->3, {4}->1, {5}->1 The current script requests all data (SELECT *) and sorts them in a script. I want to offset the count to the SQL server (good idea?). Kev |
| |||
| kev wrote: > Is there a way to return an associative array of total instances of > unique entries? > Desired Result: {2}->2, {3}->3, {4}->1, {5}->1 SELECT `count`, COUNT(`count`) FROM tablename GROUP BY `count` This is made a little bit confusing by the fact that your column is named `count`. Regards, Bill K. |
| |||
| Bill Karwin wrote: > kev wrote: > > Is there a way to return an associative array of total instances of > > unique entries? > > Desired Result: {2}->2, {3}->3, {4}->1, {5}->1 > > SELECT `count`, COUNT(`count`) FROM tablename GROUP BY `count` > > This is made a little bit confusing by the fact that your column is > named `count`. > > Regards, > Bill K. Query Result: 2 - 2 3 - 3 4 - 1 5 - 1 Simple! Thanks Bill. |
| |||
| I have a concern about my request on top of that since we are on the topic of optimization. What is a good way to approach SQL queries? Should I connect to the server on every query? (looks like a bad idea because you will increase server bandwidth assuming not localhost) Example: Start session On every query: Connect => Query => Disconnect End session Should I use pconnect (OR connect and save connection handler) and disconnect after the session has ended? How does the server handle the event when users are on a pconnect and the browser is closed? (would pconnect disconnect once session is over?) I do not want to leave multiple connections open to timeout. Example: Start session => Connect On every query: Query End session = Disconnect Feedback would be great. Just want to establish a standard. I am currently using the first method but I am wondering if it should worry about scalability. Thanks in advance Kev |
| |||
| kev wrote: > Should I connect to the server on every query? No, I would not recommend this. Connecting costs a large amount of overhead. It has to find the host, establish a socket, authenticate your username and password, and set up other resources. This is actually pretty expensive to do, and you should avoid doing it for every query. > Should I use pconnect (OR connect and save connection handler) and > disconnect after the session has ended? I assume you mean mysql_pconnect(), part of the PHP mysql package. http://www.php.net/function.mysql-pconnect Yes, this is a good thing to do for greater scalability. This reduces the number of times that PHP needs to create a new connection to MySQL The web server (Apache) that runs PHP manages the persisent connections and lets them be re-used by later PHP requests. However, if you use PHP in CGI mode, starting a new process for each PHP request, the persistent connections won't last anyway. > How does the server handle the > event when users are on a pconnect and the browser is closed? (would > pconnect disconnect once session is over?) According to the manual, mysql_close() does not close connections opened by mysql_pconnect(). The persistent database connections will be used by later requests for different web sessions. Also note that subsequent PHP requests are _not_ likely to use the same database connection. So if you create connection-based things like temporary tables or user variables, use them only during the same PHP request, and then you should drop that table or set that variable to NULL, or else some other web session will get to see it. > I do not want to leave multiple connections open to timeout. You can force MySQL to time out long connections using the "interactive-timeout" option in the my.cnf (my.ini) configuration file. The default for this timeout is 28800 seconds. See http://dev.mysql.com/doc/refman/5.0/...variables.html Regards, Bill K. |
| |||
| kev wrote: > I have a concern about my request on top of that since we are on the > topic of optimization. > > What is a good way to approach SQL queries? > > Should I connect to the server on every query? (looks like a bad idea > because you will increase server bandwidth assuming not localhost) > > Example: > Start session > > On every query: Connect => Query => Disconnect > > End session > > Should I use pconnect (OR connect and save connection handler) and > disconnect after the session has ended? How does the server handle the > event when users are on a pconnect and the browser is closed? (would > pconnect disconnect once session is over?) I do not want to leave > multiple connections open to timeout. > > Example: > Start session => Connect > > On every query: Query > > End session = Disconnect > > Feedback would be great. Just want to establish a standard. I am > currently using the first method but I am wondering if it should worry > about scalability. Thanks in advance > > Kev > Kev, I don't generally use pconnect() - it keeps connections open even if no one is using the website (or php code). It might be find if you're running a site with 100K hits/hr. But none of mine do that. When I need MySQL, I just connect at the start of the page and close at the end of the page. Nice, clean, no muss, no fuss. Additionally, if you're using shared hosting, persistent connections may be disabled. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| Bill Karwin <bill@karwin.com> wrote: > kev wrote: >> Should I connect to the server on every query? > > No, I would not recommend this. Connecting costs a large amount of > overhead. It has to find the host, establish a socket, authenticate > your username and password, and set up other resources. This is > actually pretty expensive to do, and you should avoid doing it for every > query. I agree that one should not connect for every single query. OTOH MySQL connects are pretty cheap, probably cheaper than the underlying TCP connection setup in the network stack. >> Should I use pconnect (OR connect and save connection handler) and >> disconnect after the session has ended? > > I assume you mean mysql_pconnect(), part of the PHP mysql package. > http://www.php.net/function.mysql-pconnect > > Yes, this is a good thing to do for greater scalability. No, it is not. Contrary, using mysql_pconnect() is considered harmful, as it tends to hog the MySQL server with idle connections. A very detailed (but German) discussion of the topic is here: <http://kris.koehntopp.de/artikel/webtune/> In a nutshell: a) many connections: This problem arises if you use different connections (that is <host, user, password> tupels) to connect to the database. If your scripts (*all* scripts running in one Apache instance) use N different connections, PHP will use up to N * max_clients connections to the database - independent from the real webserver load. Contrary, if each of your scripts uses only M different connections at a time (in most cases M=1) and refrains from using persistent connections, you will have at most M * max_clients open connections to the database. Plus, you will only reach that limit if your web- server is fully loaded. Idle connections will eat memory and file descriptors. Not good. b) MySQL sessions have state There are lots of objects in MySQL with session scope: locks, user variables, temporary tables. Currently there is no way to reset a session, so the best way to destroy those objects is to close the connection. Additionally there is a hidden attribute - the selected database (mysql_select_db(), USE <db>). If you re-use a MySQL connection handle in PHP, you cannot know the selected database. Therefore many database abstraction layers for PHP call mysql_select_db() *mandatory* right before each mysql_query(). Forgotten locks may render your database useless. Forgotten temp. tables will eat memory and/or disk space. User variables may spread sensitive information between web applications. Very bad. Conclusion: mysql_pconnect() solves a nonexistent problem. Also it creates new problems. You should not use it. XL -- Axel Schwenke, Senior Software Developer, MySQL AB Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/ MySQL User Forums: http://forums.mysql.com/ |
| |||
| Axel Schwenke wrote: > Bill Karwin <bill@karwin.com> wrote: > > kev wrote: > >> Should I connect to the server on every query? > > > > No, I would not recommend this. Connecting costs a large amount of > > overhead. It has to find the host, establish a socket, authenticate > > your username and password, and set up other resources. This is > > actually pretty expensive to do, and you should avoid doing it for every > > query. > > I agree that one should not connect for every single query. > OTOH MySQL connects are pretty cheap, probably cheaper than the > underlying TCP connection setup in the network stack. > > >> Should I use pconnect (OR connect and save connection handler) and > >> disconnect after the session has ended? > > > > I assume you mean mysql_pconnect(), part of the PHP mysql package. > > http://www.php.net/function.mysql-pconnect > > > > Yes, this is a good thing to do for greater scalability. > > No, it is not. Contrary, using mysql_pconnect() is considered harmful, > as it tends to hog the MySQL server with idle connections. A very > detailed (but German) discussion of the topic is here: > > <http://kris.koehntopp.de/artikel/webtune/> > > In a nutshell: > > a) many connections: > > This problem arises if you use different connections (that is <host, > user, password> tupels) to connect to the database. If your scripts > (*all* scripts running in one Apache instance) use N different > connections, PHP will use up to N * max_clients connections to the > database - independent from the real webserver load. > > Contrary, if each of your scripts uses only M different connections > at a time (in most cases M=1) and refrains from using persistent > connections, you will have at most M * max_clients open connections > to the database. Plus, you will only reach that limit if your web- > server is fully loaded. > > Idle connections will eat memory and file descriptors. Not good. > > > b) MySQL sessions have state > > There are lots of objects in MySQL with session scope: locks, user > variables, temporary tables. Currently there is no way to reset a > session, so the best way to destroy those objects is to close the > connection. Additionally there is a hidden attribute - the selected > database (mysql_select_db(), USE <db>). If you re-use a MySQL > connection handle in PHP, you cannot know the selected database. > Therefore many database abstraction layers for PHP call > mysql_select_db() *mandatory* right before each mysql_query(). > > Forgotten locks may render your database useless. Forgotten temp. > tables will eat memory and/or disk space. User variables may spread > sensitive information between web applications. Very bad. > > > Conclusion: mysql_pconnect() solves a nonexistent problem. Also it > creates new problems. You should not use it. > > > XL > -- > Axel Schwenke, Senior Software Developer, MySQL AB > > Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/ > MySQL User Forums: http://forums.mysql.com/ Good to know. Thanks for the info. Now I have to figure out how to pass the SQL connection handle across pages/the session. |
| ||||
| kev wrote: > Axel Schwenke wrote: > >>Bill Karwin <bill@karwin.com> wrote: >> >>>kev wrote: >>> >>>>Should I connect to the server on every query? >>> >>>No, I would not recommend this. Connecting costs a large amount of >>>overhead. It has to find the host, establish a socket, authenticate >>>your username and password, and set up other resources. This is >>>actually pretty expensive to do, and you should avoid doing it for every >>>query. >> >>I agree that one should not connect for every single query. >>OTOH MySQL connects are pretty cheap, probably cheaper than the >>underlying TCP connection setup in the network stack. >> >> >>>>Should I use pconnect (OR connect and save connection handler) and >>>>disconnect after the session has ended? >>> >>>I assume you mean mysql_pconnect(), part of the PHP mysql package. >>>http://www.php.net/function.mysql-pconnect >>> >>>Yes, this is a good thing to do for greater scalability. >> >>No, it is not. Contrary, using mysql_pconnect() is considered harmful, >>as it tends to hog the MySQL server with idle connections. A very >>detailed (but German) discussion of the topic is here: >> >><http://kris.koehntopp.de/artikel/webtune/> >> >>In a nutshell: >> >>a) many connections: >> >>This problem arises if you use different connections (that is <host, >>user, password> tupels) to connect to the database. If your scripts >>(*all* scripts running in one Apache instance) use N different >>connections, PHP will use up to N * max_clients connections to the >>database - independent from the real webserver load. >> >>Contrary, if each of your scripts uses only M different connections >>at a time (in most cases M=1) and refrains from using persistent >>connections, you will have at most M * max_clients open connections >>to the database. Plus, you will only reach that limit if your web- >>server is fully loaded. >> >>Idle connections will eat memory and file descriptors. Not good. >> >> >>b) MySQL sessions have state >> >>There are lots of objects in MySQL with session scope: locks, user >>variables, temporary tables. Currently there is no way to reset a >>session, so the best way to destroy those objects is to close the >>connection. Additionally there is a hidden attribute - the selected >>database (mysql_select_db(), USE <db>). If you re-use a MySQL >>connection handle in PHP, you cannot know the selected database. >>Therefore many database abstraction layers for PHP call >>mysql_select_db() *mandatory* right before each mysql_query(). >> >>Forgotten locks may render your database useless. Forgotten temp. >>tables will eat memory and/or disk space. User variables may spread >>sensitive information between web applications. Very bad. >> >> >>Conclusion: mysql_pconnect() solves a nonexistent problem. Also it >>creates new problems. You should not use it. >> >> >>XL >>-- >>Axel Schwenke, Senior Software Developer, MySQL AB >> >>Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/ >>MySQL User Forums: http://forums.mysql.com/ > > > Good to know. Thanks for the info. Now I have to figure out how to pass > the SQL connection handle across pages/the session. > Kev, You can't pass connections across pages - the connection will automatically close at the end of the PHP script if you don't close it yourself (it's a good idea to close it, though). But that's OK - you really don't want to keep the connection open while the user goes to lunch. Just connect before your first MySQL and close the connection after the last one (or at the end of the page). -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |