This is a discussion on Multiple logical databases within the pgsql Hackers forums, part of the PostgreSQL category; --> Mark Woodward wrote: > My issue is this, (and this is NOT a slam on PostgreSQL), I have a ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Mark Woodward wrote: > My issue is this, (and this is NOT a slam on PostgreSQL), I have a number > of physical databases on one machine on ports 5432, 5433, 5434. All > running the same version and in fact, installation of PostgreSQL. > > Even though they run on the same machine, run the same version of the > software, and are used by the same applications, they have NO > interoperability. For now, lets just accept that they need to be on > separate physical clusters because some need to be able to started and > stopped while others need to remain running, there are other reasons, but > one reason will suffice for the discussion. > Hmmm - do you really need to start and stop them? or are you just doing that to forbid user access whilst doing data loads etc? If so, then you might get more buy-in by requesting enhancements that work with the design of Pg a little more (or I hope they do anyway....) e.g: 1/ Enable/disable (temporarily) user access to individual databases via a simple admin command (tho 'ALTER DATABASE xxx CONNECTION LIMIT 0' will suffice if you do loads with a superuser role). 2/ Restrict certain users to certain databases via simple admin commands (editing pg_hba.conf is not always convenient or possible). 3/ Make cross db relation references a little more transparent (e.g maybe introduce SYNONYM for this). Other related possibilities come to mind, like being able to segment the buffer cache on a database level (e.g: bigdb gets 90% of the shared buffers.... not 100%, as I want to keep smalldb's tables cached always....). Cheers Mark ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| Mark Woodward wrote: > From an administration perspective, a single point of admin would > seem like a logical and valuable objective, no? I don't understand why you are going out of your way to separate your databases (for misinformed reasons, it appears) and then want to design a way to centrally control them so they can all fail together. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| > Mark Woodward wrote: >> From an administration perspective, a single point of admin would >> seem like a logical and valuable objective, no? > > I don't understand why you are going out of your way to separate your > databases (for misinformed reasons, it appears) and then want to design > a way to centrally control them so they can all fail together. > Oh come on, "misinformed?" is that really called for? Think about a website that (and I have one) has the U.S.A. Streetmap database, the freedb CD database, and a slew of sites based on phpbb and drupal. Maybe one should put them all in one database cluster, but... The street database is typically generated and QAed in the lab. It is then uploaded to the server. It has many millions of rows and about a half dozen indexes. To dump and reload takes almost a day. Compressing the DB and uploading it into the site, uncompressing it, stoping the current postgresql process, swapping the data directory, and restarting it can be done in about an hour. One can not do this if the street map database is part of the standard database cluster. The same thing happens with the freedb database. Unless you can tell me how to insert live data and indexes to a cluster without having to reload the data and recreate the indexes, then I hardly think I am "misinformed." The ad hominem attack wasn't nessisary. I have no problem with disagreement, but I take exception to insult. If no one sees a way to manage multiple physical database clusters as one logical cluster as something worth doing, then so be it. I have a practical example of a valid reason how this would make PostgreSQL easier to work with. Yes there are work arounds. Yes it is not currently unworkable. It is just that it could be better. As I mentioned earlier, I have been dealing with this sort of problem for a number of years now, and I think this is the "cool" solution to the problem. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Mark Woodward schrieb: .... > Unless you can tell me how to insert live data and indexes to a cluster > without having to reload the data and recreate the indexes, then I hardly > think I am "misinformed." The ad hominem attack wasn't nessisary. I see you had a usecase for something like pg_diff and pg_patch .... > If no one sees a way to manage multiple physical database clusters as one > logical cluster as something worth doing, then so be it. I have a > practical example of a valid reason how this would make PostgreSQL easier > to work with. Yes there are work arounds. Yes it is not currently > unworkable. I dont see your problem, really 1) if you have very big and very workloaded databases, you often have them on different physically boxes anyway 2) you can run any number of postmasters on the same box - just put them to listen on different ip Now to the management - you say cddb and geodb are managed off host. So they are not managed on the life server and so you dont need to switch your psql console to them. And yeah, its really not a problem, to quit psql and connect to a different server anyway :-) If you dont like to type -p otherport, you can either create aliases with all the arguments or use something like pgadmin3 which enables you to easy switch from database to database, from host to host as you like. Now is there any usecase I have missed which you still would like to have addressed? Kind regards Tino Wildenhain ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Josh Berkus wrote: > Mark, > >> Even though they run on the same machine, run the same version of the >> software, and are used by the same applications, they have NO >> interoperability. For now, lets just accept that they need to be on >> separate physical clusters because some need to be able to started and >> stopped while others need to remain running, there are other reasons, >> but one reason will suffice for the discussion. > > For an immediate solution to what you are encountering, have you looked at > pgPool? I agree with Josh - pgpool sounds like the place to start with this. That's got to be the easiest place to add some sort of "listall"/"switch todb" functionality. It also means you're not *forced* to have only one version of PG, or have them all on the same machine. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| > Mark Woodward schrieb: > ... >> Unless you can tell me how to insert live data and indexes to a cluster >> without having to reload the data and recreate the indexes, then I >> hardly >> think I am "misinformed." The ad hominem attack wasn't nessisary. > > I see you had a usecase for something like pg_diff and pg_patch > ... >> If no one sees a way to manage multiple physical database clusters as >> one >> logical cluster as something worth doing, then so be it. I have a >> practical example of a valid reason how this would make PostgreSQL >> easier >> to work with. Yes there are work arounds. Yes it is not currently >> unworkable. > > I dont see your problem, really > > 1) if you have very big and very workloaded databases, you often have > them on different physically boxes anyway > 2) you can run any number of postmasters on the same box - just put > them to listen on different ip > > Now to the management - you say cddb and geodb are managed off host. > So they are not managed on the life server and so you dont need to > switch your psql console to them. > > And yeah, its really not a problem, to quit psql and connect > to a different server anyway :-) > > If you dont like to type -p otherport, you can either create > aliases with all the arguments or use something like pgadmin3 > which enables you to easy switch from database to database, > from host to host as you like. > > Now is there any usecase I have missed which you still would > like to have addressed? I don't, as it happens, have these databases on different machines, but come to think about it, maybe it doesn't matter. The "port" aspect is troubling, it isn't really self documenting. The application isn't psql, the applications are custom code written in PHP and C/C++. Like I said, in this thread of posts, yes there are ways of doing this, and I've been doing it for years. It is just one of the rough eges that I think could be smoother. (in php) pg_connect("dbname=geo host=dbserver"); Could connect and query the dbserver, if the db is not on it, connect to a database of known servers, find geo, and use that information to connect. It sounds like a simple thing, for sure, but to be useful, there needs to be buy in from the group otherwise it is just some esoteric hack. The point is, that I have been working with this sort of "use case" for a number of years, and being able to represent multiple physical databases as one logical db server would make life easier. It was a brainstorm I had while I was setting this sort of system for the [n]th time. For my part, I have tried to maintain my own change list for PostgreSQL in the past, but it is a pain. The main source changes too frequently to keep up and in the end is just another project to maintain. Using the "/etc/hosts" file or DNS to maintain host locations for is a fairly common and well known practice, but there is no such mechanism for "ports." The problem now becomes a code issue, not a system administration issue. If one writes the code to their website to use a generic host name, say, "dbserver," then one can easily test system changes locally and push the code to a live site. The only difference is the host name. When a port is involved, there is no systemic way to represent that to the operating system, and must therefor be part of the code. As part of the code, it must reside in a place where code has access, and must NOT be pushed with the rest of the site. Having some mechanism to deal with this would be cleaner IMHO. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| "Mark Woodward" <pgsql@mohawksoft.com> writes: > The point is, that I have been working with this sort of "use case" for a > number of years, and being able to represent multiple physical databases > as one logical db server would make life easier. It was a brainstorm I had > while I was setting this sort of system for the [n]th time. It sounds like all that would be needed is a kind of "smart proxy"--has a list of database clusters on the machine and the databases they contain, and speaks enough of the protocol to recognize the startup packet and reroute it internally to the right cluster. I've heard 'pgpool' mentioned here; from a quick look at the docs it looks similar but not quite what you want. So your databases would listen on 5433, 5434, etc and the proxy would listen on 5432 and route everything properly. If a particular cluster is not up, the proxy could just error out the connection. Hmm, that'd be fun to write if I ever find the time... -Doug ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote: > Using the "/etc/hosts" file or DNS to maintain host locations for is a > fairly common and well known practice, but there is no such mechanism for > "ports." The problem now becomes a code issue, not a system administration > issue. Actually, there is, it's in /etc/services and the functions are getservbyname and getservbyport. I wonder if it'd be possible to have psql use this if you put a string in the port part of the connect string. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFD41vhIB7bNG8LQkwRAi0iAJ9W9L6/py/AWjForUknyxOoh5qOkwCfQnE+ FGoGTC2C7Ff5b+jxM6dkSI8= =LqFd -----END PGP SIGNATURE----- |
| |||
| On Feb 3, 2006, at 08:05, Mark Woodward wrote: > Using the "/etc/hosts" file or DNS to maintain host locations for is a > fairly common and well known practice, but there is no such > mechanism for > "ports." The problem now becomes a code issue, not a system > administration > issue. What if you assigned multiple IPs to a machine, then used ipfw (or something) to forward connections to port 5432 for each IP to the proper IP and port? You could use /etc/hosts or DNS to give each IP a host name, and use it in your code. For example (this only does forwarding for clients on localhost, but you get the idea), you could set up: Host IP -------- --------------- ----------------- db_one 127.0.1.1:5432 192.168.1.5:5432 db_two 127.0.1.2:5432 192.168.1.6:5432 db_three 127.0.1.3:5432 192.168.1.6:5433 fb_four 127.0.1.4:5432 16.51.209.8:8865 You could reconfigure the redirection by changing the ipfw configuration -- you wouldn't change your client code at all. It would continue to use a connection string of "... host=db_one", but you'd change 127.0.1.1:5432 to forward to the new IP/port. Or use pgpool. - Chris ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| ||||
| > "Mark Woodward" <pgsql@mohawksoft.com> writes: > >> The point is, that I have been working with this sort of "use case" for >> a >> number of years, and being able to represent multiple physical databases >> as one logical db server would make life easier. It was a brainstorm I >> had >> while I was setting this sort of system for the [n]th time. > > It sounds like all that would be needed is a kind of "smart > proxy"--has a list of database clusters on the machine and the > databases they contain, and speaks enough of the protocol to recognize > the startup packet and reroute it internally to the right cluster. > I've heard 'pgpool' mentioned here; from a quick look at the docs it > looks similar but not quite what you want. > > So your databases would listen on 5433, 5434, etc and the proxy would > listen on 5432 and route everything properly. If a particular cluster > is not up, the proxy could just error out the connection. > > Hmm, that'd be fun to write if I ever find the time... It is similar to a proxy, yes, but that is just part of it. The setup and running of these systems should all be managed. ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |