This is a discussion on High-availability within the Pgsql General forums, part of the PostgreSQL category; --> Hi all, After realizing that 'clustering' in the PgSQL docs means multiple DBs behind one server, and NOT multple ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi all, After realizing that 'clustering' in the PgSQL docs means multiple DBs behind one server, and NOT multple machines, I am back at square one, feeling somewhat the fool. :P Can anyone point me to docs/websites that discuss options on replicating in (as close as possible to) realtime? Ideally with load balancing while both/all servers are up, and failover/resyncing when a member fails and is restored. Is this even possible on PostgreSQL? Being a quite small company, proprietary hardware and fancy software licenses are not possible (ie: 'use oracle' won't help). I've looked at slony, but it looks more like a way to push occasional copies to slaves, and isn't meant to be real time. Am I wrong by chance? Thanks for any help/tips/pointers! Madi ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On 6/1/07, Madison Kelly <linux@alteeve.com> wrote: > After realizing that 'clustering' in the PgSQL docs means multiple > DBs behind one server, and NOT multple machines, I am back at square > one, feeling somewhat the fool. :P I remember being similarly disappointed in this rampant co-opting of the word "cluster" back in 7.4 or so. of crows, a cluster of databases, I guess. > Can anyone point me to docs/websites that discuss options on > replicating in (as close as possible to) realtime? Ideally with load > balancing while both/all servers are up, and failover/resyncing when a > member fails and is restored. The PostgreSQL documentation gives a pretty good overview of the options: http://www.postgresql.org/docs/8.2/i...ilability.html That said, there is to my knowledge no single, integrated product that will do all you ask. None are capable of anything near real-time, automatic failover tends to be left as an exercise for the reader, and there is a lot of work to get it up and running, and requires particular care in maintenance and monitoring once it's up. There are several commercial (Mammoth Replicator comes to mind) and several open-source projects. Among the open-source ones (Slony-I, pgpool, PGCluster), I believe Slony-I is the most mature. There are a few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that are not ready for prime time yet; of these, I believe pgpool-II is the most promising. As mentioned in a different thread today, work is being done to implement WAL-based master-slave replication, which I think should prove more scalable and more transparent than the current third-party products: http://archives.postgresql.org/pgsql...3/msg00050.php > I've looked at slony, but it looks more like a way to push occasional > copies to slaves, and isn't meant to be real time. Am I wrong by chance? Slony is indeed intended for near-real-time replication; it's asynchronous, so slaves always lag behind the master. The amount of discrepancy depends on a bunch of factors -- individual node performance, network performance, and system load. Alexander. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ |
| |||
| On 6/3/07, Madison Kelly <linux@alteeve.com> wrote: > > Slony is indeed intended for near-real-time replication; it's > > asynchronous, so slaves always lag behind the master. The amount of > > discrepancy depends on a bunch of factors -- individual node > > performance, network performance, and system load. > > That was *exactly* the kind of link I was trying to find. You're welcome. As a side-note, I sat up pgpool-II today, and was pleasantly surprised about how easy it all was; within two minutes I had two databases in perfect sync on my laptop. It has limitations (such as in its handling of sequences), but compared to Slony it's like a breath of fresh mountain air. Pgpool-II also supports table partitioning, where you define each database to have a subset of the data. Pgpool-II then intercepts every SQL statement and routes it to the correct server. It doesn't work with referential integrity, I think, which is a major limitation, but it's the nature of the beast. Alexander. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On 6/3/07, Alexander Staubo <alex@purefiction.net> wrote: > As a side-note, I sat up pgpool-II today, and was pleasantly surprised > about how easy it all was; within two minutes I had two databases in > perfect sync on my laptop. It has limitations (such as in its handling > of sequences), but compared to Slony it's like a breath of fresh > mountain air. Err, the setup is, I mean. Once you have Slony up and running, it's pretty smooth. Alexander. ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Alexander Staubo wrote: > On 6/1/07, Madison Kelly <linux@alteeve.com> wrote: >> After realizing that 'clustering' in the PgSQL docs means multiple >> DBs behind one server, and NOT multple machines, I am back at square >> one, feeling somewhat the fool. :P > > I remember being similarly disappointed in this rampant co-opting of > the word "cluster" back in 7.4 or so. > of crows, a cluster of databases, I guess. > >> Can anyone point me to docs/websites that discuss options on >> replicating in (as close as possible to) realtime? Ideally with load >> balancing while both/all servers are up, and failover/resyncing when a >> member fails and is restored. > > The PostgreSQL documentation gives a pretty good overview of the options: > > http://www.postgresql.org/docs/8.2/i...ilability.html > > That said, there is to my knowledge no single, integrated product that > will do all you ask. None are capable of anything near real-time, > automatic failover tends to be left as an exercise for the reader, and > there is a lot of work to get it up and running, and requires > particular care in maintenance and monitoring once it's up. > > There are several commercial (Mammoth Replicator comes to mind) and > several open-source projects. Among the open-source ones (Slony-I, > pgpool, PGCluster), I believe Slony-I is the most mature. There are a > few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that > are not ready for prime time yet; of these, I believe pgpool-II is the > most promising. > > As mentioned in a different thread today, work is being done to > implement WAL-based master-slave replication, which I think should > prove more scalable and more transparent than the current third-party > products: > > http://archives.postgresql.org/pgsql...3/msg00050.php > >> I've looked at slony, but it looks more like a way to push occasional >> copies to slaves, and isn't meant to be real time. Am I wrong by chance? > > Slony is indeed intended for near-real-time replication; it's > asynchronous, so slaves always lag behind the master. The amount of > discrepancy depends on a bunch of factors -- individual node > performance, network performance, and system load. > > Alexander. That was *exactly* the kind of link I was trying to find. Thank you! Madi ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Alexander Staubo wrote: >> As a side-note, I sat up pgpool-II today, and was pleasantly surprised >> about how easy it all was; within two minutes I had two databases in >> perfect sync on my laptop. It has limitations (such as in its handling >> of sequences), but compared to Slony it's like a breath of fresh >> mountain air. > > Err, the setup is, I mean. Once you have Slony up and running, it's > pretty smooth. I wonder what the OP means by "real-time". The standard definition is "within a deterministic time bound". Replication implies latency. Ignoring latency or wishing it away will not help. It is possible to manage latency. One strategy is to minimize it. There are others. Also remember the ancient proverb, applicable when two or more nodes are trying to agree on what time it is: "Man with two watches never knows correct time." I think of this category of issue as the Special Relativity of information. -- Lew |
| |||
| Madison Kelly wrote: > Being a quite small company, proprietary hardware and fancy software > licenses are not possible (ie: 'use oracle' won't help). How much data do you put in the DB? Oracle has a free version, but it has size limits. (Ducking the slings and arrows of outraged PG fans: I prefer Postgre, I really do.) -- Lew |
| |||
| Lew wrote: > Madison Kelly wrote: >> Being a quite small company, proprietary hardware and fancy software >> licenses are not possible (ie: 'use oracle' won't help). > > How much data do you put in the DB? Oracle has a free version, but it > has size limits. > > (Ducking the slings and arrows of outraged PG fans: I prefer Postgre, I > really do.) > Hrm, it's hard to say as we're (hoping!) to grow. At the moment, a few hundred megs. If the company gets off the ground, possibly much more. also, we've got a few (dozen or so) side projects that each have their own DBs. I think the risk of running into a barrier like a size limit would be too much. Even if we get off the ground, the storage needs of the DB will outgrow our revenue. I'd hate to be in a position where I am dependent on a (potentially) very expensive invoice while we are still running on a shoe-string. Thanks for the suggestion though! I will poke at the free/trial version and, if I am unable to load-balance pgSQL and we run into performance problems, I will have a better idea of what options I have (ie: bigger iron vs. an oracle license). Thanks! Madi ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Lew wrote: >> How much data do you put in the DB? Oracle has a free version, but it >> has size limits. >> >> (Ducking the slings and arrows of outraged PG fans: I prefer Postgre, >> I really do.) Madison Kelly wrote: > Thanks for the suggestion though! I will poke at the free/trial > version and, if I am unable to load-balance pgSQL and we run into > performance problems, I will have a better idea of what options I have > (ie: bigger iron vs. an oracle license). I only mentioned Oracle because you did. Other companies (IBM DB2 comes to mind) offer free versions of their for-pay RDBMSes also. If it were my shop, I'd stick with PG. For upsized DBs there are a host of issues common across platforms. No matter what, you will need to become friendly with RAID drive arrays, multi- vs. single-core CPUs, large RAM allocation and the like. I am not at all familiar with "load-balancing" on the DB level. Are you talking about dividing the DB into multiple servers? That seems highly suspect to me. I used to work for a startup touting a continuous-replication approach for databases, but it never seemed to me to offer benefits over the kind of master-slave replication that PG already has available. Others on this thread have mentioned Pgpool-II (but beware, Alexander Staubo warned, > It doesn't work with referential integrity, I think, ), Slony-I and others, about which I know nothing.` -- Lew |
| ||||
| On Sun, Jun 03, 2007 at 01:35:49PM -0400, Lew wrote: > How much data do you put in the DB? Oracle has a free version, but it has > size limits. AFAIK, Oracle's free version doesn't include RAC, which is what would be needed to satisfy the request anyway. A -- Andrew Sullivan | ajs@crankycanuck.ca Users never remark, "Wow, this software may be buggy and hard to use, but at least there is a lot of code underneath." --Damien Katz ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| Thread Tools | |
| Display Modes | |
|
|