Unix Technical Forum

High-availability

This is a discussion on High-availability within the Pgsql General forums, part of the PostgreSQL category; --> Hi all, After realizing that 'clustering' in the PgSQL docs means multiple DBs behind one server, and NOT multple ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql General

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-09-2008, 05:39 PM
Madison Kelly
 
Posts: n/a
Default High-availability

Hi all,

After realizing that 'clustering' in the PgSQL docs means multiple
DBs behind one server, and NOT multple machines, I am back at square
one, feeling somewhat the fool. :P

Can anyone point me to docs/websites that discuss options on
replicating in (as close as possible to) realtime? Ideally with load
balancing while both/all servers are up, and failover/resyncing when a
member fails and is restored.

Is this even possible on PostgreSQL?

Being a quite small company, proprietary hardware and fancy software
licenses are not possible (ie: 'use oracle' won't help).

I've looked at slony, but it looks more like a way to push occasional
copies to slaves, and isn't meant to be real time. Am I wrong by chance?

Thanks for any help/tips/pointers!

Madi

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-09-2008, 05:39 PM
Alexander Staubo
 
Posts: n/a
Default Re: High-availability

On 6/1/07, Madison Kelly <linux@alteeve.com> wrote:
> After realizing that 'clustering' in the PgSQL docs means multiple
> DBs behind one server, and NOT multple machines, I am back at square
> one, feeling somewhat the fool. :P


I remember being similarly disappointed in this rampant co-opting of
the word "cluster" back in 7.4 or so. A gaggle of geese, a murder
of crows, a cluster of databases, I guess.

> Can anyone point me to docs/websites that discuss options on
> replicating in (as close as possible to) realtime? Ideally with load
> balancing while both/all servers are up, and failover/resyncing when a
> member fails and is restored.


The PostgreSQL documentation gives a pretty good overview of the options:

http://www.postgresql.org/docs/8.2/i...ilability.html

That said, there is to my knowledge no single, integrated product that
will do all you ask. None are capable of anything near real-time,
automatic failover tends to be left as an exercise for the reader, and
there is a lot of work to get it up and running, and requires
particular care in maintenance and monitoring once it's up.

There are several commercial (Mammoth Replicator comes to mind) and
several open-source projects. Among the open-source ones (Slony-I,
pgpool, PGCluster), I believe Slony-I is the most mature. There are a
few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that
are not ready for prime time yet; of these, I believe pgpool-II is the
most promising.

As mentioned in a different thread today, work is being done to
implement WAL-based master-slave replication, which I think should
prove more scalable and more transparent than the current third-party
products:

http://archives.postgresql.org/pgsql...3/msg00050.php

> I've looked at slony, but it looks more like a way to push occasional
> copies to slaves, and isn't meant to be real time. Am I wrong by chance?


Slony is indeed intended for near-real-time replication; it's
asynchronous, so slaves always lag behind the master. The amount of
discrepancy depends on a bunch of factors -- individual node
performance, network performance, and system load.

Alexander.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-09-2008, 05:40 PM
Alexander Staubo
 
Posts: n/a
Default Re: High-availability

On 6/3/07, Madison Kelly <linux@alteeve.com> wrote:
> > Slony is indeed intended for near-real-time replication; it's
> > asynchronous, so slaves always lag behind the master. The amount of
> > discrepancy depends on a bunch of factors -- individual node
> > performance, network performance, and system load.

>
> That was *exactly* the kind of link I was trying to find.


You're welcome.

As a side-note, I sat up pgpool-II today, and was pleasantly surprised
about how easy it all was; within two minutes I had two databases in
perfect sync on my laptop. It has limitations (such as in its handling
of sequences), but compared to Slony it's like a breath of fresh
mountain air.

Pgpool-II also supports table partitioning, where you define each
database to have a subset of the data. Pgpool-II then intercepts every
SQL statement and routes it to the correct server. It doesn't work
with referential integrity, I think, which is a major limitation, but
it's the nature of the beast.

Alexander.

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-09-2008, 05:40 PM
Alexander Staubo
 
Posts: n/a
Default Re: High-availability

On 6/3/07, Alexander Staubo <alex@purefiction.net> wrote:
> As a side-note, I sat up pgpool-II today, and was pleasantly surprised
> about how easy it all was; within two minutes I had two databases in
> perfect sync on my laptop. It has limitations (such as in its handling
> of sequences), but compared to Slony it's like a breath of fresh
> mountain air.


Err, the setup is, I mean. Once you have Slony up and running, it's
pretty smooth.

Alexander.

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-09-2008, 05:40 PM
Madison Kelly
 
Posts: n/a
Default Re: High-availability

Alexander Staubo wrote:
> On 6/1/07, Madison Kelly <linux@alteeve.com> wrote:
>> After realizing that 'clustering' in the PgSQL docs means multiple
>> DBs behind one server, and NOT multple machines, I am back at square
>> one, feeling somewhat the fool. :P

>
> I remember being similarly disappointed in this rampant co-opting of
> the word "cluster" back in 7.4 or so. A gaggle of geese, a murder
> of crows, a cluster of databases, I guess.
>
>> Can anyone point me to docs/websites that discuss options on
>> replicating in (as close as possible to) realtime? Ideally with load
>> balancing while both/all servers are up, and failover/resyncing when a
>> member fails and is restored.

>
> The PostgreSQL documentation gives a pretty good overview of the options:
>
> http://www.postgresql.org/docs/8.2/i...ilability.html
>
> That said, there is to my knowledge no single, integrated product that
> will do all you ask. None are capable of anything near real-time,
> automatic failover tends to be left as an exercise for the reader, and
> there is a lot of work to get it up and running, and requires
> particular care in maintenance and monitoring once it's up.
>
> There are several commercial (Mammoth Replicator comes to mind) and
> several open-source projects. Among the open-source ones (Slony-I,
> pgpool, PGCluster), I believe Slony-I is the most mature. There are a
> few in-progress attempts (pgpool-II, PGCluster 2, PostgreSQL-R) that
> are not ready for prime time yet; of these, I believe pgpool-II is the
> most promising.
>
> As mentioned in a different thread today, work is being done to
> implement WAL-based master-slave replication, which I think should
> prove more scalable and more transparent than the current third-party
> products:
>
> http://archives.postgresql.org/pgsql...3/msg00050.php
>
>> I've looked at slony, but it looks more like a way to push occasional
>> copies to slaves, and isn't meant to be real time. Am I wrong by chance?

>
> Slony is indeed intended for near-real-time replication; it's
> asynchronous, so slaves always lag behind the master. The amount of
> discrepancy depends on a bunch of factors -- individual node
> performance, network performance, and system load.
>
> Alexander.


That was *exactly* the kind of link I was trying to find.

Thank you!

Madi

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-09-2008, 05:40 PM
Lew
 
Posts: n/a
Default Re: High-availability

Alexander Staubo wrote:
>> As a side-note, I sat up pgpool-II today, and was pleasantly surprised
>> about how easy it all was; within two minutes I had two databases in
>> perfect sync on my laptop. It has limitations (such as in its handling
>> of sequences), but compared to Slony it's like a breath of fresh
>> mountain air.

>
> Err, the setup is, I mean. Once you have Slony up and running, it's
> pretty smooth.


I wonder what the OP means by "real-time". The standard definition is "within
a deterministic time bound".

Replication implies latency. Ignoring latency or wishing it away will not help.

It is possible to manage latency. One strategy is to minimize it. There are
others.

Also remember the ancient proverb, applicable when two or more nodes are
trying to agree on what time it is:
"Man with two watches never knows correct time."

I think of this category of issue as the Special Relativity of information.

--
Lew
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-09-2008, 05:40 PM
Lew
 
Posts: n/a
Default Re: High-availability

Madison Kelly wrote:
> Being a quite small company, proprietary hardware and fancy software
> licenses are not possible (ie: 'use oracle' won't help).


How much data do you put in the DB? Oracle has a free version, but it has
size limits.

(Ducking the slings and arrows of outraged PG fans: I prefer Postgre, I really
do.)

--
Lew
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-09-2008, 05:41 PM
Madison Kelly
 
Posts: n/a
Default Re: High-availability

Lew wrote:
> Madison Kelly wrote:
>> Being a quite small company, proprietary hardware and fancy software
>> licenses are not possible (ie: 'use oracle' won't help).

>
> How much data do you put in the DB? Oracle has a free version, but it
> has size limits.
>
> (Ducking the slings and arrows of outraged PG fans: I prefer Postgre, I
> really do.)
>


Hrm, it's hard to say as we're (hoping!) to grow. At the moment, a
few hundred megs. If the company gets off the ground, possibly much
more. also, we've got a few (dozen or so) side projects that each have
their own DBs.

I think the risk of running into a barrier like a size limit would be
too much. Even if we get off the ground, the storage needs of the DB
will outgrow our revenue. I'd hate to be in a position where I am
dependent on a (potentially) very expensive invoice while we are still
running on a shoe-string.

Thanks for the suggestion though! I will poke at the free/trial
version and, if I am unable to load-balance pgSQL and we run into
performance problems, I will have a better idea of what options I have
(ie: bigger iron vs. an oracle license).

Thanks!

Madi

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-09-2008, 05:41 PM
Lew
 
Posts: n/a
Default Re: High-availability

Lew wrote:
>> How much data do you put in the DB? Oracle has a free version, but it
>> has size limits.
>>
>> (Ducking the slings and arrows of outraged PG fans: I prefer Postgre,
>> I really do.)


Madison Kelly wrote:
> Thanks for the suggestion though! I will poke at the free/trial
> version and, if I am unable to load-balance pgSQL and we run into
> performance problems, I will have a better idea of what options I have
> (ie: bigger iron vs. an oracle license).


I only mentioned Oracle because you did. Other companies (IBM DB2 comes to
mind) offer free versions of their for-pay RDBMSes also. If it were my shop,
I'd stick with PG.

For upsized DBs there are a host of issues common across platforms. No matter
what, you will need to become friendly with RAID drive arrays, multi- vs.
single-core CPUs, large RAM allocation and the like.

I am not at all familiar with "load-balancing" on the DB level. Are you
talking about dividing the DB into multiple servers? That seems highly
suspect to me. I used to work for a startup touting a continuous-replication
approach for databases, but it never seemed to me to offer benefits over the
kind of master-slave replication that PG already has available.

Others on this thread have mentioned Pgpool-II (but beware, Alexander Staubo
warned,
> It doesn't work with referential integrity, I think,

), Slony-I and others, about which I know nothing.`

--
Lew
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-09-2008, 05:42 PM
Andrew Sullivan
 
Posts: n/a
Default Re: High-availability

On Sun, Jun 03, 2007 at 01:35:49PM -0400, Lew wrote:
> How much data do you put in the DB? Oracle has a free version, but it has
> size limits.


AFAIK, Oracle's free version doesn't include RAC, which is what would
be needed to satisfy the request anyway.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
Users never remark, "Wow, this software may be buggy and hard
to use, but at least there is a lot of code underneath."
--Damien Katz

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:35 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com