This is a discussion on Postgresql replication within the Pgsql General forums, part of the PostgreSQL category; --> On Thu, 2005-08-25 at 17:45 -0700, Jeff Davis wrote: > The replicator surely is not optional, and must be ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Thu, 2005-08-25 at 17:45 -0700, Jeff Davis wrote: > The replicator surely is not optional, and must be centralized. >From http://pgcluster.projects.postgresql..._3/index.html: "Several replication server can be set up. When an problem occurs at the replication server, Cluster DB automatically changes connection to the standing-by replication server. When all replication servers stop, Cluster DB operates in stand-alone mode. There are two modes for stand-alone mode: read only mode which only permits references, and read write mode, which permits updating." > PGCluster is query-based, right? Yes. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Chris Browne wrote: > I'm pretty sure that they _don't_ track balance updates for each > transaction that applies to a customer's account. You could, via one > form of trickery or another, "overdraw" your account by a fairly hefty > amount, and they probably won't notice for a day or even three. But > once they notice/estimate that the Gentle Caller has built up some > dangerously high balance, they'll warn of impending discontinuation of > service if some sum isn't paid by some date. This works for companies that have some degree of power over their customers. E.g. pay up or we disconnect your service. Return your overdrafts/pay your fees or we mess up your credit. This doesn't work if it's a small company who's account has been emptied. Sure the bank will refuse to honor the check but then that company will be hit with overdraw penalties and possible legal penalties to the payee for the bounced check. The warning threshold system is easy to implement but there will always be corner cases where the warning is not soon enough or a single payment wipes out ~ 100% of the account. Warn too often and it'll be ignored by people as a "boy crying wolf" alarm. |
| |||
| Chris Travers wrote: > I guess I am thinking along different lines than you. I was thinking > that the simplest solution would be to have master/slave replication for > *approved* transactions only and no replication for initial commits > prior to approval. This makes the assumption that a single transaction > will be committed on a single server, and that a single transaction will > not be split over multiple servers. In this way, you can commit a > pending transaction to any single server, and when it is approved, it > gets replicated via the master. See below for more. This works if you don't care that multiple servers commit transactions that force a budget or bank account to be exceeded. > Thinking about this.... The big issue is that you only want to > replicate the deltas, not the entire account. I am still thinking > master/slave, but something where the deltas are replicated in the > background or where the user, in checking his account, is actually > querying the home server. This second issue could be done via dblink or > DBI-Link and would simply require that a master table linking the > accounts with home servers be replicated (this should, I think, be > fairly low-overhead). Except what you know have is your system fails if any server fail or is inaccessible. |
| |||
| William Yu wrote: > Chris Travers wrote: > >> I guess I am thinking along different lines than you. I was thinking >> that the simplest solution would be to have master/slave replication >> for *approved* transactions only and no replication for initial >> commits prior to approval. This makes the assumption that a single >> transaction will be committed on a single server, and that a single >> transaction will not be split over multiple servers. In this way, >> you can commit a pending transaction to any single server, and when >> it is approved, it gets replicated via the master. See below for more. > > > This works if you don't care that multiple servers commit transactions > that force a budget or bank account to be exceeded. > Ok. then lets look at possibility B. (Async Multimaster Replication is out). > >> Thinking about this.... The big issue is that you only want to >> replicate the deltas, not the entire account. I am still thinking >> master/slave, but something where the deltas are replicated in the >> background or where the user, in checking his account, is actually >> querying the home server. This second issue could be done via dblink >> or DBI-Link and would simply require that a master table linking the >> accounts with home servers be replicated (this should, I think, be >> fairly low-overhead). > > > Except what you know have is your system fails if any server fail or > is inaccessible. > Ok. If you have a system where each location is authoritative for its customers and the server transparently queries that server where needed (via a view). then when any server becomes inaccessible then the customers whose accounts are on that server become inaccessible. This may not be accessible. But if this is the case, then you could treat this as a partitioned table, where each partition is authoritative on one location (see a previous post on how to do this) and then use Slony to replicate. Again this does not get around your objection above namely that it is possible to do duplicate transactions at multiple locations. For this you would need an explicit call to the authoritative server. I see no other way around that. This might allow people to view tentative balances from other branches if the home (or its connection) server is down, but they would not be able to withdraw funds. But if you do this, you have a different problem. Namely that replication will be occuring over your slower than desired links. As the number of branches grow, so will the bandwidth demands on every branch. This may not therefore be scalable. Unfortunately there is no free lunch here. And I think that at some point you are going to have to choose between: 1) Efficiency of network throughput 2) Tolerance to attempts at repeat transactions before replication (emptying an account multiple times) 3) Availability of a transaction. You can pick any two. I think that this is the way it will work with any other database system as well. Best Wishes, Chris Travers Metatron Technology Consulting > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > > ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Chris Travers wrote: > 1) Efficiency of network throughput > 2) Tolerance to attempts at repeat transactions before replication > (emptying an account multiple times) > 3) Availability of a transaction. We ended up having to give up #1. It's possible to have our transactions routed to multiple servers before it becomes a final transaction. User1 might request a payment on ServerA. User2 then marks the payment as approved on ServerB. ServerC is authoritative and checks the bank/budget balances before posting as final. After each of these steps requires replication of the latest changes to all other servers. (In theory, the first 2 steps only require replication to the authoritative server but we do so anyways so all servers can act as backups for each other -- pending transactions still need to be restored in case of total DB failure.) There's definitely a delay in terms of getting from point A to point Z; duplicate servers. But there's guaranteed financial integrity, users can connect to any server the load balancer picks and no server requires any other server/connection to be up for individual user tranactions to occur. |
| |||
| William Yu wrote: > Chris Travers wrote: > >> 1) Efficiency of network throughput >> 2) Tolerance to attempts at repeat transactions before replication >> (emptying an account multiple times) >> 3) Availability of a transaction. > > > We ended up having to give up #1. It's possible to have our > transactions routed to multiple servers before it becomes a final > transaction. User1 might request a payment on ServerA. User2 then > marks the payment as approved on ServerB. ServerC is authoritative and > checks the bank/budget balances before posting as final. After each of > these steps requires replication of the latest changes to all other > servers. (In theory, the first 2 steps only require replication to the > authoritative server but we do so anyways so all servers can act as > backups for each other -- pending transactions still need to be > restored in case of total DB failure.) Ok. I see where you are going with this. It is an interesting problem. Multimaster Async Replication will give you a problem in that it allows the attack you are describing due to the asynchronous nature of the replication. If I were trying to prevent this sort of attack, I would try to build into this some sort of "account authority" which can manage these transactions. Origionally I was thinking of the home server as the obvious place to start if it is available. But if it is not, then you would need some infrastructure to track attempted withdrawals and handle them appropriately. Such servers could cache requests and if they see duplicates or many requests coming from many servers on the same account could flag that. One option might be to have a rotational authority (i.e. home-server, then next, then next) in a ring so that an unavailable server still allows reasonable precautions to be held against emptying the account many times. Basically, if the servers aren't talking to eachother at the time of the transaction, then they are not going to know about duplicates. You have to have some setup beyond your replication to handle this. Otherwise you will have some issues with delays causing the security risks you deem unacceptable. Your question seems to be "How do I set up multimaster async replication such that a person cannot empty his account on each server" and the answer is that this is an inherent limitation of multimaster async replication. This also means that you will have to have some sort of other verification process for such transactions beyond what is locally available on the replicants. > There's definitely a delay in terms of getting from point A to point > Z; duplicate servers. But there's guaranteed financial integrity, > users can connect to any server the load balancer picks and no server > requires any other server/connection to be up for individual user > tranactions to occur. The delay will by definition defeat any guarantee of financial integrity if you are allowing read-write operations to the replica without checking with some sort of central authority. At very least, the central authority should look for suspicious patterns. Again, it may be possible to do some failover here, but I don't think you can do without *some* sort of centralized control. (Note, here load balancing is handled by the distribution of accounts. A down server simply means that the next server in the ring will take over its remote verification role). This doesn;t make the security issue go away, but it may reduce it to an acceptable level. I.e. it is still possible for duplicates to be submitted just before and after a home server goes down, but this is a lot better than being able to have one transaction repeated on each server and then dealing with the massively overdrawn account. Best Wishes, Chris Travers Metatron Technology Consulting ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Our own personal IM Chris Travers wrote: > The delay will by definition defeat any guarantee of financial integrity > if you are allowing read-write operations to the replica without > checking with some sort of central authority. At very least, the > central authority should look for suspicious patterns. Again, it may be > possible to do some failover here, but I don't think you can do without > *some* sort of centralized control. Actually this is the easy part. When the home server finally issues payments, it only issues what it knows about and what can be verified as OK. Any transactions that are currently being entered on another server will appear after the next replication cycle and it will be verified afterwards. If the previous payment issuing cycle used up all the money, the "new" requests are kept in pending until money is put in. This does allow for newer requests that happen to be executed on home servers to possibly take precendence over old requests but there is no requirement in the business process that payments must come out in any specific order. > This doesn;t make the security issue go away, but it may reduce it to an > acceptable level. I.e. it is still possible for duplicates to be > submitted just before and after a home server goes down, but this is a > lot better than being able to have one transaction repeated on each > server and then dealing with the massively overdrawn account. The "home" server going down is the trickiest issue. Because when a server disappears, is that because it went down temporarily? For good? A temporary internet problem where nobody can get access to it? Or an internet routing issue where just the connection between those two servers is severed? If it's the last, users might still be doing stuff on ServerA with ServerA is posting financials but ServerB thinks the server is down and decides to take over ServerA's duties. Of course, in ServerA's view, it's ServerB and ServerC that's down -- not itself. Maybe we can mitigate this by having more servers at more data centers around the world so everybody can monitor everybody. At some point, if you have N servers and N-1 servers say ServerA is down, it probably is down. With a high enough N, ServerA could probably decisively decide it was the server severed from the internet and refuse to post any financials until connection to the outside world was restore + some extra threshold. This last problem, which luckily occurs rarely, we do by hand right now. We're not ready to run this on full auto because we only have 2 data centers (with multiple servers within each data center). The servers do not have enough info to know which server is actually down in order to auto-promote/demote. It does require staff that's not just in 1 location though because our primary office going down w/ our local datacenter would mean nobody there could do the switchover. (Assuming major natural disaster that kept us from using our laptops at the local Starbucks to do the work.) |
| |||
| William Yu <wyu@talisys.com> writes: > Chris Browne wrote: >> I'm pretty sure that they _don't_ track balance updates for each >> transaction that applies to a customer's account. You could, via one >> form of trickery or another, "overdraw" your account by a fairly hefty >> amount, and they probably won't notice for a day or even three. But >> once they notice/estimate that the Gentle Caller has built up some >> dangerously high balance, they'll warn of impending discontinuation of >> service if some sum isn't paid by some date. > > This works for companies that have some degree of power over their > customers. E.g. pay up or we disconnect your service. Return your > overdrafts/pay your fees or we mess up your credit. > > This doesn't work if it's a small company who's account has been > emptied. Sure the bank will refuse to honor the check but then that > company will be hit with overdraw penalties and possible legal > penalties to the payee for the bounced check. > > The warning threshold system is easy to implement but there will > always be corner cases where the warning is not soon enough or a > single payment wipes out ~ 100% of the account. Warn too often and > it'll be ignored by people as a "boy crying wolf" alarm. In a context where there is no "degree of power over their customers," I would hardly think that the presence/lack of automated controls or presence/lack of balance synchronization is particularly material. In other words, if trying to apply policies is going to forcibly break, then building the data system may have been a futile exercise in the first place. And trying to distribute the system may again be a *business* mistake that admits no technical solution. If that's the case, then we can conclude that replication is no solution, and that the organization should improve connectivity for their single centralized system. That may be the case, but is the uninteresting case, as it amounts to throwing our hands up, saying "no answers; you can't have a distributed system," and giving up on trying anything further. No point to further discussion... -- let name="cbbrowne" and tld="acm.org" in String.concat "@" [name;tld];; http://cbbrowne.com/info/linux.html "There is no reason anyone would want a computer in their home". -- Ken Olson, Pres. and founder of Digital Equipment Corp. 1977 |
| |||
| On Thu, Aug 25, 2005 at 01:44:15PM +0200, Bohdan Linda wrote: > there are some other db solutions which have good performance when doing > this kind of replication across the world. Bluntly, "No." -- Andrew Sullivan | ajs@crankycanuck.ca Information security isn't a technological problem. It's an economics problem. --Bruce Schneier ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| On Wed, Aug 24, 2005 at 11:54:42AM -0400, Chris Browne wrote: > There is a Slony-II project ongoing that is trying to construct a > more-or-less synchronous multimaster replication system (where part of > the cleverness involves trying to get as much taking place in an > asynchronous fashion as possible) that would almost certainly be of no > use to your "use case." Just to emphasise this point: assuming we ever get Slony-II to work, it is all but guaranteed to be useless for cases like the one that started this thread: it'll simply require very fast network connections to work. I've had more than one person ask me when multi-site multimaster is coming, and my answer is always, "Have you started work on it yet?" I think there might be a way to hack up Slony-I to do it -- Josh Berkus gave me a quick outline while at OSCON that made me think it possible -- but AFAIK, nobody is actually doing that work. It's worth noting that single-origin master-slave async replication is tricky, but by no means impossible. Multi-master _anything_ is hard, no question about it; and it more or less always imposes some overhead that you won't like. The question is merely whether you want to pay that price. A -- Andrew Sullivan | ajs@crankycanuck.ca A certain description of men are for getting out of debt, yet are against all taxes for raising money to pay it off. --Alexander Hamilton ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |