vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Hannu Krossing asked me about his patch to ignore transactions running > > VACUUM LAZY in other vacuum transactions. I attach a version of the > > patch updated to the current sources. > > nonInVacuumXmin seems useless ... perhaps a vestige of some earlier > version of the computation? Hmm, not useless at all really -- only a bug of mine. Turns out the notInVacuumXmin stuff is essential, so I put it back. I noticed something however -- in calculating the OldestXmin we always consider all DBs, even though there is a parameter for skipping backends not in the current DB -- this is because the Xmin we store in PGPROC is always computed using all backends. The allDbs parameter only allows us to skip the Xid of a transaction running elsewhere, but this is not very helpful because the Xmin of transactions running in the local DB will include those foreign Xids. In case I'm not explaining myself, the problem is that if I open a transaction in database A and then vacuum a table in database B, those tuples deleted after the transaction in database A started cannot be removed. To solve this problem, one idea is to change the new member of PGPROC to "current database's not in vacuum Xmin", which is the minimum of Xmins of backends running in my database which are not executing a lazy vacuum. This can be used to vacuum non-shared relations. We could either add it anew, beside nonInVacuumXmin, or replace nonInVacuumXmin. The difference will be whether we will have something to be used to vacuum shared relations or not. I think in general, shared relations are not vacuumed much so it shouldn't be too much of a problem if we leave them to be vacuumed with the regular, all-databases, include-vacuum Xmin. The other POV is that we don't really care about long-running transaction in other databases unless they are lazy vacuum, a case which is appropiately covered by the patch as it currently stands. This seems to be the POV that Hannu takes: the only long-running transactions he cares about are lazy vacuums. Thoughts? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Ühel kenal päeval, N, 2006-07-27 kell 19:29, kirjutas Alvaro Herrera: > > We could either add it anew, beside nonInVacuumXmin, or replace > nonInVacuumXmin. The difference will be whether we will have something > to be used to vacuum shared relations or not. I think in general, > shared relations are not vacuumed much so it shouldn't be too much of a > problem if we leave them to be vacuumed with the regular, all-databases, > include-vacuum Xmin. Yes. I don't think that vacuuming shared relations will ever be a significant performance concern. > The other POV is that we don't really care about long-running > transaction in other databases unless they are lazy vacuum, a case which > is appropiately covered by the patch as it currently stands. This seems > to be the POV that Hannu takes: the only long-running transactions he > cares about are lazy vacuums. Yes. The original target audience of this patch are users running 24/7 OLTP databases with big slow changing tables and small fast-changing tables which need to stay small even at the time when the big ones are vacuumed. The other possible transactions which _could_ possibly be ignored while VACUUMING are those from ANALYSE and non-lazy VACUUMs. I don't care about them as: ANALYSE is relatively fast, even on huge tables, and thus can be ignored. If you do run VACUUM FULL on anything bigger than a few thousand lines then you are not running a 24/7 OLTP database anyway. I also can't see a usecase for OLTP database where VACUUM FREEZE is required. Maybe we could also start ignoring the transactions that are running the new CONCURRENT CREATE INDEX command, as it also runs inside its own transaction(s) which can't possibly touch the tuples in the table being vacuumed as it locks out VACUUM on the indexed table. That would probably be quite easy to do by just having CONCURRENT CREATE INDEX also mark its transactions as ignorable by VACUUM. Maybe the variable name for that (proc->inVacuum) needs to be changed to something like trxSafeToIgnoreByVacuum. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> nonInVacuumXmin seems useless ... perhaps a vestige of some earlier >> version of the computation? > Hmm, not useless at all really -- only a bug of mine. Turns out the > notInVacuumXmin stuff is essential, so I put it back. Uh, why? > I noticed something however -- in calculating the OldestXmin we always > consider all DBs, even though there is a parameter for skipping backends > not in the current DB -- this is because the Xmin we store in PGPROC is > always computed using all backends. The allDbs parameter only allows us > to skip the Xid of a transaction running elsewhere, but this is not very > helpful because the Xmin of transactions running in the local DB will > include those foreign Xids. Yeah, this has been recognized for some time. However the overhead of calculating local and global xmins in *every* transaction start is a significant reason not to do it. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Another idea Jan had today was whether we could vacuum more rows if a long-running backend is in serializable mode, like pg_dump. --------------------------------------------------------------------------- Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> nonInVacuumXmin seems useless ... perhaps a vestige of some earlier > >> version of the computation? > > > Hmm, not useless at all really -- only a bug of mine. Turns out the > > notInVacuumXmin stuff is essential, so I put it back. > > Uh, why? > > > I noticed something however -- in calculating the OldestXmin we always > > consider all DBs, even though there is a parameter for skipping backends > > not in the current DB -- this is because the Xmin we store in PGPROC is > > always computed using all backends. The allDbs parameter only allows us > > to skip the Xid of a transaction running elsewhere, but this is not very > > helpful because the Xmin of transactions running in the local DB will > > include those foreign Xids. > > Yeah, this has been recognized for some time. However the overhead of > calculating local and global xmins in *every* transaction start is a > significant reason not to do it. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> nonInVacuumXmin seems useless ... perhaps a vestige of some earlier > >> version of the computation? > > > Hmm, not useless at all really -- only a bug of mine. Turns out the > > notInVacuumXmin stuff is essential, so I put it back. > > Uh, why? Because it's used to determine the Xmin that our vacuum will use. If there is a transaction whose Xmin calculation included the Xid of a transaction running vacuum, we have gained nothing from directly excluding said vacuum's Xid, because it will affect us anyway indirectly via that transaction's Xmin. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Ühel kenal päeval, N, 2006-07-27 kell 22:05, kirjutas Bruce Momjian: > Another idea Jan had today was whether we could vacuum more rows if a > long-running backend is in serializable mode, like pg_dump. I don't see how this gives us ability to vacuum more rows, as the snapshot of a serializable transaction is the oldest one. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Hannu Krosing wrote: > ?hel kenal p?eval, N, 2006-07-27 kell 22:05, kirjutas Bruce Momjian: > > Another idea Jan had today was whether we could vacuum more rows if a > > long-running backend is in serializable mode, like pg_dump. > > I don't see how this gives us ability to vacuum more rows, as the > snapshot of a serializable transaction is the oldest one. Good question. Imagine you have a serializable transaction like pg_dump, and then you have lots of newer transactions. If pg_dump is xid=12, and all the new transactions start at xid=30, any row created and expired between 12 and 30 can be removed because they are not visible. For a use case, imagine an UPDATE chain where a rows was created by x=15 and expired by xid=19. Right now, we don't remove that row, though we could. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Bruce Momjian <bruce@momjian.us> writes: > Good question. Imagine you have a serializable transaction like > pg_dump, and then you have lots of newer transactions. If pg_dump is > xid=12, and all the new transactions start at xid=30, any row created > and expired between 12 and 30 can be removed because they are not > visible. This reasoning is bogus. It would probably be safe for pg_dump because it's a read-only operation, but it fails badly if the serializable transaction is trying to do updates. An update needs to chase the chain of newer versions of the row forward from the version that's visible to the xact's serializable snapshot, to see if anyone has committed a newer version. Your proposal would remove elements of that chain, thereby possibly allowing the serializable xact to conclude it may update the tuple when it should have given an error. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Good question. Imagine you have a serializable transaction like > > pg_dump, and then you have lots of newer transactions. If pg_dump is > > xid=12, and all the new transactions start at xid=30, any row created > > and expired between 12 and 30 can be removed because they are not > > visible. > > This reasoning is bogus. > > It would probably be safe for pg_dump because it's a read-only > operation, but it fails badly if the serializable transaction is trying > to do updates. An update needs to chase the chain of newer versions of > the row forward from the version that's visible to the xact's > serializable snapshot, to see if anyone has committed a newer version. > Your proposal would remove elements of that chain, thereby possibly > allowing the serializable xact to conclude it may update the tuple > when it should have given an error. So in fact members of the chain are not visible, but vacuum doesn't have a strong enough lock to remove parts of the chain. What seems strange is that vacuum can trim the chain, but only if you do members starting from the head. I assume this is because you don't need to rejoin the chain around the expired tuples. ("bogus" seems a little strong.) -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| ||||
| Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane wrote: > >> Uh, why? > > > Because it's used to determine the Xmin that our vacuum will use. If > > there is a transaction whose Xmin calculation included the Xid of a > > transaction running vacuum, we have gained nothing from directly > > excluding said vacuum's Xid, because it will affect us anyway indirectly > > via that transaction's Xmin. > > But the patch changes things so that *everyone* excludes the vacuum from > their xmin. Or at least I thought that was the plan. We shouldn't do that, because that Xmin is also used to truncate SUBTRANS. Unless we are prepared to say that vacuum does not use subtransactions so it doesn't matter. This is true currently, so we could go ahead and do it (unless I'm missing something) -- but it means lazy vacuum will never be able to use subtransactions. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |