This is a discussion on Is full_page_writes=off safe in conjunction with PITR? within the pgsql Hackers forums, part of the PostgreSQL category; --> While thinking about the patch I just made to allow full_page_writes to be turned off again, it struck me ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| While thinking about the patch I just made to allow full_page_writes to be turned off again, it struck me that this patch only fixes the problem for post-crash XLOG replay. There is still a hazard if the variable is turned off in a PITR master system. The reason is that while a base backup is being taken, the backup-taker might read an inconsistent state of a page and include that in the backup. This is not a problem if full_page_writes is ON --- it's logically equivalent to a torn page write and will be fixed on the slave by XLOG replay. But it *is* a problem if full_page_writes is OFF, for exactly the same reason that torn page writes are a problem. I think we had originally argued that there was no problem anyway because the kernel should cause the page write to appear atomic to other processes (since we issue it in a single write() command). But that's only true if the backup-taker reads in units that are multiples of BLCKSZ. If the backup-taker reads, say, 4K at a time then it's certainly possible that it gets a later version of the second half of a page than it got of the first half. I don't know about you, but I sure don't feel comfortable making assumptions at that level about the behavior of tar or cpio. I fear we still have to disable full_page_writes (force it ON) if XLogArchivingActive is on. Comments? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Ühel kenal päeval, R, 2006-04-14 kell 16:40, kirjutas Tom Lane: > I think we had originally argued that there was no problem anyway > because the kernel should cause the page write to appear atomic to other > processes (since we issue it in a single write() command). But that's > only true if the backup-taker reads in units that are multiples of > BLCKSZ. If the backup-taker reads, say, 4K at a time then it's > certainly possible that it gets a later version of the second half of a > page than it got of the first half. I don't know about you, but I sure > don't feel comfortable making assumptions at that level about the > behavior of tar or cpio. > > I fear we still have to disable full_page_writes (force it ON) if > XLogArchivingActive is on. Comments? Why not just tell the backup-taker to take backups using 8K pages ? --------------- Hannu ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Hannu Krosing <hannu@skype.net> writes: > Ühel kenal päeval, R, 2006-04-14 kell 16:40, kirjutas Tom Lane: >> If the backup-taker reads, say, 4K at a time then it's >> certainly possible that it gets a later version of the second half of a >> page than it got of the first half. I don't know about you, but I sure >> don't feel comfortable making assumptions at that level about the >> behavior of tar or cpio. >> >> I fear we still have to disable full_page_writes (force it ON) if >> XLogArchivingActive is on. Comments? > Why not just tell the backup-taker to take backups using 8K pages ? How? (No, I don't think tar's blocksize options control this necessarily --- those indicate the blocking factor on the *tape*. And not everyone uses tar anyway.) Even if this would work for all popular backup programs, it seems far too fragile: the consequence of forgetting the switch would be silent data corruption, which you might not notice until the slave had been in live operation for some time. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Quoting Tom Lane <tgl@sss.pgh.pa.us>: > I fear we still have to disable full_page_writes (force it ON) if > XLogArchivingActive is on. Comments? Yeah - if you are enabling PITR, then you care about safety and integrity, so it makes sense (well, to me anyway). Cheers Mark ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| * Tom Lane: > I think we had originally argued that there was no problem anyway > because the kernel should cause the page write to appear atomic to other > processes (since we issue it in a single write() command). I doubt Linux makes any such guarantees. See this recent thread on linux-kernel: <http://marc.theaimsgroup.com/?t=114489284200003> ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Ühel kenal päeval, R, 2006-04-14 kell 17:31, kirjutas Tom Lane: > Hannu Krosing <hannu@skype.net> writes: > > Ühel kenal päeval, R, 2006-04-14 kell 16:40, kirjutas Tom Lane: > >> If the backup-taker reads, say, 4K at a time then it's > >> certainly possible that it gets a later version of the second half of a > >> page than it got of the first half. I don't know about you, but I sure > >> don't feel comfortable making assumptions at that level about the > >> behavior of tar or cpio. > >> > >> I fear we still have to disable full_page_writes (force it ON) if > >> XLogArchivingActive is on. Comments? > > > Why not just tell the backup-taker to take backups using 8K pages ? > > How? use find + dd, or whatever. I just dont want it to be made universally unavailable just because some users *might* use an file/disk-level backup solution which is incompatible. > (No, I don't think tar's blocksize options control this > necessarily --- those indicate the blocking factor on the *tape*. > And not everyone uses tar anyway.) If I'm desperate enough to get the 2x reduction of WAL writes, I may even write my own backup solution. > Even if this would work for all popular backup programs, it seems > far too fragile: the consequence of forgetting the switch would be > silent data corruption, which you might not notice until the slave > had been in live operation for some time. We may declare only one solution to be supported by us with XLogArchivingActive, say a gnu tar modified to read in Nx8K blocks ( pg_tar I guess that even if we can control what operating system does, it is still possible to get a torn page using some SAN solution, where you can freeze the image for backup independent of OS. ---------------- Hannu ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| Tom Lane wrote: > Hannu Krosing <hannu@skype.net> writes: > > Ühel kenal päeval, R, 2006-04-14 kell 16:40, kirjutas Tom Lane: > >> If the backup-taker reads, say, 4K at a time then it's > >> certainly possible that it gets a later version of the second half of a > >> page than it got of the first half. I don't know about you, but I sure > >> don't feel comfortable making assumptions at that level about the > >> behavior of tar or cpio. > >> > >> I fear we still have to disable full_page_writes (force it ON) if > >> XLogArchivingActive is on. Comments? > > > Why not just tell the backup-taker to take backups using 8K pages ? > > How? (No, I don't think tar's blocksize options control this > necessarily --- those indicate the blocking factor on the *tape*. > And not everyone uses tar anyway.) > > Even if this would work for all popular backup programs, it seems > far too fragile: the consequence of forgetting the switch would be > silent data corruption, which you might not notice until the slave > had been in live operation for some time. Yea, it is a problem. Even a 10k read is going to read 2k into the next page. I am thinking we should throw an error on pg_start_backup() and pg_stop_backup if full_page_writes is off. Seems archive_command and full_page_writes can still be used if we are not in the process of doing a file system backup. In fact, could we have pg_start_backup() turn on full_page_writes and have pg_stop_backup turn it off, if postgresql.conf has it off. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Bruce Momjian <pgman@candle.pha.pa.us> writes: > I am thinking we should throw an error on pg_start_backup() and > pg_stop_backup if full_page_writes is off. No, we'll just change the test in xlog.c so that fullPageWrites is ignored if XLogArchivingActive. > Seems archive_command and > full_page_writes can still be used if we are not in the process of doing > a file system backup. Think harder: we are only safe if the first write to a given page after it's mis-copied by the archiver is a full page write. The requirement therefore continues after pg_stop_backup. Unless you want to add infrastructure to keep track for *every page* in the DB of whether it's been fully written since the last backup? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Hannu Krosing <hannu@skype.net> writes: > If I'm desperate enough to get the 2x reduction of WAL writes, I may > even write my own backup solution. Given Florian's concern, sounds like you might have to write your own kernel too. In which case, generating a variant build of Postgres that allows full_page_writes to be disabled is certainly not beyond your powers. But for the ordinary mortal DBA, I think this combination is just too unsafe to even consider. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| ||||
| Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I am thinking we should throw an error on pg_start_backup() and > > pg_stop_backup if full_page_writes is off. > > No, we'll just change the test in xlog.c so that fullPageWrites is > ignored if XLogArchivingActive. We should probably throw a LOG message too. > > Seems archive_command and > > full_page_writes can still be used if we are not in the process of doing > > a file system backup. > > Think harder: we are only safe if the first write to a given page after > it's mis-copied by the archiver is a full page write. The requirement > therefore continues after pg_stop_backup. Unless you want to add > infrastructure to keep track for *every page* in the DB of whether it's > been fully written since the last backup? Ah, yea. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|