Unix Technical Forum

Re: [HACKERS] Full page writes improvement, codeupdate

This is a discussion on Re: [HACKERS] Full page writes improvement, codeupdate within the Pgsql Patches forums, part of the PostgreSQL category; --> Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Patches

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 09:50 AM
Bruce Momjian
 
Posts: n/a
Default Re: [HACKERS] Full page writes improvement, codeupdate


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Koichi Suzuki wrote:
> Hi,
>
> Here's a patch reflected some of Simon's comments.
>
> 1) Removed an elog call in a critical section.
>
> 2) Changed the name of the commands, pg_complesslog and pg_decompresslog.
>
> 3) Changed diff option to make a patch.
>
> --
> Koichi Suzuki


[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings


--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 09:52 AM
Simon Riggs
 
Posts: n/a
Default Re: [HACKERS] Full page writes improvement, code update

On Tue, 2007-04-03 at 19:45 +0900, Koichi Suzuki wrote:
> Bruce Momjian wrote:
> > Your patch has been added to the PostgreSQL unapplied patches list at:
> >
> > http://momjian.postgresql.org/cgi-bin/pgpatches

>
> Thank you very much for including. Attached is an update of the patch
> according to Simon Riggs's comment about GUC name.


The patch comes with its own "install kit", which is great to review
(many thanks), but hard to determine where you think code should go when
committed.

My guess based on your patch
- the patch gets applied to core :-)
- pg_compresslog *and* pg_decompresslog go to a contrib directory called
contrib/lesslog?

Can you please produce a combined patch that does all of the above, plus
edits the contrib Makefile to add all of those, as well as editing the
README so it doesn't mention the patch, just the contrib executables?

The patch looks correct to me now. I haven't tested it yet, but will be
doing so in the last week of April, which is when I'll be doing docs for
this and other stuff, since time is pressing now.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 09:53 AM
Koichi Suzuki
 
Posts: n/a
Default Re: [HACKERS] Full page writes improvement, code update

Hi,

I agree to put the patch to core and the others (pg_compresslog and
pg_decompresslog) to contrib/lesslog.

I will make separate materials to go to core and contrib.

As for patches, we have tested against pgbench, DBT-2 and our
propriatery benchmarks and it looked to run correctly.

Regards;

Simon Riggs wrote:
> On Tue, 2007-04-03 at 19:45 +0900, Koichi Suzuki wrote:
>> Bruce Momjian wrote:
>>> Your patch has been added to the PostgreSQL unapplied patches list at:
>>>
>>> http://momjian.postgresql.org/cgi-bin/pgpatches

>> Thank you very much for including. Attached is an update of the patch
>> according to Simon Riggs's comment about GUC name.

>
> The patch comes with its own "install kit", which is great to review
> (many thanks), but hard to determine where you think code should go when
> committed.
>
> My guess based on your patch
> - the patch gets applied to core :-)
> - pg_compresslog *and* pg_decompresslog go to a contrib directory called
> contrib/lesslog?
>
> Can you please produce a combined patch that does all of the above, plus
> edits the contrib Makefile to add all of those, as well as editing the
> README so it doesn't mention the patch, just the contrib executables?
>
> The patch looks correct to me now. I haven't tested it yet, but will be
> doing so in the last week of April, which is when I'll be doing docs for
> this and other stuff, since time is pressing now.
>



--
Koichi Suzuki

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 09:55 AM
Koichi Suzuki
 
Posts: n/a
Default Re: [HACKERS] Full page writes improvement, code update

Hi,

Here're two patches for

1) lesslog_core.patch, patch for core, to set a mark to the log record
to be removed in archiving,

2) lesslog_contrib.patch, patch for contrib/lesslog, pg_compresslog and
pg_decompresslog,

respectively, as asked. I hope they work.

Regards;

Simon Riggs wrote:
> On Tue, 2007-04-03 at 19:45 +0900, Koichi Suzuki wrote:
>> Bruce Momjian wrote:
>>> Your patch has been added to the PostgreSQL unapplied patches list at:
>>>
>>> http://momjian.postgresql.org/cgi-bin/pgpatches

>> Thank you very much for including. Attached is an update of the patch
>> according to Simon Riggs's comment about GUC name.

>
> The patch comes with its own "install kit", which is great to review
> (many thanks), but hard to determine where you think code should go when
> committed.
>
> My guess based on your patch
> - the patch gets applied to core :-)
> - pg_compresslog *and* pg_decompresslog go to a contrib directory called
> contrib/lesslog?
>
> Can you please produce a combined patch that does all of the above, plus
> edits the contrib Makefile to add all of those, as well as editing the
> README so it doesn't mention the patch, just the contrib executables?
>
> The patch looks correct to me now. I haven't tested it yet, but will be
> doing so in the last week of April, which is when I'll be doing docs for
> this and other stuff, since time is pressing now.
>



--
Koichi Suzuki

diff -Ncar postgresql-8.2.1.org/src/backend/access/transam/xlog.c postgresql-8.2.1/src/backend/access/transam/xlog.c
*** postgresql-8.2.1.org/src/backend/access/transam/xlog.c 2006-12-01 03:29:11.000000000 +0900
--- postgresql-8.2.1/src/backend/access/transam/xlog.c 2007-04-06 16:54:23.000000000 +0900
***************
*** 137,142 ****
--- 137,143 ----
char *XLOG_sync_method = NULL;
const char XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
bool fullPageWrites = true;
+ bool walAddOptimizationInfo = false;

#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
***************
*** 626,632 ****
{
/* Buffer already referenced by earlier chain item */
if (dtbuf_bkp[i])
! rdt->data = NULL;
else if (rdt->data)
{
len += rdt->len;
--- 627,641 ----
{
/* Buffer already referenced by earlier chain item */
if (dtbuf_bkp[i])
! {
! if (fullPageWrites && walAddOptimizationInfo && rdt->data)
! {
! len += rdt->len;
! COMP_CRC32(rdata_crc, rdt->data, rdt->len);
! }
! else
! rdt->data = NULL;
! }
else if (rdt->data)
{
len += rdt->len;
***************
*** 642,648 ****
&(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
{
dtbuf_bkp[i] = true;
! rdt->data = NULL;
}
else if (rdt->data)
{
--- 651,663 ----
&(dtbuf_lsn[i]), &(dtbuf_xlg[i])))
{
dtbuf_bkp[i] = true;
! if (fullPageWrites && walAddOptimizationInfo && rdt->data)
! {
! len += rdt->len;
! COMP_CRC32(rdata_crc, rdt->data, rdt->len);
! }
! else
! rdt->data = NULL;
}
else if (rdt->data)
{
***************
*** 908,913 ****
--- 923,941 ----
return RecPtr;
}

+ /*
+ * If online backup is not in progress and wal_add_optimization_info is on,
+ * mark backup blocks removable if any.
+ * This mark will be referenced during archiving to remove needless backup
+ * blocks in the record and compress WAL segment files.
+ * NOTE: wal_add_optimization_info is ignored when full_page_writes is off.
+ */
+ if (fullPageWrites && walAddOptimizationInfo && (info & XLR_BKP_BLOCK_MASK) &&
+ !Insert->forcePageWrites)
+ {
+ info |= XLR_BKP_REMOVABLE;
+ }
+
/* Insert record header */

record = (XLogRecord *) Insert->currpos;
***************
*** 2820,2832 ****
blk += blen;
}

! /* Check that xl_tot_len agrees with our calculation */
! if (blk != (char *) record + record->xl_tot_len)
{
! ereport(emode,
! (errmsg("incorrect total length in record at %X/%X",
! recptr.xlogid, recptr.xrecoff)));
! return false;
}

/* Finally include the record header */
--- 2848,2876 ----
blk += blen;
}

! /*
! * If physical log has not been removed, check the length to see
! * the following.
! * - No physical log existed originally,
! * - WAL record was not removable because it is generated during
! * the online backup,
! * - Cannot be removed because the physical log spanned in
! * two segments.
! * The reason why we skip the length check on the physical log removal is
! * that the flag XLR_SET_BKB_BLOCK(0..2) is reset to zero and it prevents
! * the above loop to proceed blk to the end of the record.
! */
! if (!(record->xl_info & XLR_BKP_REMOVABLE) ||
! record->xl_info & XLR_BKP_BLOCK_MASK)
{
! /* Check that xl_tot_len agrees with our calculation */
! if (blk != (char *) record + record->xl_tot_len)
! {
! ereport(emode,
! (errmsg("incorrect total length in record at %X/%X",
! recptr.xlogid, recptr.xrecoff)));
! return false;
! }
}

/* Finally include the record header */
diff -Ncar postgresql-8.2.1.org/src/backend/utils/misc/guc.c postgresql-8.2.1/src/backend/utils/misc/guc.c
*** postgresql-8.2.1.org/src/backend/utils/misc/guc.c 2006-11-29 23:50:07.000000000 +0900
--- postgresql-8.2.1/src/backend/utils/misc/guc.c 2007-04-06 16:54:23.000000000 +0900
***************
*** 97,102 ****
--- 97,103 ----
extern int CommitSiblings;
extern char *default_tablespace;
extern bool fullPageWrites;
+ extern bool walAddOptimizationInfo;

#ifdef TRACE_SORT
extern bool trace_sort;
***************
*** 546,551 ****
--- 547,560 ----
true, NULL, NULL
},
{
+ {"wal_add_optimization_info", PGC_SIGHUP, WAL_SETTINGS,
+ gettext_noop("Writes logical log corresponding to full pages in WAL record."),
+ gettext_noop("")
+ },
+ &walAddOptimizationInfo,
+ false, NULL, NULL
+ },
+ {
{"silent_mode", PGC_POSTMASTER, LOGGING_WHEN,
gettext_noop("Runs the server silently."),
gettext_noop("If this parameter is set, the server will automatically run in the "
diff -Ncar postgresql-8.2.1.org/src/backend/utils/misc/postgresql.conf.sample postgresql-8.2.1/src/backend/utils/misc/postgresql.conf.sample
*** postgresql-8.2.1.org/src/backend/utils/misc/postgresql.conf.sample 2006-11-21 10:23:37.000000000 +0900
--- postgresql-8.2.1/src/backend/utils/misc/postgresql.conf.sample 2007-04-06 16:54:23.000000000 +0900
***************
*** 154,159 ****
--- 154,161 ----
# fsync_writethrough
# open_sync
#full_page_writes = on # recover from partial page writes
+ #wal_add_optimization_info = off # write logical log correspond to full
+ # page.
#wal_buffers = 64kB # min 32kB
# (change requires restart)
#commit_delay = 0 # range 0-100000, in microseconds
diff -Ncar postgresql-8.2.1.org/src/include/access/xlog.h postgresql-8.2.1/src/include/access/xlog.h
*** postgresql-8.2.1.org/src/include/access/xlog.h 2006-11-06 07:42:10.000000000 +0900
--- postgresql-8.2.1/src/include/access/xlog.h 2007-04-06 16:54:23.000000000 +0900
***************
*** 66,73 ****
/*
* If we backed up any disk blocks with the XLOG record, we use flag bits in
* xl_info to signal it. We support backup of up to 3 disk blocks per XLOG
! * record. (Could support 4 if we cared to dedicate all the xl_info bits for
! * this purpose; currently bit 0 of xl_info is unused and available.)
*/
#define XLR_BKP_BLOCK_MASK 0x0E /* all info bits used for bkp blocks */
#define XLR_MAX_BKP_BLOCKS 3
--- 66,74 ----
/*
* If we backed up any disk blocks with the XLOG record, we use flag bits in
* xl_info to signal it. We support backup of up to 3 disk blocks per XLOG
! * record.
! * Bit 0 of xl_info is used to represent that backup blocks are not necessary
! * in archive-log.
*/
#define XLR_BKP_BLOCK_MASK 0x0E /* all info bits used for bkp blocks */
#define XLR_MAX_BKP_BLOCKS 3
***************
*** 75,80 ****
--- 76,82 ----
#define XLR_BKP_BLOCK_1 XLR_SET_BKP_BLOCK(0) /* 0x08 */
#define XLR_BKP_BLOCK_2 XLR_SET_BKP_BLOCK(1) /* 0x04 */
#define XLR_BKP_BLOCK_3 XLR_SET_BKP_BLOCK(2) /* 0x02 */
+ #define XLR_BKP_REMOVABLE XLR_SET_BKP_BLOCK(3) /* 0x01 */

/*
* Sometimes we log records which are out of transaction control.

diff -Ncr postgresql-8.2.1.org/contrib/Makefile postgresql-8.2.1/contrib/Makefile
*** postgresql-8.2.1.org/contrib/Makefile 2006-09-09 13:07:51.000000000 +0900
--- postgresql-8.2.1/contrib/Makefile 2007-04-06 17:14:21.000000000 +0900
***************
*** 16,21 ****
--- 16,22 ----
intagg \
intarray \
isn \
+ lesslog \
lo \
ltree \
oid2name \
diff -Ncr postgresql-8.2.1.org/contrib/README postgresql-8.2.1/contrib/README
*** postgresql-8.2.1.org/contrib/README 2006-09-09 13:07:52.000000000 +0900
--- postgresql-8.2.1/contrib/README 2007-04-06 17:14:21.000000000 +0900
***************
*** 68,73 ****
--- 68,77 ----
PostgreSQL type extensions for ISBN, ISSN, ISMN, EAN13 product numbers
by Germán Méndez Bravo (Kronuz) <kronuz@hotmail.com>

+ lesslog -
+ Reduce archive log file size by removing unnecessary physical log.
+ by Koichi Suzuki <suzuki.koichi@oss.ntt.co.jp>
+
lo -
Large Object maintenance
by Peter Mount <peter@retep.org.uk>
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/Makefile postgresql-8.2.1/contrib/lesslog/Makefile
*** postgresql-8.2.1.org/contrib/lesslog/Makefile 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/Makefile 2007-04-06 17:14:21.000000000+0900
***************
*** 0 ****
--- 1,3 ----
+ all install clean:
+ $(MAKE) -f Makefile.pg_compresslog $@
+ $(MAKE) -f Makefile.pg_decompresslog $@
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/Makefile.pg_compresslog postgresql-8.2.1/contrib/lesslog/Makefile.pg_compresslog
*** postgresql-8.2.1.org/contrib/lesslog/Makefile.pg_compresslog 1970-01-0109:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/Makefile.pg_compresslog 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,19 ----
+ PROGRAM = pg_compresslog
+ OBJS = pg_compresslog.o file.o debug.o
+
+ PG_CPPFLAGS = -I$(libpq_srcdir)
+ PG_LIBS = $(libpq_pgport) $(top_builddir)/src/backend/utils/hash/pg_crc..o
+
+ DOCS = README.lesslog
+
+ ifdef USE_PGXS
+ PGXS := $(shell pg_config --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/pg_compresslog
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
+
+ $(OBJS): Makefile.pg_compresslog
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/Makefile.pg_decompresslog postgresql-8.2.1/contrib/lesslog/Makefile.pg_decompresslog
*** postgresql-8.2.1.org/contrib/lesslog/Makefile.pg_decompresslog 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/Makefile.pg_decompresslog 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,19 ----
+ PROGRAM = pg_decompresslog
+ OBJS = pg_decompresslog.o file.o debug.o
+
+ PG_CPPFLAGS = -I$(libpq_srcdir)
+ PG_LIBS = $(libpq_pgport) $(top_builddir)/src/backend/utils/hash/pg_crc..o
+
+ DOCS =
+
+ ifdef USE_PGXS
+ PGXS := $(shell pg_config --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/pg_decompresslog
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
+
+ $(OBJS): Makefile.pg_decompresslog
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/README.lesslog postgresql-8.2.1/contrib/lesslog/README.lesslog
*** postgresql-8.2.1.org/contrib/lesslog/README.lesslog 1970-01-01 09:00:00..000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/README.lesslog 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,71 ----
+ lesslog README 2006/04/06
+
+ ** What is lesslog?
+
+ lesslog is a set of tools to reduce the size of PostgreSQL archive log. lesslog consists of the following materials.
+
+ - pg_compresslog
+ This is a command to remove physical log records with "removable" mark.
+ This command should be specified as archive_command in postgresql.conf.
+ This command also removes page headers by changing page size from 8kBto
+ 16MB, which are restored by pg_decompresslog.
+
+ - pg_decompresslog
+ This command restores page headers and add dummy data to make up for
+ physical log record, finally restores LSN of each log record and restores
+ the page size to be used in the archive recovery. This command shouldbe
+ specified as restore_command in recovery.conf.
+
+
+ ** How to use lesslog
+ 1. Build and install the additional tools.
+ Move to contrib/lesslog directory, then make and make install.
+ pg_compresslog and pg_decompresslog will be installed to PostgreSQL install
+ directory.
+ 2. Edit postgresql.conf
+ Edit postgresql.conf which is copied to DB cluster by initdb and edit
+ parameters as follows.
+
+ full_page_writes = on
+ wal_add_optimization_info = on
+ archive_command = 'pg_compresslog "%p" <archive directory>/"%f"'
+
+ ** How to use pg_compresslog
+ Synopsis
+ pg_compresslog [from [to]]
+
+ Explanation
+ pg_compresslog removes physical log from the WAL segment file specified by
+ <from> and archives as <to> file name.
+
+ if <from> is omitted or specfied as "-", it reads setment file from stdin.
+ If <to> is omitted or specified as "-", it means stdout.
+
+ Physical log records removed by pg_compresslog are those written while
+ online backup is not running and both full_page_writes and
+ wal_add_optimization_info are "on".
+
+ To use the output of pg_compresslog command in archive recovery, it must be
+ restored using pg_decompresslog command.
+
+ Return value
+ pg_compresslog returns zero if no error occurs, 0 if error occurs.
+
+ ** How to use pg_decompresslog
+ Synopsis
+ pg_decompresslog [from [to]]
+
+ Explanation
+ pg_decompresslog reads archive log file specified by <from> argument and
+ restores an area corresponds to the removed physical log, which restores
+ LSN of each log record, and writes them to the file specified by <to>
+ argument.
+
+ If <from> is omitted or specified as "-", it reads from stdin. If <to> is
+ omitted of specified as "-", it writes to stdout.
+
+ You can specifiy the file written by pg_compresslog as <from> argument.
+
+ Return value
+ It returns zero if no error occurs, 1 if error occurs.
+
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/debug.c postgresql-8.2.1/contrib/lesslog/debug.c
*** postgresql-8.2.1.org/contrib/lesslog/debug.c 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/debug.c 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,125 ----
+ /*
+ * debug.c
+ * Debug dump function implementation.
+ */
+ #include <stdio.h>
+ #include <errno.h>
+
+ #include "postgres.h"
+ #include "access/xlog.h"
+ #include "access/xlog_internal.h"
+
+ void get_segment_id(const char *filename);
+ void dump_record(XLogRecPtr *ptr, size_t off, XLogRecord *precord);
+ void dump_page_header(int num, XLogPageHeader pheader);
+ void dumpXLogRecord(XLogRecPtr *ptr, size_t off, XLogRecord *record);
+
+ /* Current segment ID */
+ static uint32 segment_id;
+
+ /* List for the resource manager. */
+ static const char * const RM_names[RM_MAX_ID + 1] = {
+ "XLOG ", /* 0 */
+ "XACT ", /* 1 */
+ "SMGR ", /* 2 */
+ "CLOG ", /* 3 */
+ "DBASE", /* 4 */
+ "TBSPC", /* 5 */
+ "MXACT", /* 6 */
+ "RM 7", /* 7 */
+ "RM 8", /* 8 */
+ "RM 9", /* 9 */
+ "HEAP ", /* 10 */
+ "BTREE", /* 11 */
+ "HASH ", /* 12 */
+ "RTREE", /* 13 */
+ "GIST ", /* 14 */
+ "SEQ " /* 15 */
+ };
+
+ /*
+ * Obtain segment ID from WAL segment file name.
+ *
+ * Parameters:
+ * filename: WAL segment file name.
+ *
+ * Note: If no slash mark (path delimiter) is included in the argument, or if
+ * the argument does not follow WAL segment file name format, nothing will
+ * happen.
+ */
+ void
+ get_segment_id(const char *filename)
+ {
+ TimeLineID tli;
+ uint32 xlogid;
+ char *p;
+ p = strrchr(filename, '/');
+ if (!p)
+ return;
+ p++;
+ if (sscanf(p, "%08X%08X%08X", &tli, &xlogid, &segment_id) != 3)
+ return;
+ }
+
+ /*
+ * Dump the page header content.
+ *
+ * Paramters:
+ * num: Page number to be included in the dump output.
+ * page: Target page.
+ */
+ void
+ dump_page_header(int num, XLogPageHeader page)
+ {
+ printf("=[%04d]================================================== \n", num);
+ printf("PAGE: xlp_magic=%02X\n", page->xlp_magic);
+ printf("PAGE: xlp_info=%02X\n", page->xlp_info);
+ printf("PAGE: xlp_tli=%u\n", page->xlp_tli);
+ printf("PAGE: xlogid=%u\n", page->xlp_pageaddr.xlogid);
+ printf("PAGE: xrecoff=%u\n", page->xlp_pageaddr.xrecoff);
+ if (page->xlp_info & XLP_FIRST_IS_CONTRECORD)
+ {
+ XLogContRecord *cont =
+ (XLogContRecord *)((char *)page + XLogPageHeaderSize(page));
+ printf("PAGE: rem_len=%u\n", cont->xl_rem_len);
+ }
+ printf("========================================== ===============\n");
+ }
+
+ /*
+ * Dump record header content in xlogdump format.
+ *
+ * Parameters:
+ * ptr: Record position information (only log ID will be used).
+ * off: Record offset within the segment.
+ * record: Pointer to the record.
+ *
+ * Note: the source is copied and modified using xlogdump source.
+ */
+ void
+ dumpXLogRecord(XLogRecPtr *ptr, size_t off, XLogRecord *record)
+ {
+ static XLogRecPtr prevRecPtr = { 0, 0};
+
+ printf("%u/%08X: prv %u/%08X",
+ ptr->xlogid, (uint32)off + segment_id * XLOG_SEG_SIZE,
+ record->xl_prev.xlogid, record->xl_prev.xrecoff);
+
+ if (!XLByteEQ(record->xl_prev, prevRecPtr))
+ printf("(?)");
+ prevRecPtr.xlogid = ptr->xlogid;
+ prevRecPtr.xrecoff = (uint32)off + segment_id * XLOG_SEG_SIZE;
+
+ printf("; xid %u; ", record->xl_xid);
+
+ if (record->xl_rmid <= RM_MAX_ID)
+ printf("%s", RM_names[record->xl_rmid]);
+ else
+ printf("RM %2d", record->xl_rmid);
+
+ printf(" info %02X len %u tot_len %u\n", record->xl_info,
+ record->xl_len, record->xl_tot_len);
+
+ fflush(stdout);
+ }
+
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/debug.h postgresql-8.2.1/contrib/lesslog/debug.h
*** postgresql-8.2.1.org/contrib/lesslog/debug.h 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/debug.h 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,28 ----
+ /*
+ * debug.h
+ * Interface for debug dump function.
+ */
+ #ifndef DEBUG_H_INCLUDED
+ #define DEBUG_H_INCLUDED
+
+ #include "access/xlog.h"
+ #include "access/xlog_internal.h"
+
+ /*
+ * In the release, debug function call itself will be eliminated.
+ */
+ #ifdef DEBUG
+
+ void get_segment_id(const char *filename);
+ void dump_page_header(int num, XLogPageHeader pheader);
+ void dumpXLogRecord(XLogRecPtr *ptr, size_t off, XLogRecord *record);
+
+ #else
+
+ #define get_segment_id(a)
+ #define dump_page_header(a, b)
+ #define dumpXLogRecord(a, b, c)
+
+ #endif
+
+ #endif
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/file.c postgresql-8.2.1/contrib/lesslog/file.c
*** postgresql-8.2.1.org/contrib/lesslog/file.c 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/file.c 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,180 ----
+ /*
+ * file.c
+ * Common I/O routine implementation used in archive/restoration
+ */
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <string.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>
+ #include <unistd.h>
+ #include <errno.h>
+
+ #include "postgres.h"
+ #include "access/xlog_internal.h"
+
+ #include "file.h"
+
+ /*
+ * Read the file data and return the size actually read.
+ * The length to read is specified by an argument.
+ *
+ * Parameters:
+ * fd: File descriptor.
+ * buff: Buffer to read.
+ * len: Size to read.
+ *
+ * Note: If error occurd, exit(2) will be called here and will not returnto the
+ * caller in this case.
+ */
+ int
+ read_buff(int fd, char *buff, size_t len)
+ {
+ int ret;
+ size_t read_len = 0;
+
+ do
+ {
+ ret = read(fd, buff + read_len, len - read_len);
+ if (ret < 0)
+ {
+ if (errno == EINTR)
+ continue;
+ fprintf(stderr, "failed to read : %s\n", strerror(errno));
+ exit(1);
+ }
+ else if (ret == 0)
+ break;
+ read_len += ret;
+ } while (read_len < len);
+
+ return read_len;
+ }
+
+ /*
+ * Write to the file and return length actually written.
+ * The length to write should be specified by an argument.
+ *
+ * Parameter:
+ * fd: File descriptor.
+ * buff: Buffer to write.
+ * len: Size to write.
+ *
+ * Note: If an error occurs, exit(2) will be called in this function and will
+ * not return to the caller in this case.
+ */
+ void
+ write_buff(int fd, const char *buff, size_t len)
+ {
+ int ret;
+ int written_len = 0;
+
+ do
+ {
+ ret = write(fd, buff + written_len, len - written_len);
+ if (ret < 0)
+ {
+ if (errno == EINTR)
+ continue;
+ fprintf(stderr, "failed to write : %s\n", strerror(errno));
+ exit(1);
+ }
+ written_len += ret;
+ } while (written_len < len);
+ }
+
+ /*
+ * Copy the contents of the file.
+ *
+ * Parameter:
+ * from_fd: File descriptor of the file to copy from.
+ * to_fd: File descriptor of the file to copy to.
+ *
+ * Note: If error occurs in this function, exit(2) will be called here and will
+ * not return to the caller in this case.
+ */
+ void
+ copy_file(int from_fd, int to_fd)
+ {
+ int read_len = 0;
+ char buff[8 * 1024]; /* 8KB buffer */
+
+ while (1)
+ {
+ /* Read to the buffer. */
+ read_len = read(from_fd, buff, sizeof(buff));
+ if (read_len < 0)
+ {
+ if (errno == EINTR)
+ continue;
+ fprintf(stderr, "failed to read : %s\n", strerror(errno));
+ exit(1);
+ }
+ else if (read_len == 0)
+ break;
+ /* Write all the buffer content. */
+ write_buff(to_fd, buff, read_len);
+ }
+
+ return;
+ }
+
+ /*
+ * Validate the record by comparing CRC value.
+ *
+ * CRC value will be calculated in the following order.
+ * - Logical Log
+ * - Full page write (if exists)
+ * - Record header (exluding the CRC area)
+ *
+ * Parameters:
+ * precord: Pointer to the target record.
+ */
+ bool
+ is_valid_record(XLogRecord *precord)
+ {
+ pg_crc32 crc;
+ BkpBlock *pblk;
+ int i;
+
+ /* Calculate CRC for a logical log. */
+ INIT_CRC32(crc);
+ COMP_CRC32(crc, XLogRecGetData(precord), precord->xl_len);
+
+ /*
+ * If full page writes exist, calculate CRC for each full page write.
+ */
+ pblk = (BkpBlock *)((char *)XLogRecGetData(precord) + precord->xl_len);
+ for (i = 0; i < XLR_MAX_BKP_BLOCKS; i++)
+ {
+ uint32 blen;
+
+ if (!(precord->xl_info & XLR_SET_BKP_BLOCK(i)))
+ continue;
+
+ if (pblk->hole_offset + pblk->hole_length > BLCKSZ)
+ {
+ fprintf(stderr, "incorrect hole size in record.\n");
+ return false;
+ }
+
+ blen = sizeof(BkpBlock) + BLCKSZ - pblk->hole_length;
+ COMP_CRC32(crc, (char *)pblk, blen);
+ pblk = (BkpBlock *)((char *)pblk + blen);
+ }
+
+ /* Calculate record header CRC value. */
+ COMP_CRC32(crc, (char *)precord + sizeof(pg_crc32),
+ SizeOfXLogRecord - sizeof(pg_crc32));
+
+ /* Examine if the final CRC is the same as the value found in the record.. */
+ FIN_CRC32(crc);
+ if (!EQ_CRC32(precord->xl_crc, crc))
+ {
+ fprintf(stderr, "incorrect resource manager data checksum.\n");
+ return false;
+ }
+
+ return true;
+ }
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/file.h postgresql-8.2.1/contrib/lesslog/file.h
*** postgresql-8.2.1.org/contrib/lesslog/file.h 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/file.h 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,28 ----
+ /*
+ * file.h
+ * Common file I/O routines for pg_archive and pg_restore.
+ */
+ #ifndef FILE_H_INCLUDED
+ #define FILE_H_INCLUDED
+
+ #include "postgres.h"
+ #include "access/xlog.h"
+
+ int read_buff(int fd, char *buff, size_t len);
+ void write_buff(int fd, const char *buff, size_t len);
+ void copy_file(int from_fd, int to_fd);
+ bool is_valid_record(XLogRecord *precord);
+
+ /*
+ * Check if the page header in the buffer is valid.
+ */
+ #define IS_WAL_FILE(buff) \
+ (((XLogPageHeader)(buff))->xlp_magic == XLOG_PAGE_MAGIC)
+
+ /*
+ * Check if the record is log switch WAL record.
+ */
+ #define IS_XLOG_SWITCH(rec) \
+ ((rec)->xl_rmid == RM_XLOG_ID && (rec)->xl_info == XLOG_SWITCH)
+
+ #endif /* !FILE_H_INCLUDED */
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/pg_compresslog.c postgresql-8.2.1/contrib/lesslog/pg_compresslog.c
*** postgresql-8.2.1.org/contrib/lesslog/pg_compresslog.c 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/pg_compresslog.c 2007-04-06 17:14:21.000000000 +0900
***************
*** 0 ****
--- 1,450 ----
+ /*
+ * pg_compresslog.c
+ * Implementation of the archive command (pg_compresslog).
+ */
+ #include <stdio.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>
+ #include <unistd.h>
+ #include <errno.h>
+
+ #include "postgres.h"
+ #include "access/xlog.h"
+ #include "access/xlog_internal.h"
+ #include "catalog/pg_control.h"
+
+ #include "file.h"
+ #include "debug.h"
+
+ /* ================================================== ===========================
+ * Global variables
+ * ================================================== =========================*/
+
+ /* Buffer to read WAL segment file. */
+ static char xlog_buff[XLogSegSize];
+ /* Buffer to hold archive log file image with physical log removed. */
+ static char arch_buff[XLogSegSize];
+
+ static int cont_log_size; /* Log size considering the former segment. */
+ static int logical_log_size; /* Total size of the logical log. */
+ static int physical_log_size; /* Total size of the physical log. */
+
+ /* ================================================== ===========================
+ * Prototype declaration
+ * ================================================== =========================*/
+ static void print_usage(int code);
+ static int open_xlog_file(int argc, char *argv[]);
+ static int open_arch_file(int argc, char *argv[]);
+ int create_arch_image(const char *from, char *to);
+ static bool remove_bkp_block(XLogRecord *record);
+
+ /* ================================================== ===========================
+ * Macros
+ * ================================================== =========================*/
+ /* Check if the physical log can be removed. */
+ #define IS_REMOVABLE(record) \
+ (((record)->xl_info & XLR_BKP_BLOCK_MASK) && \
+ ((record)->xl_info & XLR_BKP_REMOVABLE))
+
+ /* ================================================== ===========================
+ * Function definitions
+ * ================================================== =========================*/
+ /*
+ * Entry point of pg_compresslog command.
+ */
+ int
+ main(int argc, char *argv[])
+ {
+ int from_fd = -1;
+ int to_fd = -1;
+ size_t xlog_len;
+ size_t arch_len;
+
+ /* Error if there are more argument(s) other than from and to. */
+ if (argc > 3)
+ print_usage(1);
+
+ /* Open WAL segment file to archive. */
+ from_fd = open_xlog_file(argc, argv);
+
+ /*
+ * If input file is not stdin, check the size of the file.
+ * If the size is not 16MB, then the specified file is not the WAL
+ * segment file. We're not sure if we can scan the file to find
+ * removable physical log, copy the whole file and then exits.
+ */
+ if (from_fd != fileno(stdin))
+ {
+ struct stat st;
+
+ if (fstat(from_fd, &st) < 0)
+ {
+ fprintf(stderr, "failed to stat `%s': %s\n", argv[1],
+ strerror(errno));
+ exit(1);
+ }
+ if (st.st_size != XLogSegSize)
+ {
+ to_fd = open_arch_file(argc, argv);
+ copy_file(from_fd, to_fd);
+ exit(0);
+ }
+ }
+
+ /*
+ * Read all the data from WAL segment file to archive.
+ * If the amount of the data is not sufficient (less than 16MB: XLogSegSize)
+ * or header is not valid, specified input file is not a WAL segment file.
+ * Copy the whole input to the output and then exit.
+ */
+ xlog_len = read_buff(from_fd, xlog_buff, XLogSegSize);
+ if (xlog_len != XLogSegSize || !IS_WAL_FILE(xlog_buff))
+ {
+ /* Write the checked header part and then copy the rest of the input file. */
+ to_fd = open_arch_file(argc, argv);
+ write_buff(to_fd, xlog_buff, xlog_len);
+ copy_file(from_fd, to_fd);
+ if (close(from_fd) < 0)
+ {
+ fprintf(stderr, "failed to close `%s': %s\n", argv[1],
+ strerror(errno));
+ exit(1);
+ }
+ exit(0);
+ }
+ if (close(from_fd) < 0)
+ {
+ fprintf(stderr, "failed to close `%s': %s\n", argv[1], strerror(errno));
+ exit(1);
+ }
+
+ /*
+ * Build the entire compressed output file image on the buffer,
+ * removing physical logs, then write the whole compressed file image.
+ */
+ arch_len = create_arch_image(xlog_buff, arch_buff);
+ to_fd = open_arch_file(argc, argv);
+ write_buff(to_fd, arch_buff, arch_len);
+ if (close(to_fd) < 0)
+ {
+ fprintf(stderr, "failed to close `%s': %s\n", argv[2], strerror(errno));
+ exit(1);
+ }
+
+ exit(0);
+ }
+
+ /*
+ * Show the usage of the command and then exits with specified code.
+ */
+ static void
+ print_usage(int code)
+ {
+ printf(
+ "usage: pg_compresslog [from [to]]\n"
+ " from - Input file name (stdin if omitted or '-' is given)\n"
+ " to - Output file name (stdiout if omitted or '-' is given)\n"
+ );
+ exit(code);
+ }
+
+ /*
+ * Build archive log file image of 16MB, from WAL segment buffer image of
+ * 8kB pages. And return size of archive log file image.
+ *
+ * Parameters:
+ * from: WAL segment buffer page
+ * to: Archive log file image
+ *
+ * Note: exit() will be called here when a error is detected. In the case of
+ * error, the control will not be given to the caller.
+ */
+ int
+ create_arch_image(const char *from, char *to)
+ {
+ const char *read_pos = from;
+ char *write_pos = to;
+ const char *crrpage = from;
+ XLogPageHeader page = (XLogPageHeader)from;
+ XLogRecord *rec = NULL;
+ XLogRecord *write_rec = NULL;
+
+ /*
+ * Copy the first page header of the segment to the buffer.
+ * If the record is the successor of the last record of the former segment,
+ * then copies XLogContRecord too.
+ * XLogContRecord.xl_rem_len means the total data length of the
+ * continuation record, not the length of the record in the given page.
+ * Therefore, this value is not influenced by the change of the page size.
+ */
+ read_pos = crrpage = from;
+ memcpy(write_pos, read_pos, XLogPageHeaderSize(page));
+ write_pos += XLogPageHeaderSize(page);
+ read_pos += XLogPageHeaderSize(page);
+ if (page->xlp_info & XLP_FIRST_IS_CONTRECORD)
+ {
+ memcpy(write_pos, read_pos, SizeOfXLogContRecord);
+ write_pos += SizeOfXLogContRecord;
+ }
+
+ /*
+ * Loop page by page.
+ */
+ for (crrpage = from; crrpage < from + XLogSegSize; crrpage += XLOG_BLCKSZ)
+ {
+ XLogRecPtr ptr;
+
+ /* Parse the page header. */
+ page = (XLogPageHeader)crrpage;
+ read_pos = crrpage + XLogPageHeaderSize(page);
+ ptr = page->xlp_pageaddr;
+
+ /* If there is a continuous data, copy them to the write buffer. */
+ if (page->xlp_info & XLP_FIRST_IS_CONTRECORD)
+ {
+ int cont_len = ((XLogContRecord *)read_pos)->xl_rem_len;
+ int copy_len = cont_len;
+ int free_len = XLOG_BLCKSZ -
+ (read_pos + SizeOfXLogContRecord - crrpage);
+ /*
+ * Copy the continuous data within this page.
+ * xl_rem_len specifies the length of the continuous data after this page,
+ * so this may be larger than the length of the rest of this page.
+ */
+ if (copy_len > free_len)
+ copy_len = free_len;
+ memcpy(write_pos, read_pos + SizeOfXLogContRecord, copy_len);
+ read_pos += MAXALIGN(SizeOfXLogContRecord + copy_len);
+ write_pos += copy_len;
+ if (!rec)
+ cont_log_size += copy_len;
+
+ /*
+ * If the data continues to the next page and no record header
+ * exists in this file, then switch to the next page.
+ */
+ if (cont_len != copy_len)
+ continue;
+
+ /*
+ * Set the write position to the end of the current record,
+ * considering alignment.
+ */
+ write_pos = to + MAXALIGN(write_pos - to);
+
+ /*
+ * If the record should have a header in this segment (not a continuous
+ * record from the last segment), perform CRC check and check if
+ * physical log record can be removed.
+ */
+ if (write_rec)
+ {
+ /* Check if the record is valid. */
+ if (!is_valid_record(write_rec))
+ exit(1);
+
+ /*
+ * Determine if the physical log can be removed.
+ * If it can be removed, then rewind the position for the next log record
+ * to the position of the physical log (plus padding).
+ */
+ if (remove_bkp_block(write_rec))
+ write_pos = (char *)write_rec +
+ MAXALIGN(SizeOfXLogRecord + rec->xl_len);
+ }
+ }
+
+ /* Read the data within the page record by record. */
+ while(read_pos <= crrpage + XLOG_BLCKSZ - SizeOfXLogRecord)
+ {
+ int freespace = XLOG_BLCKSZ - (read_pos - crrpage);
+
+ /* Obtain the record header info. */
+ rec = (XLogRecord *)read_pos;
+ write_rec = (XLogRecord *)write_pos;
+ logical_log_size += rec->xl_len;
+ physical_log_size +=
+ rec->xl_tot_len - (SizeOfXLogRecord + rec->xl_len);
+ dumpXLogRecord(&ptr, read_pos - from, rec);
+
+ /*
+ * If the record continues to the following pages, copy only the portion
+ * in this page and then switch to the next page.
+ */
+ if (rec->xl_tot_len > freespace)
+ {
+ /* Copy the log data only in the current page. */
+ memcpy(write_pos, read_pos, freespace);
+ /* read_pos will be overwritten at the next loop. We don't need to update this here. */
+ write_pos += freespace;
+ break;
+ }
+
+ /* Copy the record data to the archive buffer. */
+ memcpy(write_pos, read_pos, rec->xl_tot_len);
+ read_pos += MAXALIGN(rec->xl_tot_len);
+ write_pos += MAXALIGN(rec->xl_tot_len);
+
+ /* Check if the record is valid using CRC in the record header. */
+ if (!is_valid_record(write_rec))
+ exit(1);
+
+ /*
+ * Log record other than log switch must have it's logical data.
+ * See the comment around the line 3065 of src/backend/access/transam/xlog.c
+ * (8.2.0).a
+ */
+ if (IS_XLOG_SWITCH(write_rec))
+ {
+ if (write_rec->xl_len != 0)
+ {
+ fprintf(stderr, "invalid xlog switch record.\n");
+ exit(1);
+ }
+ }
+ else if (write_rec->xl_len == 0)
+ {
+ fprintf(stderr, "invalid record length.\n");
+ exit(1);
+ }
+
+ /* If the log record is the log switch record, then no more log recordexists
+ * in * the input file. Exit.
+ */
+ if (IS_XLOG_SWITCH(write_rec))
+ return write_pos - to;
+
+ /*
+ * If the physical log is removable, then rewind the position of the next
+ * record to
+ * the physical log start position (and padding).
+ */
+ if (remove_bkp_block(write_rec))
+ write_pos = (char *)write_rec +
+ MAXALIGN(SizeOfXLogRecord + write_rec->xl_len);
+ else
+ write_pos = to + MAXALIGN(write_pos - to);
+
+ }
+ }
+
+ return (write_pos - to);
+ }
+
+ /*
+ * Remove the physical log which was marked `REMOVABLE'.
+ * Return true if the physical record has been removed, false otherwise.
+ */
+ static bool
+ remove_bkp_block(XLogRecord *record)
+ {
+ pg_crc32 crc;
+
+ /*
+ * If no record is specified or the physical log is not removable, just
+ * return.
+ */
+ if (!record || !IS_REMOVABLE(record))
+ return false;
+
+ /*
+ * Reset XLR_BKP_BLOCK_MASK.
+ * We need the flag to show the physical log is removable to restore
+ * removed physical log with a dummy. It is not reset.
+ */
+ record->xl_info &= ~XLR_BKP_BLOCK_MASK;
+
+ /*
+ * Record contents changes by physical log removal and CRC has to be
+ * recalculated.
+ * CRC will be accumulated as follows:
+ * 1. Logical log
+ * 2. Physical log (It has ben removed and we don't calculate its CRC here).
+ * 3. WAL record header excluding CRC part
+ * Please refer to the line 2817 of RecordIsValid(), src/backend/access/transam/xlog.c.
+ */
+ INIT_CRC32(crc);
+ COMP_CRC32(crc, XLogRecGetData(record), record->xl_len);
+ COMP_CRC32(crc, (char *)record + sizeof(pg_crc32),
+ SizeOfXLogRecord - sizeof(pg_crc32));
+ FIN_CRC32(crc);
+ record->xl_crc = crc;
+
+ return true;
+ }
+
+ /*
+ * Open the WAL segment file to archive and return file descriptor.
+ *
+ * The first argument of pg_compresslog will be regarded as an input file.
+ * If omitted or specified as "-", stdin will be used as an input file.
+ *
+ * Parameters:
+ * argc: Number of arguments (argument to main() will be passed as is).
+ * argv: Array of pointers to argument strings (argument to main() will be
+ * passed as is).
+ *
+ * Note: exit() will be called here if error occurs. Will not return to the
+ * caller in this case.
+ */
+ static int
+ open_xlog_file(int argc, char *argv[])
+ {
+ int from_fd = -1;
+
+ if (argc > 1 && strcmp(argv[1], "-") != 0)
+ {
+ /* Open WAL segment file to archive. */
+ from_fd = open(argv[1], O_RDONLY, 0);
+ if (from_fd < 0)
+ {
+ fprintf(stderr, "failed to open `%s': %s\n", argv[1],
+ strerror(errno));
+ exit(1);
+ }
+
+ /* Obtain segment ID from the file name (for record dump). */
+ get_segment_id(argv[1]);
+ }
+ else
+ from_fd = fileno(stdin);
+
+ return from_fd;
+ }
+
+ /*
+ * Open the archive segment file to write the result and return file descriptor.
+ *
+ * The second argument to pg_compresslog will be regarded as an output file.
+ * If omitted or specified as "-", stdout will be used as an output file.
+ *
+ * Parameters:
+ * argc: Number of arguments (argument to main() will be passed as is).
+ * argv: Array of pointers to argument strings (argument to main() will be
+ *
+ * Note: When an error occurs within this function, exit() will be calledhere
+ * and will not return to the caller in this case.
+ */
+ static int
+ open_arch_file(int argc, char *argv[])
+ {
+ int to_fd = -1;
+
+ if (argc > 2 && strcmp(argv[2], "-") != 0)
+ {
+ /* Open the archive log file. */
+ to_fd = open(argv[2], O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
+ S_IRUSR | S_IWUSR);
+ if (to_fd < 0)
+ {
+ fprintf(stderr, "failed to open `%s': %s\n", argv[2],
+ strerror(errno));
+ exit(1);
+ }
+ }
+ else
+ to_fd = fileno(stdout);
+
+ return to_fd;
+ }
diff -Ncr postgresql-8.2.1.org/contrib/lesslog/pg_decompresslog.c postgresql-8.2.1/contrib/lesslog/pg_decompresslog.c
*** postgresql-8.2.1.org/contrib/lesslog/pg_decompresslog.c 1970-01-01 09:00:00.000000000 +0900
--- postgresql-8.2.1/contrib/lesslog/pg_decompresslog.c 2007-04-06 17:14:21..000000000 +0900
***************
*** 0 ****
--- 1,543 ----
+ /*
+ * file pg_decompresslog.c
+ * Implementation of the archive restore command (pg_decompresslog).
+ */
+ #include <stdio.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <fcntl.h>
+ #include <unistd.h>
+
+ #include "postgres.h"
+ #include "access/xlog.h"
+ #include "access/xlog_internal.h"
+ #include "catalog/pg_control.h"
+
+ #include "file.h"
+ #include "debug.h"
+
+ /* ================================================== ===========================
+ * Global variables
+ * ================================================== =========================*/
+
+ /* Buffer to hold restored WAL segment file image. */
+ static char xlog_buff[XLogSegSize];
+ /* Buffer to read an archive log file. */
+ static char arch_buff[XLogSegSize];
+ /* Position to write a record data. */
+ static char *write_pos = xlog_buff;
+ /* Position to read record data in the archive log buffer. */
+ static char *read_pos = arch_buff;
+ /* This holds data in the first page header of the segment. */
+ static XLogPageHeaderData baseheader;
+
+ /* ================================================== ===========================
+ * Prototype declaration
+ * ================================================== =========================*/
+ static void print_usage(int code);
+ int create_wal_image(int arch_len);
+ static int write_record(char *record_buff, int rem_len, bool isFromPrevSeg);
+ static int get_freespace(void);
+ static void insert_XLogContRecord(char *write_pos, int rem_len);
+ static void insert_pageheader(char *write_pos, XLogPageHeader pheader,
+ bool hasContRecord);
+ static int open_arch_file(int argc, char *argv[]);
+ static int open_xlog_file(int argc, char *argv[]);
+
+ /* ================================================== ===========================
+ * Function definitions
+ * ================================================== =========================*/
+
+ /*
+ * Entry point of pg_decompresslog command.
+ */
+ int
+ main(int argc, char *argv[])
+ {
+ int from_fd = -1;
+ int to_fd = -1;
+ size_t arch_len;
+
+ /* Error if argument(s) other than <from>, and <to> are given. */
+ if (argc > 3)
+ print_usage(1);
+
+ /* Open the archive log file to restore. */
+ from_fd = open_arch_file(argc, argv);
+
+ /*
+ * Read all the data in the input archive log file.
+ * If the header at the first page is not valid, it is not a WAL segmentfile
+ * and then copy the whole input file to the output file.
+ */
+ arch_len = read_buff(from_fd, arch_buff, XLogSegSize);
+ if (!IS_WAL_FILE(arch_buff))
+ {
+ /* Write what is read for header validation check and then copy the rest of the input file. */
+ to_fd = open_xlog_file(argc, argv);
+ write_buff(to_fd, arch_buff, arch_len);
+ copy_file(from_fd, to_fd);
+ if (close(from_fd) < 0)
+ {
+ fprintf(stderr, "failed to close `%s': %s\n", argv[1],
+ strerror(errno));
+ exit(1);
+ }
+ exit(0);
+ }
+ if (close(from_fd) < 0)
+ {
+ fprintf(stderr, "failed to close `%s': %s\n", argv[1], strerror(errno));
+ exit(1);
+ }
+
+ /*
+ * Build the restored WAL segment file image.
+ * Write all the restored WAL segment file image.
+ */
+ if (create_wal_image((int)arch_len))
+ {
+ fprintf(stderr, "failed to create the image of `%s'\n", argv[1]);
+ exit(1);
+ }
+ to_fd = open_xlog_file(argc, argv);
+ write_buff(to_fd, xlog_buff, XLogSegSize);
+ if (close(to_fd) < 0)
+ {
+ fprintf(stderr, "failed to close `%s': %s\n", argv[2], strerror(errno));
+ exit(1);
+ }
+
+ exit(0);
+ }
+
+ /*
+ * Show the usage of the command and exit with specified code.
+ */
+ static void
+ print_usage(int code)
+ {
+ printf(
+ "usage: pg_decompresslog [from [to]]\n"
+ " from - Iput file name (stdin if omitted or specified as '-')\n"
+ " to - Output file name (stdout if omitted or specified as '-')\n"
+ );
+ exit(code);
+ }
+
+ /*
+ * Restore 8KB page WAl segment file image from 16MB page archive log build by
+ * pg_compresslog command.
+ *
+ * Parameters:
+ * arch_len: Size of the archive log file.
+ */
+ int
+ create_wal_image(int arch_len)
+ {
+ /* Buffer holding one record data. */
+ static char record_buff[XLogSegSize];
+ XLogPageHeader pheader;
+ XLogContRecord *pcontrec = NULL;
+ XLogRecord *precord = NULL;
+ char *rec_write_pos = record_buff;
+ int rec_len = 0;
+ bool isFromPrevSeg = false;
+
+ /*
+ * Copy the archive log file page header and hold info in the header.
+ * They are used to restore page headers of WAL segment file.
+ */
+ pheader = (XLogPageHeader)arch_buff;
+ if (XLogPageHeaderSize(pheader) != SizeOfXLogLongPHD)
+ {
+ fprintf(stderr, "invalid pageheader size.\n");
+ return -1;
+ }
+ memcpy(write_pos, (char *)pheader, SizeOfXLogLongPHD);
+ read_pos += SizeOfXLogLongPHD;
+ write_pos += SizeOfXLogLongPHD;
+
+ baseheader.xlp_magic = pheader->xlp_magic;
+ baseheader.xlp_info &= ~XLP_ALL_FLAGS;
+ baseheader.xlp_tli = pheader->xlp_tli;
+ baseheader.xlp_pageaddr.xlogid = pheader->xlp_pageaddr.xlogid;
+ baseheader.xlp_pageaddr.xrecoff = pheader->xlp_pageaddr.xrecoff;
+
+ /*
+ * Copy XLogContRecord and the continuous record to the record buffer, if there is
+ * a continuous record from the last segment file. Then move them to WAL segment
+ * file image buffer.
+ */
+ if (pheader->xlp_info & XLP_FIRST_IS_CONTRECORD)
+ {
+ pcontrec = (XLogContRecord *)read_pos;
+
+ /* If the size of the continue record is not valid, it's an error. */
+ if (pcontrec->xl_rem_len == 0)
+ {
+ printf("invalid continue record length : xl_rem_len = %u\n",
+ pcontrec->xl_rem_len);
+ return -1;
+ }
+
+ memcpy(write_pos, read_pos, SizeOfXLogContRecord);
+ write_pos += SizeOfXLogContRecord;
+ rec_len = pcontrec->xl_rem_len;
+ memcpy(rec_write_pos, (read_pos + SizeOfXLogContRecord), rec_len);
+ read_pos += MAXALIGN(SizeOfXLogContRecord + rec_len);
+ isFromPrevSeg = true;
+
+ /* Write the continuous data to WAL segment file image buffer. */
+ if (write_record(record_buff, rec_len, isFromPrevSeg))
+ return 0;
+ }
+ isFromPrevSeg = false;
+ dump_page_header(((write_pos - xlog_buff) / XLOG_BLCKSZ),
+ (XLogPageHeader)xlog_buff);
+
+ /*
+ * Loop record by record, and build each record image in the record buffer.
+ */
+ while ((read_pos - arch_buff) < arch_len)
+ {
+ /* Set the write position of the record data. */
+ rec_write_pos = record_buff;
+ precord = (XLogRecord *)read_pos;
+
+ /*
+ * If the record data fits in the current segment, validate the record.
+ * If WAL record cotinues to the next segment, we cannot calculate CRC for
+ * the whole record and skip the validation.
+ */
+ if ((char *)precord - arch_buff + precord->xl_tot_len <= arch_len)
+ if (!is_valid_record(precord))
+ exit(1);
+
+ /*
+ * Record other than the log switch must have corresponding logical data.
+ * Refer to the comment around the line 3056 in src/backend/access/transam/xlog.c (8.2.0).
+ */
+ if (IS_XLOG_SWITCH(precord))
+ {
+ if (precord->xl_len != 0)
+ {
+ fprintf(stderr, "invalid xlog switch record.\n");
+ exit(1);
+ }
+ }
+ else if (precord->xl_len == 0)
+ {
+ fprintf(stderr, "invalid record length.\n");
+ exit(1);
+ }
+
+ /*
+ * Copy the record header and the logical log to the record buffer.
+ * We don't move read_pos here to cauculate the alignment considering
+ * physical log.
+ */
+ memcpy(rec_write_pos, read_pos, (SizeOfXLogRecord + precord->xl_len));
+ rec_write_pos += (SizeOfXLogRecord + precord->xl_len);
+ rec_len = precord->xl_tot_len;
+
+ /* Copy the physical log, or restore it. */
+ if (precord->xl_tot_len > SizeOfXLogRecord + precord->xl_len)
+ {
+ /*
+ * If physical log does exist (not removed), then simply copy it.
+ * If physical log is removed, then build a dummy.
+ */
+ if (precord->xl_info & XLR_BKP_BLOCK_MASK)
+ {
+ memcpy(rec_write_pos,
+ XLogRecGetData((XLogRecord *)read_pos) + precord->xl_len,
+ precord->xl_tot_len - (SizeOfXLogRecord + precord->xl_len));
+ read_pos += MAXALIGN(precord->xl_tot_len);
+ }
+ else
+ {
+ /*
+ * Because full page write flag will be omitted during the archiving,CRC check
+ * should be performed only against the record header and the logicallog.
+ * Therefore, we don't have to recalculate CRC value here.
+ */
+ memset(rec_write_pos, '\0',
+ precord->xl_tot_len - (SizeOfXLogRecord + precord->xl_len));
+ read_pos += MAXALIGN(SizeOfXLogRecord + precord->xl_len);
+ }
+ }
+ else
+ {
+ /* If theres not physical log in the original log, simply updates the input position. */
+ read_pos += MAXALIGN(precord->xl_tot_len);
+ }
+
+ /*
+ * Write a record image to the WAL segment image buffer. Hey, the pagesize
+ * is again 8kB as the original WAL.
+ */
+ if (write_record(record_buff, rec_len, isFromPrevSeg))
+ break;
+
+ }
+
+ return 0;
+ }
+
+ /*
+ * Build a record image to fit in 8kB page to WAL segment image buffer.
+ *
+ * Return value:
+ * 0: One record data build complete.
+ * 1: Hit the tail of the segment.
+ */
+ static int
+ write_record(char *record_buff, int rem_len, bool isFromPrevSeg)
+ {
+ char *phead_pos = NULL;
+ char *rec_read_pos = record_buff; /* Read position in the record buffer. */
+ char *rec_head_pos = NULL;
+ int freespace; /* Size of the free space in the page. */
+ bool hasContRecord = isFromPrevSeg;
+
+ /*
+ * Hold the position of the record header (or XLogContRecord).
+ * It is needed for an alignment.
+ */
+ rec_head_pos = write_pos; /* Set the write position of the record data.. */
+ if (isFromPrevSeg)
+ rec_head_pos -= SizeOfXLogContRecord;
+
+ freespace = get_freespace();
+
+ /*
+ * If free space size is the same as the page size, it means that the last record restoration
+ * filled the last page. So we add page header here.
+ * Because the record is complete in the last page, we don't need XLogContRecord.
+ * Page header at the top of the segment must have had written at the first call of this function.
+ * So we always add short format header here.
+ */
+ if (freespace == XLOG_BLCKSZ)
+ {
+ phead_pos = write_pos;
+ insert_pageheader(write_pos, &baseheader, false);
+ write_pos += SizeOfXLogShortPHD;
+ freespace = (XLOG_BLCKSZ - SizeOfXLogShortPHD);
+ rec_head_pos = write_pos;
+ dump_page_header(((write_pos - xlog_buff) / XLOG_BLCKSZ),
+ (XLogPageHeader)phead_pos);
+ }
+
+ /*
+ * Loop page by page.
+ */
+ while(1)
+ {
+ /*
+ * If the record header does not fit the page, then insert a page header to the next
+ * page and copy the record data.
+ */
+ if (!hasContRecord && freespace < SizeOfXLogRecord)
+ {
+ write_pos += freespace;
+ }
+ else if (freespace < rem_len)
+ {
+ /*
+ * If the record data does not fit the page, fill this page with the former
+ * part of the record, copy the rest to the next page, insert a page header
+ * and XLogContRecord.
+ * the next page.
+ */
+ memcpy(write_pos, rec_read_pos, freespace);
+ if (!hasContRecord)
+ dumpXLogRecord(&baseheader.xlp_pageaddr,
+ (size_t)(rec_head_pos - xlog_buff),
+ (XLogRecord *)rec_head_pos);
+ write_pos += freespace;
+ rec_read_pos += freespace;
+ rem_len -= freespace;
+ hasContRecord = true;
+ }
+ else
+ {
+ /*
+ * If th recor data fits to the page, copy the whole record data to the
+ * buffer and switch to the next record.
+ */
+ int len;
+ memcpy(write_pos, rec_read_pos, rem_len);
+ if (!hasContRecord)
+ dumpXLogRecord(&baseheader.xlp_pageaddr,
+ (size_t)(rec_head_pos - xlog_buff),
+ (XLogRecord *)rec_head_pos);
+
+ /*
+ * Alignment handling.
+ * Alignment has to be adjusted for each record.
+ */
+ if (hasContRecord)
+ len = MAXALIGN(SizeOfXLogContRecord + rem_len);
+ else
+ len = MAXALIGN(rem_len);
+ write_pos = rec_head_pos + len;
+ hasContRecord = false;
+
+ break;
+ }
+
+ /*
+ * Insert a page header.
+ * If the start of the page is a continuous data from the last page,
+ * insert XLogContRecor too.
+ */
+ if ((write_pos - xlog_buff) >= XLogSegSize)
+ return 1;
+ phead_pos = write_pos;
+ insert_pageheader(write_pos, &baseheader, hasContRecord);
+ write_pos += SizeOfXLogShortPHD;
+ freespace = (XLOG_BLCKSZ - SizeOfXLogShortPHD);
+ rec_head_pos = write_pos;
+ if (hasContRecord)
+ {
+ insert_XLogContRecord(write_pos, rem_len);
+ write_pos += SizeOfXLogContRecord;
+ freespace -= SizeOfXLogContRecord;
+ }
+ dump_page_header(((write_pos - xlog_buff) / XLOG_BLCKSZ),
+ (XLogPageHeader)phead_pos);
+ }
+ return 0;
+ }
+
+ /*
+ * Calculate free space size of the page.
+ */
+ static int
+ get_freespace(void)
+ {
+ return XLOG_BLCKSZ - (write_pos - xlog_buff) % XLOG_BLCKSZ;
+ }
+
+ /*
+ * Insert a XLogContRecord to the buffer.
+ *
+ * Parameters:
+ * write_pos: Write position in the buffer.
+ * rem_len: Length of the remaining record which continues from the last
+ * page.
+ */
+ static void
+ insert_XLogContRecord(char *write_pos, int rem_len)
+ {
+ XLogContRecord contrec;
+
+ contrec.xl_rem_len = rem_len;
+ memcpy(write_pos, (char *)&contrec, SizeOfXLogContRecord);
+ }
+
+ /*
+ * Insert a page header to the buffer.
+ *
+ * Parameters:
+ * write_pos: Write position in the buffer.
+ * pheader: Pointer to the structure holding header info at the firt page of
+ * the segment.
+ * hasContRecord: Flag to indicate a continuous record from the last page.
+ */
+ static void
+ insert_pageheader(char *write_pos, XLogPageHeader pheader, bool hasContRecord)
+ {
+ /*
+ * Each page header is restored using the page header at the first page of
+ * the WAL segment. Magic number (xlp_magic), timeline id (xlp_tli) and
+ * XLOGID (xlogid) should no change within a segment and they are copied
+ * from the first page header. Continuous data (xlp_info) depends on the
+ * record of a given page.
+ * xrecoff is calculated by adding XLOG_BLKSZ to xrecoff value in the first
+ * page header.
+ */
+ pheader->xlp_info &= ~XLP_ALL_FLAGS;
+ if (hasContRecord)
+ pheader->xlp_info |= XLP_FIRST_IS_CONTRECORD;
+ pheader->xlp_pageaddr.xrecoff += XLOG_BLCKSZ;
+ memcpy(write_pos, (char *)pheader, SizeOfXLogShortPHD);
+ }
+
+ /*
+ * Open thie archive log file to be restored and return file descriptor.
+ *
+ * The first argument of the command is an input file name.
+ * If omitted or specified as "-", stdin will be used as an input file.
+ *
+ * Parameters:
+ * argc: This is one of the argument to the command, number of the arguments.
+ * argv: This is one of the argument to the command, a pointer arry tothe
+ * argument list.
+ *
+ * Note: If error occurs within this function, whole command will exit here
+ * using exit() and the caller will not have any chance to take care of errors.
+ */
+ static int
+ open_arch_file(int argc, char *argv[])
+ {
+ int from_fd = -1;
+
+ if (argc > 1 && strcmp(argv[1], "-") != 0)
+ {
+ /* Open archive log file to restore. */
+ from_fd = open(argv[1], O_RDONLY, 0);
+ if (from_fd < 0)
+ {
+ fprintf(stderr, "failed to open `%s': %s\n", argv[1],
+ strerror(errno));
+ exit(1);
+ }
+
+ /* Obtain the segment ID from the file name (for record dump). */
+ get_segment_id(argv[1]);
+ }
+ else
+ from_fd = fileno(stdin);
+
+ return from_fd;
+ }
+
+ /*
+ * Open the WAL segment file to write the restored data and return file
+ * descriptor.
+ *
+ * The secomd argument to the command will be regarded as an output file.
+ * If omitted or specified as "-", stdout will be used as the output file.
+ *
+ * Parameters:
+ * argc: This is one of the argument to the command, number of the arguments.
+ * argv: This is one of the argument to the command, a pointer arry tothe
+ * argument list.
+ *
+ * Note: If error occurs within this function, whole command will exit here
+ * using exit() and the caller will not have any chance to take care of errors.
+ */
+ static int
+ open_xlog_file(int argc, char *argv[])
+ {
+ int to_fd = -1;
+
+ if (argc > 2 && strcmp(argv[2], "-") != 0)
+ {
+ /* Open the WAL segment file */
+ to_fd = open(argv[2], O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
+ S_IRUSR | S_IWUSR);
+ if (to_fd < 0)
+ {
+ fprintf(stderr, "failed to open `%s': %s\n", argv[2],
+ strerror(errno));
+ exit(1);
+ }
+ }
+ else
+ to_fd = fileno(stdout);
+
+ return to_fd;
+ }


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:31 AM.