vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Josh; I'd like to explain what the term "compression" in my proposal means again and would like to show the resource consumption comparision with cp and gzip. My proposal is to remove unnecessary full page writes (they are needed in crash recovery from inconsistent or partial writes) when we copy WAL to archive log and rebuilt them as a dummy when we restore from archive log. Dummy is needed to maintain LSN. So it is very very different from general purpose compression such as gzip, although pg_compresslog compresses archive log as a result. As to CPU and I/O consumption, I've already evaluated as follows: 1) Collect all the WAL segment. 2) Copy them by different means, cp, pg_compresslog and gzip. and compared the ellapsed time as well as other resource consumption. Benchmark: DBT-2 Database size: 120WH (12.3GB) Total WAL size: 4.2GB (after 60min. run) Elapsed time: cp: 120.6sec gzip: 590.0sec pg_compresslog: 79.4sec Resultant archive log size: cp: 4.2GB gzip: 2.2GB pg_compresslog: 0.3GB Resource consumption: cp: user: 0.5sec system: 15.8sec idle: 16.9sec I/O wait: 87.7sec gzip: user: 286.2sec system: 8.6sec idle: 260.5sec I/O wait: 36.0sec pg_compresslog: user: 7.9sec system: 5.5sec idle: 37.8sec I/O wait: 28.4sec Because the resultant log size is considerably smaller than cp or gzip, pg_compresslog need much less I/O and because the logic is much simpler than gzip, it does not consume CPU. The term "compress" may not be appropriate. We may call this "log optimization" instead. So I don't see any reason why this (at least optimization "mark" in each log record) can't be integrated. Simon Riggs wrote: > On Thu, 2007-03-29 at 11:45 -0700, Josh Berkus wrote: > >>> OK, different question: >>> Why would anyone ever set full_page_compress = off? >> The only reason I can see is if compression costs us CPU but gains RAM & >> I/O. I can think of a lot of applications ... benchmarks included ... >> which are CPU-bound but not RAM or I/O bound. For those applications, >> compression is a bad tradeoff. >> >> If, however, CPU used for compression is made up elsewhere through smaller >> file processing, then I'd agree that we don't need a switch. As I wrote to Simon's comment, I concern only one thing. Without a switch, because both full page writes and corresponding logical log is included in WAL, this will increase WAL size slightly (maybe about five percent or so). If everybody is happy with this, we don't need a switch. > > Koichi-san has explained things for me now. > > I misunderstood what the parameter did and reading your post, ISTM you > have as well. I do hope Koichi-san will alter the name to allow > everybody to understand what it does. > Here're some candidates: full_page_writes_optimize full_page_writes_mark: means it marks full_page_write as "needed in crash recovery", "needed in archive recovery" and so on. I don't insist these names. It's very helpful if you have any suggestion to reflect what it really means. Regards; -- Koichi Suzuki ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| > Archive recovery needs the > normal xlog record, which in some cases has been optimised > away because the backup block is present, since the full > block already contains the changes. Aah, I didn't know that optimization exists. I agree that removing that optimization is good/ok. Andreas ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Fri, 2007-04-13 at 11:47 -0400, Tom Lane wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > On Fri, 2007-04-13 at 10:36 -0400, Tom Lane wrote: > >> That's what bothers me about this patch, too. It will be increasing > >> the cost of writing WAL (more data -> more CRC computation and more > >> I/O, not to mention more contention for the WAL locks) which translates > >> directly to a server slowdown. > > > I don't really understand this concern. > > The real objection is that a patch that's alleged to make WAL smaller > actually does the exact opposite. Now maybe you can buy that back > downstream of the archiver --- after yet more added-on processing --- > but it still seems that there's a fundamental misdesign here. > > > Koichi-san has included a parameter setting that would prevent any > > change at all in the way WAL is written. > > It bothers me that we'd need to have such a switch. That's just another > way to shoot yourself in the foot, either by not enabling it (in which > case applying pg_compresslog as it stands would actively break your > WAL), or by enabling it when you weren't actually going to use > pg_compresslog (because you misunderstood the documentation to imply > that it'd make your WAL smaller by itself). What I want to see is a > patch that doesn't bloat WAL at all and therefore doesn't need a switch. > I think Andreas is correct to complain that it should be done that way. I agree with everything you say because we already had *exactly* this discussion when the patch was already submitted, with me saying everything you just said. After a few things have been renamed to show their correct function and impact, I am now comfortable with this patch. Writing lots of additional code simply to remove a parameter that *might* be mis-interpreted doesn't sound useful to me, especially when bugs may leak in that way. My take is that this is simple and useful *and* we have it now; other ways don't yet exist, nor will they in time for 8.3. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| Sorry I was very late to find this. With DBT-2 benchmark, I've already compared the amount of WAL. The result was as follows: Amount of WAL after 60min. run of DBT-2 benchmark wal_add_optimization_info = off (default) 3.13GB wal_add_optimization_info = on (new case) 3.17GB -> can be optimized to 0.31GB by pg_compresslog. So the difference will be around a couple of percents. I think this is very good figure. For information, DB Size: 12.35GB (120WH) Checkpoint timeout: 60min. Checkpoint occured only once in the run. ---------------- I don't think replacing LSN works fine. For full recovery to the current time, we need both archive log and WAL. Replacing LSN will make archive log LSN inconsistent with WAL's LSN and the recovery will not work. Reconstruction to regular WAL is proposed as pg_decompresslog. We should be careful enough not to make redo routines confused with the dummy full page writes, as Simon suggested. So far, it works fine. Regards; Zeugswetter Andreas ADI SD wrote: >>> Yup, this is a good summary. >>> >>> You say you need to remove the optimization that avoids the logging > of >>> a new tuple because the full page image exists. >>> I think we must already have the info in WAL which tuple inside the >>> full page image is new (the one for which we avoided the WAL entry >>> for). >>> >>> How about this: >>> Leave current WAL as it is and only add the not removeable flag to >>> full pages. >>> pg_compresslog then replaces the full page image with a record for > the >>> one tuple that is changed. >>> I tend to think it is not worth the increased complexity only to > save >>> bytes in the uncompressed WAL though. >> It is essentially what my patch proposes. My patch includes >> flag to full page writes which "can be" removed. > > Ok, a flag that marks full page images that can be removed is perfect. > > But you also turn off the optimization that avoids writing regular > WAL records when the info is already contained in a full-page image > (increasing the > uncompressed size of WAL). > It was that part I questioned. As already stated, maybe I should not > have because > it would be too complex to reconstruct a regular WAL record from the > full-page image. > But that code would also be needed for WAL based partial replication, so > if it where too > complicated we would eventually want a switch to turn off the > optimization anyway > (at least for heap page changes). > >>> Another point about pg_decompresslog: >>> >>> Why do you need a pg_decompresslog ? Imho pg_compresslog should >>> already do the replacing of the full_page with the dummy entry. Then > >>> pg_decompresslog could be a simple gunzip, or whatever compression > was >>> used, but no logic. >> Just removing full page writes does not work. If we shift the rest > of >> the WAL, then LSN becomes inconsistent in compressed archive logs > which >> pg_compresslog produces. For recovery, we have to restore LSN as the > >> original WAL. Pg_decompresslog restores removed full page writes as > a >> dumm records so that recovery redo functions won't be confused. > > Ah sorry, I needed some pgsql/src/backend/access/transam/README reading. > > LSN is the physical position of records in WAL. Thus your dummy record > size is equal to what you cut out of the original record. > What about disconnecting WAL LSN from physical WAL record position > during replay ? > Add simple short WAL records in pg_compresslog like: advance LSN by 8192 > bytes. > > Andreas > -- ------------- Koichi Suzuki ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| Thread Tools | |
| Display Modes | |
|
|