This is a discussion on NOLOGGING option, or ? within the pgsql Hackers forums, part of the PostgreSQL category; --> On 6/1/05, Hans-Jürgen Schönig <postgres@cybertec.at> wrote: > Personally I don't think that it is a good idea to do ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On 6/1/05, Hans-Jürgen Schönig <postgres@cybertec.at> wrote: > Personally I don't think that it is a good idea to do that. > People will tend to corrupt their systems because they want speed > (sometimes without thinking about the consequences). > > I can only think of one scenario where nologging would actually make > sense: Many people use session tables to keep track of user level > information on a website. corrupting a session table (usually not very > large) would not cause a lot of problems. Well, from what I know, TEMPORARY tables are not WAL-logged, since they won't exist after "restart" (since none of the current sessions would exist). The problem with TEMPORARY tables is that they are not globally visible. I think it would be neat to be able to create "server-life-time" tables. I.e. table which is initially empty, can be used by all users, but is guaranteed to be empty (truncated) upon server restart. A place to store global temporary variables. A table type to put on a RAM-disk. Potential uses? Imagine a site which has a busy 'session' table and a relatively 'static' other tables. Most of WALs would consist on 'useless' updates to session table. And recovery using WAL files would take longer (as PostgreSQL would have to dump and restore whole a lot of session data). Having a "global temporary" table would be helpful in such a situation. And theoretically it wouldn't need to "spill to disk" at all, provided it was small enough. Regards, Dawid ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| |||
| On Wed, 2005-06-01 at 10:18 -0400, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> Not unless you are proposing to change COPY to acquire a lock strong > >> enough to lock out other writers to the table for the duration ... > > > Well, if the table is initally empty, what harm is there in locking the > > table? > > You cannot *know* whether it is empty unless you lock the table before > you look. So your argument is circular. > > I think this only makes sense as an explicit option to COPY, one of the > effects of which would be to take a stronger lock than COPY normally does. Locking the table is OK. We're loading it for the first time, so we are expecting to be the only users at this time. Here's a more fully worked out plan for wal/concurrency: 1. Lock table, scan until we find our first live row (not deleted, insert not aborted), in which case throw an error. If no error, then ok to proceed to next phase. 2. If there are any dead row versions we truncate the file, since we do not wish to see those rows ever again. No existing transaction can have a lock on them, since we hold it, so its ok for them to go. We record that we have done this in WAL with a short message to say that that we are loading an empty table, whether or not archiving is not enabled. 3. We load the table, with all TransactionIds set as FROZEN. We do not generate WAL for each row unless archiving is enabled. 4. Optionally, we set a flag on the table showing the whole table is frozen. Anybody writing to this table subsequently will spoil this flag. If the flag is set, all forms of VACUUM will return success immediately without performing a scan (since it is already in a perfect VACUUM FULL and VACUUM FREEZE state). If the server crashes, we replay WAL. If we see a load start message, we truncate the relation and note that a load has started. If there is WAL data for the tuples, we replay it. If WAL replay ends without the load transaction having successfully committed, then we truncate the table. The above applies to both LOAD/COPY whatever-yer-call-it and in modified form for CREATE TABLE AS SELECT. For CTAS, no scan is required in (1), and no truncate is required in (2), otherwise the same. I'm expecting Alon Goldshuv to join this discussion soon to explain some other ideas, new developments and discuss the potential for a new command, so lets wait for him... Best Regards, Simon Riggs ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| Hannu Krosing <hannu@skype.net> writes: > I think this should be a decision done when creating a table, just like > TEMP tables. So you always know if a certain table is or is not > safe/replicated/recoverable. > This has also the advantage of requiring no changes to actual COPY and > INSERT commands. That doesn't seem right to me; the scenario I envision is that you are willing to do the initial data loading over again (since you presumably still have the source data available). But once you've got it loaded you want full protection. Perhaps it could work to use an ALTER TABLE command to flip the state. But I'm not really seeing the point compared to treating it as a COPY option. I do not believe that anyone needs this to work on individual INSERT commands --- if you are after max speed, why aren't you using COPY? And treating it as an ALTER property opens the possibility of forgetting to ALTER the table back to normal behavior, which would be a foot-gun of large caliber indeed :-( regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Bruce Momjian <pgman@candle.pha.pa.us> writes: > Jochem van Dieten wrote: >> Why only on an empty table? What is the problem with bypassing WAL on >> any table as long as all files of that table are fsync'ed before >> commit? > Because adding rows to a table might modify existing pages, and if the > COPY fails, you have to restore those pages to a consistent state, and > make sure they are recovered for partial page writes, which we can't do > without WAL. With an initially empty table, you can just throw away the > file system file. You have also got to think about the effects on the table's indexes ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Simon Riggs <simon@2ndquadrant.com> writes: > 4. Optionally, we set a flag on the table showing the whole table is > frozen. Anybody writing to this table subsequently will spoil this flag. > If the flag is set, all forms of VACUUM will return success immediately > without performing a scan (since it is already in a perfect VACUUM FULL > and VACUUM FREEZE state). This bit strikes me as dangerous and not related to the original proposal. I don't care for the load-already-frozen-data part at all, either. That's not just giving up WAL protection, that's an outright MVCC semantics violation, in return for which we get ... not much. Certainly not any speedup in the LOAD itself. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| On 6/1/05, Bruce Momjian wrote: > Jochem van Dieten wrote: >> >> Why only on an empty table? What is the problem with bypassing WAL on >> any table as long as all files of that table are fsync'ed before >> commit? > > Because adding rows to a table might modify existing pages, and if the > COPY fails, you have to restore those pages to a consistent state, and > make sure they are recovered for partial page writes, which we can't do > without WAL. With an initially empty table, you can just throw away the > file system file. Thank you for the explanation, but I am afraid I still don't get it. COPY can either fail and do a normal rollback, in which case there is no problem because the xid never made it to the xlog. So I take it you are talking about a hard crash (pull the plug) somewhere during the actual writing to disk. In that case you have updated several pages and overwritten the free space with new tuples. But you have not overwritten live tuples, so why would you need to restore them? I mean, didn't PostgreSQL < 7.1 work without a WAL at all? Jochem ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| On Wed, 2005-06-01 at 11:31 -0400, Tom Lane wrote: > Hannu Krosing <hannu@skype.net> writes: > > I think this should be a decision done when creating a table, just like > > TEMP tables. So you always know if a certain table is or is not > > safe/replicated/recoverable. > > This has also the advantage of requiring no changes to actual COPY and > > INSERT commands. > > That doesn't seem right to me; the scenario I envision is that you are > willing to do the initial data loading over again (since you presumably > still have the source data available). But once you've got it loaded > you want full protection. Yes, thats the scenario. Believe me, I prefer less code, but I think general feeling now is that we must provide a data safe solution to the performance challenge. > Perhaps it could work to use an ALTER TABLE command to flip the state. > But I'm not really seeing the point compared to treating it as a COPY > option. I do not believe that anyone needs this to work on individual > INSERT commands --- if you are after max speed, why aren't you using > COPY? And treating it as an ALTER property opens the possibility of > forgetting to ALTER the table back to normal behavior, which would be > a foot-gun of large caliber indeed :-( Oh no, not the foot gun again. I surrender. Best Regards, Simon Riggs ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Wed, Jun 01, 2005 at 06:00:28PM +0100, Simon Riggs wrote: > On Wed, 2005-06-01 at 11:31 -0400, Tom Lane wrote: > > Perhaps it could work to use an ALTER TABLE command to flip the state. > > But I'm not really seeing the point compared to treating it as a COPY > > option. I do not believe that anyone needs this to work on individual > > INSERT commands --- if you are after max speed, why aren't you using > > COPY? And treating it as an ALTER property opens the possibility of > > forgetting to ALTER the table back to normal behavior, which would be > > a foot-gun of large caliber indeed :-( > > Oh no, not the foot gun again. I surrender. Sorry, what are you surrendering from/for/of/to? I think the proposal is doing a reasonable headways. So far we have - it is an option to COPY and CREATE TABLE AS, not GUC, not ALTER TABLE - it'd only work on empty tables - it'd only work it PITR is off - it needs to follow MVCC semantics, i.e. the tuples need to be written with the correct Xids - after the command is done, an fsync is applied to the table file(s) Another point that needs thought is what to do if the table has any indexes. Are operations on said indexes logged or not? Maybe we should just say that indexes are verbotten and the user needs to create them afterwards. -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias) ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |
| |||
| Alvaro Herrera <alvherre@surnet.cl> writes: > - it is an option to COPY and CREATE TABLE AS, not GUC, not ALTER TABLE AFAICS it could just happen automatically for CREATE TABLE AS; there's no need for an option there, any more than there is for CREATE INDEX. The only reason it needs to be an explicitly specified option for COPY is that it would require taking a sole-writer lock on the table, which COPY does not now do (and I believe I've heard of people using parallel COPYs to load a table faster, so changing the lock type wouldn't be transparent for everyone). > Another point that needs thought is what to do if the table has any > indexes. Are operations on said indexes logged or not? Maybe we should > just say that indexes are verbotten and the user needs to create them > afterwards. That seems pretty reasonable to me. Again, that's what you'd do anyway if you are after the fastest possible load time, so why should we work much harder to support an inefficient approach? Thinking about it, maybe the user-visible option should be defined thus: LOCK Causes COPY FROM to acquire Exclusive lock on the target table, rather than RowExclusive lock as it normally does. This ensures that no other process is modifying the table while the COPY proceeds. In some cases this can allow significantly faster operation. and then the checks on PITR mode, no indexes, and empty starting table could be internal implementation details rather than part of the user-visible spec (ie, we just fall through and do it normally if any of those conditions don't hold). I like this a little better because there might be application-level reasons to want exclusive lock, independently of implementation details. Also: AFAICS the starting table need not be empty, if we arrange for all inserts done by the COPY to be done into freshly-appended blocks. The initial WAL entry could note the current table length, and instead of "truncate to 0 length" the recovery action is "truncate to noted length". So really the constraints are just "no PITR" and "no indexes". regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| ||||
| I have been working on improving the COPY command performance and as a result also came up with other thoughts that may possibly be better off implemented in a new command (i.e LOAD DATA) rather than adding them to the existing COPY. The improvements I made were in the COPY parsing logic - changing a char-by-char parsing to a multi char buffered fast parsing, while using a bytebuffer which is faster than StringInfoData and minimizing the number of loads into the buffer. The data conversion and insertion parts of COPY I left untouched. As a result the parsing performance increased by about 550%, and the overall COPY performance increased by: Around 40% for 15 column (mixed types) table. Around 90% for 1 column table. (the difference betweeen the two is caused by data conversion overhead). I will post the patch and more numbers to the list later today with more details. I'll just comment now that it is only available for delimited ASCII input data when client and server encodings are the same. CSV and encoding conversions may be added later, this is merely to show that data could be loaded much faster. Here are some things that make me think a new LOAD command is a good idea: 1) There seem to be a possibility that many COPY modifications/improvements may be problematic to incorporate in the current postgres COPY code. Further more, it may be desired to keep the COPY command as is and also have a way to run an improved COPY command for purposes of backwards compatibility. 2) A modified command syntax for introducing a direct single row error handling. By direct I mean - a row that if rejected from within the COPY command context does not throw an error and rollsback the whole transaction. Instead the error is caught and recorded elsewhere, maybe in some error table, with some more information that can later on be retrieved. The following rows continue to be processed. This way there is barely any error handling overhead. Having a recursive row isolation into smaller batches is extremely expensive for non-small data sets. It's not an option for serious users. 3) maybe have an option to indicate the EOL (end of line) format in the command syntax. Current COPY code detects the line-end according to the first data line, this is problematic is 1st data line is mal formatted, and also this doesn't allow having any CR's for example in the data file when EOL is only a linefeed. That causes extra data errors in COPY processing. A CR is a valid data character. Specifying the line end in command syntax will save all this badness. 4) Data integrity and escaping improvements. My patch changes now treats all characters as data (unless it's an escaped delim or EOL) and therefore data integrity is preserved (take for example the following valid data field "file:\new\bang" that after COPY into the database, querying for it from psql will result is wrong data -- backslashes are gone, there are 2 data lines, and a bell will ring for \b!) However, some people that already got used to the postgres COPY escaping way may want to keep it. They could do so by still using the old COPY. 5) allow an ERRORLIMIT to allow control of aborting a load after a certain number of errors (and a pre-requisite for this is point number 2 above). 6) allow LIMIT and OFFSET, for files with header rows for example (could be done in COPY too). 7) Allow the blocks to be directly written to the table, rather than via the buffer cache. 8) Allow a bulk index insertion operation at the end of the LOAD step, if the data has been loaded in sorted order. Use something like the SORTED INDEXES statement on Oracle sql*loader to specify the sort order of the incoming data, so that the index build step can bypass another external sort before loading directly into the index. 9) allow for Simon's WAL bypass. I have surely missed some problems that hide behind the idea, but these points make me believe that LOAD DATA is a good idea. Alon. On 5/31/05 7:47 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> Recent test results have shown a substantial performance improvement >> (+25%) if WAL logging is disabled for large COPY statements. > > How much of that is left after we fix the 64-bit-CRC issue? > >> Now, I would like to discuss adding an enable_logging USERSET GUC, > > [ fear and loathing ... ] > > I don't like the idea of a GUC at all, and USERSET is right out. > I think it would have to be system-wide (cf fsync) to be even > implementable let alone somewhat predictable. Even if it could > be done per-backend with reasonable semantics, random users should > not get to make that decision --- it should be the DBA's call, > which means it needs at least SUSET permissions. > > BTW, I'm sure you are the last one who needs to be reminded that > any such thing breaks PITR completely. Which is surely sufficient > reason not to let it be USERSET. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |