This is a discussion on TRUNCATE TABLE within the Pgsql Performance forums, part of the PostgreSQL category; --> Recently, I have been doing extensive profiling of a version 8.1.4 Postgres DB with about 175 tables and 5 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Recently, I have been doing extensive profiling of a version 8.1.4 Postgres DB with about 175 tables and 5 GB of data (the server running on Fedora Linux and the clients on Windows XP). Surprisingly, one of the bottlenecks is TRUNCATE TABLE and that command is really slow as compared to other operations. For example, we have operations like: TRUNCATE TABLE my_temporary_table COPY my_temporary_table ... FROM STDIN BINARY do_something where do_something is using the data in my_temporary_table to do something like a JOIN or a mass UPDATE or whatever. Now, it turns out that typically most time is lost in TRUNCATE TABLE, in fact it spoils the performance of most operations on the DB ! I read in a mailing list archive that TRUNCATE TABLE is slow since it was made transaction-safe somewhere in version 7, but for operations on a temporary table (with data coming from the outside world) that is irrelevant, at least for my application, in casu, a middleware software package. So, my questions are 1. Why is TRUNCATE TABLE so slow (even if transaction-safe) 2. Is there is way to dig up in the source code somewhere a quick-and-dirty TRUNCATE TABLE alternative for operations on temporary tables that need not be transaction-safe (because the middleware itself can easily restore anything that goes wrong there). I noticed, by the way, that removing records in general is painfully slow, but I didn't do a detailed analysis of that issue yet. As an alternative to TRUNCATE TABLE I tried to CREATE and DROP a table, but that wasn't any faster. Sincerely, Adriaan van Os ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| "Adriaan van Os" <postgres@microbizz.nl> writes: > Recently, I have been doing extensive profiling of a version 8.1.4 Postgres DB > with about 175 tables and 5 GB of data (the server running on Fedora Linux and > the clients on Windows XP). Surprisingly, one of the bottlenecks is TRUNCATE > TABLE and that command is really slow as compared to other operations. For > example, we have operations like: What filesystem is this? Some filesystems are notoriously slow at deleting large files. The mythtv folk who face this problem regularly recommend either JFS or XFS for this purpose. Postgres generally doesn't really need to be able to delete large files quickly. The only times files are deleted which come to mind are when you DROP or TRUNCATE or possibly when you VACUUM a table. > I noticed, by the way, that removing records in general is painfully slow, but > I didn't do a detailed analysis of that issue yet. That's strange. Deleting should be the *quickest* operation in Postgres. Do you perchance have foreign key references referencing this table? Do you have any triggers? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Gregory Stark wrote: > That's strange. Deleting should be the *quickest* operation in Postgres. Do > you perchance have foreign key references referencing this table? No. > Do you have any triggers? No. Tom Lane wrote: > Adriaan van Os <postgres@microbizz.nl> writes: >> Surprisingly, one of the bottlenecks is TRUNCATE TABLE and that >> command is really slow as compared to other operations. > > When you don't quantify that statement at all, it's hard to make an > intelligent comment on it, but TRUNCATE per se shouldn't be slow. > Are you sure you are not measuring a delay to obtain exclusive lock > on the table before it can be truncated (ie, waiting for other > transactions to finish with it)? During the tests, there is only one connection to the database server. No other transactions are running. > When you don't quantify that statement at all, it's hard to make an > intelligent comment on it, but TRUNCATE per se shouldn't be slow. Below are some timings, in milliseconds. > TRUNCATE TABLE my_temporary_table > COPY my_temporary_table ... FROM STDIN BINARY > do_something The temporary table has one INT4 field and no indices. Numrows TRUNCATE (ms) COPY (ms) SELECT (ms) 5122 80,6 16,1 51,2 3910 79,5 12,9 39,9 2745 90,4 10,7 32,4 1568 99,5 7,6 24,7 398 161,1 4,0 22,1 200 79,5 3,3 22,0 200 87,9 3,1 22,0 222368 4943,5 728,6 7659,5 222368 1685,7 512,2 2883,1 Note how fast the COPY is (which is nice). The SELECT statement uses the temporary table. Regards, Adriaan van Os ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| Adriaan van Os <postgres@microbizz.nl> writes: > Tom Lane wrote: >> When you don't quantify that statement at all, it's hard to make an >> intelligent comment on it, but TRUNCATE per se shouldn't be slow. > Below are some timings, in milliseconds. I can only conclude that you're using a seriously bad filesystem :-( I tried to replicate your results on a fairly old and slow HPUX box. I get a fairly repeatable time of around 40msec to truncate a table; this is presumably mostly filesystem time to create one file and delete another. I used CVS HEAD for this because the devel version of psql supports reporting \timing for \copy commands, but I'm quite sure that TRUNCATE isn't any faster than it was in 8.2: regression=# create table tab(f1 int); CREATE TABLE Time: 63.775 ms regression=# insert into tab select random()*10000 from generate_series(1,5000); INSERT 0 5000 Time: 456.011 ms regression=# \copy tab to 'tab.data' binary Time: 80.343 ms regression=# truncate table tab; TRUNCATE TABLE Time: 35.825 ms regression=# \copy tab from 'tab.data' binary Time: 391.928 ms regression=# select count(*) from tab; count ------- 5000 (1 row) Time: 21.457 ms regression=# truncate table tab; TRUNCATE TABLE Time: 47.867 ms regression=# \copy tab from 'tab.data' binary Time: 405.074 ms regression=# select count(*) from tab; count ------- 5000 (1 row) Time: 20.247 ms If I increase the test size to 200K rows, I get a proportional increase in the copy and select times, but truncate stays about the same: regression=# truncate table tab; TRUNCATE TABLE Time: 40.196 ms regression=# \copy tab from 'tab.data' binary Time: 15779.689 ms regression=# select count(*) from tab; count -------- 200000 (1 row) Time: 642.965 ms Your numbers are not making any sense to me. In particular there is no reason in the Postgres code for it to take longer to truncate a 200K-row table than a 5K-row table. (I would expect some increment at the point of having 1GB in the table, where we'd create a second table segment file, but you are nowhere near that.) The bottom line seems to be that you have a filesystem that takes a long time to delete a file, with the cost rising rapidly as the file gets bigger. Can you switch to a different filesystem? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Gregory Stark wrote: > What filesystem is this? Ext3 on Fedora Linux. > Some filesystems are notoriously slow at deleting > large files. The mythtv folk who face this problem regularly recommend either > JFS or XFS for this purpose. That's a remarkable advice, because XFS is known to be slow at creating and deleting files, see <http://en.wikipedia.org/wiki/XFS> and <http://everything2.com/index.pl?node_id=1479435>. Regards, Adriaan van Os ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Adriaan van Os a écrit : > That's a remarkable advice, because XFS is known to be slow at creating > and deleting files, see <http://en.wikipedia.org/wiki/XFS> and > <http://everything2.com/index.pl?node_id=1479435>. > date of article: Fri Jul 25 2003 ! |
| |||
| "Adriaan van Os" <postgres@microbizz.nl> writes: > That's a remarkable advice, because XFS is known to be slow at creating and > deleting files, see <http://en.wikipedia.org/wiki/XFS> and > <http://everything2.com/index.pl?node_id=1479435>. I think this is a case of "you're both right". XFS may have to do more work than other filesystems for meta-information updates. However It still only has to do a constant or nearly constant amount of work. So it may be slower at managing a large directory of thousands of small files than ext3, but it's faster at deleting a single 1G file than ext3. On mythtv the experience is that if you use ext3 and delete a large file while recording another program you can expect the new recording to lose stutter at that point. The large delete will lock out the recording from writing to the filesystem for several seconds. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On 7/13/07, Jean-Max Reymond <jmreymond@gmail.com> wrote: > Adriaan van Os a écrit : > > That's a remarkable advice, because XFS is known to be slow at creating > > and deleting files, see <http://en.wikipedia.org/wiki/XFS> and > > <http://everything2.com/index.pl?node_id=1479435>. > > > > date of article: Fri Jul 25 2003 ! > Even at this date, the article end with : "More interestingly, my delete performance has actually superseded that of ext3, for both random and sequential deletes! The most major weakness of XFS has been eliminated, and my spankin' new filesystem is ready to rock. Cheers!" -- Thomas SAMSON I came, I saw, I deleted all your files. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On Fri, Jul 13, 2007 at 09:47:06AM +0200, Adriaan van Os wrote: >That's a remarkable advice, because XFS is known to be slow at creating and >deleting files, see <http://en.wikipedia.org/wiki/XFS> and ><http://everything2.com/index.pl?node_id=1479435>. xfs' slowness is proportional to the *number* rather than the *size* of the files. In postgres you'll tend to have fewer, larger, files than you would in (e.g.) a source code repository, so it is generally more important to have a filesystem that deletes large files quickly than a filesystem that deletes lots of files quickly. I'd suspect that the same is true for mythtv. Mike Stone ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| Michael Stone <mstone+postgres@mathom.us> writes: > xfs' slowness is proportional to the *number* rather than the *size* of > the files. In postgres you'll tend to have fewer, larger, files than you > would in (e.g.) a source code repository, so it is generally more > important to have a filesystem that deletes large files quickly than a > filesystem that deletes lots of files quickly. The weird thing is that the files in question were hardly "large". IIRC his test case used a single int4 column, so the rows were probably 36 bytes apiece allowing for all overhead. So the test cases with about 5K rows were less than 200K in the file, and the ones with 200K rows were still only a few megabytes. I tried the test on my Linux machine (which I couldn't do when I responded earlier because it was tied up with another test), and saw truncate times of a few milliseconds for both table sizes. This is ext3 on Fedora 6. So I'm still of the opinion that there's something broken about Adriaan's infrastructure, but maybe we have to look to an even lower level than the filesystem. Perhaps he should try getting some bonnie++ benchmark numbers to see if his disk is behaving properly. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |