This is a discussion on Linux/PostgreSQL scalability issue - problem with 8 cores within the Pgsql Performance forums, part of the PostgreSQL category; --> Jakub Ouhrabka <kuba@comgate.cz> writes: > What does it mean? Look at src/include/storage/sinval.h and src/include/utils/syscache.h. What you seem to have ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Jakub Ouhrabka <kuba@comgate.cz> writes: > What does it mean? Look at src/include/storage/sinval.h and src/include/utils/syscache.h. What you seem to have here is a bunch of tuple updates in pg_class (invalidating caches 29 and 30, which in 8.2 correspond to RELNAMENSP and RELOID), followed by a bunch of SharedInvalRelcacheMsg and SharedInvalSmgrMsg. What I find interesting is that the hits are coming against nearly-successive tuple CTIDs in pg_class, eg these are all on pages 25 and 26 of pg_class: > LOG: sending inval msg 30 0 25 45 30036 4294936595 > LOG: sending inval msg 29 0 25 45 30036 2019111801 > LOG: sending inval msg 30 0 26 11 30036 4294936595 > LOG: sending inval msg 29 0 26 11 30036 2019111801 > LOG: sending inval msg 30 0 25 44 30036 4294936597 > LOG: sending inval msg 29 0 25 44 30036 3703878920 > LOG: sending inval msg 30 0 26 10 30036 4294936597 > LOG: sending inval msg 29 0 26 10 30036 3703878920 > LOG: sending inval msg 30 0 26 9 30036 4294936616 > LOG: sending inval msg 29 0 26 9 30036 3527122063 > LOG: sending inval msg 30 0 25 43 30036 4294936616 > LOG: sending inval msg 29 0 25 43 30036 3527122063 The ordering is a little strange --- not sure what's producing that. I can think of three things that might be producing this: 1. DDL operations ... but most sorts of DDL on a table would touch more catalogs than just pg_class, so this is a bit hard to credit. 2. VACUUM. 3. Some sort of direct update of pg_class. The fact that we have a bunch of catcache invals followed by relcache/smgr invals says that this all happened in one transaction, else they'd have been intermixed better. That lets VACUUM off the hook, because it processes each table in a separate transaction. I am wondering if maybe your app does one of those sneaky things like fooling with pg_class.reltriggers. If so, the problem might be soluble by just avoiding unnecessary updates. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Hi Tom, > I can think of three things that might be producing this: we've found it: TRUNCATE We'll try to eliminate use of TRUNCATE and the periodical spikes should go off. There will still be possibility of spikes because of database creation etc - we'll try to handle this by issuing trivial commands from idle backeds and in longer run to decrease the number of backends/databases running. This is the way to go, right? One more question: is it ok to do mass regexp update of pg_proc.prosrc changing TRUNCATEs to DELETEs? Anything we should be aware of? Sure we'll do our own testing but in case you see any obvious downsides... Many thanks for your guidance! Kuba ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Hi > > I can think of three things that might be producing this: > > we've found it: TRUNCATE I haven't been following this thread. Can someone please explain to me why TRUNCATE causes these spikes? -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: adrian@careerjunction.co.za Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Jakub Ouhrabka <kuba@comgate.cz> writes: > we've found it: TRUNCATE Huh. One transaction truncating a dozen tables? That would match the sinval trace all right ... > One more question: is it ok to do mass regexp update of pg_proc.prosrc > changing TRUNCATEs to DELETEs? You might be throwing the baby out with the bathwater, performance-wise. Mass DELETEs will require cleanup by VACUUM, and that'll likely eat more cycles and I/O than you save. I'd think in terms of trying to spread out the TRUNCATEs or check to see if you really need one (maybe the table's already empty), rather than this. I do plan to look at the sinval code once 8.3 is out the door, so another possibility if you can wait a few weeks/months is to leave your app alone and see if the eventual patch looks sane to back-patch. (I don't think the community would consider doing so, but you could run a locally modified Postgres with it.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| > Huh. One transaction truncating a dozen tables? That would match the > sinval trace all right ... It should be 4 tables - the shown log looks like there were more truncates? > You might be throwing the baby out with the bathwater, > performance-wise. Yes, performance was the initial reason to use truncate instead of delete many years ago. But today the truncated tables usualy contain exactly one row - quick measurements now show that it's faster to issue delete instead of truncate in this case. Again, many thanks for your invaluable advice! Kuba ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Adrian Moisey <adrian@careerjunction.co.za> writes: >> we've found it: TRUNCATE > I haven't been following this thread. Can someone please explain to me > why TRUNCATE causes these spikes? It's not so much the TRUNCATE as the overhead of broadcasting the resultant catalog changes to the many hundreds of (mostly idle) backends he's got --- all of which respond by trying to lock the shared sinval message queue at about the same time. You could see the whole thread as an object lesson in why connection pooling is a good idea. But certainly it seems that sinval is the next bottleneck in terms of being able to scale Postgres up to very large numbers of backends. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| > Okay, for a table of just a few entries I agree that DELETE is > probably better. But don't forget you're going to need to have those > tables vacuumed fairly regularly now, else they'll start to bloat. I think we'll go with DELETE also for another reason: Just after we figured out the cause of the spikes we started to investigate a long-term issue we had with PostgreSQL: pg_dump of big database was blocking some of our applications. And yes, we replaced TRUNCATE with DELETE and everything is running as expected. Looking at the docs now I see there is a new paragraph in 8.3 docs mentioning that TRUNCATE is not MVCC-safe and also the blocking issue. It's a pity that the warning wasn't there in 7.1 times :-) Thanks, Kuba Tom Lane napsal(a): > Jakub Ouhrabka <kuba@comgate.cz> writes: >>>> Huh. One transaction truncating a dozen tables? That would match the >>>> sinval trace all right ... > >> It should be 4 tables - the shown log looks like there were more truncates? > > Actually, counting up the entries, there are close to 2 dozen relations > apparently being truncated in the trace you showed. But that might be > only four tables at the user level, since each index on these tables > would appear separately, and you might have a toast table plus index > for each one too. If you want to dig down, the table OIDs are visible > in the trace, in the messages with type -1: > >>> LOG: sending inval msg -1 0 30036 0 30700 3218341912 > ^^^^^ ^^^^^ > DBOID RELOID > > so you could look into pg_class to confirm what's what. > >> Yes, performance was the initial reason to use truncate instead of >> delete many years ago. But today the truncated tables usualy contain >> exactly one row - quick measurements now show that it's faster to issue >> delete instead of truncate in this case. > > Okay, for a table of just a few entries I agree that DELETE is probably > better. But don't forget you're going to need to have those tables > vacuumed fairly regularly now, else they'll start to bloat. > > regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| ||||
| Added to TODO: * Improve performance of shared invalidation queue for multiple CPUs http://archives.postgresql.org/pgsql...1/msg00023.php --------------------------------------------------------------------------- Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Perhaps it would make sense to try to take the "fast path" in > > SIDelExpiredDataEntries with only a shared lock rather than exclusive. > > I think the real problem here is that sinval catchup processing is well > designed to create contention :-(. Once we've decided that the message > queue is getting too full, we SIGUSR1 all the backends at once (or as > fast as the postmaster can do it anyway), then they all go off and try > to touch the sinval queue. Backends that haven't awoken even once > since the last time will have to process the entire queue contents, > and they're all trying to do that at the same time. What's worse, they > take and release the SInvalLock once for each message they take off the > queue. This isn't so horrid for one-core machines (since each process > will monopolize the CPU for probably less than one timeslice while it's > catching up) but it's pretty obvious where all the contention is coming > from on an 8-core. > > Some ideas for improving matters: > > 1. Try to avoid having all the backends hit the queue at once. Instead > of SIGUSR1'ing everybody at the same time, maybe hit only the process > with the oldest message pointer, and have him hit the next oldest after > he's done reading the queue. > > 2. Try to take more than one message off the queue per SInvalLock cycle. > (There is a tuning tradeoff here, since it would mean holding the lock > for longer at a time.) > > 3. Try to avoid having every backend run SIDelExpiredDataEntries every > time through ReceiveSharedInvalidMessages. It's not critical to delete > entries until the queue starts getting full --- maybe we could rejigger > the logic so it only happens once when somebody notices the queue is > getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum > do it, or something like that? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |