This is a discussion on Re: database vacuum from cron hanging within the pgsql Hackers forums, part of the PostgreSQL category; --> Kevin Grittner wrote: > I'm not sure what you mean regarding pg_config -- could you clarify? The output of ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Kevin Grittner wrote: > I'm not sure what you mean regarding pg_config -- could you clarify? The output of pg_config --configure > Your email came through as I was trying to figure out where to find > the core dump. We restarted the server with cassert, and I find this > in the log prior to my attempt to vacuum: It should be in $PGDATA/core (maybe with some other name depending on settings) > TRAP: FailedAssertion("!(buf->refcount > 0)", File: "bufmgr.c", Line: 812) > [2005-10-12 09:10:05.695 CDT] 16602 LOG: server process (PID 16619) was terminated by signal 6 > [2005-10-12 09:10:05.695 CDT] 16602 LOG: terminating any other active server processes Here is the culprit. -- Alvaro Herrera http://www.PlanetPostgreSQL.org "Always assume the user will do much worse than the stupidest thing you can imagine." (Julien PUYDT) ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| ||||
| Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Kevin Grittner wrote: >> I'm not sure what you mean regarding pg_config -- could you clarify? > The output of pg_config --configure Actually I wanted the whole thing, not just --configure (I'm particularly interested in the CFLAGS setting). >> Your email came through as I was trying to figure out where to find >> the core dump. We restarted the server with cassert, and I find this >> in the log prior to my attempt to vacuum: > It should be in $PGDATA/core (maybe with some other name depending on > settings) If my theory about a bogus increment code sequence is correct, then the core dump will not tell us anything very interesting anyway --- the trap will happen when the slower of the two processes tries to remove its pin, but that's way after the bug happened. I'm thinking that the easiest way to confirm or disprove this theory is to examine the assembly code. Please do this: 1. cd into src/backend/storage/buffer directory of build tree. 2. rm bufmgr.o; make 3. Note gcc command issued by make to rebuild bufmgr.o. Cut and paste, changing -c to -S and removing "-o bufmgr.o" if present. Keep all the other switches the same. 4. This should produce a file bufmgr.s. Gzip and send to me (off-list please, it's likely to be large and boring) Please also confirm exactly which version of bufmgr.c you are working with --- the $PostgreSQL line near the head of the file will do. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |