View Single Post

   
  #10 (permalink)  
Old 04-12-2008, 06:44 AM
Markus Schiltknecht
 
Posts: n/a
Default Re: autovacuum process handling

Hi,

Alvaro Herrera wrote:
> I haven't done that yet, since the current incarnation does not need it.
> But I have considered using some signal like SIGUSR1 to mean "something
> changed in your processes, look into your shared memory". The
> autovacuum shared memory area would contain PIDs (or maybe PGPROC
> pointers?) of workers; so when the launcher goes to check that it
> notices that one worker is no longer there, meaning that it must have
> terminated its job.


Meaning the launcher must keep a list of currently known worker PIDs and
compare that to the list in shared memory. This is doable, but quite a
lot of code for something the postmaster gets for free (i.e. SIGCHLD).

> Sure you do -- they won't corrupt anything :-) Plus, what use are
> running backends in a multimaster environment, if they can't communicate
> with the outside? Much better would be, AFAICS, to shut everyone down
> so that the users can connect to a working node.


You are right here. I'll have to recheck my code and make sure I 'take
down' the postmaster in a decent way (i.e. make it terminate it's
children immediately, so that they can't commit anymore).

>> More involved with what? It does not touch shared memory, it mainly
>> keeps track of the backends states (by getting a notice from the
>> postmaster) and does all the necessary forwarding of messages between
>> the communication system and the backends. It's main loop is similar to
>> the postmasters, mainly consisting of a select().

>
> I meant "more complicated". And if it has to listen on a socket and
> forward messages to remote backends, it certainly is a lot more
> complicated than the current autovac launcher.


That may well be. My point was, that my replication manager is so
similar to the postmaster, that it is a real PITA to do that much coding
just to make it a separate process.

>> For sure, the replication manager needs to keep running during a
>> restarting cycle. And it needs to know the database's state, so as to be
>> able to decide if it can request workers or not.

>
> I think this would be pretty easy to do if you made the remote backends
> keep state in shared memory. The manager just needs to get a signal to
> know that it should check the shared memory. This can be arranged
> easily: just have the remote backends signal the postmaster, and have
> the postmaster signal the manager. Alternatively, have the manager PID
> stored in shared memory and have the remote backends signal (SIGUSR1 or
> some such) the manager. (bgwriter does this: it announces its PID in
> shared memory, and the backends signal it when they want a CHECKPOINT).


Sounds like we run out of signals, soon. ;-)

I also have to pass around data (writesets), which is why I've come up
with that IMessage stuff. It's a per process message queue in shared
memory, using a SIGUSR1 to signal new messages. Works, but as I said, I
found myself adding messages for all the postmaster events, so that I've
really began to question what to do in which process.

Again, thanks for your inputs.

Markus



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply With Quote