vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Dear all, I am writing a C-language shared-object file which is dynamically linked with postgres, and uses the various SPI functions for executing queries from numerous trigger functions. My question is thus: what is the best method for a dynamically linked object to share memory with the same object running on other backends? Am I right in thinking that if I allocate memory in the "upper execution context" from SPI_palloc(), this is not shared with the other processes? I thought of a few ways of doing this (please forgive me if these appear idiotic, as I am fairly new to postgres): 1. Change memory context to TopMemoryContext and palloc everything there. (However, I believe this still isn't shared between processes?) 2. Use the shmem functions in src/backend/storage/ipc/shmem.c to create a chunk of shared memory and use this (Although I would like to avoid writing my own memory manager to carve up the space). 3. Somehow create shared memory using the shmem functions, and set a memory context to live *inside* this shared memory, which my trigger functions can then switch to. Then use palloc() and pfree() without worrying.. Please let me know if this problem has been solved before, as I have searched through the mailing lists and through the source, but am not sure which is the best way to resolve it. Thanks for your help. Regards, Richard ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Sun, Feb 05, 2006 at 02:03:59PM +0000, richard@playford.net wrote: > 1. Change memory context to TopMemoryContext and palloc everything there. > (However, I believe this still isn't shared between processes?) Not shared, correct. > 2. Use the shmem functions in src/backend/storage/ipc/shmem.c to create a > chunk of shared memory and use this (Although I would like to avoid writing > my own memory manager to carve up the space). This is the generally accepted method. Please remember that when sharing structures you have to worry about concurrency. So you need locking. > 3. Somehow create shared memory using the shmem functions, and set a memory > context to live *inside* this shared memory, which my trigger functions can > then switch to. Then use palloc() and pfree() without worrying.. Nope, palloc/pfree don't deal with concurrency. > Please let me know if this problem has been solved before, as I have searched > through the mailing lists and through the source, but am not sure which is > the best way to resolve it. Thanks for your help. Most people allocate chunks of shared memory and don't use palloc/pfree. What are you doing that requires such management? Most shared structures in PostgreSQL are allocated once and never freed... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFD5geNIB7bNG8LQkwRArkhAJ9LauuVmYQ4onXtZ1K2xp yFxEKHkQCdGstl SYgYQTkWJVPvAZRf9H3Vx9k= =hmgt -----END PGP SIGNATURE----- |
| |||
| richard@playford.net writes: > 1. Change memory context to TopMemoryContext and palloc everything there. > (However, I believe this still isn't shared between processes?) Nope. > 2. Use the shmem functions in src/backend/storage/ipc/shmem.c to create a > chunk of shared memory and use this (Although I would like to avoid writing > my own memory manager to carve up the space). > > 3. Somehow create shared memory using the shmem functions, and set a memory > context to live *inside* this shared memory, which my trigger functions can > then switch to. Then use palloc() and pfree() without worrying.. You'd have to do one of the above, but #2 is probably out because all shared memory is allocated to various purposes at startup and there is none free at runtime (as I understand it). For #3, how do you plan to have a memory context shared by multiple backends with no synchronization? If two backends try to do allocation or deallocation at the same time you will get corruption, as I don't think palloc() and pfree() do any locking (they currently never allocate from shared memory). You should probably think very carefully about whether you can get along without using additional shared memory, because it's not that easy to do. -Doug ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On Sun February 5 2006 14:11, Martijn van Oosterhout wrote: > This is the generally accepted method. Please remember that when > sharing structures you have to worry about concurrency. So you need > locking. Of course - I have already implemented locking with semaphores (I may simply use one big lock and carefully avoid reentry). > Nope, palloc/pfree don't deal with concurrency. Indeed, although if I lock the shared memory then I can palloc and pfree() without worrying. The problem I see is that new memory contexts have their memory assigned to them when they are created. I can't tell them "go here!" > Most people allocate chunks of shared memory and don't use > palloc/pfree. What are you doing that requires such management? Most > shared structures in PostgreSQL are allocated once and never freed... I have a number of functions which modify tables based on complex rules stored in script-files. I wrote a parser for these files as a separate program first before incorporating it as a shared object, subsequentially it loads and executes rules from memory. As anything can be read from the files, and rules can be unloaded later, I was hoping for flexibility in allocing memory to store it all. Another option is to load the files but store the rules within the database, which should be possible, but appears to be a slightly messy way of doing it. Then again, messing about with shared memory allocation may be messier. Asking as an fairly inexperienced postgres person, what would you suggest? ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Sun, Feb 05, 2006 at 02:31:23PM +0000, Richard Hills wrote: > I have a number of functions which modify tables based on complex rules stored > in script-files. I wrote a parser for these files as a separate program first > before incorporating it as a shared object, subsequentially it loads and > executes rules from memory. As anything can be read from the files, and rules > can be unloaded later, I was hoping for flexibility in allocing memory to > store it all. So what you load are the already processed rules? In that case you could probably use the buffer management system. Ask it to load the blocks and they'll be in the buffer cache. As long as you have the buffer pinned they'll stay there. That's pretty much a read-only approach. If you're talking about things that don't come from disk, well, hmm... If you want you could use a file on disk as backing and mmap() it into each processes address space... > Another option is to load the files but store the rules within the database, > which should be possible, but appears to be a slightly messy way of doingit. > Then again, messing about with shared memory allocation may be messier. > Asking as an fairly inexperienced postgres person, what would you suggest? The real question is, does it need to be shared-writable. Shared-readonly is much easier (ie one writer, multiple readers). Using a file as backing store for mmap() may be the easiest.... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFD5g8sIB7bNG8LQkwRAvFmAJ9yJbihckcuoPD9Ff1fqm zONE5w+gCfaw+N gqWc7E5r2aETBZjR92ufPFQ= =zK4a -----END PGP SIGNATURE----- |
| |||
| On Sun February 5 2006 14:43, Martijn van Oosterhout wrote: > So what you load are the already processed rules? In that case you > could probably use the buffer management system. Ask it to load the > blocks and they'll be in the buffer cache. As long as you have the > buffer pinned they'll stay there. That's pretty much a read-only > approach. > > If you're talking about things that don't come from disk, well, hmm... > If you want you could use a file on disk as backing and mmap() it into > each processes address space... <...> > The real question is, does it need to be shared-writable. > Shared-readonly is much easier (ie one writer, multiple readers). Using > a file as backing store for mmap() may be the easiest.... I load the rules from a script and parse them, storing them in a forest of linked malloced structures. These structures are created by one writer but then read by a number of readers, and later may be removed by the original writer. So, as you can imagine, I could store the forest in the db, although it might be a mess. First I will look through the buffer management system, and see if that will do the job. Thanks for your help, Regards, Richard ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Martijn van Oosterhout <kleptog@svana.org> writes: > So what you load are the already processed rules? In that case you > could probably use the buffer management system. Ask it to load the > blocks and they'll be in the buffer cache. As long as you have the > buffer pinned they'll stay there. .... until you get to the end of the transaction, where the buffer manager will barf because somebody forgot an unpin. Long-term buffer pins are really not acceptable anyway --- you'd essentially be asserting that your little facility is more important than any other use of shared buffers, and I'm sorry but that ain't so. AFAICT the data structures you are worried about don't have any readily predictable size, which means there is no good way to keep them in shared memory --- we can't dynamically resize shared memory. So I think storing the rules in a table and loading into private memory at need is really the only reasonable solution. Storing them in a table has a lot of other advantages anyway, mainly that you can manipulate them from SQL. You can find some prior discussion of similar issues in the archives; IIRC the idea of a shared plan cache was being kicked around for awhile some years back. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Sun February 5 2006 16:16, Tom Lane wrote: > AFAICT the data structures you are worried about don't have any readily > predictable size, which means there is no good way to keep them in > shared memory --- we can't dynamically resize shared memory. So I think > storing the rules in a table and loading into private memory at need is > really the only reasonable solution. Storing them in a table has a lot > of other advantages anyway, mainly that you can manipulate them from > SQL. I have come to the conclusion that storing the rules and various other bits in tables is the best solution, although this will require a much more complex db structure than I had originally planned. Trying to allocate and free memory in shared memory is fairly straightforward, but likely to become incredibly messy. Seeing as some of the rules already include load-value-from-db-on-demand, it should be fairly straightforward to extend it to load-rule-from-db-on-demand. Thanks for all your help, Regards, Richard ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Sun, 2006-02-05 at 14:03 +0000, richard@playford.net wrote: > 3. Somehow create shared memory using the shmem functions, and set a memory > context to live *inside* this shared memory, which my trigger functions can > then switch to. Then use palloc() and pfree() without worrying.. This has been done before, by the TelegraphCQ folks: they implemented a shared memory MemoryContext on top of OSSP MM[1]. The code is in the v0.2 TelegraphCQ tarball[2] -- see shmctx.c and shmset.c in src/backend/utils/mmgr/. I'm not aware of an independent distribution, but you could probably separate it out without too much pain. (Of course, the comments elsewhere in the thread about using an alternative are probably still true...) -Neil [1] http://www.ossp.org/pkg/lib/mm/ [2] http://telegraph.cs.berkeley.edu/dow...hCQ-0.2.tar.gz ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| ||||
| > On Sun February 5 2006 16:16, Tom Lane wrote: >> AFAICT the data structures you are worried about don't have any readily >> predictable size, which means there is no good way to keep them in >> shared memory --- we can't dynamically resize shared memory. So I think >> storing the rules in a table and loading into private memory at need is >> really the only reasonable solution. Storing them in a table has a lot >> of other advantages anyway, mainly that you can manipulate them from >> SQL. > > I have come to the conclusion that storing the rules and various other > bits in > tables is the best solution, although this will require a much more > complex > db structure than I had originally planned. Trying to allocate and free > memory in shared memory is fairly straightforward, but likely to become > incredibly messy. > > Seeing as some of the rules already include load-value-from-db-on-demand, > it > should be fairly straightforward to extend it to > load-rule-from-db-on-demand. > I posted some source to a shared memory sort of thing to the group, as well as to you, I believe. For variables and values that change very infrequently, using the DB is the right idea. PostgreSQL, as well as most databases, crumble under a highly changing database. By changing, I mean a lot of UPDATES and DELETES. Inserts are not so bad. PostgreSQL has a fairl poor (IMHO) UPDATE behaviour. Most transaction aware databases do, but PostgreSQL seems quite bad. For an example, if you are doing a scoreboard sort of thing for a website, updating a single varible in a table 20 times a second, will quickly make that simple and normally fast update/query take a very long time. You have to run VACUUM a whole lot. The next example is a session table for a website, you may have a few hundred or a few thousand active session rows, but each row may get many updates, and you may have tens of thousands of sessions which may be inactive. Unless you vaccum very frequently, you are doing a lot of disk I/O for every session, because the query has to walk the table file to find a valid row. A database is a BAD system to manage data like sessions in an active website. It is a good tool for most all, but if you are implementing an eBay or Yahoo, you'll swamp your DB quickly. The issue with a shared memory system is that you don't get the data security that you do with disk storage. ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |