This is a discussion on adding new pages bulky way within the pgsql Hackers forums, part of the PostgreSQL category; --> I need your advice. For on-disk bitmap I run a list of TIDs. TIDs are stored in pages as ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I need your advice. For on-disk bitmap I run a list of TIDs. TIDs are stored in pages as an array, page's opaque data holds an array of bits, indicating whether corresponding TID has been deleted and should be skipped during the scan. Pages, that contain TIDs list, are organized in extents, each extent has 2^N pages, where N is extent's number (i.e. 2nd extent will occupy 4 pages). Given that I know number of TIDs, that fit into one page, and the TID's sequential number, I can easily calculate: - extent number TID belongs to; - page offset inside that extent, and; - TID place in the page. At the moment, I store BlockNumber of the extent's first page in the metapage and allocate all pages that belongs to that extent sequentially. I need to do so to minimize number of page reads when searching for the TID in the list; I'll need to read 1 page at most to find out TID at given position during the scan. I hope you understood the idea. This also means, that while extent's pages are being added this way, no other pages can be added to the index. And the higher is extent's number, the more time it'll take to allocate all pages. The question is: allocating pages this way is really ugly, I understand. Is there some API that would allow allocating N pages in the bulk way? Maybe this is a know problem, that has been already solved before? Any other ideas? Thanks in advance! -- Victor Y. Yegorov ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| On Mon, Jun 06, 2005 at 10:59:04PM +0300, Victor Y. Yegorov wrote: > The question is: allocating pages this way is really ugly, I understand. Is > there some API that would allow allocating N pages in the bulk way? > Maybe this is a know problem, that has been already solved before? > Any other ideas? I don't understand your question. What's the problem with holding the extend lock for the index relation while you extend it? Certainly you want only a single process creating a new extent in the index, right? I guess the question is when are the extents created, and what concurrency do you expect from that operation. -- Alvaro Herrera (<alvherre[a]surnet.cl>) "La naturaleza, tan frágil, tan expuesta a la muerte... y tan viva" ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| "Victor Y. Yegorov" <viy@mits.lv> writes: > [ scheme involving a predetermined layout of index pages ] > The question is: allocating pages this way is really ugly, I understand. Is > there some API that would allow allocating N pages in the bulk way? Why bother? Just write each page when you need to --- there's no law that says you must use P_NEW. The hash index type does something pretty similar, IIRC. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |
| |||
| "Tom Lane" <tgl@sss.pgh.pa.us> writes > > Why bother? Just write each page when you need to --- there's no law > that says you must use P_NEW. The hash index type does something pretty > similar, IIRC. > Is there any performance benefits if we have a mdextend_several_pages() function in md.c? So the relation can be extended in a bulky way. In my understanding, if we write write(fd, buffer, BLCKSZ*10) instead of for (i=0; i<10; i++) write(fd, buffer, BLCKSZ); This will reduce some costs of file system logs for journal file systems. Of course, the cost we have to pay is the longer time of holding relation extension lock. Regards, Qingqing |
| |||
| "Qingqing Zhou" <zhouqq@cs.toronto.edu> writes: > Is there any performance benefits if we have a mdextend_several_pages() > function in md.c? I very seriously doubt that there would be *any* win, and I doubt even more that it could possibly be worth the klugery you'd have to do to make it happen. Bear in mind that index access methods are two API layers away from md.c --- how will you translate this into something that makes sense in the context of bufmgr's API? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| "Tom Lane" <tgl@sss.pgh.pa.us> writes > > I very seriously doubt that there would be *any* win, and I doubt even > more that it could possibly be worth the klugery you'd have to do to > make it happen. Bear in mind that index access methods are two API > layers away from md.c --- how will you translate this into something > that makes sense in the context of bufmgr's API? > Index access or heap access doesn't matter. The imaginary plan is like this: -- change 1 -- /* md.c */ mdextend() { mdextend_several_pages(); add_pages_to_FSM(); } -- change 2 -- /* * Any places hold relation extension lock */ if (needLock) LockPage(relation, 0, ExclusiveLock); /* ADD: check again here */ if (InvalidBlockNumber != GetPageWithFreeSpace()) UnlockPage(relation, 0, ExclusiveLock); /* I have to do the extension */ buffer = ReadBuffer(relation, P_NEW); Above code is quite like how we handle xlogflush() currently. |
| |||
| * Tom Lane <tgl@sss.pgh.pa.us> [07.06.2005 07:59]: > Why bother? Just write each page when you need to --- there's no law > that says you must use P_NEW. This means 2 things: 1) I cannot mix P_NEW and exact-number ReadBuffer() calls; 2) thus, I have to track next-block-number myself. Is it so? BTW, are there any differences in buffer seeking speed, if buffer block-numbers are mixed and if they're not (i.e. P_NEW is used)? -- Victor Y. Yegorov ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Tue, Jun 07, 2005 at 07:52:57PM +0300, Victor Y. Yegorov wrote: > * Tom Lane <tgl@sss.pgh.pa.us> [07.06.2005 07:59]: > > Why bother? Just write each page when you need to --- there's no law > > that says you must use P_NEW. > > This means 2 things: > 1) I cannot mix P_NEW and exact-number ReadBuffer() calls; Huh, why? You need to grab the relation extension block (LockRelationForExtension in CVS tip). -- Alvaro Herrera (<alvherre[a]surnet.cl>) "[PostgreSQL] is a great group; in my opinion it is THE best open source development communities in existence anywhere." (Lamar Owen) ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| * Alvaro Herrera <alvherre@surnet.cl> [08.06.2005 00:39]: > Huh, why? You need to grab the relation extension block > (LockRelationForExtension in CVS tip). Really? Didn't knew that. Consider: 1) I add 2 pages to the newly-created relation using P_NEW as BlockNumber; 2) then I do LockRelationForExtension; ReadBuffer(135) and UnockRelationForExtension. What BlockNumber will be assigned to the buffer, if I call ReadBuffer(P_NEW) now? 136? BTW, is it OK to say "BlockNumber is assigned to buffer"? -- Victor Y. Yegorov ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| "Victor Y. Yegorov" <viy@mits.lv> writes: > * Alvaro Herrera <alvherre@surnet.cl> [08.06.2005 00:39]: >> Huh, why? You need to grab the relation extension block >> (LockRelationForExtension in CVS tip). > Really? Didn't knew that. > Consider: > 1) I add 2 pages to the newly-created relation > using P_NEW as BlockNumber; > 2) then I do LockRelationForExtension; ReadBuffer(135) and > UnockRelationForExtension. As things are set up at the moment, you really should not use P_NEW at all unless you hold the relation extension lock. (At least not for ordinary heap relations. An index access method could have its own rules about how to add blocks to the relation --- hash does for instance.) This is all pretty ugly in my view, and so I would not stand opposed to ideas about a cleaner design ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |