This is a discussion on Re: vacuum, performance, and MVCC within the pgsql Hackers forums, part of the PostgreSQL category; --> bruce wrote: > Tom Lane wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > > I think at some ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| bruce wrote: > Tom Lane wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > > I think at some point we have to admit that _polling_ the tables, which > > > is what autovacuum does, just isn't going to work well, no matter how > > > much it is tweeked, and another approach should be considered for > > > certain workload cases. > > > > Autovacuum polls in its current, first-generation implementation; > > what I said upthread was it needs to be smarter than that. I am not > > sure how you get from that to the conclusion that the very next step > > is to abandon the vacuuming approach altogether. > > I am not ready to abandon autovacuum, but as I stated later the UPDATE > with no key change case is common enought that it could be handled > better without involving autovacuum and its limitations. > > As I remember, most databases have problem with DELETE/INSERT cycles, > but we seem to be hit by UPDATE performance more than most, and more > than is wise. In an attempt to get some concrete on these ideas... ;-) I think the major complexity in doing an in-place UPDATE when no key columns change is allowing rollback on transaction abort (or backend crash), and properly handling visibility for transactions in progress. If the old and new rows are on the same heap page (perhaps a necessary limitation), you can easily update the heap item id to point to the new heap row. All indexes will then point to the new row, and sequential scans will still see both rows (which is good). That leave the rollback issue (undoing the item id change), and having index scans for current backends still see the old row. OK, I have an idea. Right now, an UPDATE where the old and new rows are on the same page have two index entries. What if we made only one index entry for both? We already have UPDATE chaining, where the old row points to the new one. If no key columns change, we set a bit in the heap that the chaining points to the old and new row (both on the same page), so an index scan uses one index entry to see the old and new row, and once the old row is no longer visible, the page index id can be updated to point to the new row and the old row can be marked free and perhaps used for a future UPDATE. (UPDATE already tries to do keep updates on the same heap page.) FYI, the reason heap cleanup is possible once you go with a single index entry for old and new rows is because there is no index cleanup (and single-row index cleanup is very expensive). -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Sat, 24 Jun 2006, Bruce Momjian wrote: > OK, I have an idea. Right now, an UPDATE where the old and new rows are > on the same page have two index entries. What if we made only one index > entry for both? We already have UPDATE chaining, where the old row > points to the new one. If no key columns change, we set a bit in the > heap that the chaining points to the old and new row (both on the same > page), so an index scan uses one index entry to see the old and new row, > and once the old row is no longer visible, the page index id can be > updated to point to the new row and the old row can be marked free and > perhaps used for a future UPDATE. (UPDATE already tries to do keep > updates on the same heap page.) In fact, that's what I originally thought Mark was suggesting. A couple of points: Why the limitation of old and new row being on the same page? This only works if none of the updated columns are indexed. That's a bit annoying. It would be nice to be able to create new index tuples in those indexes that contain one of the changed columns, but not in others. What happens if you create a new index that contains one of the changed columns? - Heikki ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Heikki Linnakangas wrote: > On Sat, 24 Jun 2006, Bruce Momjian wrote: > > > OK, I have an idea. Right now, an UPDATE where the old and new rows are > > on the same page have two index entries. What if we made only one index > > entry for both? We already have UPDATE chaining, where the old row > > points to the new one. If no key columns change, we set a bit in the > > heap that the chaining points to the old and new row (both on the same > > page), so an index scan uses one index entry to see the old and new row, > > and once the old row is no longer visible, the page index id can be > > updated to point to the new row and the old row can be marked free and > > perhaps used for a future UPDATE. (UPDATE already tries to do keep > > updates on the same heap page.) > > In fact, that's what I originally thought Mark was suggesting. A couple of > points: > > Why the limitation of old and new row being on the same page? Because having them be on the same page is the only way you can update the page item pointer so when you recycle the row, you the indexes are now pointing to the new version. Pages look like: [marker][item1][item2][item3]...[tuple1][tuple2][tuple3] and indexes only point to items, not to tuples. This allows tuples to be compacted on the page without affecting the indexes. If tuple1 is updated to tuple2, once tuple1 is no longer visible to any backends, you can modify item1 to point to tuple2, and you can mark the space used by tuple1 as reusable: [marker][item1(tuple2)][item2][item3]...[free][tuple2][tuple3] > This only works if none of the updated columns are indexed. That's a bit > annoying. It would be nice to be able to create new index tuples in those The hope is that a commonly updated tuple will eventually be on a page where there is sufficient free space for updated version to stay on there. For an active server, there might be several updated versions of rows on the same page. > indexes that contain one of the changed columns, but not in others. If you can't expire the old row because one of the indexed columns was modified, I see no reason to try to reduce the additional index entries. > What happens if you create a new index that contains one of > the changed columns? Uh, I had not thought of that. You could easily create two index entries for the old and new rows, but then the heap bit saying there is only one index row would be inaccurate for the new index. I suppose you could create new rows in all indexes and clear the heap bit. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Sat, 24 Jun 2006, Bruce Momjian wrote: > Because having them be on the same page is the only way you can update > the page item pointer so when you recycle the row, you the indexes are > now pointing to the new version. Pages look like: > > [marker][item1][item2][item3]...[tuple1][tuple2][tuple3] > > and indexes only point to items, not to tuples. This allows tuples to > be compacted on the page without affecting the indexes. > > If tuple1 is updated to tuple2, once tuple1 is no longer visible to any > backends, you can modify item1 to point to tuple2, and you can mark the > space used by tuple1 as reusable: > > [marker][item1(tuple2)][item2][item3]...[free][tuple2][tuple3] Ok, now I think I get it. So the limitation of old and new tuple being on the same page is required to make it possible to remove the old tuple without touching the indexes? If you move the new tuple (logically, by modifying item pointers) on vacuum, isn't there a risk that a concurrent seqscan misses it? > If you can't expire the old row because one of the indexed columns was > modified, I see no reason to try to reduce the additional index entries. It won't enable early expiration, but it means less work to do on update. If there's a lot of indexes, not having to add so many index tuples can be a significant saving. To summarise, we have two issues related to frequent updates: 1. Index and heap bloat, requiring frequent vacuum. 2. Updates are more expensive than on other DBMSs, because we have to add a new index tuple in every index, even if none of the indexed columns are modified. Tom suggested that we just improve vacuum and autovacuum, and someone brought up the dead space map idea again. Those are all worthwhile things to do and help with vacuuming after deletes as well as updates, but they only address issue 1. Mark's suggestion (assuming that it would've worked) as well as yours address both, but only for updates. - Heikki ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Heikki Linnakangas wrote: > On Sat, 24 Jun 2006, Bruce Momjian wrote: > > > Because having them be on the same page is the only way you can update > > the page item pointer so when you recycle the row, you the indexes are > > now pointing to the new version. Pages look like: > > > > [marker][item1][item2][item3]...[tuple1][tuple2][tuple3] > > > > and indexes only point to items, not to tuples. This allows tuples to > > be compacted on the page without affecting the indexes. > > > > If tuple1 is updated to tuple2, once tuple1 is no longer visible to any > > backends, you can modify item1 to point to tuple2, and you can mark the > > space used by tuple1 as reusable: > > > > [marker][item1(tuple2)][item2][item3]...[free][tuple2][tuple3] > > Ok, now I think I get it. So the limitation of old and new tuple being on > the same page is required to make it possible to remove the old tuple > without touching the indexes? Yes, modifying the heap page item pointer is required for reuse. > If you move the new tuple (logically, by modifying item pointers) on > vacuum, isn't there a risk that a concurrent seqscan misses it? Well, you lock the page while updating the item pointer. Because the versions are on the same page, a single page lock should be fine. > > If you can't expire the old row because one of the indexed columns was > > modified, I see no reason to try to reduce the additional index entries. > > It won't enable early expiration, but it means less work to do on update. > If there's a lot of indexes, not having to add so many index tuples can be > a significant saving. Already added to TODO. * Reuse index tuples that point to heap tuples that are not visible to anyone? > To summarise, we have two issues related to frequent updates: > 1. Index and heap bloat, requiring frequent vacuum. > 2. Updates are more expensive than on other DBMSs, because we have to add > a new index tuple in every index, even if none of the indexed columns are > modified. > > Tom suggested that we just improve vacuum and autovacuum, and someone > brought up the dead space map idea again. Those are all worthwhile things > to do and help with vacuuming after deletes as well as updates, but they > only address issue 1. Mark's suggestion (assuming that it would've > worked) as well as yours address both, but only for updates. Agreed. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Sun, Jun 25, 2006 at 09:13:48PM +0300, Heikki Linnakangas wrote: > >If you can't expire the old row because one of the indexed columns was > >modified, I see no reason to try to reduce the additional index entries. > > It won't enable early expiration, but it means less work to do on update. > If there's a lot of indexes, not having to add so many index tuples can be > a significant saving. While catching up on this thread, the following idea came to me that I think would allow for not updating an index on an UPDATE if it's key doesn't change. If I understand Bruce's SITC proposal correctly, this would differ in that SITC requires that no index keys change. My idea is that if an UPDATE places the new tuple on the same page as the old tuple, it will not create new index entries for any indexes where the key doesn't change. This means that when fetching tuples from the index, ctid would have to be followed until you found the version you wanted OR you found the first ctid that pointed to a different page (because that tuple will have it's own index entry) OR you found a tuple with a different value for the key of the index you're using (because it'd be invalid, and there'd be a different index entry for it). I believe that the behavior of the index hint bits would also have to change somewhat, as each index entry would now essentially be pointing at all the tuples in the ctid chain that exist on a page, not just single tuple. In the case of an UPDATE that needs to put the new tuple on a different page, our current behavior would be used. This means that the hint bits would still be useful in limiting the number of heap pages you hit. I also believe this means that we wouldn't suffer any additional overhead from our current code when there isn't much free space on pages. Since SITC allows for in-page space reuse without vacuuming only when no index keys change, it's most useful for very heavily updated tables such as session handlers or queue tables, because those tables typically have very few indexes, so it's pretty unlikely that an index key will change. For more general-purpose tables that have more indexes but still see a fair number of updates to a subset of rows, not having to update every index would likely be a win. I also don't see any reason why both options couldn't be used together. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| It is certainly possible to do what you are suggesting, that is have two index entries point to same chain head, and have the index access routines figure out if the index qualifications still hold, but that seems like a lot of overhead. Also, once there is only one visible row in the chain, removing old index entries seems quite complex because you have to have vacuum keep the qualifications of each row to figure out which index tuple is the valid one (seems messy). --------------------------------------------------------------------------- Jim C. Nasby wrote: > On Sun, Jun 25, 2006 at 09:13:48PM +0300, Heikki Linnakangas wrote: > > >If you can't expire the old row because one of the indexed columns was > > >modified, I see no reason to try to reduce the additional index entries. > > > > It won't enable early expiration, but it means less work to do on update. > > If there's a lot of indexes, not having to add so many index tuples can be > > a significant saving. > > While catching up on this thread, the following idea came to me that I > think would allow for not updating an index on an UPDATE if it's key > doesn't change. If I understand Bruce's SITC proposal correctly, this > would differ in that SITC requires that no index keys change. > > My idea is that if an UPDATE places the new tuple on the same page as > the old tuple, it will not create new index entries for any indexes > where the key doesn't change. This means that when fetching tuples from > the index, ctid would have to be followed until you found the version > you wanted OR you found the first ctid that pointed to a different page > (because that tuple will have it's own index entry) OR you found a tuple > with a different value for the key of the index you're using (because > it'd be invalid, and there'd be a different index entry for it). I > believe that the behavior of the index hint bits would also have to > change somewhat, as each index entry would now essentially be pointing > at all the tuples in the ctid chain that exist on a page, not just > single tuple. > > In the case of an UPDATE that needs to put the new tuple on a different > page, our current behavior would be used. This means that the hint bits > would still be useful in limiting the number of heap pages you hit. I > also believe this means that we wouldn't suffer any additional overhead > from our current code when there isn't much free space on pages. > > Since SITC allows for in-page space reuse without vacuuming only when no > index keys change, it's most useful for very heavily updated tables such > as session handlers or queue tables, because those tables typically have > very few indexes, so it's pretty unlikely that an index key will change. > For more general-purpose tables that have more indexes but still see a > fair number of updates to a subset of rows, not having to update every > index would likely be a win. I also don't see any reason why both > options couldn't be used together. > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: > > It is certainly possible to do what you are suggesting, that is have two > index entries point to same chain head, and have the index access > routines figure out if the index qualifications still hold, but that > seems like a lot of overhead. > > Also, once there is only one visible row in the chain, removing old > index entries seems quite complex because you have to have vacuum keep > the qualifications of each row to figure out which index tuple is the > valid one (seems messy). Perhaps my point got lost... in the case where no index keys change during an update, SITC seems superior in every way to my proposal. My idea (let's call it Index Tuple Page Consolidation, ITPC) would be beneficial to UPDATEs that modify one or more index keys but still put the tuple on the same page. Where SITC would be most useful for tables that have a very heavy update rate and very few indexes, ITPC would benefit tables that have more indexes on them; where presumably it's much more likely for UPDATEs to change at least one index key (which means SITC goes out the window, if I understand it correctly). If I'm missing something and SITC can in fact deal with some index keys changing during an UPDATE, then I see no reason for ITPC. > --------------------------------------------------------------------------- > > Jim C. Nasby wrote: > > On Sun, Jun 25, 2006 at 09:13:48PM +0300, Heikki Linnakangas wrote: > > > >If you can't expire the old row because one of the indexed columns was > > > >modified, I see no reason to try to reduce the additional index entries. > > > > > > It won't enable early expiration, but it means less work to do on update. > > > If there's a lot of indexes, not having to add so many index tuples can be > > > a significant saving. > > > > While catching up on this thread, the following idea came to me that I > > think would allow for not updating an index on an UPDATE if it's key > > doesn't change. If I understand Bruce's SITC proposal correctly, this > > would differ in that SITC requires that no index keys change. > > > > My idea is that if an UPDATE places the new tuple on the same page as > > the old tuple, it will not create new index entries for any indexes > > where the key doesn't change. This means that when fetching tuples from > > the index, ctid would have to be followed until you found the version > > you wanted OR you found the first ctid that pointed to a different page > > (because that tuple will have it's own index entry) OR you found a tuple > > with a different value for the key of the index you're using (because > > it'd be invalid, and there'd be a different index entry for it). I > > believe that the behavior of the index hint bits would also have to > > change somewhat, as each index entry would now essentially be pointing > > at all the tuples in the ctid chain that exist on a page, not just > > single tuple. > > > > In the case of an UPDATE that needs to put the new tuple on a different > > page, our current behavior would be used. This means that the hint bits > > would still be useful in limiting the number of heap pages you hit. I > > also believe this means that we wouldn't suffer any additional overhead > > from our current code when there isn't much free space on pages. > > > > Since SITC allows for in-page space reuse without vacuuming only when no > > index keys change, it's most useful for very heavily updated tables such > > as session handlers or queue tables, because those tables typically have > > very few indexes, so it's pretty unlikely that an index key will change. > > For more general-purpose tables that have more indexes but still see a > > fair number of updates to a subset of rows, not having to update every > > index would likely be a win. I also don't see any reason why both > > options couldn't be used together. > > -- > > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > > Pervasive Software http://pervasive.com work: 512-231-6117 > > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > -- > Bruce Momjian bruce@momjian.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Jim C. Nasby wrote: > On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: > > > > It is certainly possible to do what you are suggesting, that is have two > > index entries point to same chain head, and have the index access > > routines figure out if the index qualifications still hold, but that > > seems like a lot of overhead. > > > > Also, once there is only one visible row in the chain, removing old > > index entries seems quite complex because you have to have vacuum keep > > the qualifications of each row to figure out which index tuple is the > > valid one (seems messy). > > Perhaps my point got lost... in the case where no index keys change > during an update, SITC seems superior in every way to my proposal. My > idea (let's call it Index Tuple Page Consolidation, ITPC) would be > beneficial to UPDATEs that modify one or more index keys but still put > the tuple on the same page. Where SITC would be most useful for tables > that have a very heavy update rate and very few indexes, ITPC would > benefit tables that have more indexes on them; where presumably it's > much more likely for UPDATEs to change at least one index key (which > means SITC goes out the window, if I understand it correctly). If I'm > missing something and SITC can in fact deal with some index keys > changing during an UPDATE, then I see no reason for ITPC. I understood what you had said. The question is whether we want to get that complex with this feature, and if there are enough use cases (UPDATE with index keys changing) to warrant it. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| Ühel kenal päeval, E, 2006-06-26 kell 23:08, kirjutas Bruce Momjian: > Jim C. Nasby wrote: > > On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote: > > > > > > It is certainly possible to do what you are suggesting, that is have two > > > index entries point to same chain head, and have the index access > > > routines figure out if the index qualifications still hold, but that > > > seems like a lot of overhead. I think Jim meant not 2 pointing to the same head, but 2 pointing into the same chain. Say we have table with (id serial, ts timestamp) where ts changes at each update and id does not. So after 3 updates on one page we have one CITC/ITPC head with pointers from both indexes and two follow-up tuples with pointers from only ts index. The problem with this setup is, that we can't reuse any of those follow-up tuples without index cleanup. > > > Also, once there is only one visible row in the chain, removing old > > > index entries seems quite complex because you have to have vacuum keep > > > the qualifications of each row to figure out which index tuple is the > > > valid one (seems messy). > > > > Perhaps my point got lost... in the case where no index keys change > > during an update, SITC seems superior in every way to my proposal. My > > idea (let's call it Index Tuple Page Consolidation, ITPC) would be > > beneficial to UPDATEs that modify one or more index keys but still put > > the tuple on the same page. Where SITC would be most useful for tables > > that have a very heavy update rate and very few indexes, ITPC would > > benefit tables that have more indexes on them; where presumably it's > > much more likely for UPDATEs to change at least one index key (which > > means SITC goes out the window, if I understand it correctly). If I'm > > missing something and SITC can in fact deal with some index keys > > changing during an UPDATE, then I see no reason for ITPC. > > I understood what you had said. The question is whether we want to get > that complex with this feature, and if there are enough use cases > (UPDATE with index keys changing) to warrant it. I'd like to think that most heavily-updated tables avoid that, but there may be still cases where this is needed. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |