vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi all, I have a large busy myisam table which has a lot of inserts and deletes acting on it (as well as selects and updates). I am sure I have read somewhere that it can be more efficient not to delete a row at all, but to just set a "deleted" flag on it, and then perhaps do a bulk delete at some quiet time. I can't find where I read it now though. Is this the case? Are there only particular use cases where it makes sense? Thanks |
| |||
| On Tue, 18 Mar 2008 12:43:00 -0700 (PDT), lister wrote: > Hi all, > > I have a large busy myisam table which has a lot of inserts and > deletes acting on it (as well as selects and updates). > > I am sure I have read somewhere that it can be more efficient not to > delete a row at all, but to just set a "deleted" flag on it, and then > perhaps do a bulk delete at some quiet time. I can't find where I read > it now though. > > Is this the case? Are there only particular use cases where it makes > sense? Well, for one thing it makes it a LOT easier to restore the row if you want it back... I'm doing both that and the "track changes by selecting the most recent of relevant records" to be able to easily undo updates for one of the sites I maintain. As for when to use such things, basically, if the people doing updates or deletes aren't people you can fire, charge for time to fix problems, or punch in the mouth when they screw up, the time spent building in these kinds of tricks is USUALLY well-spent. -- 32. I will not fly into a rage and kill a messenger who brings me bad news just to illustrate how evil I really am. Good messengers are hard to come by. --Peter Anspach's list of things to do as an Evil Overlord |
| |||
| lister wrote: > Hi all, > > I have a large busy myisam table which has a lot of inserts and > deletes acting on it (as well as selects and updates). > > I am sure I have read somewhere that it can be more efficient not to > delete a row at all, but to just set a "deleted" flag on it, and then > perhaps do a bulk delete at some quiet time. I can't find where I read > it now though. > > Is this the case? Are there only particular use cases where it makes > sense? > > Thanks It does make sense - just be aware that your applications would need to include "and rec_deleted is null" or similar in all of your other sql that operates on those tables (select, insert, update) to ensure you are not operating on data you supposedly deleted. Of course the biggest factor here is time and effort. Is it going to take you more time to add the deleted column (very easy) + modifying your code than it is really worth? Does it really take you that long to "purge" old data? |
| |||
| On Mar 18, 12:43 pm, lister <listerofsme...@hotmail.com> wrote: > Hi all, > > I have a large busy myisam table which has a lot of inserts and > deletes acting on it (as well as selects and updates). > > I am sure I have read somewhere that it can be more efficient not to > delete a row at all, but to just set a "deleted" flag on it, and then > perhaps do a bulk delete at some quiet time. I can't find where I read > it now though. > > Is this the case? Are there only particular use cases where it makes > sense? > > Thanks I tend to doubt if this saves you any processing time at all. After all, an I/O is an I/O, and the system is going to pull down a block, modify it, and put it back, no matter whether that's an update or a delete. And, although I'm not expert on database internals, I'd be willing to bet that internally the system just flags the row for deletion anyway. Then later on, as time and algorithms permit, it puts the row back into the pool of available space. So all you'd accomplish with your flag is duplicating the internal actions, and not saving hardly anything at all. Other than that -- Peter H. Coffin is 100% correct in that this method allows you to "undelete" a row, something that can be very important, depending on the application. And Michael Austin is 100% correct that possibly your biggest time constraint will be man-hours changing all the application queries to add ("and rec_deleted is null") to each and every SQL that includes this table. Bottom line: unless it's needed by the application, don't bother. Hope that helps. |
| |||
| ThanksButNo wrote: > On Mar 18, 12:43 pm, lister <listerofsme...@hotmail.com> wrote: >> Hi all, >> >> I have a large busy myisam table which has a lot of inserts and >> deletes acting on it (as well as selects and updates). >> >> I am sure I have read somewhere that it can be more efficient not to >> delete a row at all, but to just set a "deleted" flag on it, and then >> perhaps do a bulk delete at some quiet time. I can't find where I read >> it now though. >> >> Is this the case? Are there only particular use cases where it makes >> sense? >> >> Thanks > > I tend to doubt if this saves you any processing time at all. > > After all, an I/O is an I/O, and the system is going to pull down a > block, modify it, and put it back, no matter whether that's an update > or a delete. > > And, although I'm not expert on database internals, I'd be willing to > bet that internally the system just flags the row for deletion > anyway. Then later on, as time and algorithms permit, it puts the row > back into the pool of available space. So all you'd accomplish with > your flag is duplicating the internal actions, and not saving hardly > anything at all. > > Other than that -- > > Peter H. Coffin is 100% correct in that this method allows you to > "undelete" a row, something that can be very important, depending on > the application. > > And Michael Austin is 100% correct that possibly your biggest time > constraint will be man-hours changing all the application queries to > add ("and rec_deleted is null") to each and every SQL that includes > this table. > > Bottom line: unless it's needed by the application, don't bother. > > Hope that helps. > There is a slight additional overhead - updating the indexes and resorting them. However, this must be done at some time, anyway. If you delete multiple rows in one statement, then the indexes will only be updated once. But that's offset by the additional entries in the indexes every time you SELECT from the table (or even worse need to do a table scan). And then there's the extra processing rec_deleted is null on every request. If you're not doing many selects - mostly inserts and deletes, it probably would save some time to mark the row(s) as deleted and later delete all at once. But if this is like most databases where there are far more SELECTS, it probably would hurt processing in the long run. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| On Mar 19, 1:23*am, ThanksButNo <no.no.tha...@gmail.com> wrote: > After all, an I/O is an I/O, and the system is going to pull down a > block, modify it, and put it back, no matter whether that's an update > or a delete. > > And, although I'm not expert on database internals, I'd be willing to > bet that internally the system just flags the row for deletion > anyway. *Then later on, as time and algorithms permit, it puts the row > back into the pool of available space. *So all you'd accomplish with > your flag is duplicating the internal actions, and not saving hardly > anything at all. Well after doing some more reading last night I found the following on the mysql site: "If a MyISAM table has no holes in the data file (deleted rows in the middle), inserts can be performed to add rows to the end of the table at the same time that SELECT statements are reading rows from the table." (http://dev.mysql.com/doc/refman/5.1/...t-inserts.html) This would perhaps give a potential speed increase by preventing holes due to deletions? Any new insert will automatically to the end of the table and so won't interfere with SELECTs. It does go on to say that this behaviour can be forced though with the concurrent_inserts variable. I guess the undelete option is the best argument for the practice so far. |
| ||||
| On Mar 18, 10:35*pm, Michael Austin <maus...@firstdbasource.com> wrote: > It does make sense - just be aware that your applications would need to > include "and rec_deleted is null" or similar in all of your other sql > that operates on those tables (select, insert, update) to ensure you are > not operating on data you supposedly deleted. > > Of course the biggest factor here is time and effort. Is it going to > take you more time to add the deleted column (very easy) + modifying > your code than it is really worth? *Does it really take you that long to > "purge" old data? Could I not use a view to do this? (Don't know much about views mind) |