vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi All, I have boiled my situation down to the following simple case: (postgres version 7.3) * Query 1 is doing a sequential scan over a table (courtesy of field ILIKE 'foo%') and index joins to a few others * Query 2 is doing a functional index scan over the same table (lower(field) LIKE 'foo%') and index joins to a few others * neither query has an order by clause * for the purpose of testing, both queries are designed to return the same result set Obviously Q2 is faster than Q1, but if I ever run them both at the same time (lets say I run two of Q1 and one of Q2 at the same time) then Q2 consistently returns WORSE times than Q1 (explain analyze confirms that it is using the index). My assumption is that the sequential scan is blowing the index from any cache it might live in, and simultaneously stealing all the disk IO that is needed to access the index on disk (the table has 200,000 rows). If I simplify the case to not do the index joins (ie. operate on the one table only) the situation is not as dramatic, but similar. My thoughts are: 1) kill the sequential scan - but unfortunately I don't have direct control over that code 2) change the way the server allocates/prioritizes different caches - i don't know enough about how postgres caches work to do this (if it's possible) 3) try it on postgres 7.4 - possible, but migrating the system to 7.4 in production will be hard because the above code that I am not responsible for has a lot of (slightly wacky) implicit date casts 4) ask the fine people on the mailing list for other suggestions! -- Mark Aufflick e mark@pumptheory.com w www.pumptheory.com (work) w mark.aufflick.com (personal) p +61 438 700 647 f +61 2 9436 4737 ================================================== ====================== iBurst Wireless Broadband from $34.95/month www.platformnetworks.net Forward undetected SPAM to: spam@mailsecurity.net.au ================================================== ====================== ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| The world rejoiced as mark@pumptheory.com (Mark Aufflick) wrote: > Hi All, > > I have boiled my situation down to the following simple case: > (postgres version 7.3) > > * Query 1 is doing a sequential scan over a table (courtesy of field > ILIKE 'foo%') and index joins to a few others > * Query 2 is doing a functional index scan over the same table > (lower(field) LIKE 'foo%') and index joins to a few others > * neither query has an order by clause > * for the purpose of testing, both queries are designed to return the > same result set > > Obviously Q2 is faster than Q1, but if I ever run them both at the > same time (lets say I run two of Q1 and one of Q2 at the same time) > then Q2 consistently returns WORSE times than Q1 (explain analyze > confirms that it is using the index). > > My assumption is that the sequential scan is blowing the index from > any cache it might live in, and simultaneously stealing all the disk > IO that is needed to access the index on disk (the table has 200,000 > rows). There's something to be said for that... > If I simplify the case to not do the index joins (ie. operate on the > one table only) the situation is not as dramatic, but similar. > > My thoughts are: > > 1) kill the sequential scan - but unfortunately I don't have direct > control over that code This is a good choice, if plausible... > 2) change the way the server allocates/prioritizes different caches - > i don't know enough about how postgres caches work to do this (if it's > possible) That's what the 8.0 cache changes did... Patent claim issues are leading to some changes to the prioritization, which is liable to change 8.0.something and 8.1. > 3) try it on postgres 7.4 - possible, but migrating the system to 7.4 > in production will be hard because the above code that I am not > responsible for has a lot of (slightly wacky) implicit date casts Moving to 7.4 wouldn't materially change the situation; you'd have to go all the way to version 8. -- (format nil "~S@~S" "cbbrowne" "gmail.com") http://linuxdatabases.info/~cbbrowne/postgresql.html Rules of the Evil Overlord #32. "I will not fly into a rage and kill a messenger who brings me bad news just to illustrate how evil I really am. Good messengers are hard to come by." <http://www.eviloverlord.com/> |
| |||
| Mark Aufflick <mark@pumptheory.com> writes: > Obviously Q2 is faster than Q1, That's not really obvious at all. If there are lots of records being returned the index might not be faster than a sequential scan. > My assumption is that the sequential scan is blowing the index from any cache > it might live in, and simultaneously stealing all the disk IO that is needed to > access the index on disk (the table has 200,000 rows). It kind of sounds to me like you've lowered random_page_cost to reflect the fact that your indexes are nearly always completely cached. But when they're not this unrealistic random_page_cost causes indexes to be used when they're no longer faster. Perhaps you should post an "EXPLAIN ANALYZE" of your Q1 and Q2 (the latter preferable with and without enable_indexscan, but since it's a join you may not be able to get precisely the comparable plan without just that one index scan.) > 2) change the way the server allocates/prioritizes different caches - i don't > know enough about how postgres caches work to do this (if it's possible) Postgres keeps one set of shared buffers, not separate pools . Normally you only allocate a small amount of your memory for Postgres and let the OS handle disk caching. What is your shared_buffers set to and how much memory do you have? > 3) try it on postgres 7.4 - possible, but migrating the system to 7.4 in > production will be hard because the above code that I am not responsible for > has a lot of (slightly wacky) implicit date casts I can't think of any 7.4 changes that would affect this directly, but there were certainly plenty of changes that had broad effects. you never know. 8.0, on the other hand, has a new algorithm that specifically tries to protect against the shared buffers being blown out by a sequential scan. But that will only help if it's the shared buffers being thrashed that's hurting you, not the entire OS file system cache. -- greg ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > 8.0, on the other hand, has a new algorithm that specifically tries to > > protect against the shared buffers being blown out by a sequential > > scan. But that will only help if it's the shared buffers being > > thrashed that's hurting you, not the entire OS file system cache. > > Something we ought to think about sometime: what are the performance > implications of the real-world situation that we have another level of > caching sitting underneath us? It seems inevitable that Postgres will eventually eliminate that redundant layer of buffering. Since mmap is not workable, that means using O_DIRECT to read table and index data. Every other database eventually goes this direction, and for good reason. Having two layers of caching and buffering is inherently inefficient. It also makes it impossible for Postgres to offer any application-specific hints to the caching replacement algorithms. In that world you would configure Postgres much like you configure Oracle, with shared_buffers taking up as much of your memory as you can afford. And the OS file system cache is kept entirely out of the loop. > AFAIK all the theoretical studies we've looked at consider only a single > level of caching. But for example, if our buffer management algorithm > recognizes an index page as being heavily hit and therefore keeps it in > cache for a long time, then when it does fall out of cache you can be sure > it's going to need to be read from disk when it's next used, because the > OS-level buffer cache has not seen a call for that page in a long time. > Contrariwise a page that we think is only on the fringe of usefulness is > going to stay in the OS cache because we repeatedly drop it and then have to > ask for it again. Hum. Is it clear that that's bad? By the same logic it's the ones on the fringe that you're likely to have to read again anyways. The ones that are being heavily used are likely not to have to be read again anyways. -- greg ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| Josh Berkus <josh@agliodbs.com> writes: > Why is mmap not workable? We can't control write order. There are other equally bad problems, but that one alone eliminates it from consideration. See past discussions. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Josh Berkus <josh@agliodbs.com> writes: > Why is mmap not workable? It would require far-reaching changes to our code > -- certainly -- but I don't think it can be eliminated from consideration. Fundamentally because there is no facility for being notified by the OS before a page is written to disk. And there's no way to prevent a page from being written to disk (mlock just prevents it from being flushed from memory, not from being synced to disk). So there's no way to guarantee the WAL will be written before the buffer is synced to disk. Maybe it could be done by writing and syncing the WAL independently before the shared buffer is written to at all, but that would be a completely different model. And it would locking the shared buffer until the sync is done, and require a private copy of the shared buffer necessitating more copies than the double buffering in the first place. -- greg ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Hi Rod, > Any solution fixing buffers should probably not take into consideration > the method being performed (do you really want to skip caching a > sequential scan of a 2 tuple table because it didn't use an index) but > the volume of data involved as compared to the size of the cache. Yes, in fact indexes aren't so different to tables really in that regard. It sounds like version 8 may help out anyway. regards Iain ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| Hi, I think there was some discussion about seq scans messing up the cache, and talk about doing something about it but I don't think it has been addressed yet. Maybe worth a troll through the archives. It is certainly true that in many situations, a seq scan is preferable to using an index. I have been testing a situation here on two versions of the same database, one of the databases is much bigger than the other (artificially bloated for testing purposes). Some of the query plans change to use seq scans on the big database, where they used indexes on the little database - but either way, in *single user* testing the performance is fine. My concern is that this kind of testing has very little relevance to the real world of multiuser processing where contention for the cache becomes an issue. It may be that, at least in the current situation, postgres is giving too much weight to seq scans based on single user, straight line performance comparisons. If your assumption is correct, then addressing that might help, though it's bound to have it's compromises too... regards Iain ----- Original Message ----- From: "Mark Aufflick" <mark@pumptheory.com> To: <pgsql-performance@postgresql.org> Sent: Tuesday, February 15, 2005 8:34 AM Subject: [PERFORM] seq scan cache vs. index cache smackdown > Hi All, > > I have boiled my situation down to the following simple case: (postgres > version 7.3) > > * Query 1 is doing a sequential scan over a table (courtesy of field ILIKE > 'foo%') and index joins to a few others > * Query 2 is doing a functional index scan over the same table > (lower(field) LIKE 'foo%') and index joins to a few others > * neither query has an order by clause > * for the purpose of testing, both queries are designed to return the same > result set > > Obviously Q2 is faster than Q1, but if I ever run them both at the same > time (lets say I run two of Q1 and one of Q2 at the same time) then Q2 > consistently returns WORSE times than Q1 (explain analyze confirms that it > is using the index). > > My assumption is that the sequential scan is blowing the index from any > cache it might live in, and simultaneously stealing all the disk IO that > is needed to access the index on disk (the table has 200,000 rows). > > If I simplify the case to not do the index joins (ie. operate on the one > table only) the situation is not as dramatic, but similar. > > My thoughts are: > > 1) kill the sequential scan - but unfortunately I don't have direct > control over that code > 2) change the way the server allocates/prioritizes different caches - i > don't know enough about how postgres caches work to do this (if it's > possible) > 3) try it on postgres 7.4 - possible, but migrating the system to 7.4 in > production will be hard because the above code that I am not responsible > for has a lot of (slightly wacky) implicit date casts > 4) ask the fine people on the mailing list for other suggestions! > -- > Mark Aufflick > e mark@pumptheory.com > w www.pumptheory.com (work) > w mark.aufflick.com (personal) > p +61 438 700 647 > f +61 2 9436 4737 > > > ================================================== ====================== > iBurst Wireless Broadband from $34.95/month www.platformnetworks.net > Forward undetected SPAM to: spam@mailsecurity.net.au > ================================================== ====================== > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |
| |||
| Greg Stark <gsstark@mit.edu> writes: > 8.0, on the other hand, has a new algorithm that specifically tries to > protect against the shared buffers being blown out by a sequential > scan. But that will only help if it's the shared buffers being > thrashed that's hurting you, not the entire OS file system cache. Something we ought to think about sometime: what are the performance implications of the real-world situation that we have another level of caching sitting underneath us? AFAIK all the theoretical studies we've looked at consider only a single level of caching. But for example, if our buffer management algorithm recognizes an index page as being heavily hit and therefore keeps it in cache for a long time, then when it does fall out of cache you can be sure it's going to need to be read from disk when it's next used, because the OS-level buffer cache has not seen a call for that page in a long time. Contrariwise a page that we think is only on the fringe of usefulness is going to stay in the OS cache because we repeatedly drop it and then have to ask for it again. I have no idea how to model this situation, but it seems like it needs some careful thought. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| ||||
| Tom, Greg, Merlin, > But for example, > if our buffer management algorithm recognizes an index page as being > heavily hit and therefore keeps it in cache for a long time, then when > it does fall out of cache you can be sure it's going to need to be read > from disk when it's next used, because the OS-level buffer cache has not > seen a call for that page in a long time. Contrariwise a page that we > think is only on the fringe of usefulness is going to stay in the OS > cache because we repeatedly drop it and then have to ask for it again. Now you can see why other DBMSs don't use the OS disk cache. There's other issues as well; for example, as long as we use the OS disk cache, we can't eliminate checkpoint spikes, at least on Linux. No matter what we do with the bgwriter, fsyncing the OS disk cache causes heavy system activity. > It seems inevitable that Postgres will eventually eliminate that redundant > layer of buffering. Since mmap is not workable, that means using O_DIRECT > to read table and index data. Why is mmap not workable? It would require far-reaching changes to our code -- certainly -- but I don't think it can be eliminated from consideration. > What about going the other way and simply letting the o/s do all the > caching? *How bad (or good) would the performance really be? * Pretty bad. You can simulate this easily by turning your shared_buffers way down ... -- Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |