This is a discussion on Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first within the pgsql Hackers forums, part of the PostgreSQL category; --> Magnus Hagander <magnus@hagander.net> writes: > On Fri, May 04, 2007 at 12:38:18PM -0400, Tom Lane wrote: >> Magnus Hagander ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Magnus Hagander <magnus@hagander.net> writes: > On Fri, May 04, 2007 at 12:38:18PM -0400, Tom Lane wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>> Could we show it in EXPLAIN ANALYZE somehow? I'm thinking it would be good >>> to see at runtime (for example as a hint that if you put in a bit more >>> work_mem it might get used) >> >> I don't see that this is any more interesting than whether the sort >> spilled to disk or not; which is something we don't show in EXPLAIN >> ANALYZE either. trace_sort is the agreed API for examining that. > Now that you mention it, that'd be nice to have as well - the point being > making it available without recompile. Well, trace_sort is available by default, but... >> It's not exactly easy to do, because (a) none of this information >> is exposed outside tuplesort.c, and (b) the tuplesortstate object >> is probably gone by the time EXPLAIN ANALYZE runs, anyway. > Hmm. Ok. Don't know enough about those parts of the code to comment on > that, but I'll certainly take your word for it :-) I take back point (b) --- the state object is released at ExecutorEnd, and EXPLAIN ANALYZE examines the tree before doing that, so if we added some kind of reporting function to tuplesort.c's API it'd be doable easily enough. What do you think the output should look like? The first thought that comes to mind is to add "method=memory" (or disk or top-N) to the "actual" annotation: regression=# explain analyze select * from tenk1 order by fivethous limit 100; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Limit (cost=840.19..840.44 rows=100 width=244) (actual time=140.511..141.604 rows=100 loops=1) -> Sort (cost=840.19..865.19 rows=10000 width=244) (actual time=140.492..140.880 rows=100 loops=1 method=top-N) ^^^^^^^^^^^^ Sort Key: fivethous -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual time=0.074..51.849 rows=10000 loops=1) Total runtime: 143.089 ms (5 rows) Another possibility, which could be wedged into explain.c slightly more easily, is to append "Method: top-N" or some such to the Sort Key line, but I'm not sure that that would look nice. Comments, ideas? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| > >> It's not exactly easy to do, because (a) none of this information > >> is exposed outside tuplesort.c, and (b) the tuplesortstate object > >> is probably gone by the time EXPLAIN ANALYZE runs, anyway. > > > Hmm. Ok. Don't know enough about those parts of the code to comment on > > that, but I'll certainly take your word for it :-) > > I take back point (b) --- the state object is released at ExecutorEnd, > and EXPLAIN ANALYZE examines the tree before doing that, so if we added > some kind of reporting function to tuplesort.c's API it'd be doable > easily enough. > > What do you think the output should look like? The first thought that > comes to mind is to add "method=memory" (or disk or top-N) to the > "actual" annotation: > > regression=# explain analyze select * from tenk1 order by fivethous limit 100; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------ > Limit (cost=840.19..840.44 rows=100 width=244) (actual time=140.511..141.604 rows=100 loops=1) > -> Sort (cost=840.19..865.19 rows=10000 width=244) (actual time=140.492..140.880 rows=100 loops=1 method=top-N) > ^^^^^^^^^^^^ > Sort Key: fivethous > -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual time=0.074..51.849 rows=10000 loops=1) > Total runtime: 143.089 ms > (5 rows) Looks pretty good to me. Easy to find and hard to misunderstand /Magnus ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On May 4, 2007, at 7:08 PM, Tom Lane wrote: > What do you think the output should look like? The first thought that > comes to mind is to add "method=memory" (or disk or top-N) to the > "actual" annotation: > > regression=# explain analyze select * from tenk1 order by fivethous > limit 100; > QUERY PLAN > ---------------------------------------------------------------------- > -------------------------------------------------- > Limit (cost=840.19..840.44 rows=100 width=244) (actual > time=140.511..141.604 rows=100 loops=1) > -> Sort (cost=840.19..865.19 rows=10000 width=244) (actual > time=140.492..140.880 rows=100 loops=1 method=top-N) > > ^^^^^^^^^^^^ > Sort Key: fivethous > -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 > width=244) (actual time=0.074..51.849 rows=10000 loops=1) > Total runtime: 143.089 ms > (5 rows) > > Another possibility, which could be wedged into explain.c slightly > more > easily, is to append "Method: top-N" or some such to the Sort Key > line, > but I'm not sure that that would look nice. If the method is disk it would be nice to know how much spilled to disk. That would tell you if it would be worth increasing work_mem, and by how much. On a related note, it would also be *really* nice if we kept stats on how many sorts or hashes had spilled to disk, perhaps along with how much had spilled. Right now the only way to monitor that in a production system is to setup a cron job to watch pgsql_tmp, which is far from elegant. I know there's concern about how much we add to the stats file, but I don't think this needs to be on a per-relation basis; per-database should be fine. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Jim Nasby <decibel@decibel.org> writes: > On a related note, it would also be *really* nice if we kept stats on > how many sorts or hashes had spilled to disk, perhaps along with how > much had spilled. Right now the only way to monitor that in a > production system is to setup a cron job to watch pgsql_tmp, which is > far from elegant. No, you can turn on trace_sort and track it from watching the log. If pgfouine hasn't got something for that already, I'd be surprised. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| > > What do you think the output should look like? The first thought that > > comes to mind is to add "method=memory" (or disk or top-N) to the > > "actual" annotation: Having the "disk" and "memory" would be really useful too. -- Josh Berkus PostgreSQL @ Sun San Francisco ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| Jim Nasby <decibel@decibel.org> writes: > If the method is disk it would be nice to know how much spilled to > disk. That would tell you if it would be worth increasing work_mem, > and by how much. Well, a more radical proposal is to add a whole 'nother line to the output, which would give us room for several bits of info. Perhaps like this: -> Sort (cost=840.19..865.19 rows=10000 width=244) (actual time=151.769..152.157 rows=100 loops=1) Sort Key: fivethous Sort Method: top-N Memory: 17KB -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual or Sort Method: disk Memory: 1000KB Disk: 18482KB Not sure what other info might be useful. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On 5/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > -> Sort (cost=840.19..865.19 rows=10000 width=244) (actual time=140.492..140.880 rows=100 loops=1 method=top-N) > ^^^^^^^^^^^^ > Sort Key: fivethous > Another possibility, which could be wedged into explain.c slightly more > easily, is to append "Method: top-N" or some such to the Sort Key line, > but I'm not sure that that would look nice. Is it possible to have something like Sort (disk|top-N|memory) instead of Sort? I'm really not sure it's a good idea to break the (actual time=0.074..51.849 rows=10000 loops=1) output we have for every node. It's easier to read the output when it's consistent. If not, append it at the end of the Sort Key line is better IMHO. -- Guillaume ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| On 5/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > No, you can turn on trace_sort and track it from watching the log. > If pgfouine hasn't got something for that already, I'd be surprised. Well, it hasn't. I never used trace_sort so i didn't think of implementing something to use it. I'll take a look at it for the next versions to see if I can do something useful. If anyone has suggestions/needs on this point, they are welcome. -- Guillaume ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On 5/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Sort Method: disk Memory: 1000KB Disk: 18482KB +1 for this one. -- Guillaume ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| "Guillaume Smet" <guillaume.smet@gmail.com> writes: > Is it possible to have something like Sort (disk|top-N|memory) instead > of Sort? That would be sane if the decision were fixed at plan time, but it isn't. What do you think of the add-a-line approach? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |