vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Fri, Dec 15, 2006 at 10:28:08AM +0000, Simon Riggs wrote: > Until we work out a better solution we can fix this in two ways: > > 1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ... > > 2. enable_analyze_timer = off | on (default) (USERSET) What exactly would this do? Only count actual rows or something? I wrote a patch that tried statistical sampling, but the figures were too far off for people's liking. > A performance drop of 4x-10x is simply unacceptable when trying to tune > queries where the current untuned time is already too high. Tying down > production servers for hours on end when we know for certain all they > are doing is calling gettimeofday millions of times is not good. This > quickly leads to the view from objective people that PostgreSQL doesn't > have a great optimizer, whatever we say in its defence. I don't want to > leave this alone, but I don't want to spend a month fixing it either. I think the best option is setitimer(), but it's not POSIX so platform support is going to be patchy. BTW, doing gettimeofday() without kernel entry is not really possible. You could use the cycle counter but it has the problem that if you have multiple CPUs you need to calibrate the result. If the CPU goes to sleep, there's is no way for the userspace process to know. Only the kernel has all the relevent information about what "time" is to get a reasonable result. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFFgn3gIB7bNG8LQkwRAt5MAJ9nP2i0ID3j+ZJPuYsJYJ WdC89o4QCePHb0 1eftAbtcNOuJsNML7/C+p1w= =YEID -----END PGP SIGNATURE----- |
| |||
| Hi list, Le vendredi 15 décembre 2006 11:50, Martijn van Oosterhout a écrit*: > BTW, doing gettimeofday() without kernel entry is not really possible. > You could use the cycle counter but it has the problem that if you have > multiple CPUs you need to calibrate the result. If the CPU goes to > sleep, there's is no way for the userspace process to know. Only the > kernel has all the relevent information about what "time" is to get a > reasonable result. I remember having played with intel RDTSC (time stamp counter) for some timing measurement, but just read from several sources (including linux kernel hackers considering its usage for gettimeofday() implementation) that TSC is not an accurate method to have elapsed time information. May be some others method than gettimeofday() are available (Lamport Timestamps, as PGDG may have to consider having a distributed processing ready EA in some future), cheaper and accurate? After all, the discussion, as far as I understand it, is about having a accurate measure of duration of events, knowing when they occurred in the day does not seem to be the point. My 2¢, hoping this could be somehow helpfull, -- Dimitri Fontaine http://www.dalibo.com/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBFgoX6l4JX7QVNfwQRAsZJAJ9OvzBumiXFdd9uP5HkuS e2n/fogwCfXpdd zf1K+3pe7dYnbvZdkJ4hrUw= =qK/Z -----END PGP SIGNATURE----- |
| |||
| "Martijn van Oosterhout" <kleptog@svana.org> writes: > BTW, doing gettimeofday() without kernel entry is not really possible. That's too strong a conclusion. Doing gettimeofday() without some help from the kernel isn't possible but it isn't necessary to enter the kernel for each call. There are various attempts at providing better timing infrastructure at low overhead but I'm not sure what's out there currently. I expect to do this what we'll have to do is invent a pg_* abstraction that has various implementations on different architectures. On Solaris it can use DTrace internally, on Linux it might have something else (or more likely several different options depending on the age and config options of the kernel). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| On Fri, 2006-12-15 at 11:50 +0100, Martijn van Oosterhout wrote: > On Fri, Dec 15, 2006 at 10:28:08AM +0000, Simon Riggs wrote: > > Until we work out a better solution we can fix this in two ways: > > > > 1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ... > > > > 2. enable_analyze_timer = off | on (default) (USERSET) > > What exactly would this do? Only count actual rows or something? Yes. It's better to have this than nothing at all. > I > wrote a patch that tried statistical sampling, but the figures were too > far off for people's liking. Well, I like your ideas, so if you have any more... Maybe sampling every 10 rows will bring things down to an acceptable level (after the first N). You tried less than 10 didn't you? Maybe we can count how many real I/Os were required to perform each particular row, so we can adjust the time per row based upon I/Os. ISTM that sampling at too low a rate means we can't spot the effects of cache and I/O which can often be low frequency but high impact. > I think the best option is setitimer(), but it's not POSIX so > platform support is going to be patchy. Don't understand that. I thought that was to do with alarms and signals. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Fri, Dec 15, 2006 at 12:15:59PM +0000, Gregory Stark wrote: > There are various attempts at providing better timing infrastructure at low > overhead but I'm not sure what's out there currently. I expect to do thiswhat > we'll have to do is invent a pg_* abstraction that has various implementations > on different architectures. On Solaris it can use DTrace internally, on Linux > it might have something else (or more likely several different options > depending on the age and config options of the kernel). I think we need to move to a sampling approach. setitimer is good, except it doesn't tell you if signals have been lost. Given they are most likely to be lost during high disk I/O, they're actually significant. I'm trying to think of a way around that. Then you don't need a cheap gettimeofday at all... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFFgpQUIB7bNG8LQkwRAuREAJ9Eu251Ysze43TeR+39EO 7LEoPmzwCgi6EX +YoCVnOkz0/b+SdQhVrCdGo= =/5fb -----END PGP SIGNATURE----- |
| |||
| Gregory Stark <stark@enterprisedb.com> writes: > There are various attempts at providing better timing infrastructure at low > overhead but I'm not sure what's out there currently. I expect to do this what > we'll have to do is invent a pg_* abstraction that has various implementations > on different architectures. You've got to be kidding. Surely it's glibc's responsibility, not ours, to implement gettimeofday correctly for the hardware. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| Tom Lane <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: > > There are various attempts at providing better timing infrastructure at low > > overhead but I'm not sure what's out there currently. I expect to do this what > > we'll have to do is invent a pg_* abstraction that has various implementations > > on different architectures. > > You've got to be kidding. Surely it's glibc's responsibility, not ours, > to implement gettimeofday correctly for the hardware. Except for two things: a) We don't really need gettimeofday. That means we don't need something sensitive to adjustments made by ntp etc. In fact that would be actively bad. Currently if the user runs "date" to reset his clock back a few days I bet interesting things happen to a large explain analyze that's running. In fact we don't need something that represents any absolute time, only time elapsed since some other point we choose. That might be easier to implement than what glibc has to do to implement gettimeofday fully. b) glibc may not want to endure an overhead on every syscall and context switch to make gettimeofday faster on the assumption that gettimeofday is a rare call and it should pay the price rather than imposing an overhead on everything else. Postgres knows when it's running an explain analyze and a 1% overhead would be entirely tolerable, especially if it affected the process pretty much evenly unlike the per-gettimeofday-overhead which can get up as high as 100% on some types of subplans and is negligible on others. And more to the point Postgres wouldn't have to endure this overhead at all when it's not needed whereas glibc has no idea when you're going to need gettimeofday next. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |