This is a discussion on Re: stats collector process high CPU utilization within the Pgsql Performance forums, part of the PostgreSQL category; --> Benjamin Minshall <minshall@intellicon.biz> writes: > Since upgrading to 8.2.3 yesterday, the stats collector process has had > very high ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Benjamin Minshall <minshall@intellicon.biz> writes: > Since upgrading to 8.2.3 yesterday, the stats collector process has had > very high CPU utilization; it is consuming roughly 80-90% of one CPU. > The server seems a lot more sluggish than it was before. Is this normal > operation for 8.2 or something I should look into correcting? What version did you update from, and what platform is this? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Benjamin Minshall <minshall@intellicon.biz> writes: > Tom Lane wrote: >> Benjamin Minshall <minshall@intellicon.biz> writes: >>> Since upgrading to 8.2.3 yesterday, the stats collector process has had >>> very high CPU utilization; it is consuming roughly 80-90% of one CPU. >>> The server seems a lot more sluggish than it was before. Is this normal >>> operation for 8.2 or something I should look into correcting? >> What version did you update from, and what platform is this? > I upgraded from 8.1.5. The system is a dual Xeon 2.4Ghz, 4Gb RAM > running linux kernel 2.6 series. OK, I was trying to correlate it with post-8.2.0 patches but evidently that's the wrong tree to bark up. No, this isn't an expected behavior. Is there anything unusual about your database (huge numbers of tables, or some such)? Can you gather some info about what it's doing? strace'ing the stats collector might prove interesting, also if you have built it with --enable-debug then oprofile results would be helpful. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Benjamin Minshall <minshall@intellicon.biz> writes: > Tom Lane wrote: >> Can you gather some info about what it's doing? >> strace'ing the stats collector might prove interesting, also if you have >> built it with --enable-debug then oprofile results would be helpful. > I will gather some strace info later today when I have a chance to > shutdown the server. I don't see why you'd need to shut anything down. Just run strace -p stats-process-ID for a few seconds or minutes (enough to gather maybe a few thousand lines of output). regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| Tom Lane wrote: > Benjamin Minshall <minshall@intellicon.biz> writes: >> Tom Lane wrote: >>> Can you gather some info about what it's doing? >>> strace'ing the stats collector might prove interesting, also if you have >>> built it with --enable-debug then oprofile results would be helpful. > >> I will gather some strace info later today when I have a chance to >> shutdown the server. > > I don't see why you'd need to shut anything down. Just run > strace -p stats-process-ID > for a few seconds or minutes (enough to gather maybe a few thousand > lines of output). > Seems the problem may be related to a huge global/pgstat.stat file. Under 8.1.5 it was about 1 MB; now it's 90 MB in 8.2.3. I ran strace for 60 seconds: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 95.71 1.119004 48652 23 rename 4.29 0.050128 0 508599 write 0.00 0.000019 0 249 22 poll 0.00 0.000000 0 23 open 0.00 0.000000 0 23 close 0.00 0.000000 0 34 getppid 0.00 0.000000 0 23 munmap 0.00 0.000000 0 23 setitimer 0.00 0.000000 0 23 22 sigreturn 0.00 0.000000 0 23 mmap2 0.00 0.000000 0 23 fstat64 0.00 0.000000 0 216 recv ------ ----------- ----------- --------- --------- ---------------- 100.00 1.169151 509282 44 total I attached an excerpt of the full strace with the many thousands of write calls filtered. -- Benjamin Minshall <minshall@intellicon.biz> Senior Developer -- Intellicon, Inc. http://www.intellicon.biz .. .. .. write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0T \274\355"..., 4096) = 4096 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0/O\227\27\230"..., 4096) = 4096 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0\0\0"..., 4096) = 4096 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0\0\0"..., 3363) = 3363 close(3) = 0 munmap(0xacdfc000, 4096) = 0 rename("global/pgstat.tmp", "global/pgstat.stat") = 0 poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1 recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\1\0\0\0\0\0 \0\0\301"..., 1000, 0) = 976 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 500}}, NULL) = 0 poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1 recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\0\0\0\0\0\0 \0\0\322"..., 1000, 0) = 976 poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1 recv(5, "\1\0\0\0H\3\0\0:\204\30\0\f\0\0\0\1\0\0\0\0\0\0\0 \301\204"..., 1000, 0) = 840 poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1 recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\1\0\0\0\0\0 \0\0\301"..., 1000, 0) = 976 poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1 recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\0\0\0\0\0\0 \0\000"..., 1000, 0) = 976 poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1 recv(5, "\1\0\0\0\240\0\0\0:\204\30\0\2\0\0\0\0\0\0\0\0\0\ 0\0\303"..., 1000, 0) = 160 poll([{fd=5, events=POLLIN|POLLERR}], 1, 2000) = -1 EINTR (Interrupted system call) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) getppid() = 22447 open("global/pgstat.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3 fstat64(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xacdfc000 write(3, "\226\274\245\1D\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0\0\0"..., 4096) = 4096 write(3, "Z\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0\0\0"..., 4096) = 4096 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0\0\0"..., 4096) = 4096 .. .. .. |
| |||
| Benjamin Minshall <minshall@intellicon.biz> writes: > Seems the problem may be related to a huge global/pgstat.stat file. > Under 8.1.5 it was about 1 MB; now it's 90 MB in 8.2.3. Yoi. We didn't do anything that would bloat that file if it were storing the same information as before. What I'm betting is that it's storing info on a whole lot more tables than before. Did you decide to start running autovacuum when you updated to 8.2? How many tables are visible in the pg_stats views? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Benjamin Minshall <minshall@intellicon.biz> writes: > Tom Lane wrote: >> How many tables are visible in the pg_stats views? > There are about 15 databases in the cluster each with around 90 tables. > A count of pg_stats yields between 500 and 800 rows in each database. Sorry, I was imprecise. The view "pg_stats" doesn't have anything to do with the stats collector; what I was interested in was the contents of the "pg_stat_xxx" and "pg_statio_xxx" views. It'd be enough to check pg_stat_all_indexes and pg_stat_all_tables, probably. Also, do you have the 8.1 installation still available to get the comparable counts there? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| > Benjamin Minshall <minshall@intellicon.biz> writes: >> Tom Lane wrote: >>> How many tables are visible in the pg_stats views? > >> There are about 15 databases in the cluster each with around 90 tables. >> A count of pg_stats yields between 500 and 800 rows in each database. > > Sorry, I was imprecise. The view "pg_stats" doesn't have anything to do > with the stats collector; what I was interested in was the contents of > the "pg_stat_xxx" and "pg_statio_xxx" views. It'd be enough to check > pg_stat_all_indexes and pg_stat_all_tables, probably. Also, do you have > the 8.1 installation still available to get the comparable counts there? > I checked all 15 databases on both 8.1 and 8.2; they were all quite consistent: pg_stat_all_indexes has about 315 rows per database pg_stat_all_tables has about 260 rows per database The pg_statio_* views match in count to the pg_stat_* views as well. While exploring this problem, I've noticed that one of the frequent insert processes creates a few temporary tables to do post-processing. Is it possible that the stats collector is getting bloated with stats from these short-lived temporary tables? During periods of high activity it could be creating temporary tables as often as two per second. ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| minshall@intellicon.biz writes: > While exploring this problem, I've noticed that one of the frequent insert > processes creates a few temporary tables to do post-processing. Is it > possible that the stats collector is getting bloated with stats from these > short-lived temporary tables? During periods of high activity it could be > creating temporary tables as often as two per second. Hmmm ... that's an interesting point, but offhand I don't see why it'd cause more of a problem in 8.2 than 8.1. Alvaro, any thoughts? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Tom Lane wrote: > minshall@intellicon.biz writes: > > While exploring this problem, I've noticed that one of the frequent insert > > processes creates a few temporary tables to do post-processing. Is it > > possible that the stats collector is getting bloated with stats from these > > short-lived temporary tables? During periods of high activity it could be > > creating temporary tables as often as two per second. > > Hmmm ... that's an interesting point, but offhand I don't see why it'd > cause more of a problem in 8.2 than 8.1. Alvaro, any thoughts? No idea. I do have a very crude piece of code to read a pgstat.stat file and output some info about what it finds (table OIDs basically IIRC). Maybe it can be helpful to examine what's in the bloated stat file. Regarding temp tables, I'd think that the pgstat entries should be getting dropped at some point in both releases. Maybe there's a bug preventing that in 8.2? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| Alvaro Herrera <alvherre@commandprompt.com> writes: > Regarding temp tables, I'd think that the pgstat entries should be > getting dropped at some point in both releases. Maybe there's a bug > preventing that in 8.2? Hmmm ... I did rewrite the backend-side code for that just recently for performance reasons ... could I have broken it? Anyone want to take a second look at http://archives.postgresql.org/pgsql...1/msg00171.php regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |