This is a discussion on Overhauling GUCS within the pgsql Hackers forums, part of the PostgreSQL category; --> Magnus and I had the fortune to share a plane flight when leaving pgCon. This gave us a chance ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Magnus and I had the fortune to share a plane flight when leaving pgCon. This gave us a chance to discuss my ideas around reforming PostgreSQL configuration at some length. Here's what we came up with (Magnus will presumably correct the parts I got wrong): Currently, PostgreSQL,conf and our set of GUC configurations suffer from 4 large problems: 1. Most people have no idea how to set these. 2. The current postgresql.conf file is a huge mess of 194 options, the vast majority of which most users will never touch. 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, and the settings.sgml), which are only synched with each other manually. 4. We don't seem to be getting any closer to autotuning. We think all of these are solvable for 8.4. We also think that it's vitally important than any overhaul of the GUCS to be completed in one version to minimize the pain involved. Here's a list of the things we want to change. It's all a package and should make sense if you take all the changes as a whole. 1) Add several pieces of extra information to guc.c in the form of extra "gettext" commands: default value, subcategory, long description, recommendations, enum lists. 2) Incorporate this data into pg_settings 3) Delete postgresql.conf.sample 4) Add a script called pg_generate_conf to generate a postgresql.conf based on guc.c and command-line switches (rather than postgresql.conf.sample), which would be called automatically by initdb. For (4), pg_generate_conf would take the following switches: -b , --basic = short conf file, listing only the 15-18 most commonly changed options -a , --advanced = conf file listing all 196+ options -t, --terse = conf file lists only category headings and actual settings, no comments -n, --normal = conf file has category and subcategory settings, with short desc comments -v, --verbose = conf file lists full descriptions and recommendations in comments with each option -c "option = value" set specific option to specific value in the file -f "filename" = take options and values from file "filename" The default would be "-b, -n" with specific settings for shared_buffers and wal_sync_method. The current postgresql.conf is a lot more like an "-a, -v" file. This would help us in the following ways: A. there would now only be 2 places to maintain GUCS lists, the docs and guc.c. B. by having a generated postgresql.conf and an easy way to generate it, writing autoconfiguration scripts (as well as shortcuts like SET PERSISTENT) become vastly easier. C. Most users would deal only with a limited set of 15-20 configuration variables. There's obviously some refinements needed, but what do people think of the above general idea? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| On Fri, 30 May 2008, Josh Berkus wrote: > 1) Add several pieces of extra information to guc.c in the form of extra > "gettext" commands: default value, subcategory, long description, > recommendations, enum lists. > 2) Incorporate this data into pg_settings When you last brought this up in February (I started on a long reply to http://archives.postgresql.org/pgsql...2/msg00759.php that I never quite finished) the thing I got stuck on was how to deal with the way people tend to comment in these files as they change things. One problem I didn't really see addressed by the improvements you're suggesting is how to handle migrating customized settings to a new version (I'm talking about 8.4->9.0 after this is in place, 8.3->8.4 is a whole different problem). It would be nice to preserve "history" of what people did like in your examples (which look exactly like what I find myself running into in the field). Now, that will get a lot easier just by virtue of having a smaller config file, but I think that adding something into pg_settings that allows saving user-added commentary would be a nice step toward some useful standardization on that side of things. It would make future automated tools aimed at parsing and generating new files, as part of things like version upgrades, a lot easier if there was a standard way such comments were handled in addition to the raw data itself. The other thing I'd like to see make its way into pg_settings, so that tools can operate on it just by querying the database, is noting what file the setting came from so that you can track things like include file usage. I think with those two additions (comments and source file tracking) it would even be concievable to clone a working facsimile of even a complicated postgresql.conf file set remotely just by reading pg_settings. While a bit outside of the part you're specifically aiming to improve here, if you could slip these two additions in I think it would be a boon to future writers of multi-server management tools as well. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote: > There's obviously some refinements needed, but what do people think of > the above general idea? My initial reaction is that this is too much change, though I agree with many of the points and understand it is well intended. We have many supporters, though 90% of them seldom touch the database from one release to the next. Many are dismayed that every time they do we've fiddled with the knobs so some don't work anymore, some are missing or renamed and there's a few new ones. So if they don't know what they're doing it is frequently because we keep moving the goalposts. We should also consider that most people with multiple databases are running multiple versions of PostgreSQL. The main reason for that is that we keep changing the behaviour of SQL between releases, so even if you had a magic superfast upgrade utility, in perhaps 50% of cases they won't use it because they have to do a full application re-test, which costs time and money. Trying to be a Postgres DBA is hard when each new release is configured differently to the last one and we have a major release about 3-5 more frequently than Oracle/SQLServer. That is probably largest source of questions and annoyance from the students on the courses I run. So my viewpoint is that we should be aggressively adding new features, yet be conservative in changing existing behaviour: provide options for behaves-like-previous-release and keep the administrative interfaces as similar as possible between releases. If you wish to make changes, I would suggest that you simply reorder some of the parameters in the .conf file, so that the ones you think are important are grouped at the top. Another suggestion would be to allow a #include style interface within postgresql.conf. We can then do this: #include standard_postgresql.conf # now we put Josh's favourite GUCs as overrides on the above .... That keeps things simple because the standard_postgresql.conf looks almost identical to what people are used to. It also provides a new feature, allowing people to include site-wide defaults for various settings to allow easy implementation of local policy. It also demonstrates a best practice approach to managing GUCs. Some other problems I see with GUCs * It's not possible to set one parameter depending upon the setting of another. * It's always unclear which GUCs can be changed, and when. That is much more infrequently understood than the meaning of them. * We should rename effective_cache_size to something that doesn't sound like it does what shared_buffers does * There is no config verification utility, so if you make a change and then try to restart and it won't, you are in trouble. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| Simon Riggs wrote: > On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote: > We have many supporters, though 90% of them seldom touch the database > from one release to the next. Many are dismayed that every time they do > we've fiddled with the knobs so some don't work anymore, some are > missing or renamed and there's a few new ones. I certainly agree we don't want to make it more difficult. > > So if they don't know > what they're doing it is frequently because we keep moving the > goalposts. No, its because they don't read the docs. We are not talking about dumping SQL here needed cleanup for fear that some user can't be bothered to read the documentation. > We should also consider that most people with multiple databases are > running multiple versions of PostgreSQL. The main reason for that is > that we keep changing the behaviour of SQL between releases, so even if > you had a magic superfast upgrade utility, in perhaps 50% of cases they > won't use it because they have to do a full application re-test, which > costs time and money. This could certainly be a problem. > > Trying to be a Postgres DBA is hard when each new release is configured > differently to the last one and we have a major release about 3-5 more > frequently than Oracle/SQLServer. That is probably largest source of > questions and annoyance from the students on the courses I run. That is the nature of the beast though. We are still learning, improving, engineering. It's part of progress of the database. I would suggest that your students (which is what I tell mine) not upgrade every release but instead try hanging out for 3 years per release. > > So my viewpoint is that we should be aggressively adding new features, > yet be conservative in changing existing behaviour: provide options for > behaves-like-previous-release and keep the administrative interfaces as > similar as possible between releases. > I agree with this in principle but not on this argument. > If you wish to make changes, I would suggest that you simply reorder > some of the parameters in the .conf file, so that the ones you think are > important are grouped at the top. IMO, the only settings that should be in the postgresql.conf are the ones that require a restart. The rest should be gathered from pg_settings. > > Another suggestion would be to allow a #include style interface within > postgresql.conf. We can then do this: > > #include standard_postgresql.conf > > # now we put Josh's favourite GUCs as overrides on the above > ... I think that could get very messy but Apache allows this. It isn't exactly a new idea and I wouldn't fight against it. > Some other problems I see with GUCs > > * It's not possible to set one parameter depending upon the setting of > another. That is getting a bit more complicated than I think we need. Unless I am misunderstanding what you are saying. > > * It's always unclear which GUCs can be changed, and when. That is much > more infrequently understood than the meaning of them. > This is a definite problem. Even the docs are a bit difficult on this one. It should be a nice simple grid in the conf if requires a restart. Hmm, SET effective_cache_size = '5000MB'; WARNING: You must issue a reload to the database for this to take effect. > * We should rename effective_cache_size to something that doesn't sound > like it does what shared_buffers does Hmmm, I wonder if it is not the other way around. Although I know that one thing I run into is that a lot of people think effective_cache is an allocation. shared_buffer_alloc? shared_mem? > > * There is no config verification utility, so if you make a change and > then try to restart and it won't, you are in trouble. > +1 +1 Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| -----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > Many are dismayed that every time they do we've fiddled with the > knobs so some don't work anymore, some are missing or renamed and > there's a few new ones. .... > * We should rename effective_cache_size *raises eyebrow* - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200805311114 http://biglumber.com/x/web?pk=2529DF...9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAkhBa3UACgkQvJuQZxSWSsg6pACgu2EJOyzj+i k79QwzdTTfi8Qj aW0AoNayz4sniFVFxnQFgnxh7dPB6QOD =Uyap -----END PGP SIGNATURE----- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| Josh Berkus <josh@agliodbs.com> writes: > Currently, PostgreSQL,conf and our set of GUC configurations suffer from > 4 large problems: > 1. Most people have no idea how to set these. > 2. The current postgresql.conf file is a huge mess of 194 options, the > vast majority of which most users will never touch. > 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, > and the settings.sgml), which are only synched with each other manually. > 4. We don't seem to be getting any closer to autotuning. The proposal doesn't actually solve any of those problems. > Here's a list of the things we want to change. It's all a package and > should make sense if you take all the changes as a whole. > 1) Add several pieces of extra information to guc.c in the form of extra > "gettext" commands: default value, subcategory, long description, > recommendations, enum lists. > 2) Incorporate this data into pg_settings > 3) Delete postgresql.conf.sample > 4) Add a script called pg_generate_conf to generate a postgresql.conf > based on guc.c and command-line switches (rather than > postgresql.conf.sample), which would be called automatically by initdb. I disagree with doing any of this. It doesn't result in any useful reduction in maintenance effort, and what it does do is make it impossible to keep control over the detailed layout, formatting, commenting etc in a sample postgresql.conf. Nor do I think that "generate a whole config file from scratch" is going to be a useful behavior for tuning problems --- how will you merge it with what you had before? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| On May 31, 2008, at 09:23, Tom Lane wrote: >> 1. Most people have no idea how to set these. >> 2. The current postgresql.conf file is a huge mess of 194 options, >> the >> vast majority of which most users will never touch. >> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, >> and the settings.sgml), which are only synched with each other >> manually. >> 4. We don't seem to be getting any closer to autotuning. > > The proposal doesn't actually solve any of those problems. It solves #2 at least. > I disagree with doing any of this. It doesn't result in any useful > reduction in maintenance effort, and what it does do is make it > impossible to keep control over the detailed layout, formatting, > commenting etc in a sample postgresql.conf. Nor do I think that > "generate a whole config file from scratch" is going to be a useful > behavior for tuning problems --- how will you merge it with what > you had before? I'd love to see these issues resolved. The current postgresql.conf is way over the top. Might you have a better idea? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| Simon, Tom, Greg, > Now, that will get a lot easier just by > virtue of having a smaller config file, but I think that adding something > into pg_settings that allows saving user-added commentary would be a nice > step toward some useful standardization on that side of things. It would > make future automated tools aimed at parsing and generating new files, as > part of things like version upgrades, a lot easier if there was a standard > way such comments were handled in addition to the raw data itself. Hmmm. What about a COMMENT ON SETTING? That seems like it would serve the purpose ... and make preserving user comments easier than the current file-conversion approach. > The other thing I'd like to see make its way into pg_settings, so that > tools can operate on it just by querying the database, is noting what file > the setting came from so that you can track things like include file > usage. I think with those two additions (comments and source file > tracking) it would even be concievable to clone a working facsimile of > even a complicated postgresql.conf file set remotely just by reading > pg_settings. Hmmm. Any idea how to do this? It sounds like a good idea to me. > So my viewpoint is that we should be aggressively adding new features, > yet be conservative in changing existing behaviour: provide options for > behaves-like-previous-release and keep the administrative interfaces as > similar as possible between releases. It's my viewpoint based on a lot of user feedback that the current postgresql.conf is fundamentally broken and a major roadblock to PostgreSQL adoption. This was a point with which there was pretty much universal agreement when I talked with people at pgCon. That is, I beleive that you're arguing for keeping .conf stable for the 5% of users who understand it and ignoring the 95% of users who are baffled by it. At this point, I think we are the only major SQL database without some form of basic autotuning for the most significant settings (certainly MySQL, Oracle, and SQL Server have it); we've been able to coast through that because autotuning features are new in other DBs, but it's going to start to hurt us pretty soon. Now, I do believe that we should plan any GUCS overhaul to happen in one postgresql version rather than phasing it in over multiple versions so that administrators only need to get used to a new way once. > > Currently, PostgreSQL,conf and our set of GUC configurations suffer from > > 4 large problems: > > > > 1. Most people have no idea how to set these. > > 2. The current postgresql.conf file is a huge mess of 194 options, the > > vast majority of which most users will never touch. > > 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, > > and the settings.sgml), which are only synched with each other manually. > > 4. We don't seem to be getting any closer to autotuning. > > The proposal doesn't actually solve any of those problems. OK, let me draw some lines: 1 & 2) by not having the settings be defined in a 500-line file, new users would no longer be baffled by scores of settings which probably don't concern them, trying to find the handful of settings which do. 3) We'd consolidate the GUC lists down from 3 places to 2, which is one less area to synchronize. Magnus and I looked to see if it might be possible to generate the docs from the same list, but it's not practical. 4) By shifting from a model where postgresql.conf is document-formatted and hand-edited to one where it's machine generated, it becomes vastly easier to write simple utilities to manage these settings. Right now, the big "obstacle" to things like SET PERSISTENT is "how to we preseve the hand-edited comments in the file" -- and the answer is we *don't.* > I disagree with doing any of this. It doesn't result in any useful > reduction in maintenance effort, and what it does do is make it > impossible to keep control over the detailed layout, formatting, > commenting etc in a sample postgresql.conf. Have you *looked* at postgresql.conf.sample lately, Tom? It's a disaster. Maintenance is already difficult, and becoming more so as we add settings. Further, you and Simon seem to think that the current "narrative docs inside the conf file" format has some kind of value which makes things easier for DBAs. I don't believe it does, and I have yet to meet a *single* new PostgreSQL user who wasn't confused and intimidated by the file. > Nor do I think that > "generate a whole config file from scratch" is going to be a useful > behavior for tuning problems --- how will you merge it with what > you had before? Who's merging? I don't think you get the proposal. The whole "hand-edited" approach to postgresql.conf should go away. It's not useful, it's not educational, and it doesn't make PostgreSQL easy to manage. Further, the lack of an easy way to manage settings via port access to PostgreSQL is a significant inhibitor to adopting PostgreSQL in environments with large numbers of servers. See prior arguments by the CPANEL folks about why they don't support PostgreSQL, which is in turn a major reason why PostgreSQL web hosting is hard to find. I agree that editing the data about settings in the guc.c file is not ideal; Magnus and I discussed that mainly because we wanted to preserve the translation framework with gettext strings. If someone can think of a better way to do this part, I'm all ears. -- Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| |||
| "David E. Wheeler" <david@kineticode.com> writes: > I'd love to see these issues resolved. The current postgresql.conf is way over > the top. Might you have a better idea? I don't think fiddling with surface issues like the formatting of the postgresql.conf is productive. Hiding parameters because you don't think beginners need them is only going to frustrate those people who do need to adjust them. What might be productive is picking up a group of parameters and thinking hard about what they mean in terms of user-visible real-world effects. If they can be recast in terms of behaviour the user wants instead of internal implementation details then that would translate into a massive simplification as well as being easier to explain to users. I think we do a pretty good job of this already. Witness things like effective_cache_size -- imagine if this were "nested_loop_cache_hit_rate" for example, good luck figuring out what to set it to. The vacuum cost delay factors are probably ripe for such a recast though. I think we need just one parameter "vacuum_io_bandwidth" or something like that. The bgwriter parameters might also be a candidate but I'm less certain. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| ||||
| Josh Berkus wrote: > Currently, PostgreSQL,conf and our set of GUC configurations suffer from > 4 large problems: As we have talked about it before, you know that I agree that the GUC system could use some improvements. But I'm a bit surprised about some of your assessments. > 1. Most people have no idea how to set these. Could you clarify this? I can't really believe that people are incapable of editing a configuration file. > 2. The current postgresql.conf file is a huge mess of 194 options, the > vast majority of which most users will never touch. My opinion has always been that we should provide a default file with only the essential options instead of all of them. I see this as a the major problem, because people are overwhelmed and consequently don't set anything. > 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, > and the settings.sgml), which are only synched with each other manually. While this is not ideal, I can't really see who this is a major problem, at least from the perspective of the user. > 4. We don't seem to be getting any closer to autotuning. True. But how does your proposal address this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |