Unix Technical Forum

Overhauling GUCS

This is a discussion on Overhauling GUCS within the pgsql Hackers forums, part of the PostgreSQL category; --> Magnus and I had the fortune to share a plane flight when leaving pgCon. This gave us a chance ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 06-02-2008, 12:36 PM
Josh Berkus
 
Posts: n/a
Default Overhauling GUCS

Magnus and I had the fortune to share a plane flight when leaving pgCon.
This gave us a chance to discuss my ideas around reforming PostgreSQL
configuration at some length. Here's what we came up with (Magnus will
presumably correct the parts I got wrong):

Currently, PostgreSQL,conf and our set of GUC configurations suffer from
4 large problems:

1. Most people have no idea how to set these.
2. The current postgresql.conf file is a huge mess of 194 options, the
vast majority of which most users will never touch.
3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
and the settings.sgml), which are only synched with each other manually.
4. We don't seem to be getting any closer to autotuning.

We think all of these are solvable for 8.4. We also think that it's
vitally important than any overhaul of the GUCS to be completed in one
version to minimize the pain involved.

Here's a list of the things we want to change. It's all a package and
should make sense if you take all the changes as a whole.

1) Add several pieces of extra information to guc.c in the form of extra
"gettext" commands: default value, subcategory, long description,
recommendations, enum lists.
2) Incorporate this data into pg_settings
3) Delete postgresql.conf.sample
4) Add a script called pg_generate_conf to generate a postgresql.conf
based on guc.c and command-line switches (rather than
postgresql.conf.sample), which would be called automatically by initdb.

For (4), pg_generate_conf would take the following switches:
-b , --basic = short conf file, listing only the 15-18 most commonly
changed options
-a , --advanced = conf file listing all 196+ options
-t, --terse = conf file lists only category headings and actual
settings, no comments
-n, --normal = conf file has category and subcategory settings, with
short desc comments
-v, --verbose = conf file lists full descriptions and recommendations in
comments with each option
-c "option = value" set specific option to specific value in the file
-f "filename" = take options and values from file "filename"

The default would be "-b, -n" with specific settings for shared_buffers
and wal_sync_method. The current postgresql.conf is a lot more like an
"-a, -v" file.

This would help us in the following ways:

A. there would now only be 2 places to maintain GUCS lists, the docs and
guc.c.
B. by having a generated postgresql.conf and an easy way to generate it,
writing autoconfiguration scripts (as well as shortcuts like SET
PERSISTENT) become vastly easier.
C. Most users would deal only with a limited set of 15-20 configuration
variables.

There's obviously some refinements needed, but what do people think of
the above general idea?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 06-02-2008, 12:36 PM
Greg Smith
 
Posts: n/a
Default Re: Overhauling GUCS

On Fri, 30 May 2008, Josh Berkus wrote:

> 1) Add several pieces of extra information to guc.c in the form of extra
> "gettext" commands: default value, subcategory, long description,
> recommendations, enum lists.
> 2) Incorporate this data into pg_settings


When you last brought this up in February (I started on a long reply to
http://archives.postgresql.org/pgsql...2/msg00759.php that I
never quite finished) the thing I got stuck on was how to deal with the
way people tend to comment in these files as they change things.

One problem I didn't really see addressed by the improvements you're
suggesting is how to handle migrating customized settings to a new version
(I'm talking about 8.4->9.0 after this is in place, 8.3->8.4 is a whole
different problem). It would be nice to preserve "history" of what people
did like in your examples (which look exactly like what I find myself
running into in the field). Now, that will get a lot easier just by
virtue of having a smaller config file, but I think that adding something
into pg_settings that allows saving user-added commentary would be a nice
step toward some useful standardization on that side of things. It would
make future automated tools aimed at parsing and generating new files, as
part of things like version upgrades, a lot easier if there was a standard
way such comments were handled in addition to the raw data itself.

The other thing I'd like to see make its way into pg_settings, so that
tools can operate on it just by querying the database, is noting what file
the setting came from so that you can track things like include file
usage. I think with those two additions (comments and source file
tracking) it would even be concievable to clone a working facsimile of
even a complicated postgresql.conf file set remotely just by reading
pg_settings.

While a bit outside of the part you're specifically aiming to improve
here, if you could slip these two additions in I think it would be a boon
to future writers of multi-server management tools as well.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 06-02-2008, 12:36 PM
Simon Riggs
 
Posts: n/a
Default Re: Overhauling GUCS


On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote:

> There's obviously some refinements needed, but what do people think of
> the above general idea?


My initial reaction is that this is too much change, though I agree with
many of the points and understand it is well intended.

We have many supporters, though 90% of them seldom touch the database
from one release to the next. Many are dismayed that every time they do
we've fiddled with the knobs so some don't work anymore, some are
missing or renamed and there's a few new ones. So if they don't know
what they're doing it is frequently because we keep moving the
goalposts.

We should also consider that most people with multiple databases are
running multiple versions of PostgreSQL. The main reason for that is
that we keep changing the behaviour of SQL between releases, so even if
you had a magic superfast upgrade utility, in perhaps 50% of cases they
won't use it because they have to do a full application re-test, which
costs time and money.

Trying to be a Postgres DBA is hard when each new release is configured
differently to the last one and we have a major release about 3-5 more
frequently than Oracle/SQLServer. That is probably largest source of
questions and annoyance from the students on the courses I run.

So my viewpoint is that we should be aggressively adding new features,
yet be conservative in changing existing behaviour: provide options for
behaves-like-previous-release and keep the administrative interfaces as
similar as possible between releases.

If you wish to make changes, I would suggest that you simply reorder
some of the parameters in the .conf file, so that the ones you think are
important are grouped at the top.

Another suggestion would be to allow a #include style interface within
postgresql.conf. We can then do this:

#include standard_postgresql.conf

# now we put Josh's favourite GUCs as overrides on the above
....

That keeps things simple because the standard_postgresql.conf looks
almost identical to what people are used to. It also provides a new
feature, allowing people to include site-wide defaults for various
settings to allow easy implementation of local policy. It also
demonstrates a best practice approach to managing GUCs.

Some other problems I see with GUCs

* It's not possible to set one parameter depending upon the setting of
another.

* It's always unclear which GUCs can be changed, and when. That is much
more infrequently understood than the meaning of them.

* We should rename effective_cache_size to something that doesn't sound
like it does what shared_buffers does

* There is no config verification utility, so if you make a change and
then try to restart and it won't, you are in trouble.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 06-02-2008, 12:36 PM
Joshua D. Drake
 
Posts: n/a
Default Re: Overhauling GUCS

Simon Riggs wrote:
> On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote:


> We have many supporters, though 90% of them seldom touch the database
> from one release to the next. Many are dismayed that every time they do
> we've fiddled with the knobs so some don't work anymore, some are
> missing or renamed and there's a few new ones.


I certainly agree we don't want to make it more difficult.

>
> So if they don't know
> what they're doing it is frequently because we keep moving the
> goalposts.


No, its because they don't read the docs. We are not talking about
dumping SQL here I don't feel it is correct to not perform such a
needed cleanup for fear that some user can't be bothered to read the
documentation.

> We should also consider that most people with multiple databases are
> running multiple versions of PostgreSQL. The main reason for that is
> that we keep changing the behaviour of SQL between releases, so even if
> you had a magic superfast upgrade utility, in perhaps 50% of cases they
> won't use it because they have to do a full application re-test, which
> costs time and money.


This could certainly be a problem.

>
> Trying to be a Postgres DBA is hard when each new release is configured
> differently to the last one and we have a major release about 3-5 more
> frequently than Oracle/SQLServer. That is probably largest source of
> questions and annoyance from the students on the courses I run.


That is the nature of the beast though. We are still learning,
improving, engineering. It's part of progress of the database. I would
suggest that your students (which is what I tell mine) not upgrade every
release but instead try hanging out for 3 years per release.

>
> So my viewpoint is that we should be aggressively adding new features,
> yet be conservative in changing existing behaviour: provide options for
> behaves-like-previous-release and keep the administrative interfaces as
> similar as possible between releases.
>


I agree with this in principle but not on this argument.

> If you wish to make changes, I would suggest that you simply reorder
> some of the parameters in the .conf file, so that the ones you think are
> important are grouped at the top.


IMO, the only settings that should be in the postgresql.conf are the
ones that require a restart. The rest should be gathered from pg_settings.

>
> Another suggestion would be to allow a #include style interface within
> postgresql.conf. We can then do this:
>
> #include standard_postgresql.conf
>
> # now we put Josh's favourite GUCs as overrides on the above
> ...


I think that could get very messy but Apache allows this. It isn't
exactly a new idea and I wouldn't fight against it.

> Some other problems I see with GUCs
>
> * It's not possible to set one parameter depending upon the setting of
> another.


That is getting a bit more complicated than I think we need. Unless I am
misunderstanding what you are saying.

>
> * It's always unclear which GUCs can be changed, and when. That is much
> more infrequently understood than the meaning of them.
>


This is a definite problem. Even the docs are a bit difficult on this
one. It should be a nice simple grid or like I said above, only goes
in the conf if requires a restart.

Hmm,

SET effective_cache_size = '5000MB';
WARNING: You must issue a reload to the database for this to take effect.

> * We should rename effective_cache_size to something that doesn't sound
> like it does what shared_buffers does


Hmmm, I wonder if it is not the other way around. Although I know that
one thing I run into is that a lot of people think effective_cache is an
allocation.

shared_buffer_alloc?
shared_mem?

>
> * There is no config verification utility, so if you make a change and
> then try to restart and it won't, you are in trouble.
>


+1 +1

Sincerely,

Joshua D. Drake


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 06-02-2008, 12:36 PM
Greg Sabino Mullane
 
Posts: n/a
Default Re: Overhauling GUCS


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Many are dismayed that every time they do we've fiddled with the
> knobs so some don't work anymore, some are missing or renamed and
> there's a few new ones.

....
> * We should rename effective_cache_size


*raises eyebrow*

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200805311114
http://biglumber.com/x/web?pk=2529DF...9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhBa3UACgkQvJuQZxSWSsg6pACgu2EJOyzj+i k79QwzdTTfi8Qj
aW0AoNayz4sniFVFxnQFgnxh7dPB6QOD
=Uyap
-----END PGP SIGNATURE-----


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 06-02-2008, 12:36 PM
Tom Lane
 
Posts: n/a
Default Re: Overhauling GUCS

Josh Berkus <josh@agliodbs.com> writes:
> Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> 4 large problems:


> 1. Most people have no idea how to set these.
> 2. The current postgresql.conf file is a huge mess of 194 options, the
> vast majority of which most users will never touch.
> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> and the settings.sgml), which are only synched with each other manually.
> 4. We don't seem to be getting any closer to autotuning.


The proposal doesn't actually solve any of those problems.

> Here's a list of the things we want to change. It's all a package and
> should make sense if you take all the changes as a whole.


> 1) Add several pieces of extra information to guc.c in the form of extra
> "gettext" commands: default value, subcategory, long description,
> recommendations, enum lists.
> 2) Incorporate this data into pg_settings
> 3) Delete postgresql.conf.sample
> 4) Add a script called pg_generate_conf to generate a postgresql.conf
> based on guc.c and command-line switches (rather than
> postgresql.conf.sample), which would be called automatically by initdb.


I disagree with doing any of this. It doesn't result in any useful
reduction in maintenance effort, and what it does do is make it
impossible to keep control over the detailed layout, formatting,
commenting etc in a sample postgresql.conf. Nor do I think that
"generate a whole config file from scratch" is going to be a useful
behavior for tuning problems --- how will you merge it with what
you had before?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 06-02-2008, 12:36 PM
David E. Wheeler
 
Posts: n/a
Default Re: Overhauling GUCS

On May 31, 2008, at 09:23, Tom Lane wrote:

>> 1. Most people have no idea how to set these.
>> 2. The current postgresql.conf file is a huge mess of 194 options,
>> the
>> vast majority of which most users will never touch.
>> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
>> and the settings.sgml), which are only synched with each other
>> manually.
>> 4. We don't seem to be getting any closer to autotuning.

>
> The proposal doesn't actually solve any of those problems.


It solves #2 at least.

> I disagree with doing any of this. It doesn't result in any useful
> reduction in maintenance effort, and what it does do is make it
> impossible to keep control over the detailed layout, formatting,
> commenting etc in a sample postgresql.conf. Nor do I think that
> "generate a whole config file from scratch" is going to be a useful
> behavior for tuning problems --- how will you merge it with what
> you had before?


I'd love to see these issues resolved. The current postgresql.conf is
way over the top. Might you have a better idea?

Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 06-02-2008, 12:36 PM
Josh Berkus
 
Posts: n/a
Default Re: Overhauling GUCS

Simon, Tom, Greg,

> Now, that will get a lot easier just by
> virtue of having a smaller config file, but I think that adding something
> into pg_settings that allows saving user-added commentary would be a nice
> step toward some useful standardization on that side of things. It would
> make future automated tools aimed at parsing and generating new files, as
> part of things like version upgrades, a lot easier if there was a standard
> way such comments were handled in addition to the raw data itself.


Hmmm. What about a COMMENT ON SETTING? That seems like it would serve the
purpose ... and make preserving user comments easier than the current
file-conversion approach.

> The other thing I'd like to see make its way into pg_settings, so that
> tools can operate on it just by querying the database, is noting what file
> the setting came from so that you can track things like include file
> usage. I think with those two additions (comments and source file
> tracking) it would even be concievable to clone a working facsimile of
> even a complicated postgresql.conf file set remotely just by reading
> pg_settings.


Hmmm. Any idea how to do this? It sounds like a good idea to me.

> So my viewpoint is that we should be aggressively adding new features,
> yet be conservative in changing existing behaviour: provide options for
> behaves-like-previous-release and keep the administrative interfaces as
> similar as possible between releases.


It's my viewpoint based on a lot of user feedback that the current
postgresql.conf is fundamentally broken and a major roadblock to PostgreSQL
adoption. This was a point with which there was pretty much universal
agreement when I talked with people at pgCon. That is, I beleive that you're
arguing for keeping .conf stable for the 5% of users who understand it and
ignoring the 95% of users who are baffled by it.

At this point, I think we are the only major SQL database without some form of
basic autotuning for the most significant settings (certainly MySQL, Oracle,
and SQL Server have it); we've been able to coast through that because
autotuning features are new in other DBs, but it's going to start to hurt us
pretty soon.

Now, I do believe that we should plan any GUCS overhaul to happen in one
postgresql version rather than phasing it in over multiple versions so that
administrators only need to get used to a new way once.

> > Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> > 4 large problems:
> >
> > 1. Most people have no idea how to set these.
> > 2. The current postgresql.conf file is a huge mess of 194 options, the
> > vast majority of which most users will never touch.
> > 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> > and the settings.sgml), which are only synched with each other manually.
> > 4. We don't seem to be getting any closer to autotuning.

>
> The proposal doesn't actually solve any of those problems.


OK, let me draw some lines:
1 & 2) by not having the settings be defined in a 500-line file, new users
would no longer be baffled by scores of settings which probably don't concern
them, trying to find the handful of settings which do.

3) We'd consolidate the GUC lists down from 3 places to 2, which is one less
area to synchronize. Magnus and I looked to see if it might be possible to
generate the docs from the same list, but it's not practical.

4) By shifting from a model where postgresql.conf is document-formatted and
hand-edited to one where it's machine generated, it becomes vastly easier to
write simple utilities to manage these settings. Right now, the big
"obstacle" to things like SET PERSISTENT is "how to we preseve the
hand-edited comments in the file" -- and the answer is we *don't.*

> I disagree with doing any of this. It doesn't result in any useful
> reduction in maintenance effort, and what it does do is make it
> impossible to keep control over the detailed layout, formatting,
> commenting etc in a sample postgresql.conf.


Have you *looked* at postgresql.conf.sample lately, Tom? It's a disaster.
Maintenance is already difficult, and becoming more so as we add settings.

Further, you and Simon seem to think that the current "narrative docs inside
the conf file" format has some kind of value which makes things easier for
DBAs. I don't believe it does, and I have yet to meet a *single* new
PostgreSQL user who wasn't confused and intimidated by the file.

> Nor do I think that
> "generate a whole config file from scratch" is going to be a useful
> behavior for tuning problems --- how will you merge it with what
> you had before?


Who's merging? I don't think you get the proposal. The whole "hand-edited"
approach to postgresql.conf should go away. It's not useful, it's not
educational, and it doesn't make PostgreSQL easy to manage.

Further, the lack of an easy way to manage settings via port access to
PostgreSQL is a significant inhibitor to adopting PostgreSQL in environments
with large numbers of servers. See prior arguments by the CPANEL folks about
why they don't support PostgreSQL, which is in turn a major reason why
PostgreSQL web hosting is hard to find.

I agree that editing the data about settings in the guc.c file is not ideal;
Magnus and I discussed that mainly because we wanted to preserve the
translation framework with gettext strings. If someone can think of a better
way to do this part, I'm all ears.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 06-02-2008, 12:36 PM
Gregory Stark
 
Posts: n/a
Default Re: Overhauling GUCS

"David E. Wheeler" <david@kineticode.com> writes:

> I'd love to see these issues resolved. The current postgresql.conf is way over
> the top. Might you have a better idea?


I don't think fiddling with surface issues like the formatting of the
postgresql.conf is productive. Hiding parameters because you don't think
beginners need them is only going to frustrate those people who do need to
adjust them.

What might be productive is picking up a group of parameters and thinking hard
about what they mean in terms of user-visible real-world effects. If they can
be recast in terms of behaviour the user wants instead of internal
implementation details then that would translate into a massive simplification
as well as being easier to explain to users.

I think we do a pretty good job of this already. Witness things like
effective_cache_size -- imagine if this were "nested_loop_cache_hit_rate" for
example, good luck figuring out what to set it to.

The vacuum cost delay factors are probably ripe for such a recast though. I
think we need just one parameter "vacuum_io_bandwidth" or something like that.
The bgwriter parameters might also be a candidate but I'm less certain.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 06-02-2008, 12:36 PM
Peter Eisentraut
 
Posts: n/a
Default Re: Overhauling GUCS

Josh Berkus wrote:
> Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> 4 large problems:


As we have talked about it before, you know that I agree that the GUC system
could use some improvements. But I'm a bit surprised about some of your
assessments.

> 1. Most people have no idea how to set these.


Could you clarify this? I can't really believe that people are incapable of
editing a configuration file.

> 2. The current postgresql.conf file is a huge mess of 194 options, the
> vast majority of which most users will never touch.


My opinion has always been that we should provide a default file with only the
essential options instead of all of them. I see this as a the major problem,
because people are overwhelmed and consequently don't set anything.

> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> and the settings.sgml), which are only synched with each other manually.


While this is not ideal, I can't really see who this is a major problem, at
least from the perspective of the user.

> 4. We don't seem to be getting any closer to autotuning.


True. But how does your proposal address this?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:09 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com