Unix Technical Forum

Re: Some ideas for comment

This is a discussion on Re: Some ideas for comment within the Pgsql Performance forums, part of the PostgreSQL category; --> > Ok, there is always a lot of talk about tuning PostgreSQL on linux and > how PostgreSQL uses ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 12:17 PM
Merlin Moncure
 
Posts: n/a
Default Re: Some ideas for comment

> Ok, there is always a lot of talk about tuning PostgreSQL on linux and
> how PostgreSQL uses the linux kernel cache to cache the tables and
> indexes.

[...]
>
> 1. Implement a partition type layout using views and rules - This
> will allow me to have one table in each view with the "active" data,
> and the inactive data stored by year in other tables.
>
> So I would have the following (for each major table):
>
> Table View as
> select * from active_table
> union all
> select * from table_2005
> union all
> select * from table_2004
> etc.


Linux does a pretty good job of deciding what to cache. I don't think
this will help much. You can always look at partial indexes too.

> 2. I am also thinking of recommending we collapse all databases in a
> cluster into one "mega" database. I can then use schema's and views
> to control database access and ensure that no customer can see another
> customers data.


hm. keep in mind views are tightly bound to the tables they are created
with (views can't 'float' over tables in different schemas). pl/pgsql
functions can, though. This is a more efficient use of server
resources, IMO, but not a windfall.

> This would mean that there are only one set of indexes being loaded
> into the cache. While they would be larger, I think in combination
> with the partition from idea 1, we would be ahead of the ball game.
> Since there would only be one set of indexes, everyone would be
> sharing them so they should always be in memory.


I would strongly consider adding more memory .

> I don't have real numbers to give you, but we know that our systems
> are hurting i/o wise and we are growing by about 2GB+ per week (net).
> We actually grow by about 5GB/week/server. However, when I run my
> weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
> end up getting about 3GB back. Unfortunately, I do not have the i/o
> bandwidth to vacuum during the day as it causes major slowdowns on our
> system. Each night, I do run a vacuum analyze across all db's to try
> and help. I also have my fsm parameters set high (8000000 fsm pages,
> and 5000 fsm relations) to try and compensate.


Generally, you can reduce data turnover for the same workload by
normalizing your database. IOW, try and make your database more
efficient in the way it stores data.

> Right now, we are still on 7.3.4. However, these ideas would be
> implemented as part of an upgrade to 8.x (plus, we'll initialize the
> new clusters with a C locale).


yes, do this!

Merlin

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 12:17 PM
Tom Lane
 
Posts: n/a
Default Re: Some ideas for comment

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
>> Right now, we are still on 7.3.4. However, these ideas would be
>> implemented as part of an upgrade to 8.x (plus, we'll initialize the
>> new clusters with a C locale).


> yes, do this!


Moving from 7.3 to 8.0 is alone likely to give you a noticeable
performance boost.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 12:17 PM
Chris Hoover
 
Posts: n/a
Default Re: Some ideas for comment

On 8/24/05, Merlin Moncure <merlin.moncure@rcsonline.com> wrote:
> > Ok, there is always a lot of talk about tuning PostgreSQL on linux and
> > how PostgreSQL uses the linux kernel cache to cache the tables and
> > indexes.

> [...]
> >
> > 1. Implement a partition type layout using views and rules - This
> > will allow me to have one table in each view with the "active" data,
> > and the inactive data stored by year in other tables.
> >
> > So I would have the following (for each major table):
> >
> > Table View as
> > select * from active_table
> > union all
> > select * from table_2005
> > union all
> > select * from table_2004
> > etc.

>
> Linux does a pretty good job of deciding what to cache. I don't think
> this will help much. You can always look at partial indexes too.
>

Yes, but won't this help create the need to store less? If I have
1,000.000 rows in a table, but only 4,000 are active, if I move those
4 to another table and link the tables via a view, should that not
help keep the 9,996,000 rows out of the kernel cache (the majority of
the time at least)?

This would mean I have more room for other objects and hopefully less
turn over in the cache, and less disk i/o.

Yes?
[...]
> I would strongly consider adding more memory .

Unfortunately, it looks like 12GB is all our Dell servers can handle.

>
> > I don't have real numbers to give you, but we know that our systems
> > are hurting i/o wise and we are growing by about 2GB+ per week (net).
> > We actually grow by about 5GB/week/server. However, when I run my
> > weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
> > end up getting about 3GB back. Unfortunately, I do not have the i/o
> > bandwidth to vacuum during the day as it causes major slowdowns on our
> > system. Each night, I do run a vacuum analyze across all db's to try
> > and help. I also have my fsm parameters set high (8000000 fsm pages,
> > and 5000 fsm relations) to try and compensate.

>
> Generally, you can reduce data turnover for the same workload by
> normalizing your database. IOW, try and make your database more
> efficient in the way it stores data.
>

That's the ultimate goal, but this database structure was developed
and released into production before I started work here. I'm trying
to slowly change it into a better db, but it is a slow process.
Normalization does not make it at the top of the priority list,
unfortunately.

> > Right now, we are still on 7.3.4. However, these ideas would be
> > implemented as part of an upgrade to 8.x (plus, we'll initialize the
> > new clusters with a C locale).
> > > 2. I am also thinking of recommending we collapse all databases in a

> > cluster into one "mega" database. I can then use schema's and views
> > to control database access and ensure that no customer can see another
> > customers data.

>
> hm. keep in mind views are tightly bound to the tables they are created
> with (views can't 'float' over tables in different schemas). pl/pgsql
> functions can, though. This is a more efficient use of server
> resources, IMO, but not a windfall.


This I know. Each schema would have to have a "custom" set of views
replacing the tables with the view programmed to only return that
customers data.

I was thinking all of the tables in schema my_tables and the views all
querying the tables stored in the my_tables schema. I would add an
identifying column to each table so that I can differentiate the data.

Chris

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 12:17 PM
Jens-Wolfhard Schicke
 
Posts: n/a
Default Re: Some ideas for comment

--On Mittwoch, August 24, 2005 16:26:40 -0400 Chris Hoover
<revoohc@gmail.com> wrote:

> On 8/24/05, Merlin Moncure <merlin.moncure@rcsonline.com> wrote:
>> Linux does a pretty good job of deciding what to cache. I don't think
>> this will help much. You can always look at partial indexes too.
>>

> Yes, but won't this help create the need to store less? If I have
> 1,000.000 rows in a table, but only 4,000 are active, if I move those
> 4 to another table and link the tables via a view, should that not
> help keep the 9,996,000 rows out of the kernel cache (the majority of
> the time at least)?

The kernel caches per page, not per file. It is likely linux only caches
those pages which contain active rows, as long as no statement does a
seq-scan on that table.

To optimize the thing, you could consider to cluster by some index which
sorts by the "activity" of the rows first. That way pages with active rows
are likely to contain more than only 1 active row and so the cache is
utilized better.

Cluster is rather slow however and tables need to be reclustered from time
to time.


Mit freundlichem Gruß
Jens Schicke
--
Jens Schicke j.schicke@asco.de
asco GmbH http://www.asco.de
Mittelweg 7 Tel 0531/3906-127
38106 Braunschweig Fax 0531/3906-400

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:06 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com