Unix Technical Forum

ECPG patch to use prepare for improved performance

This is a discussion on ECPG patch to use prepare for improved performance within the Pgsql Patches forums, part of the PostgreSQL category; --> This patch for ECPG utilizes the "PQprepare" and "PQexecPrepared" functions to cause SQL statements from ECPG to be cached. ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Patches

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 11:01 AM
William Lawrance
 
Posts: n/a
Default ECPG patch to use prepare for improved performance


This patch for ECPG utilizes the "PQprepare" and "PQexecPrepared"
functions to cause SQL statements from ECPG to be cached. It does
this without requiring any changes in the user's source program.

It was developed during the preparation for a benchmark for a
large customer. This benchmark consists of several hundred programs
containing several thousand embedded SQL statements. The benchmark
has been successfully executed using Oracle, DB2, and PostgreSQL.
In the benchmark, Postgres is shown to be slower, by far, than the
other DBMS systems. In a three hour execution, using this patch,
approximately 30% was saved.

The following approach is used:

Within the "execute.c" module, routines are added to manage a cache
of prepared statements. These routines are used to search, insert,
and delete entries in the cache. The key for these cache entries is
the text of the SQL statement as passed by ECPG from the application
program.

Within the same module, the "ECPGexecute" function was replaced.
This is the function that is called to execute a statement after
some preliminary housekeeping is done. The original "ECPGexecute"
function constructs an ASCII string by replacing each host variable
with its current value and then calling "PQexec". The new
"ECPGexecute" function does the following:

- build an array of the current values of the host variables.

- search the cache for an entry indicating that this statement
has already been prepare'd, via "PQprepare"

- If no entry was found in the previous step, call "PQprepare"
for the statement and then insert an entry for it into the
cache. If this requires an entry to be re-used, execute a
"DEALLOCATE PREPARE.." for the previous contents.

- At this point, the SQL statement has been prepare'd by PQlib,
either when the statement was executed in the past, or in
the previous step.

- call "PQexecPrepared", using the array of parameters built
in the first step above.









---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 11:01 AM
Michael Meskes
 
Posts: n/a
Default Re: ECPG patch to use prepare for improved performance

On Mon, May 07, 2007 at 02:46:29PM -0700, William Lawrance wrote:
> This patch for ECPG utilizes the "PQprepare" and "PQexecPrepared"
> functions to cause SQL statements from ECPG to be cached. It does
> this without requiring any changes in the user's source program.
> ...


I still do not understand why you prepare each statement. This might
help you with your test case, but I don't like to add this as a general
rule. If a user wants a prepared statement he/she should use the prepare
statement. I agree that the prepare logic has to be rewritten and this
is high on my agenda, but I will probably only do this for statements
issued with EXEC SQL PREPARE not for every single statement.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 11:02 AM
William Lawrance
 
Posts: n/a
Default Re: ECPG patch to use prepare for improved performance

This approach was used for several reasons--

1. No changes were required in the application source program. For
an application involving thousands of SQL statements in hundreds
of programs, this is important. This customer application has
been tuned extensively by the customer for DB2, and he is not
receptive to large changes.

2. The performance was improved by about 1 hour in the 3 hour
elapsed time of the application. This is important to the
customer in terms of accomplishing his work load in the
time that has been allotted, based on his experience with DB2.
Without this improvement, he is likely to consider it too slow.

I would like to emphasize that we aren't measuring an artificial
test program; this is a real customer's application. We loaded
7 million rows into 217 tables to run the application. I believe
it is representative of many real batch applications.


Is there reason not to prepare each statement?

Could it be predicated upon a user supplied option ?

Other comments ?


-----Original Message-----
From: Michael Meskes [mailto:meskes@postgresql.org]
Sent: Wednesday, May 09, 2007 2:04 AM
To: William Lawrance
Cc: Pgsql-Patches
Subject: Re: [PATCHES] ECPG patch to use prepare for improved
performance

On Mon, May 07, 2007 at 02:46:29PM -0700, William Lawrance wrote:
> This patch for ECPG utilizes the "PQprepare" and "PQexecPrepared"
> functions to cause SQL statements from ECPG to be cached. It does
> this without requiring any changes in the user's source program.
> ...


I still do not understand why you prepare each statement. This might
help you with your test case, but I don't like to add this as a general
rule. If a user wants a prepared statement he/she should use the prepare
statement. I agree that the prepare logic has to be rewritten and this
is high on my agenda, but I will probably only do this for statements
issued with EXEC SQL PREPARE not for every single statement.

Michael


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 11:02 AM
Alvaro Herrera
 
Posts: n/a
Default Re: ECPG patch to use prepare for improved performance

William Lawrance wrote:
> This approach was used for several reasons--
>
> 1. No changes were required in the application source program. For
> an application involving thousands of SQL statements in hundreds
> of programs, this is important. This customer application has
> been tuned extensively by the customer for DB2, and he is not
> receptive to large changes.
>
> 2. The performance was improved by about 1 hour in the 3 hour
> elapsed time of the application. This is important to the
> customer in terms of accomplishing his work load in the
> time that has been allotted, based on his experience with DB2.
> Without this improvement, he is likely to consider it too slow.
>
> I would like to emphasize that we aren't measuring an artificial
> test program; this is a real customer's application. We loaded
> 7 million rows into 217 tables to run the application. I believe
> it is representative of many real batch applications.
>
>
> Is there reason not to prepare each statement?


One reason is that prepared statements have the parameters passed out of
line after the planning is done, so in certain cases the optimizer makes
a different choice which leads to worse plans.

This used to be a problem with JDBC as well, until a workaround was
added so that the "unnamed" prepared statement is not planned until the
parameters are passed. If you don't do that, it may end up being a bad
choice for applications as well.

> Other comments ?


Codewise I noticed you wrote your own hashing function, which seemed odd
to me at first sight. We already have a hashing infrastructure, but I'm
not sure if it could be used in ECPG (mainly due to lack of ereport/elog
support).

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 11:02 AM
Michael Meskes
 
Posts: n/a
Default Re: ECPG patch to use prepare for improved performance

On Wed, May 09, 2007 at 01:12:17PM -0700, William Lawrance wrote:
> 2. The performance was improved by about 1 hour in the 3 hour
> elapsed time of the application. This is important to the
> customer in terms of accomplishing his work load in the
> time that has been allotted, based on his experience with DB2.
> Without this improvement, he is likely to consider it too slow.


But this only holds for one customer. I don't think this will hold for
every single application. At least I do not see a reason why this
should hold everytime.

> I would like to emphasize that we aren't measuring an artificial
> test program; this is a real customer's application. We loaded
> 7 million rows into 217 tables to run the application. I believe
> it is representative of many real batch applications.


But how about non-batch applications?

> Is there reason not to prepare each statement?


I'm completely against forcing such a design decision on the programmer.
Hopefully I will be able to add a real prepare statement soon.

> Could it be predicated upon a user supplied option ?


Yes, this is fine with me. If you could rearrange the patch I will test
and commit it.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 11:02 AM
William Lawrance
 
Posts: n/a
Default Re: ECPG patch to use prepare for improved performance


This updated patch for ECPG uses the current routines by
default. If an environment variable (ECPGUSEPREPARE) is set
to "yes", it uses the new routine that prepares and
caches each statement.




-----Original Message-----
From: Michael Meskes [mailto:meskes@postgresql.org]
Sent: Thursday, May 10, 2007 3:01 AM
To: William Lawrance
Cc: Michael Meskes; Pgsql-Patches
Subject: Re: [PATCHES] ECPG patch to use prepare for improved
performance


On Wed, May 09, 2007 at 01:12:17PM -0700, William Lawrance wrote:
> 2. The performance was improved by about 1 hour in the 3 hour
> elapsed time of the application. This is important to the
> customer in terms of accomplishing his work load in the
> time that has been allotted, based on his experience with DB2.
> Without this improvement, he is likely to consider it too slow.


But this only holds for one customer. I don't think this will hold for
every single application. At least I do not see a reason why this
should hold everytime.

> I would like to emphasize that we aren't measuring an artificial
> test program; this is a real customer's application. We loaded
> 7 million rows into 217 tables to run the application. I believe
> it is representative of many real batch applications.


But how about non-batch applications?

> Is there reason not to prepare each statement?


I'm completely against forcing such a design decision on the programmer.
Hopefully I will be able to add a real prepare statement soon.

> Could it be predicated upon a user supplied option ?


Yes, this is fine with me. If you could rearrange the patch I will test
and commit it.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-18-2008, 11:02 AM
Andrew Dunstan
 
Posts: n/a
Default Re: ECPG patch to use prepare for improved performance


This seems like a very all or nothing approach. By contrast, the Perl
DBD::Pg driver lets you decide per statement if you want it
server-prepared or not. Is that not possible?

cheers

andrew

William Lawrance wrote:
> This updated patch for ECPG uses the current routines by
> default. If an environment variable (ECPGUSEPREPARE) is set
> to "yes", it uses the new routine that prepares and
> caches each statement.
>
>
>
>
> -----Original Message-----
> From: Michael Meskes [mailto:meskes@postgresql.org]
> Sent: Thursday, May 10, 2007 3:01 AM
> To: William Lawrance
> Cc: Michael Meskes; Pgsql-Patches
> Subject: Re: [PATCHES] ECPG patch to use prepare for improved
> performance
>
>
> On Wed, May 09, 2007 at 01:12:17PM -0700, William Lawrance wrote:
>
>> 2. The performance was improved by about 1 hour in the 3 hour
>> elapsed time of the application. This is important to the
>> customer in terms of accomplishing his work load in the
>> time that has been allotted, based on his experience with DB2.
>> Without this improvement, he is likely to consider it too slow.
>>

>
> But this only holds for one customer. I don't think this will hold for
> every single application. At least I do not see a reason why this
> should hold everytime.
>
>
>> I would like to emphasize that we aren't measuring an artificial
>> test program; this is a real customer's application. We loaded
>> 7 million rows into 217 tables to run the application. I believe
>> it is representative of many real batch applications.
>>

>
> But how about non-batch applications?
>
>
>> Is there reason not to prepare each statement?
>>

>
> I'm completely against forcing such a design decision on the programmer.
> Hopefully I will be able to add a real prepare statement soon.
>
>
>> Could it be predicated upon a user supplied option ?
>>

>
> Yes, this is fine with me. If you could rearrange the patch I will test
> and commit it.
>
> Michael
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-18-2008, 11:03 AM
Bruce Momjian
 
Posts: n/a
Default Re: ECPG patch to use prepare for improvedperformance


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

William Lawrance wrote:
>
> This updated patch for ECPG uses the current routines by
> default. If an environment variable (ECPGUSEPREPARE) is set
> to "yes", it uses the new routine that prepares and
> caches each statement.
>
>
>
>
> -----Original Message-----
> From: Michael Meskes [mailto:meskes@postgresql.org]
> Sent: Thursday, May 10, 2007 3:01 AM
> To: William Lawrance
> Cc: Michael Meskes; Pgsql-Patches
> Subject: Re: [PATCHES] ECPG patch to use prepare for improved
> performance
>
>
> On Wed, May 09, 2007 at 01:12:17PM -0700, William Lawrance wrote:
> > 2. The performance was improved by about 1 hour in the 3 hour
> > elapsed time of the application. This is important to the
> > customer in terms of accomplishing his work load in the
> > time that has been allotted, based on his experience with DB2.
> > Without this improvement, he is likely to consider it too slow.

>
> But this only holds for one customer. I don't think this will hold for
> every single application. At least I do not see a reason why this
> should hold everytime.
>
> > I would like to emphasize that we aren't measuring an artificial
> > test program; this is a real customer's application. We loaded
> > 7 million rows into 217 tables to run the application. I believe
> > it is representative of many real batch applications.

>
> But how about non-batch applications?
>
> > Is there reason not to prepare each statement?

>
> I'm completely against forcing such a design decision on the programmer.
> Hopefully I will be able to add a real prepare statement soon.
>
> > Could it be predicated upon a user supplied option ?

>
> Yes, this is fine with me. If you could rearrange the patch I will test
> and commit it.
>
> Michael
> --
> Michael Meskes
> Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
> ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
> Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings


--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:44 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com