Unix Technical Forum

statement caching proof of concept

This is a discussion on statement caching proof of concept within the pgsql Interfaces jdbc forums, part of the PostgreSQL category; --> This is just proof of concept. More work has to be done to make it build properly and work ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Interfaces jdbc

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-16-2008, 12:08 AM
Dave Cramer
 
Posts: n/a
Default statement caching proof of concept

This is just proof of concept. More work has to be done to make it
build properly and work properly under different jdk's

Couple of questions.

1) What to do if there are multiple concurrent requests per
connection for the same statement?
1) we could just allow it
2) we could return a non-cacheable preparedstatement
3) throw an exception

2) Is it enough to cache prepared statements or should we cache
statements too?

Note: this work is based completely on apache's dbcp statement
caching implementation and this will be noted in the final code.



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-16-2008, 12:08 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: statement caching proof of concept

On Mon, 19 Jun 2006, Dave Cramer wrote:

> This is just proof of concept. More work has to be done to make it build
> properly and work properly under different jdk's


That's neat, but isn't statement caching something that belongs to
the application server, not the JDBC driver?

Also, it seems to me that the changes to AbstractJdbc2Connection and
AbstractJdbc2Statement are completely unrelated to statement caching.

> Note: this work is based completely on apache's dbcp statement caching
> implementation and this will be noted in the final code.


What kind of licensing issues does that bring us?

- Heikki

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-16-2008, 12:08 AM
Dave Cramer
 
Posts: n/a
Default Re: statement caching proof of concept


On 19-Jun-06, at 2:07 PM, Heikki Linnakangas wrote:

> On Mon, 19 Jun 2006, Dave Cramer wrote:
>
>> This is just proof of concept. More work has to be done to make it
>> build properly and work properly under different jdk's

>
> That's neat, but isn't statement caching something that belongs to
> the application server, not the JDBC driver?


There's an interesting question. However the way it has been done it
can be very easily removed, or added.
>
> Also, it seems to me that the changes to AbstractJdbc2Connection
> and AbstractJdbc2Statement are completely unrelated to statement
> caching.
>
>> Note: this work is based completely on apache's dbcp statement
>> caching implementation and this will be noted in the final code.

>
> What kind of licensing issues does that bring us?

None, their license is more or less bsd.
>
> - Heikki
>



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-16-2008, 12:08 AM
till toenges
 
Posts: n/a
Default Re: statement caching proof of concept

Dave Cramer wrote:
> This is just proof of concept. More work has to be done to make it
> build properly and work properly under different jdk's


Interesting. How is it intended to be used? Automatically for every
prepared statement? How does this compare to other JDBC implementations?

And, why would this be a part of the Postgres driver? It seems more like
an add-on to JDBC drivers in general. I do use a pool for prepared
statements myself, actually quite like the one you wrote. But it is part
of my general database related code library.

Don't get me wrong, if there is a clean way to boost the performance of
the driver without allocating too many resources, i'd like to see it in
the driver. But if it is complicated to use, or tries do to "clever"
things behind my back, i'm not so sure. Maybe activate it via a
parameter like preparedThreshold?

> Couple of questions.
>
> 1) What to do if there are multiple concurrent requests per
> connection for the same statement?
> 1) we could just allow it
> 2) we could return a non-cacheable preparedstatement
> 3) throw an exception


Should do whatever a normal prepared statement would do in this
situation, imho. That is, if i have code that uses normal prepared
statements, and is updated to use these cached statements, it should not
fail in any new ways. Or either the different behaviour it must be very
well documented.

> 2) Is it enough to cache prepared statements or should we cache
> statements too?


Actually, if i don't use a prepared statement, i want just this: do it
once, then forget about it. Though it could be nice to boost stupid code
that doesn't know about prepared statements; which is not really a
problem of the driver.

> Note: this work is based completely on apache's dbcp statement
> caching implementation and this will be noted in the final code.


What does that mean?


Till

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-16-2008, 12:08 AM
Dave Cramer
 
Posts: n/a
Default Re: statement caching proof of concept


On 19-Jun-06, at 2:57 PM, till toenges wrote:

> Dave Cramer wrote:
>> This is just proof of concept. More work has to be done to make it
>> build properly and work properly under different jdk's

>
> Interesting. How is it intended to be used? Automatically for every
> prepared statement? How does this compare to other JDBC
> implementations?

Essentially yes, however this is all good feedback. The way it's
implemented we could use it or not on a per connection basis.
>
> And, why would this be a part of the Postgres driver? It seems more
> like
> an add-on to JDBC drivers in general. I do use a pool for prepared
> statements myself, actually quite like the one you wrote. But it is
> part
> of my general database related code library.


Other drivers use statement caching namely Oracle.
>
> Don't get me wrong, if there is a clean way to boost the
> performance of
> the driver without allocating too many resources, i'd like to see
> it in
> the driver. But if it is complicated to use, or tries do to "clever"
> things behind my back, i'm not so sure. Maybe activate it via a
> parameter like preparedThreshold?


All good feedback
>
>> Couple of questions.
>>
>> 1) What to do if there are multiple concurrent requests per
>> connection for the same statement?
>> 1) we could just allow it
>> 2) we could return a non-cacheable preparedstatement
>> 3) throw an exception

>
> Should do whatever a normal prepared statement would do in this
> situation, imho. That is, if i have code that uses normal prepared
> statements, and is updated to use these cached statements, it
> should not
> fail in any new ways. Or either the different behaviour it must be
> very
> well documented.


I'd agree the driver should not fail.
>
>> 2) Is it enough to cache prepared statements or should we cache
>> statements too?

>
> Actually, if i don't use a prepared statement, i want just this: do it
> once, then forget about it. Though it could be nice to boost stupid
> code
> that doesn't know about prepared statements; which is not really a
> problem of the driver.
>
>> Note: this work is based completely on apache's dbcp statement
>> caching implementation and this will be noted in the final code.

>
> What does that mean?

Nothing much, I am just giving credit where credit is due. I borrowed
most of the concepts from them.

Their license like ours allows this.

Dave
>
>
> Till
>



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-16-2008, 12:08 AM
Mark Lewis
 
Posts: n/a
Default Re: statement caching proof of concept

Hmmm. You know, an interesting alternative way to get the same kind of
performance boost: cache just the handles to the server-side prepared
statements, but not the Java PreparedStatement instances.

I'm just suggesting the idea because it seems like it would be simpler
to implement (just a pool and some ref-counting), and you wouldn't need
to worry about any special concurrency issues, because you'd be dealing
with distinct PreparedStatement object that just happen to refer to the
same server-side statement.

It seems like it would be less error-prone to only cache the expensive
part, and leave the rest of the stateful stuff in PreparedStatement
alone.

But on the other hand, I'm not in a position to offer an alternative
implementation, and working code trumps vaporware any day

-- Mark Lewis


On Mon, 2006-06-19 at 12:55 -0400, Dave Cramer wrote:
> This is just proof of concept. More work has to be done to make it
> build properly and work properly under different jdk's
>
> Couple of questions.
>
> 1) What to do if there are multiple concurrent requests per
> connection for the same statement?
> 1) we could just allow it
> 2) we could return a non-cacheable preparedstatement
> 3) throw an exception
>
> 2) Is it enough to cache prepared statements or should we cache
> statements too?
>
> Note: this work is based completely on apache's dbcp statement
> caching implementation and this will be noted in the final code.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-16-2008, 12:08 AM
till toenges
 
Posts: n/a
Default Re: statement caching proof of concept

Dave Cramer wrote:
> Other drivers use statement caching namely Oracle.


Are there any special things you have to do or care about to use cached
statements there? My experience with the oracle driver is limited.

>>> 1) What to do if there are multiple concurrent requests per
>>> connection for the same statement?
>>> 1) we could just allow it
>>> 2) we could return a non-cacheable preparedstatement
>>> 3) throw an exception

>> Should do whatever a normal prepared statement would do in this
>> situation, imho. That is, if i have code that uses normal prepared
>> statements, and is updated to use these cached statements, it
>> should not
>> fail in any new ways. Or either the different behaviour it must be
>> very
>> well documented.

>
> I'd agree the driver should not fail.


The current situation allows to use a prepared statement from several
threads at once, or more than once before reading the result in a single
thread. There is no way to get all the results back reliably in this
case. Since the JDBC specifications don't demand anything more (afaik),
solution 1 seems ok.

A slightly more advanced solution might be to create a map of lists of
prepared statements, instead of a map of prepared statements. If i have
a prepared statement "SELECT 1" and this is in high demand by hundreds
of threads (silly example), it would be possible to have more than one
prepared statement with the same sql. That would be more like solution
2, with better performance in concurrent applications.

I like Mark Lewis idea of caching the handles, but i have no real
knowledge about the postgres api, the overhead of creating a new
prepared statement object (could easily offset the saved memory in a
"heavy duty" application) and how well this would work; or how this
would influence the specific problem of multiple concurrent requests.
Certainly sounds a bit more difficult to implement.

How does it combine with methods like get*MetaData()?

And what happens with methods like setQueryTimeout()? They change the
behaviour of the cached statement, and don't get reset between use.
Especially interesting in the case of automatic caching of prepared
statements, where the user expects a new, "clean" statement. How does
the Oracle driver handle this, for example?

Ok, just looked at the source again. The method clearWarnings() is
called before returning the statement. There could be an extended
clearCachedStatement() method, or something like this to reset
everything neccessary.


Till


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-16-2008, 12:08 AM
till toenges
 
Posts: n/a
Default Re: statement caching proof of concept

till toenges wrote:
> The current situation allows to use a prepared statement from several
> threads at once, or more than once before reading the result in a single
> thread. There is no way to get all the results back reliably in this
> case. Since the JDBC specifications don't demand anything more (afaik),
> solution 1 seems ok.


Oversight on my part: If the caching is automatic, an application that
originally used several prepared statements with the same sql and now
gets only one from the cache would very probably misbehave. Therefore
"just allowing it" is not an option.

In fact, i have no precise idea how a statement would know that it is
ready for reuse. An application could keep a reference to the statement
and run it again and again. WeakReferences and ReferenceQueues could be
used. Always interesting to play with the garbage collector ;-)

Maybe the postgres api has a way to identify individual calls and then
Mark Lewis idea of just caching the handle and creating individual
prepared statement objects upon use is the right way to go.


Till

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-16-2008, 12:08 AM
Oliver Jowett
 
Posts: n/a
Default Re: statement caching proof of concept

Dave Cramer wrote:
> This is just proof of concept. More work has to be done to make it build
> properly and work properly under different jdk's


Isn't there a bunch of statement state (things like fetch size, max
rows, etc) that have defined defaults and this cache implementation will
not provide?

The "wrapper" implementation approach suffers from the usual difficulty
that the "back links" such as ResultSet.getStatement() point to the
wrong object. It's actually quite a bit of work to get this right..

The cached statements are vulnerable to buggy apps that mutate the
statement after close, since there's no interception of those methods to
check whether the wrapper statement has been closed.

What exactly is the performance bottleneck you're trying to avoid by
having the statement pool? If it's the parse/plan cost, I think Mark's
suggestion of putting the cache at the protocol level may be simpler. If
it's overall statement cost, you might be better off with a generic
wrapper that is not postgresql-specific at all.

-O

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-16-2008, 12:08 AM
Dave Cramer
 
Posts: n/a
Default Re: statement caching proof of concept


On 19-Jun-06, at 7:15 PM, Oliver Jowett wrote:

> Dave Cramer wrote:
>> This is just proof of concept. More work has to be done to make it
>> build properly and work properly under different jdk's

>
> Isn't there a bunch of statement state (things like fetch size, max
> rows, etc) that have defined defaults and this cache implementation
> will not provide?


Good point, these would all have to be reset to their defaults as well.

>
> The "wrapper" implementation approach suffers from the usual
> difficulty that the "back links" such as ResultSet.getStatement()
> point to the wrong object. It's actually quite a bit of work to get
> this right..

Since result sets only live as long as the statement, wouldn't they
point to the statement that is still open ?
>
> The cached statements are vulnerable to buggy apps that mutate the
> statement after close, since there's no interception of those
> methods to check whether the wrapper statement has been closed.


No question, and I would certainly not make this the default
behaviour. The user would have to turn on caching.
>
> What exactly is the performance bottleneck you're trying to avoid
> by having the statement pool? If it's the parse/plan cost, I think
> Mark's suggestion of putting the cache at the protocol level may be
> simpler. If it's overall statement cost, you might be better off
> with a generic wrapper that is not postgresql-specific at all.


How does the generic wrapper solve the problems above ? I would think
they all suffer from the same problems ?
>
> -O
>



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:26 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com