Unix Technical Forum

Re: ODBC driver for Windows & future...

This is a discussion on Re: ODBC driver for Windows & future... within the pgsql Interfaces odbc forums, part of the PostgreSQL category; --> Hi, I have investigated the ODBC driver behaviour, when the query result has very many rows. It seems, that ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Interfaces odbc

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-16-2008, 02:00 AM
Marko Ristola
 
Posts: n/a
Default Re: ODBC driver for Windows & future...


Hi,

I have investigated the ODBC driver behaviour, when the query result has
very many rows.

It seems, that the whole query result is stored as such into a memory
buffer before any
further processing.
That buffer is reallocated, when needed. If the buffer is for example
50Mbytes, and
after reallocation it's size will be 100Mbytes. The malloc() or
realloc() takes a very long time.
I investigated this bottleneck on Linux ODBC driver.

The procedure without an ODBC cursor is as follows:
1. Read all query result data from the backend to the huge buffer.
(maybe sometimes restructure the buffer, if some column size on the
buffer is exceeded.)
This seems to be the bottleneck with the large malloc() operation.
2. Read (and convert) the asked results from the buffer for the given row.


Allocating huge buffers is inefficient.
Linux operating system handles allocating big files much better
than allocating big memory areas.
More efficient would be to use a temporary file:
sequential file scans are rather fast.

One way for solving the problem:

Maybe the key for solving the bottleneck is to tune the operating system
to free enough memory
beforehand: If the operating system has 100Mb unused memory, it is a lot
faster,
than if it has only 2Mb unused memory ready for fast memory allocations.

Good way for solving the problem:

The bottleneck can be avoided on the program side by using ODBC cursor.
With ODBC cursor one can fetch for example 1000 rows in one batch
from the database server. You get next 1000 rows with a new fetch.
This way there is no limit on the number of rows fetched on any database.

On large result sets, there is always a limit with the memory on 32 bit
systems.
On 64 bit systems this limit goes away, but the limit with some slowdown on
nonlocal CPU memory won't go away even on high end machines.
(NUMA machines have about 2Gbytes memory near each CPU. Other memory is
behind
a slower bus ). So memory allocations over 2Gb are not good for speed.

So, the ODBC cursor scales well for any huge query result, on any
client operating system. It works even on Java, where memory is
extremely limited.

Other ways to solve the problem?
ODBC Code: How about allocating memory in 4Mb chunks? The operating
system handles
small memory allocations more easily and frees more memory to be available
in the background while the ODBC driver fills the allocated chunk.

Marko Ristola

*Shachar Shemesh wrote:

*lso of interest is that this very same client is also interested in the
ODBC driver for a different project. We have already did some porting of
their application, and have spotted a serious performance issue with
ODBC when long query results are retrieved. It is possible (thought it
would be best not to count on it) that we will do some work in that
direction on ODBC in the foreseeable future. The reason we did not step
forward and offered ourselves as full maintainers of the code is that we
don't feel we have the resources for that. It is good to know, however,
that the facilities for sending patches and having them committed exists.


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-16-2008, 02:00 AM
markw@mohawksoft.com
 
Posts: n/a
Default Re: ODBC driver for Windows & future...

>
> Hi,
>
> I have investigated the ODBC driver behaviour, when the query result has
> very many rows.
>
> It seems, that the whole query result is stored as such into a memory
> buffer before any
> further processing.
> That buffer is reallocated, when needed. If the buffer is for example
> 50Mbytes, and
> after reallocation it's size will be 100Mbytes. The malloc() or
> realloc() takes a very long time.
> I investigated this bottleneck on Linux ODBC driver.


There is actually a setting that makes the ODBC driver use a cursor to
only grab a chunk at a time, its in the docs somewhere as I have used it
in the past.


>
> The procedure without an ODBC cursor is as follows:
> 1. Read all query result data from the backend to the huge buffer.
> (maybe sometimes restructure the buffer, if some column size on the
> buffer is exceeded.)
> This seems to be the bottleneck with the large malloc() operation.
> 2. Read (and convert) the asked results from the buffer for the given row.
>
>
> Allocating huge buffers is inefficient.
> Linux operating system handles allocating big files much better
> than allocating big memory areas.
> More efficient would be to use a temporary file:
> sequential file scans are rather fast.
>
> One way for solving the problem:
>
> Maybe the key for solving the bottleneck is to tune the operating system
> to free enough memory
> beforehand: If the operating system has 100Mb unused memory, it is a lot
> faster,
> than if it has only 2Mb unused memory ready for fast memory allocations.
>
> Good way for solving the problem:
>
> The bottleneck can be avoided on the program side by using ODBC cursor.
> With ODBC cursor one can fetch for example 1000 rows in one batch
> from the database server. You get next 1000 rows with a new fetch.
> This way there is no limit on the number of rows fetched on any database.
>
> On large result sets, there is always a limit with the memory on 32 bit
> systems.
> On 64 bit systems this limit goes away, but the limit with some slowdown
> on
> nonlocal CPU memory won't go away even on high end machines.
> (NUMA machines have about 2Gbytes memory near each CPU. Other memory is
> behind
> a slower bus ). So memory allocations over 2Gb are not good for speed.
>
> So, the ODBC cursor scales well for any huge query result, on any
> client operating system. It works even on Java, where memory is
> extremely limited.
>
> Other ways to solve the problem?
> ODBC Code: How about allocating memory in 4Mb chunks? The operating
> system handles
> small memory allocations more easily and frees more memory to be available
> in the background while the ODBC driver fills the allocated chunk.
>
> Marko Ristola
>
> *Shachar Shemesh wrote:
>
> *lso of interest is that this very same client is also interested in the
> ODBC driver for a different project. We have already did some porting of
> their application, and have spotted a serious performance issue with
> ODBC when long query results are retrieved. It is possible (thought it
> would be best not to count on it) that we will do some work in that
> direction on ODBC in the foreseeable future. The reason we did not step
> forward and offered ourselves as full maintainers of the code is that we
> don't feel we have the resources for that. It is good to know, however,
> that the facilities for sending patches and having them committed exists.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>



---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:53 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com