Unix Technical Forum

Re: binary protocol was Performance problem with timestamps in result sets

This is a discussion on Re: binary protocol was Performance problem with timestamps in result sets within the pgsql Interfaces jdbc forums, part of the PostgreSQL category; --> How about the actual transport cost difference between text and binary protocols ? it may not be any big ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Interfaces jdbc

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-15-2008, 11:57 PM
mikael-aronsson
 
Posts: n/a
Default Re: binary protocol was Performance problem with timestamps in result sets

How about the actual transport cost difference between text and binary
protocols ? it may not be any big difference though, and many times text
representation can be smaller then a binary protocol.

I have no idea about endianness, but as the clients works fine between
different platforms I would assume that the endian format in the protocol is
fixed (but you should not assume things so maybe I am hanging myself again
here).

I do not think it would give much though to use the binary protocol as Java
is not very good when it comes to converting binary data back to native
values unless it is serialized or you start to mess around with nio buffers,
so in the end I do not think there would be much difference in performance.

Mikael

----- Original Message -----
From: "mikael-aronsson" <mikael-aronsson@telia.com>
To: "Dave Cramer" <pg@fastcrypt.com>
Sent: Thursday, March 09, 2006 1:38 PM
Subject: Re: [JDBC] binary protocol was Performance problem with timestamps
in result sets


> How about the actual transport cost difference between text and binary
> protocols ? it may not be any big difference though, and many times text
> representation can be smaller then a binary protocol.
>
> I have no idea about endianness, but as the clients works fine between
> different platforms I would assume that the endian format in the protocol
> is fixed (but you should not assume things so maybe I am hanging myself
> again here).
>
> I do not think it would give much though to use the binary protocol as
> Java is not very good when it comes to converting binary data back to
> native values unless it is serialized or you start to mess around with nio
> buffers, so in the end I do not think there would be much difference in
> performance.
>
> Mikael
>
> ----- Original Message -----
> From: "Dave Cramer" <pg@fastcrypt.com>
> To: "List" <pgsql-jdbc@postgresql.org>
> Sent: Thursday, March 09, 2006 1:11 PM
> Subject: [JDBC] binary protocol was Performance problem with timestamps in
> result sets
>
>
>> As Oliver points out the timestamp is not a 64bit integer, or even a
>> floatingpoint number. It is a textual representation of the timestamp
>> which needs to be parsed. I looked at the parsing and I was unable to
>> see anything that could be significantly optimized.
>>
>> So the option of going to the binary protocol exists. There are a number
>> of challenges with this. As Oliver points out this is an all or nothing
>> proposition. In other words you can't ask for just timestamps to be
>> returned in binary. The entire row comes back as binary. Additionally,
>> there are two possible representations of timestamps in postgresql. One
>> is a 64 bit integer, the other is floating point. Added to this there
>> may be endian issues ( do we know the answer to this question ?)
>>
>> Significant performance improvement exists for dates, times, timestamps,
>> however the advantages for the rest of the types is questionable given
>> the above assertions.
>>
>> Comments ?
>>
>> Dave
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>> subscribe-nomail command to majordomo@postgresql.org so that your
>> message can get through to the mailing list cleanly

>



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-15-2008, 11:57 PM
Thomas Dudziak
 
Posts: n/a
Default Re: binary protocol was Performance problem with timestamps in result sets

On 3/9/06, mikael-aronsson <mikael-aronsson@telia.com> wrote:
> How about the actual transport cost difference between text and binary
> protocols ? it may not be any big difference though, and many times text
> representation can be smaller then a binary protocol.


Really ? I would have thought its vice versa. E.g. a float is usualy 4
(or 8) bytes in binary, but can be a lot longer in text depending on
the format.

> I do not think it would give much though to use the binary protocol as Java
> is not very good when it comes to converting binary data back to native
> values unless it is serialized or you start to mess around with nio buffers,
> so in the end I do not think there would be much difference in performance.


Personally I would be interested in whether a binary protocol impl in
the JDBC driver would bring benefits or not for the other, simpler
types (int, string, ...). If they don't suffer, then it might actually
be worthwhile to investigate a binary impl. That is being said of
course from a pure user perspective - I have no insight whatsoever in
the core workings of the JDBC driver.

cheers,
Tom

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-15-2008, 11:57 PM
Markus Schaber
 
Posts: n/a
Default Re: binary protocol was Performance problem with timestamps

Hi, Mikael,

mikael-aronsson wrote:

> I do not think it would give much though to use the binary protocol as Java
> is not very good when it comes to converting binary data back to native
> values unless it is serialized or you start to mess around with nio
> buffers,


I think that parsing complicated text representations is not faster than
multiplying fixed-length bunches of byte values together.


HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-15-2008, 11:57 PM
Marc Herbert
 
Posts: n/a
Default Re: binary protocol was Performance problem with timestamps in result sets

"Thomas Dudziak" <tomdzk@gmail.com> writes:

> On 3/9/06, mikael-aronsson <mikael-aronsson@telia.com> wrote:
>> How about the actual transport cost difference between text and binary
>> protocols ? it may not be any big difference though, and many times text
>> representation can be smaller then a binary protocol.

>
> Really ? I would have thought its vice versa. E.g. a float is usualy 4
> (or 8) bytes in binary, but can be a lot longer in text depending on
> the format.



To represent binary IEEE754's floats (4 bytes) without loss the
maximum required number of base10 digits is 9. For IEEE754's doubles
(8 bytes) it's 17. I don't know what is the "average" required number
of digits.


Of course using one byte-character per base10 digit is a waste of
space... you could gzip or BCD-encode the string :-)

References:
- "What Every Computer Scientist Should Know About Floating Point
Arithmetic" 1991 - David Goldberg
- paragraph "Conversions" at:
<http://www2.hursley.ibm.com/decimal/>


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-15-2008, 11:57 PM
Marc Herbert
 
Posts: n/a
Default Re: binary protocol was Performance problem with timestamps in result sets

Marc Herbert <Marc.Herbert@continuent.com> writes:

> To represent binary IEEE754's floats (4 bytes) without loss the
> maximum required number of base10 digits is 9. For IEEE754's doubles
> (8 bytes) it's 17. I don't know what is the "average" required number
> of digits.


Sorry, forgot to precise: that's just for the fraction. You need to
add representations for sign and exponent.





---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:43 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com