Unix Technical Forum

Reducing NUMERIC size for 8.3

This is a discussion on Reducing NUMERIC size for 8.3 within the pgsql Hackers forums, part of the PostgreSQL category; --> We've changed the on-disk database format in 8.3, so we have an opportunity to change other things also. There ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 09:09 AM
Simon Riggs
 
Posts: n/a
Default Reducing NUMERIC size for 8.3

We've changed the on-disk database format in 8.3, so we have an
opportunity to change other things also. There is a patch thats been on
the patch queue for some time called numeric508, submitted Dec 2005;
I've updated this patch now for 8.3 to remove bit rot (an hour's work).
This is posted to pgsql-patches now and it works.

The benefit of the patch is that it reduces each NUMERIC value by 2
bytes, so will speed up things considerably. This is now especially
important if we are looking to reduce the speed of numeric division by a
factor of 4 (recent WIP patch).

The objections to applying this patch originally were:
1. it changes on-disk format (we've done this, so argument is void)
2. it would restrict number of digits to 508 and there are allegedly
some people that want to store > 508 digits.

The current patch passes all regression tests, but currently fails
numeric_big.sql since this explicitly checks for support of
numeric(1000,800).

We could:
a) accept the patch as-is and restrict NUMERIC to 508 digits
b) refine the patch somewhat to allow 1000 digits

(b) is possible in a couple of ways, both fairly quick:
- extend the patch so that one of the spare bits from the second digit
is used to represent dscale 508-1000.
- extend the patch so that if weight > 127 or dscale > 127 we would use
the first byte in the digits as an extra indicator byte holding the high
bits of both fields.
Neither change makes any difference to numbers below
1,000,000,000,000,000....(127 zeroes in total)...000 which probably
covers the vast majority of people's usage.

Objections: True, we are passed feature freeze, but this patch has been
on the queue for 14 months prior to freeze and has been waiting on disk
format changes to make patch application acceptable. We definitely want
to reduce the size of Numeric by 2 bytes at some point. The question in
my mind is: When is the best time to make this change? If we put this
off until 8.4, then it will get rejected again because we won't want to
change the disk format again. So the best time to do this is now,
otherwise we'll put it off forever.

Can I get somebody other than Tom to agree to review the patch? Clearly
waiting for Tom to review this is just going to delay release, which I
don't want to do.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 09:10 AM
Tom Lane
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

"Simon Riggs" <simon@2ndquadrant.com> writes:
> We've changed the on-disk database format in 8.3, so we have an
> opportunity to change other things also. There is a patch thats been on
> the patch queue for some time called numeric508, submitted Dec 2005;


I thought that idea had been rejected long since as being an unreasonable
reduction in the flexibility of numeric arithmetic.

> The benefit of the patch is that it reduces each NUMERIC value by 2
> bytes, so will speed up things considerably. This is now especially
> important if we are looking to reduce the speed of numeric division by a
> factor of 4 (recent WIP patch).


The only way that patch would make division faster is by making the
slowest cases (longest input numbers) impossible. Which is hardly a
solution.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 09:10 AM
Andreas Pflug
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

Simon Riggs wrote:
> The objections to applying this patch originally were:
> 2. it would restrict number of digits to 508 and there are allegedly
> some people that want to store > 508 digits.
>

If 508 digits are not enough, are1000 digits be sufficient? Both limits
appear quite arbitrary to me.

Regards,
Andreas


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 09:10 AM
Simon Riggs
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

On Mon, 2007-06-18 at 16:56 +0200, Andreas Pflug wrote:
> Simon Riggs wrote:
> > The objections to applying this patch originally were:
> > 2. it would restrict number of digits to 508 and there are allegedly
> > some people that want to store > 508 digits.
> >

> If 508 digits are not enough, are1000 digits be sufficient? Both limits
> appear quite arbitrary to me.


Thats the current limit; I agree, but I didn't choose it. IIRC if you
don't specify a limit then you can have arbitrarily long numbers.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 09:10 AM
Michael Paesold
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

Andreas Pflug wrote:
> Simon Riggs wrote:
>> The objections to applying this patch originally were:
>> 2. it would restrict number of digits to 508 and there are allegedly
>> some people that want to store > 508 digits.
>>

> If 508 digits are not enough, are1000 digits be sufficient? Both limits
> appear quite arbitrary to me.


This 1000 is just a restriction on the typmod of numeric.

You can still use a much higher number of digits, if you use
unconstrained numeric:

test=> create table test (n numeric);

CREATE TABLE
test=> insert into test values (10::numeric ^ 99999);
INSERT 0 1
test=> select length(n) from test;
length
--------
100017
(1 row)

Best Regards
Michael Paesold


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 09:10 AM
Simon Riggs
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

On Mon, 2007-06-18 at 10:54 -0400, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > We've changed the on-disk database format in 8.3, so we have an
> > opportunity to change other things also. There is a patch thats been on
> > the patch queue for some time called numeric508, submitted Dec 2005;

>
> I thought that idea had been rejected long since as being an unreasonable
> reduction in the flexibility of numeric arithmetic.


I've re-read all of the posts from Dec 2005 and I see everyone in favour
of committing the patch, subject to review of whether limitation to 508
was acceptable. For me that was never finalised, at least I can't see a
clear decision. The important point is that the patch can be easily
modified to overcome these restrictions, a line of thought that was
never explored.

The current patch can be easily modified to allow 1000 digits, yet still
saving 2 bytes per value when storing more common values. Or we can use
the same techniques to represent arbitrary number of digits, again still
saving 2 bytes on common values.

We have the ground work done, we just need to agree further changes.

> > The benefit of the patch is that it reduces each NUMERIC value by 2
> > bytes, so will speed up things considerably. This is now especially
> > important if we are looking to reduce the speed of numeric division by a
> > factor of 4 (recent WIP patch).

>
> The only way that patch would make division faster is by making the
> slowest cases (longest input numbers) impossible. Which is hardly a
> solution.


It isn't directly related to division, but is a speed up nonetheless
which can offset the loss.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 09:10 AM
Tom Lane
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Simon Riggs wrote:
>> The objections to applying this patch originally were:
>> 2. it would restrict number of digits to 508 and there are allegedly
>> some people that want to store > 508 digits.
>>

> If 508 digits are not enough, are1000 digits be sufficient? Both limits
> appear quite arbitrary to me.


As per the recent discussion about factorial, the current limit of
numeric format is 10^131071 --- there is a whole lot of daylight between
that and 10^508.

I had a thought though: it's possible to reduce the header overhead for
typical-size numbers without giving up the ability to store large ones.
This is because the POS/NEG/NAN sign possibilities leave one unused bit
pattern. Hence:

1. Switch the positions of the n_sign_dscale and n_weight fields in the
long format, so that the sign bits are in the first word.

2. Reserve the fourth "sign" bit pattern to denote a compressed-header
format in which there's just one uint16 header word and the
NumericDigits start right after that. The header word could contain:
2 bits: "sign" distinguishing this from the two-word-header format
1 bit: actual number sign (POS or NEG, disallow NaN)
6 bits: weight, room for -32 .. 31
7 bits: dscale, room for 0 .. 127

3. When packing a NumericVar into a Numeric, use this short format when
it's not a NaN and the weight and dscale are in range, else use the long
format.

Since the weight is in base-10000 digits, this bit allocation allows a
dynamic range of about +- 10^127 which fits well with the dscale range.
But I suspect that most of the use-cases for long numerics involve large
integers, so it might be more useful to shave another bit or two from
dscale and give 'em to weight.

In any case, no capability is lost, unlike the original proposal; and
this would be much less invasive than the original patch since there's
no need to play tricks with the content of the digit array.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 09:10 AM
Andreas Pflug
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
>
>> Simon Riggs wrote:
>>
>>> The objections to applying this patch originally were:
>>> 2. it would restrict number of digits to 508 and there are allegedly
>>> some people that want to store > 508 digits.
>>>
>>>

>> If 508 digits are not enough, are1000 digits be sufficient? Both limits
>> appear quite arbitrary to me.
>>

>
> As per the recent discussion about factorial, the current limit of
> numeric format is 10^131071 --- there is a whole lot of daylight between
> that and 10^508.
>
> I had a thought though: it's possible to reduce the header overhead for
> typical-size numbers without giving up the ability to store large ones.
> This is because the POS/NEG/NAN sign possibilities leave one unused bit
> pattern. Hence:
>
> 1. Switch the positions of the n_sign_dscale and n_weight fields in the
> long format, so that the sign bits are in the first word.
>
> 2. Reserve the fourth "sign" bit pattern to denote a compressed-header
> format in which there's just one uint16 header word and the
> NumericDigits start right after that. The header word could contain:
> 2 bits: "sign" distinguishing this from the two-word-header format
> 1 bit: actual number sign (POS or NEG, disallow NaN)
> 6 bits: weight, room for -32 .. 31
> 7 bits: dscale, room for 0 .. 127
>
> 3. When packing a NumericVar into a Numeric, use this short format when
> it's not a NaN and the weight and dscale are in range, else use the long
> format.
>
> Since the weight is in base-10000 digits, this bit allocation allows a
> dynamic range of about +- 10^127 which fits well with the dscale range.
> But I suspect that most of the use-cases for long numerics involve large
> integers, so it might be more useful to shave another bit or two from
> dscale and give 'em to weight.
>
> In any case, no capability is lost, unlike the original proposal; and
> this would be much less invasive than the original patch since there's
> no need to play tricks with the content of the digit array.
>


I wonder if the currently waiting patch isn't Good Enough for
999.9999999999999999 % of use cases, and "all" others can use numeric
instead of numeric(1000,800) or so. Especially since there are many
patches waiting that do need further investigation and refining.

Regards,
Andreas


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 09:10 AM
Simon Riggs
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

On Mon, 2007-06-18 at 11:32 -0400, Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> > Simon Riggs wrote:
> >> The objections to applying this patch originally were:
> >> 2. it would restrict number of digits to 508 and there are allegedly
> >> some people that want to store > 508 digits.
> >>

> > If 508 digits are not enough, are1000 digits be sufficient? Both limits
> > appear quite arbitrary to me.

>
> As per the recent discussion about factorial, the current limit of
> numeric format is 10^131071 --- there is a whole lot of daylight between
> that and 10^508.
>
> I had a thought though: it's possible to reduce the header overhead for
> typical-size numbers without giving up the ability to store large ones.
> This is because the POS/NEG/NAN sign possibilities leave one unused bit
> pattern. Hence:
>
> 1. Switch the positions of the n_sign_dscale and n_weight fields in the
> long format, so that the sign bits are in the first word.
>
> 2. Reserve the fourth "sign" bit pattern to denote a compressed-header
> format in which there's just one uint16 header word and the
> NumericDigits start right after that. The header word could contain:
> 2 bits: "sign" distinguishing this from the two-word-header format
> 1 bit: actual number sign (POS or NEG, disallow NaN)
> 6 bits: weight, room for -32 .. 31
> 7 bits: dscale, room for 0 .. 127
>
> 3. When packing a NumericVar into a Numeric, use this short format when
> it's not a NaN and the weight and dscale are in range, else use the long
> format.
>
> Since the weight is in base-10000 digits, this bit allocation allows a
> dynamic range of about +- 10^127 which fits well with the dscale range.
> But I suspect that most of the use-cases for long numerics involve large
> integers, so it might be more useful to shave another bit or two from
> dscale and give 'em to weight.
>
> In any case, no capability is lost, unlike the original proposal; and
> this would be much less invasive than the original patch since there's
> no need to play tricks with the content of the digit array.


Sounds good. I thought there'd be a way.

Since this is your idea, would you like to do this, or should I?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 09:10 AM
Tom Lane
 
Posts: n/a
Default Re: Reducing NUMERIC size for 8.3

Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Tom Lane wrote:
>> In any case, no capability is lost, unlike the original proposal; and
>> this would be much less invasive than the original patch since there's
>> no need to play tricks with the content of the digit array.


> I wonder if the currently waiting patch isn't Good Enough for
> 999.9999999999999999 % of use cases, and "all" others can use numeric
> instead of numeric(1000,800) or so.


Apparently you misunderstand that patch: it takes capability away from
unconstrained numeric too.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:31 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com