Unix Technical Forum

datatype advice numeric vs. varchar

This is a discussion on datatype advice numeric vs. varchar within the Pgsql General forums, part of the PostgreSQL category; --> I'm working on a voicemail application which involves storing phone numbers and other types that have long sequences of ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql General

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-09-2008, 01:27 PM
Gene
 
Posts: n/a
Default datatype advice numeric vs. varchar

I'm working on a voicemail application which involves storing phone
numbers and other types that have long sequences of digits [0-9]. I
need to be able to do pattern matching using LIKE and map the fields
back to java String objects. I was thinking maybe it would be more
efficient to use the numeric(x,0) type instead of a varchar since it
could use less space on disk.

My calculations for disk space based off some information i found
online are ( 8 + ( 2 bytes for every four digits) ) for numeric and (
4 + number of chars ) for a utf8 varchar datatype. Are these
calculations still valid and has anyone tried using numeric for this
purpose or is this really stupid?

Thanks for any hints!
Gene

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-09-2008, 01:27 PM
Martijn van Oosterhout
 
Posts: n/a
Default Re: datatype advice numeric vs. varchar

On Thu, Jan 18, 2007 at 01:15:02AM -0500, Gene wrote:
> I'm working on a voicemail application which involves storing phone
> numbers and other types that have long sequences of digits [0-9]. I
> need to be able to do pattern matching using LIKE and map the fields
> back to java String objects. I was thinking maybe it would be more
> efficient to use the numeric(x,0) type instead of a varchar since it
> could use less space on disk.


Your calculations are reasonable, but if you do it as numeric you're
going to be fighting the system the whole way. Numerics are treated as
numbers and things like LIKE don't work on them. When you select them,
java is likely to convert them to numbers there too.

Is this not a case of premature optimisation?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFrxFpIB7bNG8LQkwRAvMbAKCJ/1OeO5U7sk0QGjl9YBia4sfRjwCfSFWy
TK5gVC8ybtnH2Xtdco6SGqA=
=pUUw
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-09-2008, 01:27 PM
Michael Glaesemann
 
Posts: n/a
Default Re: datatype advice numeric vs. varchar


On Jan 18, 2007, at 15:15 , Gene wrote:

> My calculations for disk space based off some information i found
> online are ( 8 + ( 2 bytes for every four digits) ) for numeric and (
> 4 + number of chars ) for a utf8 varchar datatype. Are these
> calculations still valid and has anyone tried using numeric for this
> purpose or is this really stupid?


While telephone numbers typically consist of digits, they're not
numbers: they're strings of digits. For example, a telephone number
in Tokyo is (typically) a string of 10 digits, beginning with "03".
0311111111 as numeric would have unexpected results when retrieved.
While you may not be concerned with Japanese phone numbers, I use it
as an example to show that telephone "numbers" are actually strings.

In short, use strings (text/varchar).

Michael Glaesemann
grzm seespotcode net

PS. The number of bytes used to represent characters in UTF-8 varies.
I believe digits (0-9) are all 1 byte/char.


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-09-2008, 01:27 PM
Gene
 
Posts: n/a
Default Re: datatype advice numeric vs. varchar

you're probably right, actually using LIKE with numeric works fine but
of course i will have to contend with things like truncating zeros to
the left etc... are there any advantages to using ascii encoding as
far as performance of LIKE, REGEXES, INDEXES etc versus using UTF8?

thanks, gene

On 1/18/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Thu, Jan 18, 2007 at 01:15:02AM -0500, Gene wrote:
> > I'm working on a voicemail application which involves storing phone
> > numbers and other types that have long sequences of digits [0-9]. I
> > need to be able to do pattern matching using LIKE and map the fields
> > back to java String objects. I was thinking maybe it would be more
> > efficient to use the numeric(x,0) type instead of a varchar since it
> > could use less space on disk.

>
> Your calculations are reasonable, but if you do it as numeric you're
> going to be fighting the system the whole way. Numerics are treated as
> numbers and things like LIKE don't work on them. When you select them,
> java is likely to convert them to numbers there too.
>
> Is this not a case of premature optimisation?
>
> Have a nice day,
> --
> Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.

>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
>
> iD8DBQFFrxFpIB7bNG8LQkwRAvMbAKCJ/1OeO5U7sk0QGjl9YBia4sfRjwCfSFWy
> TK5gVC8ybtnH2Xtdco6SGqA=
> =pUUw
> -----END PGP SIGNATURE-----
>
>
>



--
Gene Hart
cell: 443-604-2679

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-09-2008, 01:27 PM
Tino Wildenhain
 
Posts: n/a
Default Re: datatype advice numeric vs. varchar

Gene schrieb:
> you're probably right, actually using LIKE with numeric works fine but
> of course i will have to contend with things like truncating zeros to
> the left etc... are there any advantages to using ascii encoding as
> far as performance of LIKE, REGEXES, INDEXES etc versus using UTF8?


Well if you are really concerned, write a custom datatype where you
store 2 digits of the telephone number in one byte. A nibble for
0-9+-/ und " " should be plenty (you would even have a c-like
string end marker :-) Of course you would need to write operators
for everything you want to do w/ it. IIRC, there were even plans
I read on the list to write such a datatype. Maybe you scan the
list archives first.

Regards
Tino

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-09-2008, 01:27 PM
Ron Johnson
 
Posts: n/a
Default Re: datatype advice numeric vs. varchar

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/18/07 00:22, Michael Glaesemann wrote:
>
> On Jan 18, 2007, at 15:15 , Gene wrote:
>
>> My calculations for disk space based off some information i found
>> online are ( 8 + ( 2 bytes for every four digits) ) for numeric and (
>> 4 + number of chars ) for a utf8 varchar datatype. Are these
>> calculations still valid and has anyone tried using numeric for this
>> purpose or is this really stupid?

>
> While telephone numbers typically consist of digits, they're not
> numbers: they're strings of digits. For example, a telephone number in
> Tokyo is (typically) a string of 10 digits, beginning with "03".
> 0311111111 as numeric would have unexpected results when retrieved.
> While you may not be concerned with Japanese phone numbers, I use it as
> an example to show that telephone "numbers" are actually strings.


Not only that, but since the number pad also has "*" & "#", there
/might/ be obscure times when you need to use them.

> In short, use strings (text/varchar).


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFFr3KvS9HxQb37XmcRAtWLAKCAM5hbrZtTmK4aK6N1zv 8rVALkGQCg2E0P
KMvry9tlIyCNH9LjiV+8M78=
=wY6z
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-09-2008, 01:27 PM
shakahshakah@gmail.com
 
Posts: n/a
Default Re: datatype advice numeric vs. varchar

On Jan 18, 1:25 am, genekh...@gmail.com (Gene) wrote:
> you're probably right, actually using LIKE with numeric works fine but
> of course i will have to contend with things like truncating zeros to
> the left etc... are there any advantages to using ascii encoding as
> far as performance of LIKE, REGEXES, INDEXES etc versus using UTF8?
>
> thanks, gene


You might want to check out the following link, particularly if your
database is UTF-8, use varchar for your TNs, and you want to look for
calls from certain NPA/NXX/block_id ranges, e.g. with "WHERE ani LIKE
'2015551%'. As I understand it, an index created with a simple "CREATE
INDEX blah ON tbl (ani)" won't get used in that situation, whereas one
created with "CREATE INDEX blah2 ON tbl (ani varchar_pattern_ops)"
will.

http://www.postgresql.org/docs/8.2/i...s-opclass.html

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:47 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com