Unix Technical Forum

Wide vs Long tables

This is a discussion on Wide vs Long tables within the MySQL forums, part of the Database Server Software category; --> Erick T. Barkhuis schrieb: >> In fact, it's not at all uncommon to have a table with PostalCodes and ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #21 (permalink)  
Old 04-09-2008, 04:49 AM
Thomas Mlynarczyk
 
Posts: n/a
Default Re: Wide vs Long tables

Erick T. Barkhuis schrieb:

>> In fact, it's not at all uncommon to have a table with PostalCodes and
>> Cities for lookup purposes.

>
> ...and then hope that the postal code system isn't changed nationally!


Table1: Id, Name, PostalCodeId
Table2: PostalCodeId, PostalCode, City

But then Table2 is not in 3NF... However, if the postal code system is
changed, only Table2 needs to be updated.

Greetings,
Thomas

--
Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison!
(Coluche)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #22 (permalink)  
Old 04-09-2008, 04:49 AM
Gordon Burditt
 
Posts: n/a
Default Re: Wide vs Long tables

>>> Normalization is ALWAYS practical.
>>
>> Id Name PostalCode City
>> ----------------------------
>> 1 Thomas 77694 Kehl
>> 2 Foo 12345 Bar
>>
>> This table is not in 3rd normal form as City depends on the non-key
>> attribute PostalCode and not on Id.


Are you SURE of this? Do you know this to be true of all countries
with postal codes, now and in the future? What prohibits a country
from ruling that the postal code boundaries and the city boundaries
may not coincide for a distance larger than one centimeter, except
along national boundaries? (This might be done because city
boundaries are in dispute, but GPS readings aren't, and it saves
on bloodshed.) Thus, the relationship between City and Postal Code
is many-to-many.

I seem to recall an odd case where 3 "cities" (actually near-ghost-towns)
shared a single (5-digit AND 9-digit) ZIP code and a single block
of a single street in the USA (Texas or Arizona, I think). They
had a combined population of about 5. Further, the boundary lines
between the cities ran through the middle of the bed of two of the
residents (someone merged two of the apartments into one), and the
building they were in was in all 3 cities. The population may have
died out by now.

I also recall at one time a large political fight over a car-pooling
database. People wanted their addresses to be stated on the list
AS THEY WROTE THEM DOWN, not edited by computer. You had a lot of
married couples, one living in Fort Worth, Texas 76126 and another
living in Benbrook, Texas 76126, with identical street addresses.
I think some of this had to do with the Post Office actually opening
a branch in Benbrook, and there were some strong feelings about
Benbrook's appearance on the postal map. The alternative to allowing
both was to shut the whole thing down as there was so much fighting
over it and so many people threatened to pull out, even a lot of
people not in that zip code. Since the car-pool-matching algorithm
used the 5-digit ZIP code in any case, it would still pair husband
and wife into the same car-pool regardless of the printed city.

Now, if you REALLY want political fights over addressing, consider
some of the fights Microsoft got involved in over about four pixels
of disputed territory between India and Pakistan on a map in one
of its educational products, arguments over whether Taiwan is a
separate country (hey, PRC, you can either claim that the USA
accidentally sent nuclear fuses to the territory of one of its
enemies, or you can claim that Taiwan is your territory, but not
both at the same time), and the existence of a country called
"Palestine".

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #23 (permalink)  
Old 04-09-2008, 04:49 AM
quakey@gmail.com
 
Posts: n/a
Default Re: Wide vs Long tables

I should clarify...when I said it takes eons, it's the not the queries
themselves, but whenever any kind of operations that alter the table
structure (adding new columns, indexes, etc).

What you proposed is a many-to-many relationship and I think is nice.

Jerry thinks I should put each event on its own row, so now I have two
possible solutions and I wonder which one will be better...hmmm.

On Apr 8, 3:24*am, Gilbert <bugger....@no.spam.com> wrote:
> I don't know how well MySQL handles tables at 1.8gb but I can tell you that
> in the Oracle world this is a very small table indeed. The reason that your
> queries take eons to run is that you almost certainly cannot use indexing
> effeciently and you will be processing the entire table for each query.
>
> I would probably set up a "Customer" table, an "Events" table and
> a "Customer/Event" table with one row per customer/event. With proper
> indexing your queries will fly.
>
> Regards


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #24 (permalink)  
Old 04-09-2008, 04:49 AM
Peter H. Coffin
 
Posts: n/a
Default Re: Wide vs Long tables

On Tue, 8 Apr 2008 18:30:46 -0700 (PDT), quakey@gmail.com wrote:
> I should clarify...when I said it takes eons, it's the not the queries
> themselves, but whenever any kind of operations that alter the table
> structure (adding new columns, indexes, etc).
>
> What you proposed is a many-to-many relationship and I think is nice.
>
> Jerry thinks I should put each event on its own row, so now I have two
> possible solutions and I wonder which one will be better...hmmm.


At the general case, they're actually the same solution. Jerry's
solution just basically omits the event detail table, leaving you with
just the person table and the corollation table with only the person and
event identifiers, but no event detail.

--
Windows is a pane in the ass....
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #25 (permalink)  
Old 04-09-2008, 04:49 AM
Jerry Stuckle
 
Posts: n/a
Default Re: Wide vs Long tables

Peter H. Coffin wrote:
> On Tue, 8 Apr 2008 18:30:46 -0700 (PDT), quakey@gmail.com wrote:
>> I should clarify...when I said it takes eons, it's the not the queries
>> themselves, but whenever any kind of operations that alter the table
>> structure (adding new columns, indexes, etc).
>>
>> What you proposed is a many-to-many relationship and I think is nice.
>>
>> Jerry thinks I should put each event on its own row, so now I have two
>> possible solutions and I wonder which one will be better...hmmm.

>
> At the general case, they're actually the same solution. Jerry's
> solution just basically omits the event detail table, leaving you with
> just the person table and the corollation table with only the person and
> event identifiers, but no event detail.
>


Not necessarily - it just wasn't specified one way or the other. It
looks like currently there is no event detail; rather just a column
referencing the event. If all they need is the event id, then they
don't need a third table. However, if they need the detail (highly
recommended), then there should be a third table containing the event id
and the description. The second table then just becomes a many-to-many
link between the original (first) and detail tables.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #26 (permalink)  
Old 04-09-2008, 05:06 PM
Gilbert
 
Posts: n/a
Default Re: Wide vs Long tables

Jerry Stuckle wrote:

> Peter H. Coffin wrote:
>> On Tue, 8 Apr 2008 18:30:46 -0700 (PDT), quakey@gmail.com wrote:
>>> I should clarify...when I said it takes eons, it's the not the queries
>>> themselves, but whenever any kind of operations that alter the table
>>> structure (adding new columns, indexes, etc).
>>>
>>> What you proposed is a many-to-many relationship and I think is nice.
>>>
>>> Jerry thinks I should put each event on its own row, so now I have two
>>> possible solutions and I wonder which one will be better...hmmm.

>>
>> At the general case, they're actually the same solution. Jerry's
>> solution just basically omits the event detail table, leaving you with
>> just the person table and the corollation table with only the person and
>> event identifiers, but no event detail.
>>

>
> Not necessarily - it just wasn't specified one way or the other. It
> looks like currently there is no event detail; rather just a column
> referencing the event. If all they need is the event id, then they
> don't need a third table. However, if they need the detail (highly
> recommended), then there should be a third table containing the event id
> and the description. The second table then just becomes a many-to-many
> link between the original (first) and detail tables.
>


The other possible advantage of having an event table (with or without
additional details) is that it provies an integrity check - i.e a
customer/event record cannot be entered unless the event id is known to the
system.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #27 (permalink)  
Old 04-09-2008, 10:42 PM
Thomas Mlynarczyk
 
Posts: n/a
Default Re: Wide vs Long tables

Gordon Burditt schrieb:
>>>> Normalization is ALWAYS practical.
>>> Id Name PostalCode City
>>> ----------------------------
>>> 1 Thomas 77694 Kehl
>>> 2 Foo 12345 Bar
>>>
>>> This table is not in 3rd normal form as City depends on the non-key
>>> attribute PostalCode and not on Id.

>
> Are you SURE of this? Do you know this to be true of all countries
> with postal codes, now and in the future? What prohibits a country
> from ruling that the postal code boundaries and the city boundaries
> may not coincide for a distance larger than one centimeter, except
> along national boundaries? (This might be done because city
> boundaries are in dispute, but GPS readings aren't, and it saves
> on bloodshed.) Thus, the relationship between City and Postal Code
> is many-to-many.


But Jerry just told me to normalize!? And if it's many-to-many,
shouldn't it go into a separate table anyway?

> I seem to recall an odd case where 3 "cities" (actually near-ghost-towns)
> shared a single (5-digit AND 9-digit) ZIP code and a single block
> of a single street in the USA (Texas or Arizona, I think). They
> had a combined population of about 5. Further, the boundary lines
> between the cities ran through the middle of the bed of two of the
> residents (someone merged two of the apartments into one), and the
> building they were in was in all 3 cities. The population may have
> died out by now.


So that particular problem is solved ;-) Anyway, for all practical
purposes, if you put *one* (any) of the many zip-codes a place has on
the letter you're sending, it will arrive, won't it? If there is indeed
such ambiguity, one could always choose one of the codes as the
canonical one.

Greetings,
Thomas

--
Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison!
(Coluche)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #28 (permalink)  
Old 04-11-2008, 12:53 AM
Gordon Burditt
 
Posts: n/a
Default Re: Wide vs Long tables

>>>>> Normalization is ALWAYS practical.
>>>> Id Name PostalCode City
>>>> ----------------------------
>>>> 1 Thomas 77694 Kehl
>>>> 2 Foo 12345 Bar
>>>>
>>>> This table is not in 3rd normal form as City depends on the non-key
>>>> attribute PostalCode and not on Id.

>>
>> Are you SURE of this? Do you know this to be true of all countries
>> with postal codes, now and in the future? What prohibits a country
>> from ruling that the postal code boundaries and the city boundaries
>> may not coincide for a distance larger than one centimeter, except
>> along national boundaries? (This might be done because city
>> boundaries are in dispute, but GPS readings aren't, and it saves
>> on bloodshed.) Thus, the relationship between City and Postal Code
>> is many-to-many.

>
>But Jerry just told me to normalize!? And if it's many-to-many,
>shouldn't it go into a separate table anyway?


If the city name depends on the individual person, not just on the
Postal Code, then it already *IS* normalized. It's the job of a
database to record and store data, not alter the data to fit the
view of someone who makes assumptions too quickly.

If you don't need to presort outgoing mail, and the only purpose
of the address is to send postal mail (not locate stores nearby,
etc.) then all you need is one address field (combining city, state,
zip, street, box number, suite number and a bunch of other stuff
that might go into an address), which may contain multiple lines
in any format the member says the post office will accept.

>> I seem to recall an odd case where 3 "cities" (actually near-ghost-towns)
>> shared a single (5-digit AND 9-digit) ZIP code and a single block
>> of a single street in the USA (Texas or Arizona, I think). They
>> had a combined population of about 5. Further, the boundary lines
>> between the cities ran through the middle of the bed of two of the
>> residents (someone merged two of the apartments into one), and the
>> building they were in was in all 3 cities. The population may have
>> died out by now.

>
>So that particular problem is solved ;-)


Yes, but how many more are there?

>Anyway, for all practical
>purposes, if you put *one* (any) of the many zip-codes a place has on
>the letter you're sending, it will arrive, won't it?


Not necessarily (and regardless of the address, there's never any guarantee
it will arrive anywhere). I did once manage to get a piece of mail to a
post office box with *ONLY* a zip code (a 9-digit zip code is sometimes
unique for post office boxes, and in this case it was) or with a zip code
and totally unrelated city and state.

>If there is indeed
>such ambiguity, one could always choose one of the codes as the
>canonical one.


Sometimes that can get you shot (consider choosing the canonical
country name "Israel" vs. "Palestine" when the customers KNOW WHERE
YOU LIVE and you are in easy rocket-launcher range.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #29 (permalink)  
Old 04-12-2008, 01:05 AM
Thomas Mlynarczyk
 
Posts: n/a
Default Re: Wide vs Long tables

Gordon Burditt schrieb:

[Choose one postal code as the canonical one]
> Sometimes that can get you shot (consider choosing the canonical
> country name "Israel" vs. "Palestine" when the customers KNOW WHERE
> YOU LIVE and you are in easy rocket-launcher range.


So I should keep everything in a single table to safe my life? Didn't
know that normalization could be that dangerous...

Greetings,
Thomas

--
Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison!
(Coluche)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #30 (permalink)  
Old 04-12-2008, 01:05 AM
Gordon Burditt
 
Posts: n/a
Default Re: Wide vs Long tables

>[Choose one postal code as the canonical one]
>> Sometimes that can get you shot (consider choosing the canonical
>> country name "Israel" vs. "Palestine" when the customers KNOW WHERE
>> YOU LIVE and you are in easy rocket-launcher range.

>
>So I should keep everything in a single table to safe my life? Didn't
>know that normalization could be that dangerous...


It's not normalization if the City or Country depends not only on the
Postal Code but also on the politics of the individual person involved.
In that case the City or Country belongs in the single table.

What's wrong with choosing WHAT THE CUSTOMER TOLD YOU?
Why do you think you know his address better than he/she does?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:12 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com