Unix Technical Forum

Re: Database normalization

This is a discussion on Re: Database normalization within the pgsql Sql forums, part of the PostgreSQL category; --> Hello, > > I have a fairly basic question about database design where im not sure > which approach ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Sql

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 05:50 PM
Sebastian Ritter
 
Posts: n/a
Default Re: Database normalization

Hello,

>
> I have a fairly basic question about database design where im not sure
> which approach is considered correct.
>
> I have two different entities: Clients and Services. Both allow users to
> add progressive updates about the two entities.
>
> The update/message format is exactly the same for both. Should I make two
> different tables:
>
> client_updates and service_updates
> or
> one table with extra columns : is_client, client_id, service_id, where
> either client_id or service_id would be null depending on the is_client
> boolean?
>
> The major problem is simply relating the foreign key in the updates table
> back to the correct entity, client or service.
>
> Regards,
> Sebastian
>
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 05:50 PM
Asko Oja
 
Posts: n/a
Default Re: Database normalization

Hi

I would create one table to log updates with type field and one id filed
that contains either client id or service id according to type. On such a
table i would forget about foreign keys (thay are better to be avoided
anyway if you have millions of records in tables).
That way you can share code that displays update history or even create some
generic components.
In my experience normalized databases have ended up with hundreds of tables
and code and screen generation that is very far from what users actually
need. Denormalizing and refactoring these databases reduces number of tables
by magnitude.

Regards,
Asko


On 8/28/07, Sebastian Ritter <ritter.sebastian@gmail.com> wrote:
>
> Hello,
>
> >
> > I have a fairly basic question about database design where im not sure
> > which approach is considered correct.
> >
> > I have two different entities: Clients and Services. Both allow users to
> > add progressive updates about the two entities.
> >
> > The update/message format is exactly the same for both. Should I make
> > two different tables:
> >
> > client_updates and service_updates
> > or
> > one table with extra columns : is_client, client_id, service_id, where
> > either client_id or service_id would be null depending on the is_client
> > boolean?
> >
> > The major problem is simply relating the foreign key in the updates
> > table back to the correct entity, client or service.
> >
> > Regards,
> > Sebastian
> >
> >

>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 05:51 PM
Andrew Sullivan
 
Posts: n/a
Default Re: Database normalization

On Tue, Aug 28, 2007 at 12:47:45PM +0100, Sebastian Ritter wrote:
> > The update/message format is exactly the same for both. Should I make two
> > different tables:


> > one table with extra columns : is_client, client_id, service_id, where
> > either client_id or service_id would be null depending on the is_client
> > boolean?


Is the rest of the data the same? If so, then one table is right.
If not, then more than one table. In either case, I really hate the
idea of two columns, one of which is always null. But if you're
going to do that, make sure you add a CHECK constraint where !(col1
IS NULL and col2 IS NULL).

A

--
Andrew Sullivan | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to. That actually seems sort of quaint now.
--J.D. Baldwin

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 05:51 PM
Sebastian Ritter
 
Posts: n/a
Default Re: Database normalization

Thanks for the information.

Both tables would be exactly sames apart from the foreign key relation to
clients or services. I agree that it seems strange to have one column that
is always null. Im using Django as my Object relational Mapper so im pretty
sure I can not add a constraint such as : CHECK constraint where !(col1
IS NULL and col2 IS NULL).

Another factor ive been considering is that one of the fields in this
table(s) definition(s) is free flowing text which could potentially become
very large. Should I take this in to
consideration when deciding whether to split the tables? In terms of
searching speed that is.

Kindest regards.
Sebastian

On 8/28/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
>
> On Tue, Aug 28, 2007 at 12:47:45PM +0100, Sebastian Ritter wrote:
> > > The update/message format is exactly the same for both. Should I make

> two
> > > different tables:

>
> > > one table with extra columns : is_client, client_id, service_id,

> where
> > > either client_id or service_id would be null depending on the

> is_client
> > > boolean?

>
> Is the rest of the data the same? If so, then one table is right.
> If not, then more than one table. In either case, I really hate the
> idea of two columns, one of which is always null. But if you're
> going to do that, make sure you add a CHECK constraint where !(col1
> IS NULL and col2 IS NULL).
>
> A
>
> --
> Andrew Sullivan | ajs@crankycanuck.ca
> I remember when computers were frustrating because they *did* exactly what
> you told them to. That actually seems sort of quaint now.
> --J.D. Baldwin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 05:51 PM
Erik Jones
 
Posts: n/a
Default Re: Database normalization

On Aug 28, 2007, at 6:47 AM, Sebastian Ritter wrote:

> Hello,
>
> I have a fairly basic question about database design where im not
> sure which approach is considered correct.
>
> I have two different entities: Clients and Services. Both allow
> users to add progressive updates about the two entities.
>
> The update/message format is exactly the same for both. Should I
> make two different tables:
>
> client_updates and service_updates
> or
> one table with extra columns : is_client, client_id, service_id,
> where either client_id or service_id would be null depending on the
> is_client boolean?
>
> The major problem is simply relating the foreign key in the updates
> table back to the correct entity, client or service.


Are client_id and service_id artificial ids? If so, is it an actual
requirement that they have distinct id sequences? I.e. is it
necessary that there can be both client_id=1 and service_id=1? If
not, you can use one table, say Entities, lose client_id, service_id,
and is_clent and replace them with entity_id and entity_type. Then
your foreign key in your updates table just needs to reference
entity_id, or possibly (entity_id, entity_type) if you want to be
really strict about things. If you want to make querying the table
simple for either case create Clients and Services views on the
table. This also gives you the ability to add other entity types
where you may to track whatever kind of updates these are.

Erik Jones

Software Developer | EmmaŽ
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 05:51 PM
Andrew Sullivan
 
Posts: n/a
Default Re: Database normalization

On Tue, Aug 28, 2007 at 03:37:22PM +0100, Sebastian Ritter wrote:
> Thanks for the information.
>
> Both tables would be exactly sames apart from the foreign key relation to
> clients or services.


Hmm. Are the services or clients tables different? A useful rule of
thumb is that, to the extent you can sort things into "kinds of
data", then you should have exactly one space for each one. (I hope
that's clear.)

> Another factor ive been considering is that one of the fields in this
> table(s) definition(s) is free flowing text which could potentially become
> very large. Should I take this in to
> consideration when deciding whether to split the tables? In terms of
> searching speed that is.


I'd put it in its own table, probably, unless you're going to use it
frequently.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
Everything that happens in the world happens at some place.
--Jane Jacobs

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 05:51 PM
Sebastian Ritter
 
Posts: n/a
Default Re: Database normalization

Hi,

The views option sounds best in my opinion but sadly i cant play with things
like inheritance or views as Im using Django as my ORM and since its
relatively new, it doesnt yet support such database operations.

Id like the ids for each "entity" to be serial and hence there would be
overlap between client_ids and service_ids.

Cheers,
Sebastian

On 8/28/07, Erik Jones <erik@myemma.com> wrote:
>
> On Aug 28, 2007, at 6:47 AM, Sebastian Ritter wrote:
>
> > Hello,
> >
> > I have a fairly basic question about database design where im not
> > sure which approach is considered correct.
> >
> > I have two different entities: Clients and Services. Both allow
> > users to add progressive updates about the two entities.
> >
> > The update/message format is exactly the same for both. Should I
> > make two different tables:
> >
> > client_updates and service_updates
> > or
> > one table with extra columns : is_client, client_id, service_id,
> > where either client_id or service_id would be null depending on the
> > is_client boolean?
> >
> > The major problem is simply relating the foreign key in the updates
> > table back to the correct entity, client or service.

>
> Are client_id and service_id artificial ids? If so, is it an actual
> requirement that they have distinct id sequences? I.e. is it
> necessary that there can be both client_id=1 and service_id=1? If
> not, you can use one table, say Entities, lose client_id, service_id,
> and is_clent and replace them with entity_id and entity_type. Then
> your foreign key in your updates table just needs to reference
> entity_id, or possibly (entity_id, entity_type) if you want to be
> really strict about things. If you want to make querying the table
> simple for either case create Clients and Services views on the
> table. This also gives you the ability to add other entity types
> where you may to track whatever kind of updates these are.
>
> Erik Jones
>
> Software Developer | Emma(r)
> erik@myemma.com
> 800.595.4401 or 615.292.5888
> 615.292.0777 (fax)
>
> Emma helps organizations everywhere communicate & market in style.
> Visit us online at http://www.myemma.com
>
>
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 05:51 PM
Sebastian Ritter
 
Posts: n/a
Default Re: Database normalization

Hi,

On 8/28/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
>
> On Tue, Aug 28, 2007 at 03:37:22PM +0100, Sebastian Ritter wrote:
> > Thanks for the information.
> >
> > Both tables would be exactly sames apart from the foreign key relation

> to
> > clients or services.

>
> Hmm. Are the services or clients tables different? A useful rule of
> thumb is that, to the extent you can sort things into "kinds of
> data", then you should have exactly one space for each one. (I hope
> that's clear.)



The table definition is exactly the same. The only difference is whether
the
row refers to a client or service.


> Another factor ive been considering is that one of the fields in this
> > table(s) definition(s) is free flowing text which could potentially

> become
> > very large. Should I take this in to
> > consideration when deciding whether to split the tables? In terms of
> > searching speed that is.

>
> I'd put it in its own table, probably, unless you're going to use it
> frequently.



Why would frequency of use change whether or not I use one or two tables?

Sebastian

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 05:51 PM
Andrew Sullivan
 
Posts: n/a
Default Re: Database normalization

On Tue, Aug 28, 2007 at 04:15:01PM +0100, Sebastian Ritter wrote:
> > I'd put it in its own table, probably, unless you're going to use it
> > frequently.

>
> Why would frequency of use change whether or not I use one or two tables?


If you have a possibly-large field that does not get used very much,
you have to pay the I/O for it every time you look at that row, even
if it's not used. Also, it sounds like it might not be used by every
row? In that case, normalization calls for it to be pushed out too.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
However important originality may be in some fields, restraint and
adherence to procedure emerge as the more significant virtues in a
great many others. --Alain de Botton

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 05:51 PM
Bart Degryse
 
Posts: n/a
Default Re: Database normalization

<quote>Im using Django as my Object relational Mapper so im pretty sure I can not add a constraint such as ...</quote>
Then you should seriously consider changing your mapper.

>>> "Sebastian Ritter" <ritter.sebastian@gmail.com> 2007-08-28 16:37 >>>

Thanks for the information.

Both tables would be exactly sames apart from the foreign key relation to clients or services. I agree that it seems strange to have one column that is always null. Im using Django as my Object relational Mapper so im pretty sure I can not add a constraint such as : CHECK constraint where !(col1
IS NULL and col2 IS NULL).

Another factor ive been considering is that one of the fields in this table(s) definition(s) is free flowing text which could potentially become very large. Should I take this in to
consideration when deciding whether to split the tables? In terms of searching speed that is.

Kindest regards.
Sebastian

On 8/28/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:

On Tue, Aug 28, 2007 at 12:47:45PM +0100, Sebastian Ritter wrote:
> > The update/message format is exactly the same for both. Should I make two
> > different tables:


> > one table with extra columns : is_client, client_id, service_id, where
> > either client_id or service_id would be null depending on the is_client
> > boolean?


Is the rest of the data the same? If so, then one table is right.
If not, then more than one table. In either case, I really hate the
idea of two columns, one of which is always null. But if you're
going to do that, make sure you add a CHECK constraint where !(col1
IS NULL and col2 IS NULL).

A

--
Andrew Sullivan | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to. That actually seems sort of quaint now.
--J.D. Baldwin

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:38 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com