Unix Technical Forum

Large select, best practice question

This is a discussion on Large select, best practice question within the pgsql Novice forums, part of the PostgreSQL category; --> Hello, I have three tables (very simplified): - people (~6 million records) - addresses (~7 million records) - contacts ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Novice

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-17-2008, 09:39 PM
Bendik Rognlien Johansen
 
Posts: n/a
Default Large select, best practice question

Hello,
I have three tables (very simplified):
- people (~6 million records)
- addresses (~7 million records)
- contacts (~10 million records)

I need to select all records from the people table and join addresses
and contacts on it. I use the result to build a Lucene index. One
document for each person (including all contacts and addresses). I am
using jdbc.

When doing it this way, i end up with many more rows than people, and
my application logic takes care of putting the correct address/
contact with each person. To do this the records have to be sorted.
(Takes a loong time) This causes a lot of very similar result rows.
Example:

John Smith has 2 phone numbers and 2 addresses

John Smith | 555-67567 | Elm street 32
John Smith | 555-83463 | Elm street 32
John Smith | 555-83463 | P.O box 55
John Smith | 555-67567 | P.O box 55


This method is quite slow and error prone.

Is there a more elegant solution to this problem?

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-17-2008, 09:39 PM
Sean Davis
 
Posts: n/a
Default Re: Large select, best practice question




On 3/1/06 11:16 AM, "Bendik Rognlien Johansen" <bendik.johansen@gmail.com>
wrote:

> Hello,
> I have three tables (very simplified):
> - people (~6 million records)
> - addresses (~7 million records)
> - contacts (~10 million records)
>
> I need to select all records from the people table and join addresses
> and contacts on it. I use the result to build a Lucene index. One
> document for each person (including all contacts and addresses). I am
> using jdbc.
>
> When doing it this way, i end up with many more rows than people, and
> my application logic takes care of putting the correct address/
> contact with each person. To do this the records have to be sorted.
> (Takes a loong time) This causes a lot of very similar result rows.
> Example:
>
> John Smith has 2 phone numbers and 2 addresses
>
> John Smith | 555-67567 | Elm street 32
> John Smith | 555-83463 | Elm street 32
> John Smith | 555-83463 | P.O box 55
> John Smith | 555-67567 | P.O box 55
>
>
> This method is quite slow and error prone.
>
> Is there a more elegant solution to this problem?


What SQL are you using to do this? Should be a simple join on the tables?
Perhaps post your SQL here.

Sean



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:52 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com