Unix Technical Forum

How can I remove duplicate entries in a sql query?

This is a discussion on How can I remove duplicate entries in a sql query? within the SQL Server forums, part of the Microsoft SQL Server category; --> I have a database being populated by hits to a program on a server. The problem is each client ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-29-2008, 06:34 PM
BarrySDCA
 
Posts: n/a
Default How can I remove duplicate entries in a sql query?

I have a database being populated by hits to a program on a server.
The problem is each client connection may require a few hits in a 1-2
second time frame. This is resulting in multiple database entries -
all exactly the same, except the event_id field, which is
auto-numbered.

I need a way to query the record w/out duplicates. That is, any
records exactly the same except event_id should only return one record.

Is this possible??

Thank you,

Barry

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-29-2008, 06:34 PM
das
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

try this, not tested :-)

SELECT *
FROM table1 a,
(SELECT min(id) FROM table1) AS b
WHERE a.id = b.id


BarrySDCA wrote:
> I have a database being populated by hits to a program on a server.
> The problem is each client connection may require a few hits in a 1-2
> second time frame. This is resulting in multiple database entries -
> all exactly the same, except the event_id field, which is
> auto-numbered.
>
> I need a way to query the record w/out duplicates. That is, any
> records exactly the same except event_id should only return one record.
>
> Is this possible??
>
> Thank you,
>
> Barry


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-29-2008, 06:34 PM
BarrySDCA
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

hrm....no luck. I ran this:


SELECT * from `playback_log` a,(SELECT min(EVENT_ID) FROM
`playback_log` ) AS b WHERE a.EVENT_ID = b.EVENT_ID


Can you see anything? I appreciate the help a bunch. thank you

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-29-2008, 06:34 PM
Thomas Kellerer
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

Haven't tested this:

SELECT *
FROM playback_log a
WHERE a.event_id = (select min(event_id) from playback_log b
where a.field1 = b.field1)
;

You will need to list as many fields as need to match the identical rows in the
where clause of the sub-select.

Thomas


BarrySDCA wrote on 03.03.2006 00:33:
> hrm....no luck. I ran this:
>
>
> SELECT * from `playback_log` a,(SELECT min(EVENT_ID) FROM
> `playback_log` ) AS b WHERE a.EVENT_ID = b.EVENT_ID
>
>
> Can you see anything? I appreciate the help a bunch. thank you
>

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-29-2008, 06:34 PM
Bill Karwin
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

"Thomas Kellerer" <WVIJEVPANEHT@spammotel.com> wrote in message
news:46petmFc69qnU1@individual.net...
> SELECT *
> FROM playback_log a
> WHERE a.event_id = (select min(event_id) from playback_log b
> where a.field1 = b.field1)


Here's a similar possibility, without using a correlated subquery:

SELECT a.*
FROM playback_log AS a
WHERE a.event_id IN (
SELECT MIN(b.event_id)
FROM playpack_log AS b
GROUP BY b.field1, b.field2, b.field3, ...)

What I've seen missing in the several solutions proposed is any use of GROUP
BY. You'll need to GROUP BY all the fields of the table _except_ for
event_id.

Regards,
Bill K.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-29-2008, 06:35 PM
Erland Sommarskog
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

Bill Karwin (bill@karwin.com) writes:
> "Thomas Kellerer" <WVIJEVPANEHT@spammotel.com> wrote in message
> news:46petmFc69qnU1@individual.net...
>> SELECT *
>> FROM playback_log a
>> WHERE a.event_id = (select min(event_id) from playback_log b
>> where a.field1 = b.field1)

>
> Here's a similar possibility, without using a correlated subquery:
>
> SELECT a.*
> FROM playback_log AS a
> WHERE a.event_id IN (
> SELECT MIN(b.event_id)
> FROM playpack_log AS b
> GROUP BY b.field1, b.field2, b.field3, ...)
>
> What I've seen missing in the several solutions proposed is any use of
> GROUP BY. You'll need to GROUP BY all the fields of the table _except_
> for event_id.


One more variation, using a derived table:

SELECT a.*
FROM playback_log AS a
JOIN (SELECT MIN(event_id)
FROM playback_log
GROUP BY field1, field2, ...) AS b ON a.event_id = b.event_id




--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-29-2008, 06:35 PM
figital
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

I'm not sure why you need a nested query as others have suggested.

I see two options:

Select Distinct blah1, blah2, [all fields except event_id], blah99
>From LoginTable


or

Select blah1, blah2, [all fields except event_id], blah99
>From LoginTable

Group by blah1, blah2, [all fields except event_id], blah99

if you want an event_id, just for kicks, you could add an aggregator:

Select blah1, blah2, max(event_id), blah99
>From LoginTable

Group by blah1, blah2, [all fields except event_id], blah99

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-29-2008, 06:35 PM
Ed Prochak
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?


BarrySDCA wrote:
> I have a database being populated by hits to a program on a server.
> The problem is each client connection may require a few hits in a 1-2
> second time frame. This is resulting in multiple database entries -
> all exactly the same, except the event_id field, which is
> auto-numbered.
>
> I need a way to query the record w/out duplicates. That is, any
> records exactly the same except event_id should only return one record.
>
> Is this possible??
>
> Thank you,
>
> Barry


Hopefully you found the suggestions on how to filter out the duplicates
helpful.

I just want to point out that this shows the weakness of using a
pseudokey as the primary key of a table. If they really are the same
events, they shouldn't be duplicated.

Actually, I don't think you have duplicate events, since at least some
of the connection hits must have failed or timed out. Otherwise why
were there retries? So they are not really the same. Are you maybe
missing a status or result attribute?

(but the ID field is still a poor crutch for a possibly weak design.)

just some food for thought.
Ed.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-29-2008, 06:35 PM
Doug
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

>(but the ID field is still a poor crutch for a possibly weak design.)

hmmmm. i'd prefer to think of it as the table stores ALL hits to the
table, but this particular data consumer only wants to see ONE. The
way I think about it, your natural key would be the composite of all
fields in the table; perhaps "natural", but certainly awkword.

I like figital's solution.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-29-2008, 06:35 PM
BarrySDCA
 
Posts: n/a
Default Re: How can I remove duplicate entries in a sql query?

It's exactly what I'm doing.....I'm recording hits to an advertising
engine, before they get to the media server. Some client players hit
more than once before they connect, causing a duplicate entry. I just
want to see it sorted out w/out the duplicates. I have the
distribution server logs for accurate playback reporting, I'm only
interested in seeing cleaned up hits.

I setup code to remove the duplicates, but it's not fool proof. I will
try the suggestions here and let the group know how it goes. Here is a
link to what I'm doing:

http://www.sundiegolive.com/advertising.htm notice if two duplicates
come in now and are staggered between eachother, it shows twice still.
I will try and let you all know...

I would like to say that I didn't expect such great help from the
group. I've thrown a few questions out to different groups but usually
don't have so many great replies. Thank you everyone!

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:58 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com