Unix Technical Forum

DISTINCT ON not working...?

This is a discussion on DISTINCT ON not working...? within the pgsql Sql forums, part of the PostgreSQL category; --> Hi all, Strange one - I have a nightly export / import routine that exports from one database and ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Sql

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 03:07 PM
Phillip Smith
 
Posts: n/a
Default DISTINCT ON not working...?

Hi all,

Strange one - I have a nightly export / import routine that exports from one
database and imports to another. Has been working fine for several months,
but last night it died on a unique constraint.

To cut out all the details, the code that is causing the problem:
SELECT DISTINCT ON (ean)
code,
CASE WHEN ean IS NULL OR valid_barcode(ean) = false THEN
null ELSE ean END AS ean
FROM TMPTABLE
WHERE code NOT IN (SELECT code FROM stock_deleted)
AND ean IS NOT NULL

That is the code that generates the error on the unique constraint against
the ean column.

If I play with that and run this:
SELECT DISTINCT ON (ean)
CASE WHEN ean IS NULL OR valid_barcode(ean) = false THEN
null ELSE ean END AS ean,
count(*)
FROM TMPTABLE
WHERE code NOT IN (SELECT code FROM stock_deleted)
AND ean IS NOT NULL
GROUP BY ean

I get a several thousand rows returned, all with a count(*) of 1, except one
row:
3246576919422 2

DISTINCT ON should eliminate one of those rows that is making that 2 - as I
said, it's been working fine for several months, and it is still doing it
correctly for approximately 100 other rows that have duplicate ean codes.

Can anyone give me a hand to work out why this one is doubling up?!

Cheers,
~p


*******************Confidentiality and Privilege Notice*******************

The material contained in this message is privileged and confidential to
the addressee. If you are not the addressee indicated in this message or
responsible for delivery of the message to such person, you may not copy
or deliver this message to anyone, and you should destroy it and kindly
notify the sender by reply email.

Information in this message that does not relate to the official business
of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta.
Weatherbeeta, its employees, contractors or associates shall not be liable
for direct, indirect or consequential loss arising from transmission of this
message or any attachments

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 03:07 PM
Tom Lane
 
Posts: n/a
Default Re: DISTINCT ON not working...?

"Phillip Smith" <phillip.smith@weatherbeeta.com.au> writes:
> To cut out all the details, the code that is causing the problem:
> SELECT DISTINCT ON (ean)
> code,
> CASE WHEN ean IS NULL OR valid_barcode(ean) = false THEN
> null ELSE ean END AS ean
> FROM TMPTABLE
> WHERE code NOT IN (SELECT code FROM stock_deleted)
> AND ean IS NOT NULL


Perhaps you've confused yourself by using "ean" as both an input and an
output column name? I think that the "ean" in the DISTINCT ON clause
will effectively refer to that CASE-expression, whereas the one in the
WHERE clause is just referring to the underlying column (and thus making
the IS NULL test in the CASE rather pointless).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 03:07 PM
Phillip Smith
 
Posts: n/a
Default Re: DISTINCT ON not working...?

Removing the CASE statement all together:
SELECT DISTINCT ON (ean)
ean,
count(*)
FROM TMPTABLE
WHERE code NOT IN (SELECT code FROM stock_deleted)
AND ean IS NOT NULL
GROUP BY ean

Still gives me:
3246576919422 2



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, 20 February 2007 15:33
To: Phillip Smith
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] DISTINCT ON not working...?

Perhaps you've confused yourself by using "ean" as both an input and an
output column name? I think that the "ean" in the DISTINCT ON clause
will effectively refer to that CASE-expression, whereas the one in the
WHERE clause is just referring to the underlying column (and thus making
the IS NULL test in the CASE rather pointless).

regards, tom lane


*******************Confidentiality and Privilege Notice*******************

The material contained in this message is privileged and confidential to
the addressee. If you are not the addressee indicated in this message or
responsible for delivery of the message to such person, you may not copy
or deliver this message to anyone, and you should destroy it and kindly
notify the sender by reply email.

Information in this message that does not relate to the official business
of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta.
Weatherbeeta, its employees, contractors or associates shall not be liable
for direct, indirect or consequential loss arising from transmission of this
message or any attachments

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 03:07 PM
=?iso-8859-2?q?Marcin_St=EApnicki?=
 
Posts: n/a
Default Re: DISTINCT ON not working...?

Dnia Tue, 20 Feb 2007 15:36:32 +1100, Phillip Smith napisa³(a):

> Removing the CASE statement all together:
> SELECT DISTINCT ON (ean)
> ean,
> count(*)
> FROM TMPTABLE
> WHERE code NOT IN (SELECT code FROM stock_deleted)
> AND ean IS NOT NULL
> GROUP BY ean
>
> Still gives me:
> 3246576919422 2


Wild guess - have you tried reindexing this table? I haven't seen
corrupted indexes since 7.1, though - it usually means subtle hardware
problems.

--
| And Do What You Will be the challenge | http://apcoln.linuxpl.org
| So be it in love that harms none | http://biznes.linux.pl
| For this is the only commandment. | http://www.juanperon.info
`---* JID: Aragorn_Vime@jabber.org *---' http://www.naszedzieci.org



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 03:07 PM
Phillip Smith
 
Posts: n/a
Default Re: DISTINCT ON not working...?

This is a temporary table (with no indexes) that gets created in the same
transaction block as the SELECT gets run, but I tried creating an index on
the ean column anyway with no luck:

CREATE INDEX ean_idx ON TMPTABLE USING btree (ean);
SELECT DISTINCT ON (ean)
ean,
count(*)
FROM TMPTABLE
WHERE code NOT IN (SELECT code FROM stock_deleted)
AND ean IS NOT NULL
GROUP BY ean;

Still returns:
3246576919422 2


-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailtogsql-sql-owner@postgresql.org]
On Behalf Of Marcin Stêpnicki
Sent: Tuesday, 20 February 2007 23:34
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] DISTINCT ON not working...?

Wild guess - have you tried reindexing this table? I haven't seen
corrupted indexes since 7.1, though - it usually means subtle hardware
problems.


*******************Confidentiality and Privilege Notice*******************

The material contained in this message is privileged and confidential to
the addressee. If you are not the addressee indicated in this message or
responsible for delivery of the message to such person, you may not copy
or deliver this message to anyone, and you should destroy it and kindly
notify the sender by reply email.

Information in this message that does not relate to the official business
of Weatherbeeta must be treated as neither given nor endorsed by Weatherbeeta.
Weatherbeeta, its employees, contractors or associates shall not be liable
for direct, indirect or consequential loss arising from transmission of this
message or any attachments

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:37 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com