View Single Post

   
  #4 (permalink)  
Old 04-18-2008, 11:05 AM
Steven Rosenstein
 
Posts: n/a
Default Re: Can the V7.3 EXPLAIN ANALYZE be trusted?





You're probably right about my being overly optimistic about the load
imposed by EXPLAIN ANALYZE. It was just that in my previous experience
with it, I'd never seen such a large runtime discrepancy before. I even
allowed for a "caching effect" by making sure the server was all but
quiescent, and then running the three queries as quickly after one another
as I could.

The server itself is an IBM x345 with dual Xeon 3ghz CPU's (hyperthreading
turned off) and 2.5gb of RAM. O/S is RHEL3 Update 4. Disks are a
ServeRAID of some flavor, I'm not sure what.

Thanks for the heads-up about the performance of IN in 7.3. We're looking
to migrate to 8.0 or 8.0.1 when they become GA, but some of our databases
are in excess of 200gb-300gb, and we need to make sure we have a good
migration plan in place (space to store the dump out of the 7.3 db) before
we start.
__________________________________________________ _________________________________

Steven Rosenstein
IT Architect/Developer | IBM Virtual Server Administration
Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001
Text Messaging: 6463456978 @ mobile.mycingular.com
Email: srosenst @ us.ibm.com

"Learn from the mistakes of others because you can't live long enough to
make them all yourself." -- Eleanor Roosevelt



Tom Lane
<tgl@sss.pgh.pa.u
s> To
Steven Rosenstein/New
02/06/2005 05:46 York/IBM@IBMUS
PM cc
pgsql-performance@postgresql.org
Subject
Re: [PERFORM] Can the V7.3 EXPLAIN
ANALYZE be trusted?










> From: pgsql-performance-owner@postgresql.org

[mailtogsql-performance-owner@postgresql.org]On Behalf Of Steven
Rosenstein
> >> I don't think EXPLAIN ANALYZE puts that much overhead on a query.


I think you're being overly optimistic. The explain shows that the
Materialize subnode is being entered upwards of 32 million times:

-> Materialize (cost=505.06..511.38 rows=632 width=4) (actual
time=0.00..0.02 rows=43 loops=752066)

43 * 752066 = 32338838. The instrumentation overhead is basically two
gettimeofday() kernel calls per node entry. Doing the math shows that
your machine is able to do gettimeofday() in about half a microsecond,
which isn't stellar but it's not all that slow for a kernel call.
(What's the platform here, anyway?) Nonetheless it's a couple of times
larger than the actual time needed to pull a row from a materialized
array ...

The real answer to your question is "IN (subselect) sucks before PG 7.4;
get a newer release".

regards, tom lane



---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Reply With Quote