Unix Technical Forum

I/O on select count(*)

This is a discussion on I/O on select count(*) within the Pgsql Performance forums, part of the PostgreSQL category; --> I have a large table (~ 2B rows) that contains an indexed timestamp column. I am attempting to run ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 05-16-2008, 01:44 PM
Doug Eck
 
Posts: n/a
Default I/O on select count(*)

I have a large table (~ 2B rows) that contains an indexed timestamp column. I am attempting to run a query to determine the number of rows for a given day using something like "select count(*) from tbl1 where ts between '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me that the query will be done using an index scan (as I would expect), and I realize that it is going to take a while. My question concerns some unusual I/O activity on the box (SUSE) when I run the query.

For the first couple of minutes I see reads only. After that vmstat shows mixed reads and writes in a ratio of about 1 block read to 5 blocks written. We have determined that files in our data and log partitions are being hit, but the file system itself is not growing during this time (it appears to be writing over the same chunk of space over and over again). Memory on the box is not being swapped while all of this is happening. I would have guessed that a "select count(*)" would not require a bunch of writes, and I can't begin to figure out why the number of blocks written are so much higher than the blocks read. If I modify the where clause to only count the rows for a given minute or two, I see the reads but I never see the unusual write behavior.

Any thoughts into what could be going on? Thanks in advance for your help.

Doug




Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 05-16-2008, 01:44 PM
Merlin Moncure
 
Posts: n/a
Default Re: I/O on select count(*)

On Wed, May 14, 2008 at 4:09 PM, Doug Eck <deck1@yahoo.com> wrote:
> I have a large table (~ 2B rows) that contains an indexed timestamp column.
> I am attempting to run a query to determine the number of rows for a given
> day using something like "select count(*) from tbl1 where ts between
> '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me
> that the query will be done using an index scan (as I would expect), and I
> realize that it is going to take a while. My question concerns some unusual
> I/O activity on the box (SUSE) when I run the query.
>
> For the first couple of minutes I see reads only. After that vmstat shows
> mixed reads and writes in a ratio of about 1 block read to 5 blocks
> written. We have determined that files in our data and log partitions are
> being hit, but the file system itself is not growing during this time (it
> appears to be writing over the same chunk of space over and over again).
> Memory on the box is not being swapped while all of this is happening. I
> would have guessed that a "select count(*)" would not require a bunch of
> writes, and I can't begin to figure out why the number of blocks written are
> so much higher than the blocks read. If I modify the where clause to only
> count the rows for a given minute or two, I see the reads but I never see
> the unusual write behavior.
>
> Any thoughts into what could be going on? Thanks in advance for your help.


can you post the exact output of explain analyze? (or, at least,
explain if the query takes too long)

merlin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 05-16-2008, 01:44 PM
Robert Lor
 
Posts: n/a
Default Re: I/O on select count(*)

Tom Lane wrote:
> It's certainly true that hint-bit updates cost something, but
> quantifying how much isn't easy.

Maybe we can instrument the code with DTrace probes to quantify the
actual costs. I'm not familiar with the code, but if I know where to
place the probes, I can easily do a quick test and provide the data.
> The off-the-cuff answer is to do the
> select count(*) twice and see how much cheaper the second one is.

Doesn't seem the second run is cheaper as shown in the results below.
The data came from the probes I've added recently.

*************** Run #1 **********************
SQL Statement : select count(*) from accounts;
Execution time : 1086.58 (ms)

============ Buffer Read Counts ============
Tablespace Database Table Count
1663 16384 1247 1
1663 16384 2600 1
1663 16384 2703 1
1663 16384 1255 2
1663 16384 2650 2
1663 16384 2690 3
1663 16384 2691 3
1663 16384 16397 8390

======== Dirty Buffer Write Counts =========
Tablespace Database Table Count
1663 16384 16397 2865

Total buffer cache hits : 1932
Total buffer cache misses : 6471
Average read time from cache : 5638 (ns)
Average read time from disk : 143371 (ns)
Average write time to disk : 20368 (ns)


*************** Run #2 **********************
SQL Statement : select count(*) from accounts;
Execution time : 1115.94 (ms)

============ Buffer Read Counts ============
Tablespace Database Table Count
1663 16384 16397 8390

======== Dirty Buffer Write Counts =========
Tablespace Database Table Count
1663 16384 16397 2865

Total buffer cache hits : 1931
Total buffer cache misses : 6459
Average read time from cache : 4357 (ns)
Average read time from disk : 154127 (ns)
Average write time to disk : 20368 (ns)


-Robert

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 05-18-2008, 10:04 PM
Greg Smith
 
Posts: n/a
Default Re: I/O on select count(*)

On Thu, 15 May 2008, Alvaro Herrera wrote:

> Starting a transaction does not write anything to pg_clog.


For Matt and others, some details here are in
src/backend/access/transam/README:

"pg_clog records the commit status for each transaction that has been
assigned an XID."

"Transactions and subtransactions are assigned permanent XIDs only when/if
they first do something that requires one --- typically,
insert/update/delete a tuple, though there are a few other places that
need an XID assigned."

After reading the code and that documentation a bit, the part I'm still
not sure about is whether the CLOG entry is created when the XID is
assigned and then kept current as the state changes, or whether that isn't
even in CLOG until the transaction is committed. It seems like the
latter, but there's some ambiguity in the wording and too many code paths
for me to map right now.

From there, it doesn't make its way out to disk until the internal CLOG
buffers are filled, at which point the least recently used buffer there is
evicted to permanent storage.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 05-18-2008, 10:04 PM
Simon Riggs
 
Posts: n/a
Default Re: I/O on select count(*)


On Fri, 2008-05-16 at 14:05 -0400, Greg Smith wrote:
> After reading the code and that documentation a bit, the part I'm
> still not sure about is whether the CLOG entry is created when the XID
> is assigned and then kept current as the state changes, or whether
> that isn't even in CLOG until the transaction is committed. It seems
> like the latter, but there's some ambiguity in the wording and too
> many code paths for me to map right now.


Alvaro already said this, I thought? The clog is updated only at sub or
main transaction end, thank goodness. When the transactionid is assigned
the page of the clog that contains that transactionid is checked to see
if it already exists and if not, it is initialised.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 05-18-2008, 10:04 PM
Alvaro Herrera
 
Posts: n/a
Default Re: I/O on select count(*)

Greg Smith wrote:

> After reading the code and that documentation a bit, the part I'm still
> not sure about is whether the CLOG entry is created when the XID is
> assigned and then kept current as the state changes, or whether that
> isn't even in CLOG until the transaction is committed. It seems like the
> latter, but there's some ambiguity in the wording and too many code paths
> for me to map right now.


pg_clog is allocated in pages of 8kB apiece(*). On allocation, pages are
zeroed, which is the bit pattern for "transaction in progress". So when
a transaction starts, it only needs to ensure that the pg_clog page that
corresponds to it is allocated, but it need not write anything to it.

(*) Each transaction needs 2 bits, so on a 8 kB page there is space for
4 transactions/byte * 8 pages * 1kB/page = 32k transactions.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 05-18-2008, 10:04 PM
Alvaro Herrera
 
Posts: n/a
Default Re: I/O on select count(*)

Alvaro Herrera wrote:

> pg_clog is allocated in pages of 8kB apiece(*). On allocation, pages are
> zeroed, which is the bit pattern for "transaction in progress". So when
> a transaction starts, it only needs to ensure that the pg_clog page that
> corresponds to it is allocated, but it need not write anything to it.


Of course, in 8.3 it's not when the transaction starts, but when the Xid
is assigned (i.e. when the transaction first calls a read-write
command). In previous versions it happens when the first snapshot is
taken (i.e. normally on the first command of any type, with very few
exceptions.)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:44 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com