Unix Technical Forum

Re: [PERFORM] A Better External Sort?

This is a discussion on Re: [PERFORM] A Better External Sort? within the pgsql Hackers forums, part of the PostgreSQL category; --> Michael, > >Realistically, you can't do better than about 25MB/s on a > > single-threaded I/O on current Linux ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 07:04 AM
Josh Berkus
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

Michael,

> >Realistically, you can't do better than about 25MB/s on a
> > single-threaded I/O on current Linux machines,

>
> What on earth gives you that idea? Did you drop a zero?


Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A
Big-Name Proprietary Database doesn't get much more than that either.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 07:04 AM
Josh Berkus
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

Jeff,

> > Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A
> > Big-Name Proprietary Database doesn't get much more than that either.

>
> I find this claim very suspicious. I get single-threaded reads in
> excess of 1GB/sec with XFS and > 250MB/sec with ext3.


Database reads? Or raw FS reads? It's not the same thing.

Also, we're talking *write speed* here, not read speed.

I also find *your* claim suspicious, since there's no way XFS is 300% faster
than ext3 for the *general* case.

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-11-2008, 07:04 AM
Jeffrey W. Baker
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

On Mon, 2005-10-03 at 14:16 -0700, Josh Berkus wrote:
> Jeff,
>
> > > Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A
> > > Big-Name Proprietary Database doesn't get much more than that either.

> >
> > I find this claim very suspicious. I get single-threaded reads in
> > excess of 1GB/sec with XFS and > 250MB/sec with ext3.

>
> Database reads? Or raw FS reads? It's not the same thing.


Just reading files off the filesystem. These are input rates I get with
a specialized sort implementation. 1GB/sec is not even especially
wonderful, I can get that on two controllers with 24-disk stripe set.

I guess database reads are different, but I remain unconvinced that they
are *fundamentally* different. After all, a tab-delimited file (my sort
workload) is a kind of database.

> Also, we're talking *write speed* here, not read speed.


Ok, I did not realize. Still you should see 250-300MB/sec
single-threaded sequential output on ext3, assuming the storage can
provide that rate.

> I also find *your* claim suspicious, since there's no way XFS is 300% faster
> than ext3 for the *general* case.


On a single disk you wouldn't notice, but XFS scales much better when
you throw disks at it. I get a 50MB/sec boost from the 24th disk,
whereas ext3 stops scaling after 16 disks. For writes both XFS and ext3
top out around 8 disks, but in this case XFS tops out at 500MB/sec while
ext3 can't break 350MB/sec.

I'm hopeful that in the future the work being done at ClusterFS will
make ext3 on-par with XFS.

-jwb

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-11-2008, 07:04 AM
Hannu Krosing
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

On E, 2005-10-03 at 14:16 -0700, Josh Berkus wrote:
> Jeff,
>
> > > Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A
> > > Big-Name Proprietary Database doesn't get much more than that either.

> >
> > I find this claim very suspicious. I get single-threaded reads in
> > excess of 1GB/sec with XFS and > 250MB/sec with ext3.

>
> Database reads? Or raw FS reads? It's not the same thing.


Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k disks in
RAID10, reiserfs). A little less than 100MB sec.

After this I ran count(*) over a 2.4GB file from another tablespace on
another device (4x142GB 10k disks in RAID10) and it run 22.5 sec on
first run and 12.5 on second.

db=# show shared_buffers ;
shared_buffers
----------------
196608
(1 row)

db=# select version();
version
--------------------------------------------------------------------------------------------
PostgreSQL 8.0.3 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 3.3.6
(Debian 1:3.3.6-7)
(1 row)


--
Hannu Krosing <hannu@skype.net>


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-11-2008, 07:04 AM
Luke Lonergan
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

Hannu,

On 10/3/05 2:43 PM, "Hannu Krosing" <hannu@skype.net> wrote:

> Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
> it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k disks in
> RAID10, reiserfs). A little less than 100MB sec.


This confirms our findings - sequential scan is CPU limited at about 120MB/s
per single threaded executor. This is too slow for fast file systems like
we're discussing here.

Bizgres MPP gets 250MB/s by running multiple scanners, but we still chew up
unnecessary amounts of CPU.

> After this I ran count(*) over a 2.4GB file from another tablespace on
> another device (4x142GB 10k disks in RAID10) and it run 22.5 sec on
> first run and 12.5 on second.


You're getting caching effects here.

- Luke



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-11-2008, 07:04 AM
Josh Berkus
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

Michael,

> >Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A
> >Big-Name Proprietary Database doesn't get much more than that either.

>
> You seem to be talking about database IO, which isn't what you said.


Right, well, it was what I meant. I failed to specify, that's all.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-11-2008, 07:05 AM
Hannu Krosing
 
Posts: n/a
Default Re: [PERFORM] A Better External Sort?

On K, 2005-10-05 at 05:43 -0400, Michael Stone wrote:
> On Tue, Oct 04, 2005 at 12:43:10AM +0300, Hannu Krosing wrote:
> >Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
> >it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k disks in
> >RAID10, reiserfs). A little less than 100MB sec.

>
> And none of that 15G table is in the 6G RAM?


I believe so, as there had been another query running for some time,
doing a select form a 50GB table.

--
Hannu Krosing <hannu@skype.net>


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:13 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com