Unix Technical Forum

Re: seq scan cache vs. index cache smackdown

This is a discussion on Re: seq scan cache vs. index cache smackdown within the Pgsql Performance forums, part of the PostgreSQL category; --> > Magnus Hagander wrote: > > I don't think that's correct either. Scatter/Gather I/O is used to SQL > ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 11:09 AM
Merlin Moncure
 
Posts: n/a
Default Re: seq scan cache vs. index cache smackdown

> Magnus Hagander wrote:
> > I don't think that's correct either. Scatter/Gather I/O is used to

SQL
> > Server can issue reads for several blocks from disks into it's own
> > buffer cache with a single syscall even if these buffers are not
> > sequential. It did make significant performance improvements when

they
> > added it, though.
> >
> > (For those not knowing - it's ReadFile/WriteFile where you pass an

array
> > of "this many bytes to this address" as parameters)

>
> Isn't that like the BSD writev()/readv() that Linux supports also? Is
> that something we should be using on Unix if it is supported by the

OS?

readv and writev are in the single unix spec...and yes they are
basically just like the win32 versions except that that are synchronous
(and therefore better, IMO).

On some systems they might just be implemented as a loop inside the
library, or even as a macro.

http://www.opengroup.org/onlinepubs/.../sysuio.h.html

On operating systems that optimize vectored read operations, it's pretty
reasonable to assume good or even great performance gains, in addition
to (or instead of) recent changes to xlog.c to group writes together for
a file...it just takes things one stop further.

Is there a reason why readv/writev have not been considered in the past?

Merlin

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 11:09 AM
Tom Lane
 
Posts: n/a
Default Re: seq scan cache vs. index cache smackdown

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> Is there a reason why readv/writev have not been considered in the past?


Lack of portability, and lack of obvious usefulness that would justify
dealing with the lack of portability.

I don't think there's any value in trying to write ordinary buffers this
way; making the buffer manager able to write multiple buffers at once
sounds like a great deal of complexity and deadlock risk in return for
not much. It might be an alternative to the existing proposed patch for
writing multiple WAL buffers at once, but frankly I consider that patch
a waste of effort. In real scenarios you seldom get to write more than
one WAL page without a forced sync occurring because someone committed.
Even if *your* transaction is long, any other backend committing a small
transaction still fsyncs. On top of that, the bgwriter will be flushing
WAL in order to maintain the write-ahead rule any time it dumps a dirty
buffer. I have a personal to-do item to make the bgwriter explicitly
responsible for writing completed WAL pages as part of its duties, but
I haven't done anything about it because I think that it will write lots
of such pages without any explicit code, thanks to the bufmgr's LSN
interlock. Even if it doesn't get it done that way, the simplest answer
is to add a little bit of code to make sure bgwriter generally does the
writes, and then we don't care.

If you want to experiment with writev, feel free, but I'll want to see
demonstrable performance benefits before any such code actually goes in.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 11:09 AM
Ron Mayer
 
Posts: n/a
Default Re: seq scan cache vs. index cache smackdown

Merlin Moncure wrote:
>
> readv and writev are in the single unix spec...and yes ...
>
> On some systems they might just be implemented as a loop inside the
> library, or even as a macro.


You sure?

Requirements like this:
http://www.opengroup.org/onlinepubs/...xsh/write.html
"Write requests of {PIPE_BUF} bytes or less will not be
interleaved with data from other processes doing writes
on the same pipe."
make me think that it couldn't be just a macro; and if it
were a loop in the library it seems it'd still have to
make sure it's done with a single write system call.

(yeah, I know that requirement is just for pipes; and I
suppose they could write a loop for normal files and a
different special case for pipes; but I'd be surprised).
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:20 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com