Unix Technical Forum

Re: Checkpoint cost, looks like it is WAL/CRC

This is a discussion on Re: Checkpoint cost, looks like it is WAL/CRC within the pgsql Hackers forums, part of the PostgreSQL category; --> >> Are you sure about that? That would probably be the normal case, but >> are you promised that ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 04:46 AM
Zeugswetter Andreas DAZ SD
 
Posts: n/a
Default Re: Checkpoint cost, looks like it is WAL/CRC


>> Are you sure about that? That would probably be the normal case, but
>> are you promised that the hardware will write all of the sectors of a


>> block in order?

>
> I don't think you can possibly assume that. If the block
> crosses a cylinder boundary then it's certainly an unsafe
> assumption, and even within a cylinder (no seek required) I'm
> pretty sure that disk drives have understood "write the next
> sector that passes under the heads"
> for decades.


A lot of hardware exists, that guards against partial writes
of single IO requests (a persistent write cache for a HP raid
controller for intel servers costs ~500$ extra).

But, the OS usually has 4k (some 8k) filesystem buffer size,
and since we do not use direct io for datafiles, the OS might decide
to schedule two 4k writes differently for one 8k page.

If you do not build pg to match your fs buffer size you cannot
guard against partial writes with hardware :-(

We could alleviate that problem with direct io for datafiles.

Andreas

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 04:46 AM
Bruce Momjian
 
Posts: n/a
Default Re: Checkpoint cost, looks like it is WAL/CRC

Zeugswetter Andreas DAZ SD wrote:
>
> >> Are you sure about that? That would probably be the normal case, but
> >> are you promised that the hardware will write all of the sectors of a

>
> >> block in order?

> >
> > I don't think you can possibly assume that. If the block
> > crosses a cylinder boundary then it's certainly an unsafe
> > assumption, and even within a cylinder (no seek required) I'm
> > pretty sure that disk drives have understood "write the next
> > sector that passes under the heads"
> > for decades.

>
> A lot of hardware exists, that guards against partial writes
> of single IO requests (a persistent write cache for a HP raid
> controller for intel servers costs ~500$ extra).
>
> But, the OS usually has 4k (some 8k) filesystem buffer size,
> and since we do not use direct io for datafiles, the OS might decide
> to schedule two 4k writes differently for one 8k page.
>
> If you do not build pg to match your fs buffer size you cannot
> guard against partial writes with hardware :-(
>
> We could alleviate that problem with direct io for datafiles.


Now that is an interesting analysis. I thought people who used
batter-backed drive cache wouldn't have partial page write problems, but
I now see it is certainly possible.

--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:00 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com