Unix Technical Forum

No Subject

This is a discussion on No Subject within the Linux Operating System forums, part of the Unix Operating Systems category; --> Subject: RAID5 performance Lines: 30 Date: Thu, 11 Dec 2003 09:45:02 GMT NNTP-Posting-Host: 69.34.91.102 X-Complaints-To: abuse@earthlink.net X-Trace: newsread1.news.pas.earthlink.net 1071135902 ...


Go Back   Unix Technical Forum > Unix Operating Systems > Linux Operating System

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-17-2008, 06:03 PM
Mike Ruskai
 
Posts: n/a
Default No Subject

Subject: RAID5 performance
Lines: 30
Date: Thu, 11 Dec 2003 09:45:02 GMT
NNTP-Posting-Host: 69.34.91.102
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread1.news.pas.earthlink.net 1071135902 69.34.91.102 (Thu, 11 Dec 2003 01:45:02 PST)
NNTP-Posting-Date: Thu, 11 Dec 2003 01:45:02 PST
Xref: intern1.nntp.aus1.giganews.com comp.os.linux.setup:450934

I'm setting up a database server with a LSI Logic U320 RAID controller,
using the megaraid2 driver.

So far, I've only tested simple read and write performance. I'm not
particularly impressed using the defaults.

There are six Ultrastar 146Z10 drives, each capable of up to 66MB/sec
sustained throughput (near track 0 - goes down to around 40MB/sec towards
the last track).

With the six drives setup as RAID5 with the default 64KB stripe size, the
write performance seems to around 8 to 12MB/sec. That's horribly slow. I
know writing parity will slow down writes with RAID5, but it shouldn't
slow them down to a fraction of the performance of a single drive.

Read performance is better, about 150MB/sec. However, that's an average
of only 25MB/sec per drive, which is less than half the transfer rate for
the region of the drives that the read testing was done from.

I'm hoping someone knows off hand how best to tune this, so I don't have
to spend a whole lot of time testing different stripe sizes and cache
settings.


--
- Mike

Remove 'spambegone.net' and reverse to send e-mail.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-17-2008, 06:03 PM
Michael Heiming
 
Posts: n/a
Default Re: RAID5 performance

Mike Ruskai <spamten.knilhtrae@begonedynnaht.net> wrote:
> I'm setting up a database server with a LSI Logic U320 RAID controller,
> using the megaraid2 driver.

[..]

> With the six drives setup as RAID5 with the default 64KB stripe size, the
> write performance seems to around 8 to 12MB/sec. That's horribly slow. I
> know writing parity will slow down writes with RAID5, but it shouldn't
> slow them down to a fraction of the performance of a single drive.


Can't really (for sure) remember if it was with such an megaraid.o
powered controller, however write performance was even lower. The
only way to get some throughput was running RAID 10 (1+0) on the
controller. I'd go for RAID1 on the system disk anyway.

If this works better, I'd drop the author a mail about the
problem.

--
Michael Heiming

Remove +SIGNS and www. if you expect an answer, sorry for
inconvenience, but I get tons of SPAM
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-17-2008, 06:04 PM
Juha Laiho
 
Posts: n/a
Default Re: RAID5 performance

"Mike Ruskai" <spamten.knilhtrae@begonedynnaht.net> said:
>There are six Ultrastar 146Z10 drives, each capable of up to 66MB/sec
>sustained throughput (near track 0 - goes down to around 40MB/sec towards
>the last track).
>
>With the six drives setup as RAID5 with the default 64KB stripe size, the
>write performance seems to around 8 to 12MB/sec. That's horribly slow. I
>know writing parity will slow down writes with RAID5, but it shouldn't
>slow them down to a fraction of the performance of a single drive.
>
>Read performance is better, about 150MB/sec. However, that's an average
>of only 25MB/sec per drive, which is less than half the transfer rate for
>the region of the drives that the read testing was done from.


How are you measuring? This might well depend on the write size you're
using.

So, with small writes (below the stripe size - or was it below
(ndisks-1)*stripesz), the disk subsystem needs to do read-modify-
calculate-write cycles to update the parity. When you write big
enough chunks at a time, the read-modify -part becomes obsolete as
what is being written overwrites all the old data within the stripe,
so as there's no need to read the old data from the stripe to calculate
the new parity, the RAID controller can just calculate the new parity
and write the full stripe (incl. new parity).
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-17-2008, 06:04 PM
P.T. Breuer
 
Posts: n/a
Default Re: RAID5 performance

Juha Laiho <Juha.Laiho@iki.fi> wrote:
> "Mike Ruskai" <spamten.knilhtrae@begonedynnaht.net> said:
> >There are six Ultrastar 146Z10 drives, each capable of up to 66MB/sec
> >sustained throughput (near track 0 - goes down to around 40MB/sec towards
> >the last track).
> >
> >With the six drives setup as RAID5 with the default 64KB stripe size, the
> >write performance seems to around 8 to 12MB/sec. That's horribly slow. I
> >know writing parity will slow down writes with RAID5, but it shouldn't
> >slow them down to a fraction of the performance of a single drive.


Eh? Won't it have to read the parity disk, read the target disk, then
write the target disk, then write the parity disk (yesss, I know it's
not a disk but a stripe)?

That would make it four times as slow on write to start with. And
that'sif they do it the way I just suggested. They could read all the
data disks instead of reading the parity disk :-).

> >Read performance is better, about 150MB/sec. However, that's an average


You'd expect it to be equal to the bandwidth available on your buses and
controllers. How are these arranged? Are these scsi or IDE? If IDE, then
you're talking three buses, no? And you can only access one at a time
on each bus. So I'd expect the bandwidth to be three times 60MB/s or
so, which is what you are getting.

> >of only 25MB/sec per drive, which is less than half the transfer rate for
> >the region of the drives that the read testing was done from.


Eh?

> How are you measuring? This might well depend on the write size you're
> using.
>
> So, with small writes (below the stripe size - or was it below
> (ndisks-1)*stripesz), the disk subsystem needs to do read-modify-
> calculate-write cycles to update the parity. When you write big


Yes.

> enough chunks at a time, the read-modify -part becomes obsolete as
> what is being written overwrites all the old data within the stripe,


I don't get that! Surely he's not rewriting the same data area again
and again!

> so as there's no need to read the old data from the stripe to calculate
> the new parity, the RAID controller can just calculate the new parity
> and write the full stripe (incl. new parity).


Eh? He'd have to read the data off all disks to find the current parity
without reading the parity stripe itself. I don't get the argument Are
you saying he can build up a picture in cache of the parity as he
writes? That's not clear to me.

Peter
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 01-17-2008, 06:04 PM
Mike Ruskai
 
Posts: n/a
Default No Subject

Subject: Re: RAID5 performance
Lines: 47
Date: Fri, 12 Dec 2003 23:09:05 GMT
NNTP-Posting-Host: 69.34.89.106
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread1.news.pas.earthlink.net 1071270545 69.34.89.106 (Fri, 12 Dec 2003 15:09:05 PST)
NNTP-Posting-Date: Fri, 12 Dec 2003 15:09:05 PST
Xref: intern1.nntp.aus1.giganews.com comp.os.linux.setup:450998

On Fri, 12 Dec 2003 20:10:14 GMT, P.T. Breuer wrote:

>> "Mike Ruskai" <spamten.knilhtrae@begonedynnaht.net> said:


>> >Read performance is better, about 150MB/sec. However, that's an average

>
>You'd expect it to be equal to the bandwidth available on your buses and
>controllers. How are these arranged? Are these scsi or IDE? If IDE, then
>you're talking three buses, no? And you can only access one at a time
>on each bus. So I'd expect the bandwidth to be three times 60MB/s or
>so, which is what you are getting.


"IDE" and "RAID" do not belong together in the same sentence.

The drives are U320 SCSI, one a single channel. At full transfer speed
they would saturate the channel, but I don't expect that to happen often.


Being SCSI, all drives can be read simultaneously.

>> >of only 25MB/sec per drive, which is less than half the transfer rate for
>> >the region of the drives that the read testing was done from.

>
>Eh?


Hard drives do not have the same media transfer rate from one track to the
next. Track 0 is much faster than the last track, typically by about
100%. The data being read was at the beginning of the volume, meaning it
should have been stored physically near track 0 on each drive.

>I don't get that! Surely he's not rewriting the same data area again
>and again!


The (very) simple test involved copying a 1.7GB file to the volume, after
it had been read completely and still resided completely in the read
buffer of the source drive (the box has 4GB of RAM), leaving the read
speed of that drive out of the equation.

Not a scientific test, to be sure, but rough and ready.


--
- Mike

Remove 'spambegone.net' and reverse to send e-mail.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 01-17-2008, 06:04 PM
P.T. Breuer
 
Posts: n/a
Default Re: RAID5 performance

Mike Ruskai <spamten.knilhtrae@begonedynnaht.net> wrote:
> >I don't get that! Surely he's not rewriting the same data area again
> >and again!

>
> The (very) simple test involved copying a 1.7GB file to the volume, after
> it had been read completely and still resided completely in the read
> buffer of the source drive (the box has 4GB of RAM), leaving the read


Oh, I see. I imagined the whole file could not be cached in ram (I've
never had my hands on as much ram as that!) given the figure of 17GB,
hence my puzzlement.

> speed of that drive out of the equation.


Peter
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 01-17-2008, 06:05 PM
D Smith
 
Posts: n/a
Default Re: RAID5 performance

On Thu, 11 Dec 2003 09:45:02 +0000, Mike Ruskai wrote:

> I'm setting up a database server with a LSI Logic U320 RAID controller,
> using the megaraid2 driver.
>
> So far, I've only tested simple read and write performance. I'm not
> particularly impressed using the defaults.
>
> There are six Ultrastar 146Z10 drives, each capable of up to 66MB/sec
> sustained throughput (near track 0 - goes down to around 40MB/sec towards
> the last track).
>
> With the six drives setup as RAID5 with the default 64KB stripe size, the
> write performance seems to around 8 to 12MB/sec. That's horribly slow. I
> know writing parity will slow down writes with RAID5, but it shouldn't
> slow them down to a fraction of the performance of a single drive.
>
> Read performance is better, about 150MB/sec. However, that's an average
> of only 25MB/sec per drive, which is less than half the transfer rate for
> the region of the drives that the read testing was done from.
>
> I'm hoping someone knows off hand how best to tune this, so I don't have
> to spend a whole lot of time testing different stripe sizes and cache
> settings.


This may be too late for you, but here are my recommendations:

1. Do not use RAID 5, it is slow on write performance. Use RAID 10 if
your controller supports it (it probably does). At work our software is
basically a database application and on a RAID 5 it crawls but RAID 10 it
is much, much better.

2. Use a smaller stripe size. 16KB would be good. This is what we have
found to work best. If we are setting up the server for our clients on
UNIX/Linux we go 8 KB or 16KB for the stripe.

Don


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 01-17-2008, 06:05 PM
Juha Laiho
 
Posts: n/a
Default Re: RAID5 performance

ptb@oboe.it.uc3m.es (P.T. Breuer) said:
>Juha Laiho <Juha.Laiho@iki.fi> wrote:
>> So, with small writes (below the stripe size - or was it below
>> (ndisks-1)*stripesz), the disk subsystem needs to do read-modify-
>> calculate-write cycles to update the parity. When you write big

>
>Yes.
>
>> enough chunks at a time, the read-modify -part becomes obsolete as
>> what is being written overwrites all the old data within the stripe,

>
>I don't get that! Surely he's not rewriting the same data area again
>and again!


No, not same. But all data on RAID5 is arranged on stripes. So, there's
a set of data stripes corresponding to a single parity stripe. Whenever
your write is so large that it completely overwrites a complete set of
data stripes corresponding to a single parity stripe, then the parity
can be calculated purely based on the written data - so the old parity
(and rest of the old stripe) does not need to be read from the disks.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

« Boot Config | Glibc »

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:47 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com