Unix Technical Forum

Re: How to improve db performance with $7K?

This is a discussion on Re: How to improve db performance with $7K? within the Pgsql Performance forums, part of the PostgreSQL category; --> I've been doing some reading up on this, trying to keep up here, and have found out that (experts, ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 11:30 AM
Mohan, Ross
 
Posts: n/a
Default Re: How to improve db performance with $7K?

I've been doing some reading up on this, trying to keep up here,
and have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a "Phase Zero"
implementation of Tagged Command Queueing (the special sauce
for SCSI).
2) This SATA "TCQ" is called NCQ and I believe it basically
allows the disk software itself to do the reordering
(this is called "simple" in TCQ terminology) It does not
yet allow the TCQ "head of queue" command, allowing the
current tagged request to go to head of queue, which is
a simple way of manifesting a "high priority" request.

3) SATA drives are not yet multi-initiator?

Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications.



-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailtogsql-performance-owner@postgresql.org] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Kevin Brown <kevin@sysexperts.com> writes:

> Greg Stark wrote:
>
>
> > I think you're being misled by analyzing the write case.
> >
> > Consider the read case. When a user process requests a block and
> > that read makes its way down to the driver level, the driver can't
> > just put it aside and wait until it's convenient. It has to go ahead
> > and issue the read right away.

>
> Well, strictly speaking it doesn't *have* to. It could delay for a
> couple of milliseconds to see if other requests come in, and then
> issue the read if none do. If there are already other requests being
> fulfilled, then it'll schedule the request in question just like the
> rest.


But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek...

> Once the first request has been fulfilled, the driver can now schedule
> the rest of the queued-up requests in disk-layout order.
>
> I really don't see how this is any different between a system that has
> tagged queueing to the disks and one that doesn't. The only
> difference is where the queueing happens.


And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek.

If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...)

In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems.

--
greg


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 11:31 AM
Steve Poe
 
Posts: n/a
Default Re: How to improve db performance with $7K?

If SATA drives don't have the ability to replace SCSI for a multi-user
Postgres apps, but you needed to save on cost (ALWAYS an issue),
could/would you implement SATA for your logs (pg_xlog) and keep the rest
on SCSI?

Steve Poe

Mohan, Ross wrote:

>I've been doing some reading up on this, trying to keep up here,
>and have found out that (experts, just yawn and cover your ears)
>
>1) some SATA drives (just type II, I think?) have a "Phase Zero"
> implementation of Tagged Command Queueing (the special sauce
> for SCSI).
>2) This SATA "TCQ" is called NCQ and I believe it basically
> allows the disk software itself to do the reordering
> (this is called "simple" in TCQ terminology) It does not
> yet allow the TCQ "head of queue" command, allowing the
> current tagged request to go to head of queue, which is
> a simple way of manifesting a "high priority" request.
>
>3) SATA drives are not yet multi-initiator?
>
>Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
>are likely to whomp SATA II drives for a while yet (read: a
>year or two) in multiuser PostGres applications.
>
>
>
>-----Original Message-----
>From: pgsql-performance-owner@postgresql.org [mailtogsql-performance-owner@postgresql.org] On Behalf Of Greg Stark
>Sent: Thursday, April 14, 2005 2:04 PM
>To: Kevin Brown
>Cc: pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] How to improve db performance with $7K?
>
>
>Kevin Brown <kevin@sysexperts.com> writes:
>
>
>
>>Greg Stark wrote:
>>
>>
>>
>>
>>>I think you're being misled by analyzing the write case.
>>>
>>>Consider the read case. When a user process requests a block and
>>>that read makes its way down to the driver level, the driver can't
>>>just put it aside and wait until it's convenient. It has to go ahead
>>>and issue the read right away.
>>>
>>>

>>Well, strictly speaking it doesn't *have* to. It could delay for a
>>couple of milliseconds to see if other requests come in, and then
>>issue the read if none do. If there are already other requests being
>>fulfilled, then it'll schedule the request in question just like the
>>rest.
>>
>>

>
>But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek...
>
>
>
>>Once the first request has been fulfilled, the driver can now schedule
>>the rest of the queued-up requests in disk-layout order.
>>
>>I really don't see how this is any different between a system that has
>>tagged queueing to the disks and one that doesn't. The only
>>difference is where the queueing happens.
>>
>>

>
>And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek.
>
>If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...)
>
>In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems.
>
>
>



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 11:31 AM
Joshua D. Drake
 
Posts: n/a
Default Re: How to improve db performance with $7K?

Steve Poe wrote:

> If SATA drives don't have the ability to replace SCSI for a multi-user


I don't think it is a matter of not having the ability. SATA all in all
is fine as long as
it is battery backed. It isn't as high performing as SCSI but who says
it has to be?

There are plenty of companies running databases on SATA without issue. Would
I put it on a database that is expecting to have 500 connections at all
times? No.
Then again, if you have an application with that requirement, you have
the money
to buy a big fat SCSI array.

Sincerely,

Joshua D. Drake



> Postgres apps, but you needed to save on cost (ALWAYS an issue),
> could/would you implement SATA for your logs (pg_xlog) and keep the
> rest on SCSI?
>
> Steve Poe
>
> Mohan, Ross wrote:
>
>> I've been doing some reading up on this, trying to keep up here, and
>> have found out that (experts, just yawn and cover your ears)
>>
>> 1) some SATA drives (just type II, I think?) have a "Phase Zero"
>> implementation of Tagged Command Queueing (the special sauce
>> for SCSI).
>> 2) This SATA "TCQ" is called NCQ and I believe it basically
>> allows the disk software itself to do the reordering
>> (this is called "simple" in TCQ terminology) It does not
>> yet allow the TCQ "head of queue" command, allowing the
>> current tagged request to go to head of queue, which is
>> a simple way of manifesting a "high priority" request.
>>
>> 3) SATA drives are not yet multi-initiator?
>>
>> Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
>> are likely to whomp SATA II drives for a while yet (read: a
>> year or two) in multiuser PostGres applications.
>>
>>
>> -----Original Message-----
>> From: pgsql-performance-owner@postgresql.org
>> [mailtogsql-performance-owner@postgresql.org] On Behalf Of Greg Stark
>> Sent: Thursday, April 14, 2005 2:04 PM
>> To: Kevin Brown
>> Cc: pgsql-performance@postgresql.org
>> Subject: Re: [PERFORM] How to improve db performance with $7K?
>>
>>
>> Kevin Brown <kevin@sysexperts.com> writes:
>>
>>
>>
>>> Greg Stark wrote:
>>>
>>>
>>>
>>>
>>>> I think you're being misled by analyzing the write case.
>>>>
>>>> Consider the read case. When a user process requests a block and
>>>> that read makes its way down to the driver level, the driver can't
>>>> just put it aside and wait until it's convenient. It has to go
>>>> ahead and issue the read right away.
>>>>
>>>
>>> Well, strictly speaking it doesn't *have* to. It could delay for a
>>> couple of milliseconds to see if other requests come in, and then
>>> issue the read if none do. If there are already other requests
>>> being fulfilled, then it'll schedule the request in question just
>>> like the rest.
>>>

>>
>>
>> But then the cure is worse than the disease. You're basically
>> describing exactly what does happen anyways, only you're delaying
>> more requests than necessary. That intervening time isn't really
>> idle, it's filled with all the requests that were delayed during the
>> previous large seek...
>>
>>
>>
>>> Once the first request has been fulfilled, the driver can now
>>> schedule the rest of the queued-up requests in disk-layout order.
>>>
>>> I really don't see how this is any different between a system that
>>> has tagged queueing to the disks and one that doesn't. The only
>>> difference is where the queueing happens.
>>>

>>
>>
>> And *when* it happens. Instead of being able to issue requests while
>> a large seek is happening and having some of them satisfied they have
>> to wait until that seek is finished and get acted on during the next
>> large seek.
>>
>> If my theory is correct then I would expect bandwidth to be
>> essentially equivalent but the latency on SATA drives to be increased
>> by about 50% of the average seek time. Ie, while a busy SCSI drive
>> can satisfy most requests in about 10ms a busy SATA drive would
>> satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM
>> SCSI drives have even lower seek times and no such IDE/SATA drives
>> exist...)
>>
>> In reality higher latency feeds into a system feedback loop causing
>> your application to run slower causing bandwidth demands to be lower
>> as well. It's often hard to distinguish root causes from symptoms
>> when optimizing complex systems.
>>
>>
>>

>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your
> joining column's datatypes do not match




---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 11:32 AM
William Yu
 
Posts: n/a
Default Re: How to improve db performance with $7K?

Problem with this strategy. You want battery-backed write caching for
best performance & safety. (I've tried IDE for WAL before w/ write
caching off -- the DB got crippled whenever I had to copy files from/to
the drive on the WAL partition -- ended up just moving WAL back on the
same SCSI drive as the main DB.) That means in addition to a $$$ SCSI
caching controller, you also need a $$$ SATA caching controller. From my
glance at prices, advanced SATA controllers seem to cost nearly as their
SCSI counterparts.

This also looks to be the case for the drives themselves. Sure you can
get super cheap 7200RPM SATA drives but they absolutely suck for
database work. Believe me, I gave it a try once -- ugh. The highend WD
10K Raptors look pretty good though -- the benchmarks @ storagereview
seem to put these drives at about 90% of SCSI 10Ks for both single-user
and multi-user. However, they're also priced like SCSIs -- here's what I
found @ Mwave (going through pricewatch to find WD740GDs):

Seagate 7200 SATA -- 80GB $59
WD 10K SATA -- 72GB $182
Seagate 10K U320 -- 72GB $289

Using the above prices for a fixed budget for RAID-10, you could get:

SATA 7200 -- 680MB per $1000
SATA 10K -- 200MB per $1000
SCSI 10K -- 125MB per $1000

For a 99% read-only DB that required lots of disk space (say something
like Wikipedia or blog host), using consumer level SATA probably is ok.
For anything else, I'd consider SATA 10K if (1) I do not need 15K RPM
and (2) I don't have SCSI intrastructure already.


Steve Poe wrote:
> If SATA drives don't have the ability to replace SCSI for a multi-user
> Postgres apps, but you needed to save on cost (ALWAYS an issue),
> could/would you implement SATA for your logs (pg_xlog) and keep the rest
> on SCSI?
>
> Steve Poe
>
> Mohan, Ross wrote:
>
>> I've been doing some reading up on this, trying to keep up here, and
>> have found out that (experts, just yawn and cover your ears)
>>
>> 1) some SATA drives (just type II, I think?) have a "Phase Zero"
>> implementation of Tagged Command Queueing (the special sauce
>> for SCSI).
>> 2) This SATA "TCQ" is called NCQ and I believe it basically
>> allows the disk software itself to do the reordering
>> (this is called "simple" in TCQ terminology) It does not
>> yet allow the TCQ "head of queue" command, allowing the
>> current tagged request to go to head of queue, which is
>> a simple way of manifesting a "high priority" request.
>>
>> 3) SATA drives are not yet multi-initiator?
>>
>> Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
>> are likely to whomp SATA II drives for a while yet (read: a
>> year or two) in multiuser PostGres applications.
>>
>>
>> -----Original Message-----
>> From: pgsql-performance-owner@postgresql.org
>> [mailtogsql-performance-owner@postgresql.org] On Behalf Of Greg Stark
>> Sent: Thursday, April 14, 2005 2:04 PM
>> To: Kevin Brown
>> Cc: pgsql-performance@postgresql.org
>> Subject: Re: [PERFORM] How to improve db performance with $7K?
>>
>>
>> Kevin Brown <kevin@sysexperts.com> writes:
>>
>>
>>
>>> Greg Stark wrote:
>>>
>>>
>>>
>>>
>>>> I think you're being misled by analyzing the write case.
>>>>
>>>> Consider the read case. When a user process requests a block and
>>>> that read makes its way down to the driver level, the driver can't
>>>> just put it aside and wait until it's convenient. It has to go ahead
>>>> and issue the read right away.
>>>>
>>>
>>> Well, strictly speaking it doesn't *have* to. It could delay for a
>>> couple of milliseconds to see if other requests come in, and then
>>> issue the read if none do. If there are already other requests being
>>> fulfilled, then it'll schedule the request in question just like the
>>> rest.
>>>

>>
>>
>> But then the cure is worse than the disease. You're basically
>> describing exactly what does happen anyways, only you're delaying more
>> requests than necessary. That intervening time isn't really idle, it's
>> filled with all the requests that were delayed during the previous
>> large seek...
>>
>>
>>
>>> Once the first request has been fulfilled, the driver can now
>>> schedule the rest of the queued-up requests in disk-layout order.
>>>
>>> I really don't see how this is any different between a system that
>>> has tagged queueing to the disks and one that doesn't. The only
>>> difference is where the queueing happens.
>>>

>>
>>
>> And *when* it happens. Instead of being able to issue requests while a
>> large seek is happening and having some of them satisfied they have to
>> wait until that seek is finished and get acted on during the next
>> large seek.
>>
>> If my theory is correct then I would expect bandwidth to be
>> essentially equivalent but the latency on SATA drives to be increased
>> by about 50% of the average seek time. Ie, while a busy SCSI drive can
>> satisfy most requests in about 10ms a busy SATA drive would satisfy
>> most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI
>> drives have even lower seek times and no such IDE/SATA drives exist...)
>>
>> In reality higher latency feeds into a system feedback loop causing
>> your application to run slower causing bandwidth demands to be lower
>> as well. It's often hard to distinguish root causes from symptoms when
>> optimizing complex systems.
>>
>>
>>

>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 11:32 AM
Greg Stark
 
Posts: n/a
Default Re: How to improve db performance with $7K?


William Yu <wyu@talisys.com> writes:

> Using the above prices for a fixed budget for RAID-10, you could get:
>
> SATA 7200 -- 680MB per $1000
> SATA 10K -- 200MB per $1000
> SCSI 10K -- 125MB per $1000


What a lot of these analyses miss is that cheaper == faster because cheaper
means you can buy more spindles for the same price. I'm assuming you picked
equal sized drives to compare so that 200MB/$1000 for SATA is almost twice as
many spindles as the 125MB/$1000. That means it would have almost double the
bandwidth. And the 7200 RPM case would have more than 5x the bandwidth.

While 10k RPM drives have lower seek times, and SCSI drives have a natural
seek time advantage, under load a RAID array with fewer spindles will start
hitting contention sooner which results into higher latency. If the controller
works well the larger SATA arrays above should be able to maintain their
mediocre latency much better under load than the SCSI array with fewer drives
would maintain its low latency response time despite its drives' lower average
seek time.

--
greg


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 11:32 AM
Alex Turner
 
Posts: n/a
Default Re: How to improve db performance with $7K?

This is fundamentaly untrue.

A mirror is still a mirror. At most in a RAID 10 you can have two
simultaneous seeks. You are always going to be limited by the seek
time of your drives. It's a stripe, so you have to read from all
members of the stripe to get data, requiring all drives to seek.
There is no advantage to seek time in adding more drives. By adding
more drives you can increase throughput, but the max throughput of the
PCI-X bus isn't that high (I think around 400MB/sec) You can easily
get this with a six or seven drive RAID 5, or a ten drive RAID 10. At
that point you start having to factor in the cost of a bigger chassis
to hold more drives, which can be big bucks.

Alex Turner
netEconomist

On 18 Apr 2005 10:59:05 -0400, Greg Stark <gsstark@mit.edu> wrote:
>
> William Yu <wyu@talisys.com> writes:
>
> > Using the above prices for a fixed budget for RAID-10, you could get:
> >
> > SATA 7200 -- 680MB per $1000
> > SATA 10K -- 200MB per $1000
> > SCSI 10K -- 125MB per $1000

>
> What a lot of these analyses miss is that cheaper == faster because cheaper
> means you can buy more spindles for the same price. I'm assuming you picked
> equal sized drives to compare so that 200MB/$1000 for SATA is almost twice as
> many spindles as the 125MB/$1000. That means it would have almost double the
> bandwidth. And the 7200 RPM case would have more than 5x the bandwidth.
>
> While 10k RPM drives have lower seek times, and SCSI drives have a natural
> seek time advantage, under load a RAID array with fewer spindles will start
> hitting contention sooner which results into higher latency. If the controller
> works well the larger SATA arrays above should be able to maintain their
> mediocre latency much better under load than the SCSI array with fewer drives
> would maintain its low latency response time despite its drives' lower average
> seek time.
>
> --
> greg
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-18-2008, 11:32 AM
Greg Stark
 
Posts: n/a
Default Re: How to improve db performance with $7K?


Alex Turner <armtuk@gmail.com> writes:

> This is fundamentaly untrue.
>
> A mirror is still a mirror. At most in a RAID 10 you can have two
> simultaneous seeks. You are always going to be limited by the seek
> time of your drives. It's a stripe, so you have to read from all
> members of the stripe to get data, requiring all drives to seek.
> There is no advantage to seek time in adding more drives.


Adding drives will not let you get lower response times than the average seek
time on your drives*. But it will let you reach that response time more often.

The actual response time for a random access to a drive is the seek time plus
the time waiting for your request to actually be handled. Under heavy load
that could be many milliseconds. The more drives you have the fewer requests
each drive has to handle.

Look at the await and svctime columns of iostat -x.

Under heavy random access load those columns can show up performance problems
more accurately than the bandwidth columns. You could be doing less bandwidth
but be having latency issues. While reorganizing data to allow for more
sequential reads is the normal way to address that, simply adding more
spindles can be surprisingly effective.

> By adding more drives you can increase throughput, but the max throughput of
> the PCI-X bus isn't that high (I think around 400MB/sec) You can easily get
> this with a six or seven drive RAID 5, or a ten drive RAID 10. At that point
> you start having to factor in the cost of a bigger chassis to hold more
> drives, which can be big bucks.


You could use software raid to spread the drives over multiple PCI-X cards.
But if 400MB/s isn't enough bandwidth then you're probably in the realm of
"enterprise-class" hardware anyways.

* (Actually even that's possible: you could limit yourself to a portion of the
drive surface to reduce seek time)

--
greg


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-18-2008, 11:32 AM
Alex Turner
 
Posts: n/a
Default Re: How to improve db performance with $7K?

[snip]
>
> Adding drives will not let you get lower response times than the average seek
> time on your drives*. But it will let you reach that response time more often.
>

[snip]

I believe your assertion is fundamentaly flawed. Adding more drives
will not let you reach that response time more often. All drives are
required to fill every request in all RAID levels (except possibly
0+1, but that isn't used for enterprise applicaitons). Most requests
in OLTP require most of the request time to seek, not to read. Only
in single large block data transfers will you get any benefit from
adding more drives, which is atypical in most database applications.
For most database applications, the only way to increase
transactions/sec is to decrease request service time, which is
generaly achieved with better seek times or a better controller card,
or possibly spreading your database accross multiple tablespaces on
seperate paritions.

My assertion therefore is that simply adding more drives to an already
competent* configuration is about as likely to increase your database
effectiveness as swiss cheese is to make your car run faster.

Alex Turner
netEconomist

*Assertion here is that the DBA didn't simply configure all tables and
xlog on a single 7200 RPM disk, but has seperate physical drives for
xlog and tablespace at least on 10k drives.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-18-2008, 11:32 AM
Jacques Caron
 
Posts: n/a
Default Re: How to improve db performance with $7K?

Hi,

At 18:56 18/04/2005, Alex Turner wrote:
>All drives are required to fill every request in all RAID levels


No, this is definitely wrong. In many cases, most drives don't actually
have the data requested, how could they handle the request?

When reading one random sector, only *one* drive out of N is ever used to
service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.

When writing:
- in RAID 0, 1 drive
- in RAID 1, RAID 0+1 or 1+0, 2 drives
- in RAID 5, you need to read on all drives and write on 2.

Otherwise, what would be the point of RAID 0, 0+1 or 1+0?

Jacques.



---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-18-2008, 11:32 AM
Alan Stange
 
Posts: n/a
Default Re: How to improve db performance with $7K?

Alex Turner wrote:

>[snip]
>
>
>>Adding drives will not let you get lower response times than the average seek
>>time on your drives*. But it will let you reach that response time more often.
>>
>>
>>

>[snip]
>
>I believe your assertion is fundamentaly flawed. Adding more drives
>will not let you reach that response time more often. All drives are
>required to fill every request in all RAID levels (except possibly
>0+1, but that isn't used for enterprise applicaitons). Most requests
>in OLTP require most of the request time to seek, not to read. Only
>in single large block data transfers will you get any benefit from
>adding more drives, which is atypical in most database applications.
>For most database applications, the only way to increase
>transactions/sec is to decrease request service time, which is
>generaly achieved with better seek times or a better controller card,
>or possibly spreading your database accross multiple tablespaces on
>seperate paritions.
>
>My assertion therefore is that simply adding more drives to an already
>competent* configuration is about as likely to increase your database
>effectiveness as swiss cheese is to make your car run faster.
>
>


Consider the case of a mirrored file system with a mostly read()
workload. Typical behavior is to use a round-robin method for issueing
the read operations to each mirror in turn, but one can use other
methods like a geometric algorithm that will issue the reads to the
drive with the head located closest to the desired track. Some
systems have many mirrors of the data for exactly this behavior. In
fact, one can carry this logic to the extreme and have one drive for
every cylinder in the mirror, thus removing seek latencies completely.
In fact this extreme case would also remove the rotational latency as
the cylinder will be in the disks read cache. :-) Of course, writing
data would be a bit slow!

I'm not sure I understand your assertion that "all drives are required
to fill every request in all RAID levels". After all, in mirrored
reads only one mirror needs to read any given block of data, so I don't
know what goal is achieved in making other mirrors read the same data.

My assertion (based on ample personal experience) is that one can
*always* get improved performance by adding more drives. Just limit the
drives to use the first few cylinders so that the average seek time is
greatly reduced and concatenate the drives together. One can then build
the usual RAID device out of these concatenated metadevices. Yes, one
is wasting lots of disk space, but that's life. If your goal is
performance, then you need to put your money on the table. The
system will be somewhat unreliable because of the device count,
additional SCSI buses, etc., but that too is life in the high
performance world.

-- Alan

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:13 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com