vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On May 12, 2008, at 11:24 PM, Francisco Reyes wrote: > Any PCI controller you have had good experience with? > How any other PCI-X/PCI-e controller that you have had good results? The LSI controllers are top-notch, and always my first choice. They have PCI-X and PCI-e versions. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| PFC wrote: > PCI limits you to 133 MB/s (theoretical), actual speed being > around 100-110 MB/s. Many servers do have more than one bus. You have to process that data too so its not going to be as much of a limit as you are suggesting. It may be possible to stream a compressed data file to the server and copy in from that after decompression, which will free LAN bandwidth. Or even if you RPC blocks of compressed data and decompress in the proc and insert right there. > On your current setup with 15K drives if you need 1 fsync per > INSERT you won't do more than 250 per second, which is very limiting... Well, thats 250 physical syncs. But if you have multiple insert streams (for group commit), or can batch the rows in each insert or copy, its not necessarily as much of a problem as you seem to be implying. Particularly if you are doing the holding table trick. James -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| > >> You say that like you don't mind having PCI in a server whose job is >> to perform massive query over large data sets. > > I am in my 4th week at a new job. Trying to figure what I am working > with. LOOL, ok, hehe, not exactly the time to have a "let's change everything" fit > From what I see I will likely get as much improvement from new hardware > as from re-doing some of the database design. Can't get everything done > at once, not to mention I have to redo one machine sooner rather than > later so I need to prioritize. > >> In fact for bulk IO a box with 2 SATA drives would be just as fast as >> your monster RAID, lol. > > I am working on setting up a standard test based on the type of > operations that the company does. This will give me a beter idea. > Specially I will work with the developers to make sure the queries I > create for the benchmark are representative of the workload. watching vmstat (or iostat) while running a very big seq scan query will give you information about the reading speed of your drives. Same for writes, during one of your big updates, watch vmstat, you'll know if you are CPU bound or IO bound... - one core at 100% -> CPU bound - lots of free CPU but lots of iowait -> disk bound - disk throughput decent (in your setup, 100 MB/s) -> PCI bus saturation - disk throughput miserable (< 10 MB/s) -> random IO bound (either random reads or fsync() or random writes depending on the case) > In your opinion if we get a new machine with PCI-e, at how many spindles > will the SCSI random access superiority start to be less notable? > Specially given the low number of connections we usually have running > against these machines. Sorting of random reads depends on multiple concurrent requests (which you don't have). Sorting of random writes does not depend on concurrent requests so, you'll benefit on your updates. About SCSI vs SATA vs number of spindles : can't answer this one. > We are using one single SQL statement. OK, so forget about fsync penalty, but do tune your checkpoints so they are not happening all the time... and bgwriter etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| ||||
| PFC schrieb: > PCI limits you to 133 MB/s (theoretical), actual speed being around > 100-110 MB/s. "Current" PCI 2.1+ implementations allow 533MB/s (32bit) to 1066MB/s (64bit) since 6-7 years ago or so. > For instance here I have a box with PCI, Giga Ethernet and a > software RAID5 ; reading from the RAID5 goes to about 110 MB/s (actual > disk bandwidth is closer to 250 but it's wasted) ; however when using > the giga ethernet to copy a large file over a LAN, disk and ethernet > have to share the PCI bus, so throughput falls to 50 MB/s. Crummy, eh ? Sounds like a slow Giga Ethernet NIC... > Let me repeat this : at the current state of SATA drives, just TWO > of them is enough to saturate a PCI bus. I'm speaking desktop SATA > drives, not high-end SCSI ! (which is not necessarily faster for pure > throughput anyway). > Adding more drives will help random reads/writes but do nothing for > throughput since the tiny PCI pipe is choking. In my experience, SATA drives are very slow for typical database work (which is heavy on random writes). They often have very slow access times, bad or missing NCQ implementation (controllers / SANs as well) and while I am not very familiar with the protocol differences, they seem to add a hell of a lot more latency than even old U320 SCSI drives. Sequential transfer performance is a nice indicator, but not very useful, since most serious RAID arrays will have bottlenecks other than the theoretical cumulated transfer rate of all the drives (from controller cache speed to SCSI bus to fibre channel). Thus, lower sequential transfer rate and lower access times scale much better. >> Whether a SAN or just an external enclosure is 12disk enough to >> substain 5K inserts/updates per second on rows in the 30 to 90bytes >> territory? At 5K/second inserting/updating 100 Million records would >> take 5.5 hours. That is fairly reasonable if we can achieve. Faster >> would be better, but it depends on what it would cost to achieve. 5K/s inserts (with no indexes) are easy with PostgreSQL and typical (current) hardware. We are copying about 175K rows/s with our current server (Quad core Xeon 2.93GHz, lots of RAM, meagre performance SATA SAN with RAID-5 but 2GB writeback cache). Rows are around 570b each on average. Performance is CPU-bound with a typical number of indexes on the table and much lower than 175K/s though, for single row updates we get about 9K/s per thread (=5.6MB/s) and that's 100% CPU-bound on the server - if we had to max this out, we'd thus use several clients in parallel and/or collect inserts in text files and make bulk updates using COPY. The slow SAN isn't a problem now. Our SATA SAN suffers greatly when reads are interspersed with writes, for that you want more spindles and faster disks. To the OP I have 1 hearty recommendation: if you are using the RAID-functionality of the 2120, get rid of it. If you can wipe the disks, try using Linux software-RAID (yes, it's an admin's nightmare etc. but should give much better performance even though the 2120's plain SCSI won't be hot either) and then start tuning your PostgreSQL installation (there's much to gain here). Your setup looks decent otherwise for what you are trying to do (but you need a fast CPU) and your cheapest upgrade path would be a decent RAID controller or at least a decent non-RAID SCSI controller for software-RAID (at least 2 ports for 12 disks), although the plain PCI market is dead. -mjy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| Thread Tools | |
| Display Modes | |
|
|