This is a discussion on Constant char length is a good practice? within the MySQL forums, part of the Database Server Software category; --> Axel Schwenke wrote: > Jerry Stuckle <jstucklex@attglobal.net> wrote: > >>Axel Schwenke wrote: >> >>>Jerry, you are right. Processing variable ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Axel Schwenke wrote: > Jerry Stuckle <jstucklex@attglobal.net> wrote: > >>Axel Schwenke wrote: >> >>>Jerry, you are right. Processing variable length records is more >>>expensive than processing fixed length records. But it's only CPU >>>cycles. Real world databases are always I/O bound, never CPU bound. >> >>Not necessarily. I've seen many cases where the system is running virtually 100% cpu for the >>database. Everything is in memory, so there is no I/O work being done. > > > I wouldn't call this a /database/ then. It's more like a cache. > Sure it's a database. But one of the things the DBM, OS and hardware can do to improve access is cache the data. And if you have the memory, large caches are great. But that doesn't mean it isn't a RDB. > Also this discussion started with "there may be a severe impact on > doing table scans" - your point in > <d-2dndCiTaQXHzXZnZ2dnUVZ_vOdnZ2d@comcast.com>. > Most of my arguments are specific to that. You're no longer talking > about table scans. > And many of those operations *may* be table scans. Or they can be more complex operations. > >>the CPU must perform the transfer from the disk buffer to RAM, or, >>if DMA (Direct Memory Access) is available, the CPU must pause while >>the data bus is being used for the transfer. > > > Are we talking about servers? There always is DMA. Also memory > bandwidth is far above the assumed 20MB/s. IIRC it is somewhere in the > 1GB/s range. So a running DMA does not hog the bus completely. BTW, my > home "server" - in fact a desktop machine with some more RAM and disks > - shows about 10% cpu time spent in kernel/interrupts when the I/O > system is maxed out (rebuild of soft RAID). > No, there is not always DMA. And no, DMA doesn't hog the bus completely. Please read what I said - "the CPU must pause while the data bus is being used for the transfer". That is still accurate. > For the curious: it's an AMD Athlon64/3000+ with 2GB RAM and 4 SATA > disks @400GB each. Running Debian/Sarge x64 edition. > That's one type of server. As an aside, mainframes have much better I/O and DMA - in fact, they can DMA without pausing the processing. But they still can be either I/O or CPU bound (or both at different times). But again, not everyone's running servers like this for their databases. > >>>Another calculation. Lets stay with 200 byte records. Assume we get >>>20MB/s from the hard disks. Thats 100.000 rows per second. Then we can >>>spend 10.000 CPU cycles per row without losing speed. You don't need >>>10.000 cycles to read a few length fields from the record and add them. >>>(remember: the record is already in memory) >>> >> >>Which is a lot less than 10K instructions - probably around 2-4K >>machine instructions, maybe fewer. > > > So how many instructions will it take to read two integers (length of > record, length of empty space behind record) from the record buffer > and add them to the record pointer? 20, 50 instructions? That's less > than 1% of the available. Come on! > First you have to fetch the buffer, which will take a few hundred instructions. Then to fetch the buffer contents, you must make a call to the OS. The necessary context switch to ring 0 and back will take at least 50 instructions. You're probably talking at least 500 instructions, just to fetch the two integers. That would probably average about 1.5K to 2K machine cycles - or 15-20% of the available CPU. > >>>Also, nobody reads single rows for table scans. MyISAM reads hunks of >>>8MB (default read_buffer_size, can be increased). There should be many >>>records in this 8MB buffer. The only uncomfortability from variable >>>length records is the copying of the last, probably incomplete record >>>to the buffer start to complete it with the next read. This effect >>>will be negligible if 100 or more rows fit in the read buffer. >> >>Yes, 8M is the read buffer size. But that doesn't mean it reads 8M >>in at one time. > > > Sure it does. For TABLE SCANS. > I would be very surprised if MySQL did something so inefficient. Their developers are much smarter than that. > >>there is no reason to read 8M if all you want is 200 bytes. > > > We're doing a TABLE SCAN. We want *all* the records. > Eventually, yes. That doesn't mean you need to read all the records immediately. That's fine if you take this operation alone. But it also only makes sense if you completely ignore the fact other operations may be taking place concurrently. No other DBM I'm familiar is this inefficient, and I really doubt the MySQL developers would be that stupid. > >>Secondly, now you've just wiped any previous operations in your buffer. > > > The read buffer is per thread. A MySQL thread will only do one thing at > a time. This time it's doing a TABLE SCAN. > > Jerry, sorry to say so, but I don't see any progress is this > discussion. I'm feeling I'm wasting my time. Unless you can come up > with some hard facts, substanciating your claims - please stop it! > So you're saying that if thread 1 gets data in its buffer, then the next operation happens to get thread 2, that the data has to be read again? How inefficient! Yes, I agree there is no progress in this discussion. You have shown a complete level of ignorance of the most basic operations - even the fact that 1 machine cycle does not match 1 machine instruction. You also have no idea what's involved in a context switch - much less fetching data from the O/S. And the operations as you describe them are so inefficient that no DBM developer with any brains would employ them. You really should get more experience in the low level operations and what's really required to do some of this work. I have. In fact I used to do a significant amount of assembler programming on the PC. This was following my assembler work on mainframes while working for IBM - which included both OS and DBM programming over the years. I agree. I'm wasting my time. > Thanks. > > XL > -- > Axel Schwenke, Senior Software Developer, MySQL AB > > Online User Manual: http://dev.mysql.com/doc/refman/5.0/en/ > MySQL User Forums: http://forums.mysql.com/ -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |