This is a discussion on IDS 10 erratic run times within the Informix forums, part of the Database Server Software category; --> IDS 10.0FC3 on Solaris 2.9 This is a brand new server which we are preparing to migrate a v9 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| IDS 10.0FC3 on Solaris 2.9 This is a brand new server which we are preparing to migrate a v9 system to. Over the past few weeks I have run multiple dbimports from more-or-less the same data source - certainly the same schema - but the import time varies widely. The fastest I've done it is 9h, the slowest 14. This weekend's took 13h 10m. There's never anyhting else going on on it. In the UPDATE STATISICS phase the elapsed time was 3h: during the "fast" run it was just 1h. I had a look during this period: often onstat -p onstat -m onstat -D would show periods of perhaps 2 minutes with no activity at all: no bufreads or physreads, no checkpointing for 20 minutes (the checkpoint interval is 5 mins), the relevant thread just "sleeping forever". I haven't changed the disk layout (a 60G database striped across three disks in a JBOD array with the standard Solaris LVM interlace of 16k, this stripe mirrored to another 3 disks identically configured attached to another controller on the same array); I have experimented with different buffer sizes (perversely one 9h run was with BUFFERS set to just 100,000 rather than the usual 1,000,000 although there's no consistent pattern to this now I've cut the LRU MIN and MAX, and I've had another 9h run with the 2GBytes of BUFFERS since). It's the fact that all activity just seems to stop for long periods that puzzles me. If, during such a period, I fire up dbaccess and run a sysmater query it returns instantly, so I can't see a systemic cause. The new box is still 1.5 to twice the speed of the old in comparative testing but this disparity in times is just puzzling me and I don't want it to come back and bite us after go live. Any suggestions or inspiration welcome! thanks Neil |
| |||
| On Mon, 19 Sep 2005 01:01:44 +0100, "Neil Truby" <neil.truby@ardenta.com> wrote: >IDS 10.0FC3 on Solaris 2.9 > >This is a brand new server which we are preparing to migrate a v9 system to. >Over the past few weeks I have run multiple dbimports from more-or-less the >same data source - certainly the same schema - but the import time varies >widely. > >The fastest I've done it is 9h, the slowest 14. This weekend's took 13h >10m. There's never anyhting else going on on it. In the UPDATE STATISICS >phase the elapsed time was 3h: during the "fast" run it was just 1h. I had >a look during this period: often onstat -p onstat -m onstat -D would show >periods of perhaps 2 minutes with no activity at all: no bufreads or >physreads, no checkpointing for 20 minutes (the checkpoint interval is 5 >mins), the relevant thread just "sleeping forever". > Waiting for disk, perhaps? Something at the OS level? >I haven't changed the disk layout (a 60G database striped across three disks >in a JBOD array with the standard Solaris LVM interlace of 16k, this stripe >mirrored to another 3 disks identically configured attached to another >controller on the same array); I have experimented with different buffer >sizes (perversely one 9h run was with BUFFERS set to just 100,000 rather >than the usual 1,000,000 although there's no consistent pattern to this now >I've cut the LRU MIN and MAX, and I've had another 9h run with the 2GBytes >of BUFFERS since). > >It's the fact that all activity just seems to stop for long periods that >puzzles me. If, during such a period, I fire up dbaccess and run a sysmater >query it returns instantly, so I can't see a systemic cause. The new box is >still 1.5 to twice the speed of the old in comparative testing but this >disparity in times is just puzzling me and I don't want it to come back and >bite us after go live. > What are the oninits doing? Percentages? Just a few thoughts . . . . JWC |
| |||
| ahum you may want to check what is slow. in the past there were problems with spl's this caused one to get grey hair.... SLOWWWW.... w.a. was before export to dbschema them all (do not forget permissions) drop 'm all export and import and after that recreate them all so please check what import is doing at that point. Superboer. |
| |||
| Neil, I have seen anything similar when we had AUTORAID configured on a system Sometimes the disks ran as RAID 10 and other times as RAID 5 depending how the AUTORAID decided to configure them When RAID 10 the performance was great When RAID 5 our perfromance was woeful |
| |||
| "scottishpoet" <dryburghj@yahoo.com> wrote in message news:1127128089.906988.42210@g43g2000cwa.googlegro ups.com... > Neil, > > I have seen anything similar when we had AUTORAID configured on a > system > > Sometimes the disks ran as RAID 10 and other times as RAID 5 depending > how the AUTORAID decided to configure them > > > When RAID 10 the performance was great > > When RAID 5 our perfromance was woeful > Interesting. I'm sure this is a permanent RAID 0+1. And the performance is not exactly consistently slow, more stop-start. I will check this out. At the moment the situation is to vague to raise a support call with IBM, and it has to date resisted my attempts to reproduce a smaller case. We're continuing to try to build a reproducible case, and see it's just a 10.0 thing (no reason to believe is it at this point). I've had a few cases in the past couple of years when we've painstakingly built a case then been told almost immediately by IBM we've hit a known problem. How I wish there was still the facility of TechInfo, where we mortals can check this out for ourselves at the outset. If anyone from Tech Support is reading, could I ask them if there are any known bugs in IDS 10.0 consistent with our symptoms? thanks Neil |
| |||
| Just like Neil, we're preparing to migrate our v9.40 servers to v10. I already converted one test instance to v10UC3R1, using same dbspaces, the same onconfig, sqlhosts etc. Upgrade went fine, at least the log said so, but now we're constantly facing -640 QPlan sanity failure <line-number> exceptions. UPDATE STATISTICS doesn't help. Any ideas? Thanks everyone. "Neil Truby" <neil.truby@ardenta.com> wrote in message news:3p6df9F8or2fU1@individual.net... > IDS 10.0FC3 on Solaris 2.9 > > This is a brand new server which we are preparing to migrate a v9 system > to. > Over the past few weeks I have run multiple dbimports from more-or-less > the > same data source - certainly the same schema - but the import time varies > widely. |
| |||
| |
| |||
| what update statistics did you run If you didn't specify "for procedure" it would not have updated the query plans for the procedure Personally, I'd drop and recreate the procedure. from the new techinfo (search for "Qplan" and check "Solve a problem") Error 640 running stored procedure (SP) after database server upgrade Technote (FAQ) Problem When you try to run a stored procedure you get an error -640 after upgrading Informix Dynamic Server Cause Solution PROBLEM If you try to run a stored procedure (SP) after upgrading your database server, you might get error -640. -640 QPlan sanity failure line-number CAUSE Corruption in the SP query plan causes the -640 error. This error occurs when the database server tries to read and build a SP query plan, or read the constructed query plan and store it in a binary format. Important note: There are other possible causes for this problem. If this document does not solve your problem, search for other documents with the same problem. If you can't find any then contact your local technical support office. SOLUTION Run the following command for a selected database: UPDATE STATISTICS FOR PROCEDURE; |
| |||
| Well, the truth is I didn't specify "for procedure" the first time but I did it this time and still the same problem appears. I'll drop and recreated all problematic procedures and hope it works. I also contacted my local support, maybe they'll know the solution. "scottishpoet" <dryburghj@yahoo.com> wrote in message news:1127217937.510858.314160@z14g2000cwz.googlegr oups.com... > what update statistics did you run > > If you didn't specify "for procedure" it would not have updated the > query plans for the procedure > > Personally, I'd drop and recreate the procedure. |
| ||||
| surprised that didn't work! there was a change to the query plans after 9.40.UC4 which means the plans all need updated and the update statistics for procedure requirement should be documented. Are you using ER? |