This is a discussion on I don't understand this "server timeout" within the SQL Server forums, part of the Microsoft SQL Server category; --> I was doing update statements in SQL Server 2000. I have a table with over 16 million rows. It ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I was doing update statements in SQL Server 2000. I have a table with over 16 million rows. It came from several hundred delimited text files, and two of the columns are file ID (int) and Line # (int) Structure is X12 (835). For those unfamiliar with that, each file has one to many BPR lines; each BPR line has zero to many CLP lines, each of those has zero to many SVC lines, each of those has zero to many CAS lines. Working with this through the Enterprise Manager MMC, a lot of things I tried got timeouts. So, I indexed File ID, Line number, and line type, and created a new table containing only the columns I knew I would need in the final output--selected fields from some of the line types mentioned, plus the line numbers and common file ID for those rows. I indexed every column in that table that I thought I might search on. I loaded it with 31 thousand rows using a select on a subset of the CAS rows. That took far less than a minute. I updated each row with the highest BPR line number not higher than the CASE line number. About a minute. Not bad, with having the worst case number of comparisons being 16 million times 31 thousand. Of course, the indexing should help plus it can be narrowed down by the "same file" and BPR # < CAS # criteria. But the next update should theoretically be faster: each row now has a BPR # and a CAS # and I am telling it to find the highest CLP number BETWEEN those two. So it should have a MUCH smaller set of to search through. Yet it thinks for about five minutes and then announces a timeout. Any suggestions? -- Wes Groleau Measure with a micrometer, mark with chalk, and cut with an axe. |
| |||
| Wes Groleau (groleau+news@freeshell.org) writes: > Working with this through the Enterprise Manager MMC, > a lot of things I tried got timeouts. > > So, I indexed File ID, Line number, and line type, and > created a new table containing only the columns I knew > I would need in the final output--selected fields from > some of the line types mentioned, plus the line numbers > and common file ID for those rows. > > I indexed every column in that table that I thought I might > search on. > > I loaded it with 31 thousand rows using a select on a > subset of the CAS rows. That took far less than a minute. > > I updated each row with the highest BPR line number not higher > than the CASE line number. About a minute. Not bad, with having > the worst case number of comparisons being 16 million times 31 thousand. > Of course, the indexing should help plus it can be narrowed down by > the "same file" and BPR # < CAS # criteria. > > But the next update should theoretically be faster: each row now has a > BPR # and a CAS # and I am telling it to find the highest CLP number > BETWEEN those two. So it should have a MUCH smaller set of to search > through. Yet it thinks for about five minutes and then announces a > timeout. Unforunately there is very little here to work from. X12 tells me nothing. And in any case you have added a number of indexes that are unknown to me. But let me point out thing: indexing single columns is far from always sufficient. Often you need composite indexes. To be able to say something more useful, I would be able to see the CREATE TABLE statements for the tables. (Or is there only one?), as well as the indexes, including keys. And of course I would need to know your UPDATE statements. And if there are any triggers, I need to see those as well. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Erland Sommarskog wrote: > Unforunately there is very little here to work from. X12 tells me > nothing. And in any case you have added a number of indexes that are > unknown to me. But let me point out thing: indexing single columns is > far from always sufficient. Often you need composite indexes. > > To be able to say something more useful, I would be able to see > the CREATE TABLE statements for the tables. (Or is there only one?), > as well as the indexes, including keys. And of course I would need > to know your UPDATE statements. And if there are any triggers, I > need to see those as well. No triggers. I tried to script the table (actually I tried to script a similar table to save myself some typing) but the wizard saved no file and gave no error message. So I copied the table and used the GUI to strip out the fields I didn't need/add a few others All the fields referenced in the UPDATE statements are indexed. The update statements are almost identical--the difference is that in UPDATE Output SET xyz = (SELECT Max(Seg_Nbr) FROM Raw_Segs WHERE Output.FID = Raw_Segs AND Seg_Nbr BETWEEN abc and pqr) xyz, abc, & pqr are different columns, such that pqr - abc is a wider range in the one that works; narrower in the one that dies with timeout. -- Wes Groleau A pessimist says the glass is half empty. An optimist says the glass is half full. An engineer says somebody made the glass twice as big as it needed to be. |
| |||
| Erland Sommarskog wrote: > To be able to say something more useful, I would be able to see > the CREATE TABLE statements for the tables. (Or is there only one?), > as well as the indexes, including keys. And of course I would need > to know your UPDATE statements. And if there are any triggers, I > need to see those as well. Sorry for the too-soon send. As I said, I did the table design with the GUI but it would be equivalent to ( CAS_Seg int, SVC_Seg int, CLP_Seg int, BPR_Seg int, other fields ) CAS_Seg is loaded first, with an INSERT from a view of Raw_Segs. Then BPR_Seg is updated with the highest Seg_Nbr lass than CAS_Seg in the same file. Works. Then one of the updates I just sent is tried and times out. -- Wes Groleau ------ "The reason most women would rather have beauty than brains is they know that most men can see better than they can think." -- James Dobson |
| |||
| Wes Groleau (groleau+news@freeshell.org) writes: > No triggers. I tried to script the table (actually I tried to script > a similar table to save myself some typing) but the wizard saved no file > and gave no error message. So I copied the table and used the GUI to > strip out the fields I didn't need/add a few others > > All the fields referenced in the UPDATE statements are indexed. > > The update statements are almost identical--the difference is that in > > UPDATE Output SET xyz = (SELECT Max(Seg_Nbr) FROM Raw_Segs > WHERE Output.FID = Raw_Segs AND Seg_Nbr BETWEEN abc and pqr) > > xyz, abc, & pqr are different columns, such that pqr - abc is > a wider range in the one that works; narrower in the one that > dies with timeout. Again, CREATE TABLE and CREATE INDEX statements for your two tables would help. Knowing that "all fields ... are indexed" is not a very useful piece of information. I would need to know where in the index the column appears, and which index that is the clustered index, if there is any. But you could try: UPDATE Output SET xyz = R.maxseg FROM Output o JOIN (SELECT Raw_segs, maxseg = Max(Seg_Nbr) FROM Raw_Segs WHERE Seg_Nbr BETWEEN abc AND pqr) R ON Output.FID = R.Raw_Segs While this syntax is proprietary and not portable, it often yields better results than a correlated subquery. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Erland Sommarskog wrote: > But you could try: > > UPDATE Output > SET xyz = R.maxseg > FROM Output o > JOIN (SELECT Raw_segs, maxseg = Max(Seg_Nbr) > FROM Raw_Segs > WHERE Seg_Nbr BETWEEN abc AND pqr) R > ON Output.FID = R.Raw_Segs > > While this syntax is proprietary and not portable, it often yields better > results than a correlated subquery. I think I also tried that, but maybe not. I'm going in to work today, so I'll make sure. And I'll make another try at extracting a script from that thing. -- Wes Groleau Expert, n.: Someone who comes from out of town and shows slides. |
| |||
| Wes Groleau (groleau+news@freeshell.org) writes: > I think I also tried that, but maybe not. I'm going in to work today, > so I'll make sure. And I'll make another try at extracting a script > from that thing. Note that you can also script from Query Analyzer. And if only the timeout bothers you, run the UPDATE from QA. QA does not have any timeouts. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Erland Sommarskog wrote: > But you could try: > > UPDATE Output > SET xyz = R.maxseg > FROM Output o > JOIN (SELECT Raw_segs, maxseg = Max(Seg_Nbr) > FROM Raw_Segs > WHERE Seg_Nbr BETWEEN abc AND pqr) R > ON Output.FID = R.Raw_Segs > > While this syntax is proprietary and not portable, it often yields better > results than a correlated subquery. Or maybe UPDATE Output SET xyz = R.maxseg FROM Output o JOIN (SELECT FID, maxseg = Max(Seg_Nbr) FROM Raw_Segs WHERE Seg_Nbr BETWEEN abc AND pqr) R ON Output.FID = R.FID But unfortunately, Ent. Mgr/SQL Svr 2000 rejected this, saying that the optional FROM syntax is not supported. By some experimentation, I got THAT message to go away (even though both FROMs were still there). But none of the ten variations I tried were accepted. (By the way, page 72 and following of SQL Cookbook offers both this approach and the one that my first approach was based on. But the syntax it says will work for the first approach was also rejected by my system--though I managed to alter it enough to work in the one case. Apparently, the "optimizer" is not very smart. I eventually got the job done as follows: Load Raw_Segments Create and insert Raw_BPR, Raw_CLP, Raw_SVC, Raw_CAS, Patient_Names, etc. Index the above Create Inv_Data with indexes Crate view or table Selected_Adjustments from Raw_CAS INSERT INTO Inv_Data (FID, CAS_Seg, Adj_Group, Adj_Reason, Adj_Amount) SELECT FID, Seg_Nbr, Type, Code, CAST(Amount AS money) AS Expr1 FROM Selected_Adjustments UPDATE Inv_Data SET SVC_Seg = (SELECT MAX(Seg_Nbr) FROM Raw_SVC AS Raw WHERE Inv_Data.FID = Raw.FID AND CAS_Seg > Seg_Nbr) UPDATE Inv_Data SET Service = (SELECT Elem_01 FROM Raw_SVC AS Raw WHERE Inv_Data.FID = Raw.FID AND SVC_Seg > Seg_Nbr) UPDATE Inv_Data SET CLP_Seg = (SELECT MAX(Seg_Nbr) FROM Raw_CLP AS Raw WHERE Inv_Data.FID = Raw.FID AND CAS_Seg > Seg_Nbr) UPDATE Inv_Data SET BPR_Seg = (SELECT MAX(Seg_Nbr) FROM Raw_BPR AS Raw WHERE Inv_Data.FID = Raw.FID AND CAS_Seg > Seg_Nbr) Each update takes about ten seconds this way. No doubt there's a simpler way, but I'm new at this. -- Wes Groleau He that complies against his will is of the same opinion still. -- Samuel Butler, 1612-1680 |
| |||
| Erland Sommarskog wrote: > And if only the timeout bothers you, run the UPDATE from QA. QA does not > have any timeouts. How do you do that? When I click the icon that has the tooltip "execution mode," it acts like it's doing something for a while, and then it displays an execution plan. But the table is unchanged. Then I select/copy the SQL, paste it into the Enterprise Manager and click the exclamation point. It complains of a syntax error. This happened before--I just forgot to mention it. Removing the syntax error got the time out. Maybe the timeout is because in one case (no timeout) it first gathered part of the subquery (877 rows of 16 million) making the second part 877 comparing to 31000. But if it tries the other part first, it is checking the full 31000 rows against the original 16 million. I don't know whether that happened, but it would explain why I did the job in less than sixty seconds by doing some of the subqueries as separate extract and insert steps. -- Wes Groleau Heroes, Heritage, and History http://freepages.genealogy.rootsweb.com/~wgroleau/ |
| ||||
| Wes Groleau (groleau+news@freeshell.org) writes: > How do you do that? When I click the icon that has the tooltip > "execution mode," You click on the green arrow. (Or press F5 or CTRL/E.) Then it will run the query which is the window. I strongly encourage you to get acquianted with Query Analyzer to run your queries. What you have in Enterprise Manager is a query designer, and a fairly limited one as testified about the bogus message about the FROM clause not being supported. In Query Analyzer you are only limited by what SQL Server permits; the tool itself does not limit you. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| Thread Tools | |
| Display Modes | |
|
|