This is a discussion on Performance mystery -- SP vs script within the SQL Server forums, part of the Microsoft SQL Server category; --> I have encountred situations like this before, but this one has me stumped. I have a pretty simple SP ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have encountred situations like this before, but this one has me stumped. I have a pretty simple SP that collects information about residential properties from a large database. First step is to query on the basis of address or location, and collect a temp table of property IDs. Second step is to populate a composite table of property information by joining the IDs to a table of characterisitics. Third step is to update some fields by finding a single value from multiple candidates in large tables -- one has 275 million, another 325 million rows -- e.g., the price of the most recent sale for a property. As an SP, this takes absolutely forever, and it seems it's doing endless scans of the large tables. So to analyze it, I took the code and ran it as a script -- turned the parameter definition at the top into a DECLARE statement, set values for the variables that are the input parameters, no other changes, and go. Presto! It runs in no time flat, and the query plans reveal it's using the indexes just like it's supposed to. But the SP might take an hour to do the same thing. Any suggestions about what to look for? I believe both versions have fresh query plans -- I have recompiled (and dropped and recreated) the SP, and the plain script should have a fresh plan. Maybe it's because the parameter values are known when the script runs, but not when the SP is complied? I would really appreciate any pointers, and can provide more information as needed. Thanks, Jim Geissman |
| |||
| PS -- it's the update statements that are the problem. Even if the temp table has only one record, four parallel threads are spawned to search through a large table hunting for the ten or so records for that property -- even though thereare indexes aplenty allowing it to go directly to the appropriate records. |
| |||
| You might want to try recompiling the stored procedure. It's possible that when the stored procedure was first run and a query plan was created, there were different indexes or the tables did not hold the same data. The stored procedure might still be trying to use that same query plan even though the data and/or indexes have changed. HTH, -Tom. |
| |||
| That sure sounds reasonable, Tom, however nothing has changed with the tables for a week or more, and the SP has only existed for two days. One change I did make that seems to have led to this was: I had approximately this formerly: update t set amt=e.amt, date=e.date,name=n.name from #temp t join event e on e.propid=t.propid join name n on n.propid=e.propid and n.eventid=e.eventid where e.eventtype=... and e.date=(select max(date) from event e1 where e1.propid=t.propid and...) I split it into two separate update queries, one for the date & amount, another for the name, because the two should each be the most recent (with some conditions), but don't have to be from the same event. The joins dropped from three tables to two. That change broke it. |
| |||
| There's an index that has everything needed to locate the amount -- propid, date, eventtype, and amount, in that order. However, it is being scanned on the basis of date and event type (millions of rows) instead of going directly to the correct handful of rows by adding propid. Is there a way to change this behavior? |
| |||
| On 9 Mar 2005 15:38:30 -0800, jim_geissman@countrywide.com wrote: >PS -- it's the update statements that are the problem. >Even if the temp table has only one record, four parallel >threads are spawned to search through a large table >hunting for the ten or so records for that property -- even >though thereare indexes aplenty allowing it to go directly >to the appropriate records. Which version of SQL Server are you using? I've had problems with SQL Server 7 grossly misestimating the time to execute query plans, thinking it has exceeded the parallelism threshold, and choosing to use parallelism when it actually hurts performance significantly. Try changing your server's parallelism threshold to 5x or 10x the default value. |
| ||||
| (jim_geissman@countrywide.com) writes: > I have encountred situations like this before, but this one > has me stumped. > > I have a pretty simple SP that collects information about > residential properties from a large database. First step is to > query on the basis of address or location, and collect a temp > table of property IDs. Second step is to populate a composite > table of property information by joining the IDs to a table of > characterisitics. Third step is to update some fields by finding > a single value from multiple candidates in large tables -- one > has 275 million, another 325 million rows -- e.g., the price of > the most recent sale for a property. > > As an SP, this takes absolutely forever, and it seems it's doing > endless scans of the large tables. So to analyze it, I took the > code and ran it as a script -- turned the parameter definition at > the top into a DECLARE statement, set values for the variables > that are the input parameters, no other changes, and go. Presto! > It runs in no time flat, and the query plans reveal it's using > the indexes just like it's supposed to. But the SP might take > an hour to do the same thing. Without seeing the code, and not having information about the table etc, the best I can offer is wild guesses and standard recommendations. Since replacing the parameters with variables appeared to give effect, it seems you have a workaround. (Copy the parameters to local variables.) Then again, there are a couple of more possibilities. If the issue is indeed parameters vs. variables, then it could be a case of psrameter sniffing when you don't want it. "Parameter sniffing" is when SQL Server uses the values of the input parameters on the first invocation to build the query plan for the stored procedure. When it comes to variables, SQL Server does not know their values before and builds the plan from standard assumptions. Usually parameter sniffing is good, but say that you have: CREATE PROCEDURE sniff_sp @a datetime ... AS ... IF @a IS NULL SELECT @a = convert(char(8), @a, 112) And on first invocation, you call the procedure with @a = NULL. That's a value you don't use, so it will not be good for the plan. Another alternative is that the stored procedure was created with any of ANSI_NULLS or QUOTED_IDENTIFIER off, and the procedure involves access to an indexed view or an indexed computed columns. The two mentioned settings must be ON for such indexes to be used, and these two settings are saved with the stored procedure. You can use the Objectproperty() function to investigate this. There may be other reasons as well, as the optimizer being to optimistic about the benefits of parallelism, or too pessimistic about the usefulness of a non-clustered index. You can use OPTION (MAXDOP 1) to turn of parallelism. You can also try to force the use of a certain index, but you should be careful, because a using a non-clustered index when there are too many hits and be real disaster performancewise. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp |