This is a discussion on insert statement blocked within the SQL Server forums, part of the Microsoft SQL Server category; --> Hi Shortly, I keep invoices in a table. Occasionally, someone will fire the execution of a stored procedure (SP) ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi Shortly, I keep invoices in a table. Occasionally, someone will fire the execution of a stored procedure (SP) that performs several UPDATEs against (potentially) all invoices OLDER than a date that is supplied to the SP as a parameter. The SP is usually a lengthy process (it takes at least 30 mins). The problem is that SQL server 2000 Dev Edition doesn't allow me to insert new invoices that are "younger", while the SP is executing. How should I specify to SQL Server that new invoices are "harmless"? Thanks. |
| |||
| On 25 Sep 2006 03:04:37 -0700, nano2k wrote: >Hi > >Shortly, I keep invoices in a table. >Occasionally, someone will fire the execution of a stored procedure >(SP) that performs several UPDATEs against (potentially) all invoices >OLDER than a date that is supplied to the SP as a parameter. >The SP is usually a lengthy process (it takes at least 30 mins). >The problem is that SQL server 2000 Dev Edition doesn't allow me to >insert new invoices that are "younger", while the SP is executing. >How should I specify to SQL Server that new invoices are "harmless"? > >Thanks. Hi nano2k, To answer that question, we need to know: - The design of your tables. Please post in the form of CREATE TABLE statements. Don't forget to include all constraints and properties and all indexes for the tables - they are probably a key factor in this situation. - The code executed in the stored procedure. - The code executed to insert new rows. -- Hugo Kornelis, SQL Server MVP |
| |||
| Hi Hugo Thank you for your quick reply. It's hard for me to provide all data that you need (and I fully understand the needing) because all (many) objects involved (SP, tables, indexes, triggers) are big (as number of lines and also as complexity). I reformulate my problem to make it more simple to understand and let's forget about the complexity of the objects enumerated above. I have an app that allows me to insert invoices into my invoice table. Classic. Today is sept 23rd 2006. I need to run my SP for all invoices older than current date (including invoices for 22nd of sept 2006). The SP ***may*** modify (that is, UPDATE) ***all*** invoices until 22nd (including). The SP runs 2h. This means that I have to wait 2h before the system allows me to insert new invoices for 23rd of sept 2006, even if new invoices do not affect the result of my SP. This is not accepted as the SP may be executed 2-3 times a day. I also have tens of thousands of records in my invoice table. My questions: 1. What indexes should I set for my invoice table? Currently I have an uniqueidentifier as primary key. I suppose I should set another non-unique index for the [invoicedate] column. 2. What types of transactions should I use inside the SP, so that inserting new invoices will be accepted by db engine? Thanks. |
| |||
| Without the scripts Hugo requested, we really can't provide specific recommendations. I suggest you start by identifying the longest running query in your invoicing process and post the relevant DDL for that query. To identify problem queries, run a Profiler trace with SQL statement completed and stored procedure statement completed events and include a filter on high duration (e.g. > 1000). Examine the execution plans of those queries and add indexes or tweak the SQL for efficiency. Some general observations and guidelines: > I need to run my SP for all invoices older > than current date (including invoices for 22nd of sept 2006). I assume there is other criteria besides date for these older invoices. If you have some indicator like InvoiceProcessed, and index on that column (perhaps with InvoiceDate too) might be a good candidate. > The SP runs 2h. This seems like an extraordinarily long time for even a very involved daily process. I would expect that a standard invoicing process would take no more than a minute for thousands of invoices. I suspect sub-optimal execution plans (e.g. iterative scans) and/or poor query techniques (cursors). Suboptimal trigger processing (scans) are notorious for causing concurrency problems. I don't know the purpose of the triggers but you might instead consider performing the process in stored procedures if possible. > 1. What indexes should I set for my invoice table? Currently I have an > uniqueidentifier as primary key. I suppose I should set another > non-unique index for the [invoicedate] column. The best candidates for indexes are columns in JOIN and WHERE clause predicates. I already mentioned that InvoiceDate and the theoretical InvoiceProcessed column. In fact, this may be a good candidate for the clustered index. In lieu of scripts, you might try running the workload through the Index Tuning Wizard or Database Tuning Advisor for automated suggestions. > 2. What types of transactions should I use inside the SP, so that > inserting new invoices will be accepted by db engine? The purpose of transactions are to provide data integrity per ACID rules. It's best to specify the lowest level that satisfies this requirement. READ COMMITED is most often appropriate. However, keep in mind that performance and concurrency often go hand-in-hand. Ideally, the daily process should have minimal effect on inserting new invoices and visa-versa as long as index are useful. -- Hope this helps. Dan Guzman SQL Server MVP "nano2k" <adi@ikonsoft.ro> wrote in message news:1159265897.398795.185840@k70g2000cwa.googlegr oups.com... > Hi Hugo > > Thank you for your quick reply. > > It's hard for me to provide all data that you need (and I fully > understand the needing) because all (many) objects involved (SP, > tables, indexes, triggers) are big (as number of lines and also as > complexity). > > I reformulate my problem to make it more simple to understand and let's > forget about the complexity of the objects enumerated above. > > I have an app that allows me to insert invoices into my invoice table. > Classic. > Today is sept 23rd 2006. I need to run my SP for all invoices older > than current date (including invoices for 22nd of sept 2006). The SP > ***may*** modify (that is, UPDATE) ***all*** invoices until 22nd > (including). The SP runs 2h. This means that I have to wait 2h before > the system allows me to insert new invoices for 23rd of sept 2006, even > if new invoices do not affect the result of my SP. This is not accepted > as the SP may be executed 2-3 times a day. > I also have tens of thousands of records in my invoice table. > > My questions: > 1. What indexes should I set for my invoice table? Currently I have an > uniqueidentifier as primary key. I suppose I should set another > non-unique index for the [invoicedate] column. > 2. What types of transactions should I use inside the SP, so that > inserting new invoices will be accepted by db engine? > > Thanks. > |
| |||
| On 26 Sep 2006 03:18:17 -0700, nano2k wrote: >Hi Hugo > >Thank you for your quick reply. > >It's hard for me to provide all data that you need (and I fully >understand the needing) because all (many) objects involved (SP, >tables, indexes, triggers) are big (as number of lines and also as >complexity). Hi nano2k, I concur with Dan that it's hard to help you without the data we need. In general terms: you need to ensure that "old" invoices and "new" invoices can be completely seperated. Any read or search operation on either the "old" or "new" subset must be able to use indexes in which potentially locked rows from the other subset can be bypassed, and any insert, update or delete in one of the subsets should only incur such locks that the other subset is not affected at all. > The SP >***may*** modify (that is, UPDATE) ***all*** invoices until 22nd >(including). The SP runs 2h. This is your main problem. I am 99.9% sure that this SP can be optimized to perform much faster. I'm not talking about saving 10 or 20% on execution time, but aboout bringing execution time back to a couple of minutes, at most. >My questions: >1. What indexes should I set for my invoice table? Currently I have an >uniqueidentifier as primary key. I suppose I should set another >non-unique index for the [invoicedate] column. That's not a good design, for several reasons. First, if this uniqueidentifier is your ONLY key, then you have no protection at all against duplicate rows. Imagine that the data entry typists accidentally puts a paperweight on the <Enter> key - the program will just keep on generating uniqueidentifier values and inserting the same row over and over again. Using a surrogate key (the official term for key values generated by the database) can, in some cases, be a good idea, but only IN ADDITION TO THE BUSINESS KEY. The business key is an attribute (or combination of attributes) that uniquely identifies an instance of the object/entity type within your Universe of Discourse. Second, if you have to use a surrogate key, then uniqueidentifier is in most cases the worst possible choice. In almost all cases, IDENTITY is the preferred way to generate surrogate key values. Since uniqueidentifiers are not monotonically increasing but generated in a pseudo-random pattern, and since the primary key by default results in a clustered index, insertions will often result in page splits. That is in itself already pretty bad for INSERT performance, but in cases when many rows in the database may be blocked (as in your scenario), it's an open invitation to major blocking problems. Consider what happens if an in INSERT needs to store a row in a page that happpens to be full - the page has to be split and half the rows in it have to be moved. But what if they are locked by another transaction? And what it that transaction happens to be running for 2 hoours? I don't think that this is the only cause for your locking problems, but it's definitely one of the causes! >2. What types of transactions should I use inside the SP, so that >inserting new invoices will be accepted by db engine? Use the transaction isolation level that you need to maintain the level of integrity that your application requires. Never use a lower level: if you are willing to sacrifice correctness for speed, just ditch the database and program your reports to produce random data - lots faster and cheaper! But don't set the transaction isolation level higher than you need either, becuase (as Dan already pointed out) higher isolation means lower concurrency. -- Hugo Kornelis, SQL Server MVP |
| |||
| Hugo, Dan Thanks for your patience with this topic. I understand and agree to all your advices. My struggle, at least for the moment, is to make the application as much responsive as possible while strongly looking for data security. Let's say that for the moment we don't care how much the SP needs to process - we care only to let other users work while the SP is running. The SP will be run only a few times a month, but there are 2-3 fixed days of the month when the SP needs to perform - it's all about accounting the SP runs. I need uniqueidentifiers because my app is a three-tier client-server app (client/server/db) that needs to generate unique IDs from client side. The client app may generate several records from different tables that will be sent all together in a single request to the server. The records need to be already coupled (that is, foreign keys must be already known before sending the request to the database) before sending them to the server to process. Of course, there is an issue here regarding indexing this column, as Dan mentioned. Anyway, I have set a task to reindex the database every night, so I think this will reduce the payback to using uniqueidentifiers - do you agree? Is there any other action I should schedule? Dan, my uniqueid column (named [objectid]) is the only key, and at least at the moment, this column is set as primary key (ok? not ok?). My protection to duplicate entries is handled by the client application. 99% of the business logic is concentrated in client app. The server side only minimally processes the request and deals with the database (select/update/insert/delete). I fully understand that I have to logically split my tables using other indexes. Dan, you got right in the heart of my problem when you said: "In general terms: you need to ensure that "old" invoices and "new" invoices can be completely seperated". This is my goal at the moment. Your inputs along with Hugo's inputs helped me to start my research on the right path. But today, I have discovered that the SP heavily uses CURSORS This is the other big issue I have to deal with. Check out this piece of code (one of 3 CURSORS defined in SP): DECLARE inputs CURSOR FOR SELECT i.accountid, il.doctypeid, il.objectid, (CASE WHEN ilb.objectid IS NOT NULL THEN ilb.objectid ELSE il.objectid END), il.itemid, ilb.batchid, d.updatestock * d.cogssign * (CASE WHEN ilb.objectid IS NOT NULL THEN ilb.pdocquantity ELSE il.pdocquantity END), d.updatestock * d.cogssign * il.price * il.rate * (1 - il.discount / 100) * (1 - i.discount / 100), il.currencyid, il.rate, il.warehouseid, dbo.f_date_notime(il.stockdate) FROM inventory i JOIN inventoryline il ON (il.inventoryid = i.objectid) JOIN item it ON (it.objectid = il.itemid) JOIN doctype d ON (il.doctypeid = d.objectid) JOIN status st ON (st.objectid = i.statusid) JOIN warehouse w ON (w.objectid = il.warehouseid) LEFT JOIN inventorylinebatch ilb ON (ilb.inventorylineid = il.objectid) WHERE d.cogssign = 1 AND (st.final = 1 OR st.cancel = 1) AND d.doctypekey NOT IN ('25','26') AND il.stockdate >= dbo.f_date_notime(@last_date) + 1 AND it.stockcontrol = 1 AND ISNULL(w.nocost,0) = 0 ORDER BY il.itemid, il.stockdate, d.doctypekey, i.docnumber, il.create_date, ilb.create_date OPEN inputs ....follows fetching results in a loop I want to change this in the following manner: to create an temporary table and to insert the results of the above SELECT statement into this temp table. After that, using a cursor, I intend to fetch the records from the temp table. This way, the tables involved in the SELECT stm above will be locked as short as possible. Do you think this is a good approach? Note: In the SELECT statement above: ALL tables involved have a primay key defined as uniqueidentifier and named objectid. Also, ALL foreign keys (FK) are indexed like this: CREATE INDEX [FK_inventoryline_inventoryid] ON [dbo].[inventoryline]([inventoryid]) ON [PRIMARY] GO Thanks again. |
| |||
| > Dan, my uniqueid column (named [objectid]) is the only key, and at > least at the moment, this column is set as primary key (ok? not ok?). > My protection to duplicate entries is handled by the client > application. 99% of the business logic is concentrated in client app. > The server side only minimally processes the request and deals with the > database (select/update/insert/delete). I fully understand that I have > to logically split my tables using other indexes. You probably intended to direct this question at Hugo rather than me but my $.02 is that you should also have a unique constraint on you natural key. Even if integrity is enforced in the application, the database must still be queried based on the natural key value. The unique constraint creates a unique index that will improve performance of that query and guarantee data integrity as well. > ORDER BY il.itemid, il.stockdate, d.doctypekey, i.docnumber, > il.create_date, ilb.create_date I'm not sure why the ORDER BY here. Does your current processing logic require that rows be processed in a particular sequence? > I want to change this in the following manner: to create an temporary > table and to insert the results of the above SELECT statement into this > temp table. After that, using a cursor, I intend to fetch the records > from the temp table. This way, the tables involved in the SELECT stm > above will be locked as short as possible. Do you think this is a good > approach? The temp table will probably help mitigate blocking but you ought to avoid cursors entirely, if possible. A set-based process usually performs much better, especially if cursors are nested. If you must use a cursor, specify the LOCAL FAST_FORWARD READ_ONLY when possible. -- Hope this helps. Dan Guzman SQL Server MVP "nano2k" <adi@ikonsoft.ro> wrote in message news:1159306705.223359.38470@e3g2000cwe.googlegrou ps.com... > Hugo, Dan > Thanks for your patience with this topic. > > I understand and agree to all your advices. > > My struggle, at least for the moment, is to make the application as > much responsive as possible while strongly looking for data security. > Let's say that for the moment we don't care how much the SP needs to > process - we care only to let other users work while the SP is running. > The SP will be run only a few times a month, but there are 2-3 fixed > days of the month when the SP needs to perform - it's all about > accounting > the SP runs. > > I need uniqueidentifiers because my app is a three-tier client-server > app (client/server/db) that needs to generate unique IDs from client > side. The client app may generate several records from different tables > that will be sent all together in a single request to the server. The > records need to be already coupled (that is, foreign keys must be > already known before sending the request to the database) before > sending them to the server to process. Of course, there is an issue > here regarding indexing this column, as Dan mentioned. Anyway, I have > set a task to reindex the database every night, so I think this will > reduce the payback to using uniqueidentifiers - do you agree? Is there > any other action I should schedule? > > Dan, my uniqueid column (named [objectid]) is the only key, and at > least at the moment, this column is set as primary key (ok? not ok?). > My protection to duplicate entries is handled by the client > application. 99% of the business logic is concentrated in client app. > The server side only minimally processes the request and deals with the > database (select/update/insert/delete). I fully understand that I have > to logically split my tables using other indexes. > > Dan, you got right in the heart of my problem when you said: "In > general terms: you need to ensure that "old" invoices and "new" > invoices can be completely seperated". This is my goal at the moment. > Your inputs along with Hugo's inputs helped me to start my research on > the right path. > > But today, I have discovered that the SP heavily uses CURSORS > This is the other big issue I have to deal with. Check out this piece > of code (one of 3 CURSORS defined in SP): > > DECLARE inputs CURSOR FOR > SELECT i.accountid, il.doctypeid, > il.objectid, > (CASE WHEN ilb.objectid IS NOT NULL THEN ilb.objectid ELSE il.objectid > END), > il.itemid, ilb.batchid, d.updatestock * d.cogssign * (CASE WHEN > ilb.objectid IS NOT NULL THEN ilb.pdocquantity ELSE il.pdocquantity > END), > d.updatestock * d.cogssign * il.price * il.rate * (1 - il.discount / > 100) * (1 - i.discount / 100), > il.currencyid, il.rate, > il.warehouseid, dbo.f_date_notime(il.stockdate) > FROM > inventory i > JOIN inventoryline il ON (il.inventoryid = i.objectid) > JOIN item it ON (it.objectid = il.itemid) > JOIN doctype d ON (il.doctypeid = d.objectid) > JOIN status st ON (st.objectid = i.statusid) > JOIN warehouse w ON (w.objectid = il.warehouseid) > LEFT JOIN inventorylinebatch ilb ON (ilb.inventorylineid = il.objectid) > WHERE > d.cogssign = 1 > AND (st.final = 1 OR st.cancel = 1) > AND d.doctypekey NOT IN ('25','26') > AND il.stockdate >= dbo.f_date_notime(@last_date) + 1 > AND it.stockcontrol = 1 > AND ISNULL(w.nocost,0) = 0 > ORDER BY il.itemid, il.stockdate, d.doctypekey, i.docnumber, > il.create_date, ilb.create_date > > OPEN inputs > > ...follows fetching results in a loop > > I want to change this in the following manner: to create an temporary > table and to insert the results of the above SELECT statement into this > temp table. After that, using a cursor, I intend to fetch the records > from the temp table. This way, the tables involved in the SELECT stm > above will be locked as short as possible. Do you think this is a good > approach? > > Note: In the SELECT statement above: ALL tables involved have a primay > key defined as uniqueidentifier and named objectid. Also, ALL foreign > keys (FK) are indexed like this: > > CREATE INDEX [FK_inventoryline_inventoryid] ON > [dbo].[inventoryline]([inventoryid]) ON [PRIMARY] > GO > > > Thanks again. > |
| ||||
| On 26 Sep 2006 14:38:25 -0700, nano2k wrote: >Hugo, Dan >Thanks for your patience with this topic. > >I understand and agree to all your advices. > >My struggle, at least for the moment, is to make the application as >much responsive as possible while strongly looking for data security. >Let's say that for the moment we don't care how much the SP needs to >process - we care only to let other users work while the SP is running. >The SP will be run only a few times a month, but there are 2-3 fixed >days of the month when the SP needs to perform - it's all about >accounting >the SP runs. Hi nano2k, There are several ways to achieve that: - Run the SP at a time when no inserts are done. - If the SP only does reporting - run the SP on a copy of the database, restored from a recent backup. Since your SP does updates, you can't use this option. - Make the SP run as fast as possible. With only tens of thousands of rows in the database, you should be able to get performance in terms of minutes, maybe even less. Definitely not hours. - Make sure that all the right indexes are there to enable the processes to run in parallel without blocking each other. This can only be done with access to the complete code - i.e. all tables, constraints, and indexes, the stored proc, and the code used for inserts. You've already said that the code is big (long and complex), but not how big. My definition of long and complex might differ from yours. That being said, it is definitely possible that your code is to long and complex for help via usenet groups - in that case, you either have to do it yoursself or (if you lack the skills) hire a consultant to do it for you. >I need uniqueidentifiers because my app is a three-tier client-server >app (client/server/db) that needs to generate unique IDs from client >side. The client app may generate several records from different tables >that will be sent all together in a single request to the server. The >records need to be already coupled (that is, foreign keys must be >already known before sending the request to the database) before >sending them to the server to process. You're freely mingling real end-user requirement and implementation choices here. uniqueidentifiers are never an end-user requirement. They can be the best solution. I'm not sure if they are here. The only real requirements I read here are insertion of new data from several clients, and sending multiple related rows on a single access to keep network traffic low. It's not a given that you need surrogate keys for this. They can be handy in some cases, but in other cases, using only the business key is preferable. And even if you do need surrogate keys, then you can still use identity values (with one of several standard techniques to prevent duplicates when identity values get generated at different locations, if you're in a replicated scenario [which I don't think is the case]). > Anyway, I have >set a task to reindex the database every night, so I think this will >reduce the payback to using uniqueidentifiers - do you agree? Is there >any other action I should schedule? If you choose a low fill factor, the number of page splits will go down. This may give some relief. OTOH, it will also spread the data over more pages, increasing the amount of I/O needed for any request. >Dan, my uniqueid column (named [objectid]) is the only key, and at >least at the moment, this column is set as primary key (ok? not ok?). Primary key or not is irrelevant for now - having only a generated value as key is wrong. Google for some of my prevous musings on surrogate key vs business key, and for similar (and, to be honest, also a few contradicting) opinions by others. That being said, the index created to check the primary key defaults to being clustered. You don't want to cluster on uniqueidentifier. Change this index to nonclustered, then find a more sensible column or combination of columns for the clustered key. Even if it's a nonunique index. >My protection to duplicate entries is handled by the client >application. 99% of the business logic is concentrated in client app. Bad. Constraints should always be enforced in the DB level. You can ALSO enforce them in the client, but never forget the DB - that way, if an attacker compromissed the security of your client, (s)he still is unable to bypass your basic integrity checks. >But today, I have discovered that the SP heavily uses CURSORS I'm not surprised. That is about the only way to get 2 hour performance on database with less than 100K rows. >This is the other big issue I have to deal with. Check out this piece >of code (one of 3 CURSORS defined in SP): (snip) >I want to change this in the following manner: to create an temporary >table and to insert the results of the above SELECT statement into this >temp table. After that, using a cursor, I intend to fetch the records >from the temp table. This way, the tables involved in the SELECT stm >above will be locked as short as possible. Do you think this is a good >approach? No. You're standing behind a Formula 1 racing wagon and pushing it, and you're proposing to apply some grease to the axes to go faster. You should get in, start the engine and hit the pedal. There are several ways to improve cursor performance (such as changing to FAST_FORWARD READ_ONLY, using the temp table you propose, or other techniques). Some of them will make things worse. Others might give you some performance gain. But that's just peanuts compared to what you can probably gain by rewriting the complete stored procedure to operate in a set-based manner. >Note: In the SELECT statement above: ALL tables involved have a primay >key defined as uniqueidentifier and named objectid. I would also encourage you to revise your naming standards. I'll have to point you to google again, since my post is already quite long, and it's time for me to get some sleep. -- Hugo Kornelis, SQL Server MVP |