vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi guys, Got a problem now now we got a project handling records saved in a table in a sql 2000(will upgraded to 2005 soon) server. every month around a million records will be inserted. now user raised a request, that is, once criterios are matched, the project should do some backend handle, for example, if SELECT colID, fieldC, fieldD FROM dataTable WHERE fieldA = 'fieldA_valueB' returns some recordset, for each @fieldC, @fieldD we shall do some back-end trick, maybe UPDATE dataTable SET fieldE = 'fieldE_valueF' WHERE fieldC = @fieldC AND fieldD = @fieldD let's say such a rule is named as rule01. Hope I'm expressing the problem clearly? my questoin is, shall we do this in Database side, using triggers, or by informing our .net project to do it? 1. since the records are coming around 1 millison per month, how can we handling the performance issue? 2. now the rules are still somewhat simple, seems at least we could do it by EXEC or SP_EXECUTESQL. but rules may turn quite complex, for example, audit log, or more complex issues, shall we do it in .net program? but how can a trigger in SQL database informing a .net program, or webservice, or windows service? by executeing an executable console program? 3. user may raise more and more rules, how could we provied a flexible solution? i mean we're trying to build it less hard-coded. seems in sql database EXEC or SP_EXECUTESQL are still somewhat flexible, while in ..net to do something like eval() in javascript or EXEC in sql server is a little bit troublesome. but, put all these bussiness logic in stored procedure sounds a little bit weired. guys, hope i have made myself clear. any suggestion? Thanks very mcuh yours, athos. |
| |||
| 1. You certainly ought to do this set-based rather than row by row as your code fragment implied. What you can also do is batch the UPDATEs into X rows at a time. For example: SET ROWCOUNT 50000 WHILE 1=1 BEGIN UPDATE dataTable SET fieldE = 'fieldE_valueF' WHERE EXISTS (SELECT * FROM dataTable AS T WHERE fieldA = 'fieldA_valueB' AND T.fieldC = dataTable.fieldC AND T.fieldD = dataTable.fieldD); IF @@ROWCOUNT = 0 BREAK END SET ROWCOUNT 0 2. Why would you want to do audit logging in the front end? I don't understand the question. 3. What is your reason for wanting dynamic code rather than static? This just comes down to change management controls. Dynamically altering fundamental chunks of business logic without a release and test cycle is a great recipe for really screwing things up. Would you consider doing the same thing for C# code? I don't think so. -- David Portas SQL Server MVP -- |
| |||
| Dear David, thanks for your time. 1. for performance, i mean, as the records are coming at around 1 million per month, when this project comes to production we'll have around 100 months' records loaded, while, for each month, less than 1000 triggers will happen, so every month we are going to look for 10^3 records in 10^8 records, just like to find a small fish in a big pond. as our developing is going somewhat well, user said they may want a daily update, which means searching for 50 records in around 10^8 records, and update, i'm worried about the performance. 2. this is a finance project so every change will be recorded for audit. well, the point is, the back-end trick may be quite complex, while i feel it's better to do complex calculation and handling in ..net, there are more tools, for example, RegExp. To be frank, this is an in-house developed project while our users are too strong for not even me, but my HOD to argue. Maybe sql 2005 will be just powerful enough, but I really don't know how complex the request will be, or how far will our dear users go. that's why I'm wondering whehter shall do the bussiness logic in .net, in this case we need to invoke the .net from sql database side, which way will be the best way to do so? 3. good point, i'm convinced. i'll report to users. u know, my users are quite ad hoc in techniques, using ipod, wanting everything dynamic, while never worries about coders' headache. thanks. thanks David, please consider issue 1 and 2. really apperciate your help. yours, athos. |
| |||
| 1. 50 rows out of millions isn't a problem in principle. Having the right indexes in place will very important but there's no reason to assume that performance will be poor just because you have millions of rows. 2. The database is surely the best place for your audit trail because there are obvious advantages (performance / storage / bandwidth) to keeping the audit and data in the same location. Triggers and audit table(s) are pretty much the standard solution for this. In SQL Server 2005 you can put .NET code in the database too so there shouldn't be any technical obstacles to implementing what's required server-side. -- David Portas SQL Server MVP -- |
| |||
| 1. man, i'm not talking about 50 in a million, but 50 in 10^8, which is 100 million .. will this be OK? 2. good news to hear, seems my booking the .net rock at Nov.29 is a smart decision thanks David. |
| |||
| "athos" <athos.liu@gmail.com> wrote in message news:1132104930.665865.220910@f14g2000cwb.googlegr oups.com... > 1. man, i'm not talking about 50 in a million, but 50 in 10^8, which is > 100 million .. will this be OK? > Why shouldn't it be OK? Take a look at wintercorp.com who recently listed a SQL Server database with 67 billion rows. With the right hardware and implementation such things are perfectly possible. However, design and implementation of an efficient scalable database requires skill and experience. I'd say that you should be asking these questions of the people who will design and build your database. If you don't already have that expertise available to you then start hiring. Hope this helps. -- David Portas SQL Server MVP -- |
| ||||
| David Portas (REMOVE_BEFORE_REPLYING_dportas@acm.org) writes: > 1. You certainly ought to do this set-based rather than row by row as > your code fragment implied. What you can also do is batch the UPDATEs > into X rows at a time. For example: > > SET ROWCOUNT 50000 > > WHILE 1=1 > > BEGIN > > UPDATE dataTable > SET fieldE = 'fieldE_valueF' > WHERE EXISTS > (SELECT * > FROM dataTable AS T > WHERE fieldA = 'fieldA_valueB' > AND T.fieldC = dataTable.fieldC > AND T.fieldD = dataTable.fieldD); > > IF @@ROWCOUNT = 0 BREAK > > END > > SET ROWCOUNT 0 Note that in SQL 2005, this is better written as: WHILE 1=1 BEGIN UPDATE TOP 50000 dataTable SET fieldE = 'fieldE_valueF' WHERE EXISTS (SELECT * FROM dataTable AS T WHERE fieldA = 'fieldA_valueB' AND T.fieldC = dataTable.fieldC AND T.fieldD = dataTable.fieldD); IF @@ROWCOUNT < 50000 BREAK END The TOP is new syntax, which Microsofts recommend over SET ROWCOUNT, as it is more optimizer-friende. The change for the check on @@rowcount is just a small optimisation. I also like to add that in my experience is that for these batching- operations to be meaningful, you should base the batch on the clusrered index of the table. Else the time it takes for SQL Server to locate the rows in the batch can be expensive. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |