This is a discussion on PERSISTANT PREPARE (another point of view) within the pgsql Sql forums, part of the PostgreSQL category; --> Hi, We are new to Postgre, actually we are migrating from MICROSOFT DBMS technologies to...hopefully Postgre. Our company is ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, We are new to Postgre, actually we are migrating from MICROSOFT DBMS technologies to...hopefully Postgre. Our company is involved in ERP business software in Serbia and region, currently counting over 200 clients. Some of them have DB's over 4GB in size. Reason for posting is implementation of PREPARE statements. I've read a thread "# PREPARE and stuff PFC" on pgsql-performance 2007-06 list and I do agree that it would not gain performance issues. What could we gain by introducing a kind of global prepared statement area, is SIMPLICITY of DB DEVELOPMENT AND MAINTENANCE. Here is our point of view: We have an application layer running over db layer. Application layer consists of classes and interfaces and db layer contains data and various data manipulation structures. Application layer calls SQL statements expecting some datasets as results (inventory list for instance). What it doesn't care about is HOW is query built (sorting, conditions, etc.) as long as it returns EXPECTED columns.Application simplly calls EXECUTE <statement> (<param1>,<param2>...). Developers working application layer do not interfere with developers workingon DB and queries. Plus MOST queries can be written to be reusable in various situations !!! The idea is: LETS SEPARATE SQL STATEMENTS FROM APPLICATION CODE. This way, we can introduce fine tuning to each of our clients without having to recompile our application. We can also work on improvements of queries performance and complexity without recompile of the application layer. Since one company has one set of rules PREPARED statements apply to every client connected to that database. Now, instead of preparing statements on each connection request (and we use around 900 prepared statements), why couldn't we simply prepare these statements ONCE and keep them in some global storage for future everyday usage. We use this approach for forms & reports creation where Crystal Report engine creates outlined report based on static prepared statement. This approach is probably not suitable for large db systems with tons of data, but is very efficient in 90% of small and medium business size databases. Please consider this issue when planning your WISH LIST or hopefully To-do-task-list. Ill be glad to here comments on this topic as well. Milan Oparnica MELANY SOFWARE TEAM __________________________________________________ _______________ Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! http://spaces.live.com/spacesapi.asp...aspx&mkt=en-us |
| |||
| Milan Oparnica <milan.opa@hotmail.com> writes: > [ much snipped ] > What could we gain by introducing a kind of global prepared statement area, is SIMPLICITY of DB DEVELOPMENT AND MAINTENANCE. > The idea is: LETS SEPARATE SQL STATEMENTS FROM APPLICATION CODE. Most people around this project think that the best way to do that is to push as much logic as you can into server-side stored procedures. That gives you every advantage that a persistent-prepared-statement feature would offer, and more besides: * you can push procedural logic, as well as SQL, out of the application * you can improve performance by reducing the number of network round trips needed to accomplish a multi-SQL-statement task regards, tom lane -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql |
| |||
| Tom Lane wrote: > Most people around this project think that the best way to do that is to > push as much logic as you can into server-side stored procedures. That > gives you every advantage that a persistent-prepared-statement feature > would offer, and more besides: > It's simply to complicated to return recordsets through server-side stored procedures. They are obviously designed to do complex data manipulation, returning few output variables informing the caller about final results. Returning records through sets of user-defined-types is memory and performance waste (please see my previous post as reply to Steve for more details). Plus it's hard to maintain and make improvements to such a system. I hate to see 800 user types made for every query we made as stored procedure. I don't say it couldn't be done through sp but maybe you guys could provide us with much easier (and efficient) way to organize and fetch common SQL statements. Something between VIEWS and STORED PROCEDURES, something like PARAMETERIZED VIEWS or PERSISTENT PREPARED statement. Either way, it would be a place where we could use only PURE SQL syntax. I think it's 90% of what any database application does. > * you can push procedural logic, as well as SQL, out of the application Application developing tools usually offer bigger set of functions, objects, methods etc. than any DB stored procedure language can. There is also debugging, code version control software, team development software and lots of other stuff. It's just more efficient to keep the logic in the application part of the system. Just compare the IDE editors to any DB Admin Tool. > > * you can improve performance by reducing the number of network round > trips needed to accomplish a multi-SQL-statement task > > regards, tom lane > I couldn't agree more. Such tasks are decidedly for SP's. I'm thinking about 90% of simple to mid-simple tasks (reports, logins, retriving customer and item properties, warehouse inventory and other stuff) that are simple matter of SQL or SQL's in a union with few parameters for filtering the data. I see programmers hard-coding such SQL statements in PHP, C++, Delphi or VB projects. Why? Is it to complex to have it implemented in Postgre engine? We have PREPARE statement, locally for the user. Is it possible to take it globally, for all users, and not to forget it when all connections dye? It is a way to get all of SQL statement out of the application not only "as much logic as you can". As a leader of our development team I find it HIGHLY (and I mean HIGHLY) DESIRABLE. Best regards, Milan Oparnica |
| |||
| Milan Oparnica wrote: > > It's simply to complicated to return recordsets through server-side > stored procedures. They are obviously designed to do complex data > manipulation, returning few output variables informing the caller about > final results. Returning records through sets of user-defined-types is > memory and performance waste (please see my previous post as reply to > Steve for more details). Plus it's hard to maintain and make > improvements to such a system. I hate to see 800 user types made for > every query we made as stored procedure. Is this topic completely out of scope in Postgre ? If I'm missing something too obvious or too important, please let me know what. I run over and over through internet and Postgre documentation and still found nothing. Is there a better place to communicate with Postgre developers ? Sincerely, Milan Oparnica |
| |||
| Hello 2008/7/16 Milan Oparnica <milan.opa@gmail.com>: > Milan Oparnica wrote: >> >> It's simply to complicated to return recordsets through server-side stored >> procedures. They are obviously designed to do complex data manipulation, >> returning few output variables informing the caller about final results. >> Returning records through sets of user-defined-types is memory and >> performance waste (please see my previous post as reply to Steve for more >> details). Plus it's hard to maintain and make improvements to such a system. >> I hate to see 800 user types made for every query we made as stored >> procedure. > > Is this topic completely out of scope in Postgre ? > If I'm missing something too obvious or too important, please let me know > what. > > I run over and over through internet and Postgre documentation and still > found nothing. > try to write prototype and show advantages. I am able to undestand advantages of persistent prep. stamenents, but I see some disadvatage too. Mainly you have to manage some shared memory space for stored plans. It's not easy task - MySQL develepoers can talk. Implemenation on postgresql is little bit dificult - lot of structures that lives in processed memory have to be moved to shared memory. This feature is nice, but question is - who do write it? Actually this problem is solved from outside - with pooling. Regards Pavel Stehule > Is there a better place to communicate with Postgre developers ? > > Sincerely, > > Milan Oparnica > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql > -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql |
| |||
| Pavel wrote: > > try to write prototype and show advantages... Prototype of what, implementation into Postgre or just efficiency of PRESISTANT PREPARE idea ? > ...but I see some disadvatage > too. Mainly you have to manage some shared memory space for stored > plans. It's not easy task - MySQL develepoers can talk. Implemenation > on postgresql is little bit dificult - lot of structures that lives in > processed memory have to be moved to shared memory. > Is it solved in MySQL or they've just tried ? We could have only PREP STATEMENT definition stored in shared memory (probably something like stored procedures), and it could be run in local processed memory. We could even assume only fetching data would be used through PREP STATEMENTS for start, and later introduce data modification. Is there some simplified PG algorithm we could use to understand the amount of work needed for introducing such feature to PG? > This feature is nice, but question is - who do write it? With a little help form PG developers and good documentation perhaps I could put some programmers from my team on this job. They are mostly C++ programmers but we have Delphi and Java if needed. > Actually this problem is solved from outside - with pooling. > I'm very interested to learn more about this solution. Can you please send me details or some links where I could research this solution ? Thank you for your reply Pavel. |
| |||
| Richard Huxton wrote: >> Milan Oparnica wrote: >> >> It's simply to complicated to return recordsets through >>server-side stored procedures. They are obviously designed to do >>complex data manipulation ... > Richard wrote: >I'm not convinced it's always a win one way or another. >You still haven't said what's "too complicated" about defining a >function: > >CREATE FUNCTION users_at_dotcom(text) RETURNS SETOF users AS $$ > SELECT * FROM users WHERE email LIKE '%@' || $1 || '.com'; >$$ LANGUAGE SQL; > Richard Huxton > Archonet Ltd ------------------------------------------------------------------- Hi Richard, It sounds like you suggest not using PREPARED statement nor stored procedures to fetch data. What do you think is the best way ? The example you posted is the only situation where it's simple to use stored procedures to fetch data. -------------------------------------------------------------------- Try to write following simple scenario: a. Data is retrieved from two tables in INNER JOIN b. I don't need all fields, but just some of them from both tables Lets call tables Customers and Orders. Definition of tables are: Customers (CustomID INTEGER, Name TEXT(50), Adress TEXT(100)) Orders (OrderID INTEGER, CustomID INTEGER, OrderNum TEXT(10)) Now I need a list of order numbers for some customer: SELECT C.CustomID, C.Name, O.OrderNum FROM Customers C INNER JOIN Orders O ON C.CustomID=O.CustomID WHERE C.Name LIKE <some input parameter> Can you write this without defining an SETOF custom data type ? ---------------------------------------------------------------------- NOTE! THIS IS VERY SIMPLIFIED REPRESENTATION OF REAL-LIFE STRATEGY. ---------------------------------------------------------------------- We sometimes have JOINS up to 10 tables. Besides, using report engines (like Crystal Reports) forces you to avoid queries where column order of the recordset can change. If you built a report on a query having CutomID,Name,OrderNum columns adding a column (CustomID,Name,Adress,OrderNum) will require recompiling the report if you want it to give correct results. Thats one of the reasons we avoid SELECT * statements. Another is because some user roles do not have permissions to examine table structures. In such cases SELECT * returns error. I hope I managed to present what I meant by "too complicated" when using stored procedures to fetch data. PREPARED statements do not suffer from such overhead. They simply return records as if the statement was prepared in the client. I will repeat, it took 5 minutes for prepared statement to return results of the same SQL that took 16 minutes for the stored procedure to do so. SP was written to return SETOF user type. If you want, I'll send you the exact SQL and the database. Later we tested other queries and it was always better performance using prepared statements then stored procedures with SETOF user defined types. Best regards, Milan Oparnica |
| |||
| Richard Huxton wrote: >> Milan Oparnica wrote: >> >> It's simply to complicated to return recordsets through >>server-side stored procedures. They are obviously designed to do >>complex data manipulation ... > Richard wrote: >I'm not convinced it's always a win one way or another. >You still haven't said what's "too complicated" about defining a >function: > >CREATE FUNCTION users_at_dotcom(text) RETURNS SETOF users AS $$ > SELECT * FROM users WHERE email LIKE '%@' || $1 || '.com'; >$$ LANGUAGE SQL; > Richard Huxton > Archonet Ltd ------------------------------------------------------------------- Hi Richard, It sounds like you suggest not using PREPARED statement nor stored procedures to fetch data. What do you think is the best way ? The example you posted is the only situation where it's simple to use stored procedures to fetch data. -------------------------------------------------------------------- Try to write following simple scenario: a. Data is retrieved from two tables in INNER JOIN b. I don't need all fields, but just some of them from both tables Lets call tables Customers and Orders. Definition of tables are: Customers (CustomID INTEGER, Name TEXT(50), Adress TEXT(100)) Orders (OrderID INTEGER, CustomID INTEGER, OrderNum TEXT(10)) Now I need a list of order numbers for some customer: SELECT C.CustomID, C.Name, O.OrderNum FROM Customers C INNER JOIN Orders O ON C.CustomID=O.CustomID WHERE C.Name LIKE <some input parameter> Can you write this without defining an SETOF custom data type ? ---------------------------------------------------------------------- NOTE! THIS IS VERY SIMPLIFIED REPRESENTATION OF REAL-LIFE STRATEGY. ---------------------------------------------------------------------- We sometimes have JOINS up to 10 tables. Besides, using report engines (like Crystal Reports) forces you to avoid queries where column order of the recordset can change. If you built a report on a query having CutomID,Name,OrderNum columns adding a column (CustomID,Name,Adress,OrderNum) will require recompiling the report if you want it to give correct results. Thats one of the reasons we avoid SELECT * statements. Another is because some user roles do not have permissions to examine table structures. In such cases SELECT * returns error. I hope I managed to present what I meant by "too complicated" when using stored procedures to fetch data. PREPARED statements do not suffer from such overhead. They simply return records as if the statement was prepared in the client. I will repeat, it took 5 minutes for prepared statement to return results of the same SQL that took 16 minutes for the stored procedure to do so. SP was written to return SETOF user type. If you want, I'll send you the exact SQL and the database. Later we tested other queries and it was always better performance using prepared statements then stored procedures with SETOF user defined types. Best regards, Milan Oparnica |
| |||
| I found this link from IBM DB2 developers showing why PERSISTENT PREPARE is a good idea and how could it be implemented. http://www.hoadb2ug.org/Docs/Favero20606.pdf It seems that main benefit (beside efficiency) is memory. Having number of connections all with dozens of PREPARED statements consumes lot of memory. If we put these statements in global space (perhaps not even in RAM memory until needed) we could reuse optimization plans as well as physical space needed to store them. I also found articles: http://archives.postgresql.org/pgsql...4/msg00867.php and http://archives.postgresql.org/pgsql...3/msg00480.php all about users trying to explain PERSISTENT PREPARE (some refer to it as global prepare). There are also some guys who actually made some code for PERSISTENT PREPARE: http://archives.postgresql.org/pgsql...3/msg01228.php and http://archives.postgresql.org/pgsql...3/msg01219.php PEOPLE NEED THIS FEATURE !!! It is not discovering the wheel but it will simplify DB programming and even gain some performance. NONE OF POPULAR SQL DBMS (Oracle, MS SQL, MySQL, Postgre, INTERBASE, FIREBIRD) HAVE THIS FEATURE. WHY ? |
| ||||
| Milan Oparnica wrote: > I found this link from IBM DB2 developers showing why PERSISTENT PREPARE > is a good idea and how could it be implemented. [snip] > NONE OF POPULAR SQL DBMS (Oracle, MS SQL, MySQL, Postgre, INTERBASE, > FIREBIRD) HAVE THIS FEATURE. > > WHY ? I suspect that people tend to use SQL or PL/PgSQL stored procedures instead. I'm not 100% sure SQL functions cache their query plans, but I know PL/PgSQL does. Exactly what is gained by the use of persistent prepare over the use of a stored procedure? What would the interface to the feature be through database access drivers like JDBC? Explicit PREPARE GLOBAL or similar, then invocation with EXECUTE ? How would users using increasingly common layers like Hibernate/JPA use it? I'm also curious about how you'd address the possible need for periodic re-planning as the data changes, though AFAIK SQL functions suffer from the same limitation there. I guess I personally just don't understand what the point of the persistent prepare feature you describe is. However, this post that you linked to: http://archives.postgresql.org/pgsql...4/msg00867.php actually describes a query plan cache, rather than persistent prepare. The post assumes the app will explicitly manage the cache, which I'm not sure is a good idea, but I can see the point of a plan cache. There might be some heuristics Pg could use to decide what to cache and to evict (planner time cost vs memory use, frequency of use, etc) so the app doesn't have to know or care about the plan cache. However, I'm not too sure how you'd match an incoming query to a cached plan, and determine that the plan was still valid, with enough speed to really benefit from the plan cache. Then again, I don't know much about Pg's innards, so that doesn't mean much. Tom Lane responded to that post to point out some of the complexities: http://archives.postgresql.org/pgsql...4/msg00868.php -- Craig Ringer -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql |