This is a discussion on Re: Practical impediment to supporting multiple SSL libraries within the pgsql Hackers forums, part of the PostgreSQL category; --> * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > It's only the functional equivalent when you ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > It's only the functional equivalent when you think all the world is a > > Postgres app, which is just not the case. > > If we are dumping data into a simple memory block in a format dictated > by libpq, then we haven't done a thing to make the app's use of that > data independent of libpq. Furthermore, because that format has to be > generalized (variable-length fields, etc), it will not be noticeably > easier to use than the existing PQresult API. The format of the structure *isn't* really dictated by libpq. The offsets, value length and record size is intended to support most any C array structure. Variable length fields have a max size which, if it goes over, an error is returned or indicated throgh the indicator array. It also gets it into the structure quite a few applications would like to have it in (which is certainly not PQresult). > What I would envision as a typical use of a callback is to convert the > data and store it in a C struct designed specifically for a particular > query's known result structure (say, a few ints, a string of a known > maximum length, etc). libpq can't do that, but a callback could do it > easily. Heh, this is exactly what I'm proposed we make libpq capable of doing, which is a relatively simple thing to do. I agree that it's often a goal of application devlopers to get it into this structure. The one downside is that at the moment I think the binary results from libpq come back in network-byte-order instead of host-byte-order. Oracle provided a way to indicate the types of the fields in the structure and performed some conversions (such as these) for you. The constants they used started with "SQL_" but I'm not entirely sure if they were actually defined in the standard or not. > The fixed-memory-block approach also falls over when considering results > of uncertain maximum size. Lastly, it doesn't seem to me to respond at > all to the ODBC needs that started this thread: IIUC, they want each row > separately malloc'd so that they can free selected rows from the > completed resultset. Results of uncertain maximum size aren't a problem at all... The caller can do the exact same thing libpq does (realloc), or it could allocate another array. *Each* call to the libpq function would return the number of elements actually populated into the memory-block; the caller would then be expected to pass in a *fresh* memory block for the next call (which could just be a simply calculated offset into the block they allocated, or could be a realloc'd block + offset, or a brand new block, etc...). I'm really not why there seem to be this "this won't work!" reaction. This isn't something I came up with out of whole cloth, it's an API that isn't unlike PQexecParams, is similar to something Oracle does (which I've used quite a bit for doing *exactly* what's mentioned above- I've got an array of pre-defined C structs that I know match the query and I want that array filled in) and is really not that complicated. > > For one thing, it's certainly possible the callback (to do a data > > transform like you're suggesting) would want access to the other > > information in a given tuple. Having to store a partial tuple in a > > temporary area which has to be built up to the full tuple before you can > > actually process it wouldn't be all that great. > > So instead, you'd prefer to *always* store partial tuples in a temporary > area, thereby making sure the independent-field-conversions case has > performance just as bad as the dependent-conversions case. > I can't follow that reasoning. I havn't been ruling out providing a callback mechanism as well but I think it's the 10% case and the 90% case is being shoe-horned into the 10% case with a performance degredation to boot. They're also not partial tuples, it's not a temporary area, and there's demonstratably less copying around of the data. It seems ODBC may be in the 10% piece here but I havn't looked at the ODBC source code yet. Thanks, Stephen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFEPqalrzgMPqB3kigRAtMpAJ9c+uW+fC375a9MtEsuCb Dk3/DTpwCfY4d2 GzIxgMU/xzUTgd6FuYFbuuE= =bNni -----END PGP SIGNATURE----- |
| |||
| * Greg Stark (gsstark@mit.edu) wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > So instead, you'd prefer to *always* store partial tuples in a temporary > > area, thereby making sure the independent-field-conversions case has > > performance just as bad as the dependent-conversions case. > > I can't follow that reasoning. > > I think there's some confusion about what problem this is aiming to solve. I > thought the primary problem ODBC and other drivers have is just that theywant > to be able to fetch whatever records are available instead of waiting forthe > entire query results to be ready. Honestly, I think that may be part of it but it seems they're more interested in storing the tuples in their own structure right away instead of keeping a PQresult around and using it everywhere. > All it sounded like to me was a need for a function that would wait untiln > records were available (or perhaps n bytes worth of records) then return. I'm not sure that you'd actually want to block until there was a certain amount returned, but that would be doable I suppose. > You seem to be talking about a much broader set of problems to solve. I'd like to improve the API in general to cover a set of use-cases that I've run into quite a few times (and apparently some others have too as other DBs offer a similar API). I'd also like the ODBC driver to be able to use libpq instead of having its own implementation of the wireline protocol. I was hoping these would overlap but it's possible they won't in which case it might be sensible to add two new metheds to the API (though I'm sure to get flak about that idea). Thanks, Stephen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFEPqm0rzgMPqB3kigRAiVTAJsE1kdMN3DhqI/siNShHeqQrF7GfgCglriX rpgtiMwG19BS5wv0R6FwHVQ= =ipzw -----END PGP SIGNATURE----- |
| |||
| Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > I think there's some confusion about what problem this is aiming to solve. I > > thought the primary problem ODBC and other drivers have is just that they want > > to be able to fetch whatever records are available instead of waiting for the > > entire query results to be ready. > > No, that's not what I'm thinking about at all, and I don't think Martijn > is either. The point here is that ODBC wants to store the resultset in > a considerably different format from what libpq natively provides, and > we'd like to avoid the conversion overhead. So how would you provide the data to the callback? And how does having a callback instead of a regular downcall give you any more flexibility in how you present the data? -- greg ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| * Greg Stark (gsstark@mit.edu) wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > Greg Stark <gsstark@mit.edu> writes: > > > I think there's some confusion about what problem this is aiming to solve. I > > > thought the primary problem ODBC and other drivers have is just that they want > > > to be able to fetch whatever records are available instead of waitingfor the > > > entire query results to be ready. > > > > No, that's not what I'm thinking about at all, and I don't think Martijn > > is either. The point here is that ODBC wants to store the resultset in > > a considerably different format from what libpq natively provides, and > > we'd like to avoid the conversion overhead. > > So how would you provide the data to the callback? And how does having a > callback instead of a regular downcall give you any more flexibility in how > you present the data? The callback can be called for each record without having to store any more than 1 tuple's worth of information in libpq. I suppose you could change things such that a call using the new interface only processes 1 tuple worth from the input stream instead and just not read any more data from the socket until there have been enough calls to process tuples. That's really more the double-memory issue though. There's also the double-copying that's happening and the have to to wait for all the data to come in before being able to read it, of course that last could be handled by cursors... Thanks, Stephen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFEPrd9rzgMPqB3kigRAjd+AJ42Ofk9LsFQtox6KvVSjv s9N/jDdACfZmbC kSdcUI81buPoc2SGNOi7bnU= =0mLN -----END PGP SIGNATURE----- |
| |||
| On Thu, Apr 13, 2006 at 03:42:44PM -0400, Stephen Frost wrote: > > You seem to be talking about a much broader set of problems to solve. > > I'd like to improve the API in general to cover a set of use-cases that > I've run into quite a few times (and apparently some others have too as > other DBs offer a similar API). I'd also like the ODBC driver to be > able to use libpq instead of having its own implementation of the > wireline protocol. I was hoping these would overlap but it's possible > they won't in which case it might be sensible to add two new metheds to > the API (though I'm sure to get flak about that idea). Well, the psqlODBC driver apparently ran into a number of problems with libpq that resulted in them not using it for their purpose. Given libpq primary purpose is to connect to PostgreSQL, it failing at that is something that should be fixed. The problem you're trying to solve is also important, it would be nice to find a good solution to that. I'm just not sure if it was relevent to the decision to bypass libpq. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFEPrnpIB7bNG8LQkwRAlssAJ9/gK/QL3heNUy9tgpsYVh7zmmrMACeIo/6 aQAEIDIRzHjSFgzObLmNVPs= =Gzc9 -----END PGP SIGNATURE----- |
| |||
| Martijn van Oosterhout wrote: -- Start of PGP signed section. > On Thu, Apr 13, 2006 at 03:42:44PM -0400, Stephen Frost wrote: > > > You seem to be talking about a much broader set of problems to solve. > > > > I'd like to improve the API in general to cover a set of use-cases that > > I've run into quite a few times (and apparently some others have too as > > other DBs offer a similar API). I'd also like the ODBC driver to be > > able to use libpq instead of having its own implementation of the > > wireline protocol. I was hoping these would overlap but it's possible > > they won't in which case it might be sensible to add two new metheds to > > the API (though I'm sure to get flak about that idea). > > Well, the psqlODBC driver apparently ran into a number of problems with > libpq that resulted in them not using it for their purpose. Given libpq > primary purpose is to connect to PostgreSQL, it failing at that is > something that should be fixed. > > The problem you're trying to solve is also important, it would be nice > to find a good solution to that. I'm just not sure if it was relevent > to the decision to bypass libpq. I know there was a lot of confusion over parallel development of psqlODBC and my guess is that current CVS is the best solution at this time. Of course, that doesn't invalidate the idea that this can be revisited as things settle down and improvements made. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> No, that's not what I'm thinking about at all, and I don't think Martijn >> is either. The point here is that ODBC wants to store the resultset in >> a considerably different format from what libpq natively provides, and >> we'd like to avoid the conversion overhead. > So how would you provide the data to the callback? And how does having a > callback instead of a regular downcall give you any more flexibility in how > you present the data? You'd hand the callback the raw data coming off the wire (pointer and byte count, probably), and then it could do whatever's appropriate. For instance, if the callback knows this field is to be converted to int, it could do atoi() and then store the integer. (Or if it knows the data is transmitted in binary, ntohl() would be the thing instead.) The basic point here is that the callback should replace all the parts of getAnotherTuple() that are responsible for storing data into the PGresult structure, including all of pqAddTuple. If you aren't satisfied with the PGresult representation, that's the level of flexibility you need, IMHO. I don't see the point of half-measures. Probably there would need to be at least three callbacks involved: one for setup, called just after the tuple descriptor info has been received; one for per-field data receipt, and one for per-tuple operations (called after all the fields of the current tuple have been passed to the per-field callback). Maybe you'd want a shutdown callback too, although that's probably not strictly necessary since whatever you might need it to do could be done equally well in the app after PQgetResult returns. (You still want to return a PGresult to carry command success/failure info, and probably the tuple descriptor info, even though use of the callbacks would leave it containing none of the data.) A useful finger exercise for validating the design would be to code up the default callbacks, ie, code to build the current PGresult structure using this API. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On Thu, Apr 13, 2006 at 09:00:10PM -0400, Tom Lane wrote: > Probably there would need to be at least three callbacks involved: > one for setup, called just after the tuple descriptor info has been > received; one for per-field data receipt, and one for per-tuple > operations (called after all the fields of the current tuple have > been passed to the per-field callback). Maybe you'd want a shutdown > callback too, although that's probably not strictly necessary since > whatever you might need it to do could be done equally well in the > app after PQgetResult returns. (You still want to return a PGresult > to carry command success/failure info, and probably the tuple descriptor > info, even though use of the callbacks would leave it containing none of > the data.) Sounds really good. The only thing now is that the main author of the wire-protocol code in psqlODBC has not yet made any comment on any of this. So we dont want to set anything in stone until we know it would solve their problem... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFEP7eBIB7bNG8LQkwRAl8CAJ0WzjYUNEivEmU6FysiUk 8jHl5H5ACeOPw1 oYgUuR7narRW8S9TjZYZ7P0= =RhqW -----END PGP SIGNATURE----- |
| |||
| On Fri, Apr 14, 2006 at 04:53:53PM +0200, Martijn van Oosterhout wrote: > Sounds really good. <snip> There's a message on the pgsql-odbc mailing list[1] with some reasons for not using libpq: 1. The driver sets some session default parameters(DateStyle, client_encoding etc) using start-up message. As far as I can see it only does this when the environment variables are set. Which IMHO is the correct behaviour. If psqlodbc doesn't honour them that does violate the principle of least surprise. OTOH, the users of ODBC possibly shouldn't be affected by the environment variables of the user, given the user of ODBC likely doesn't know (or care) that PostgreSQL is involved. 2. You can try V2 protocol implementation when the V3 implementation has some bugs or performance issues. Well, there is a point here, you can't force the version. It always defaults to 3 if available. 3. Quote: I don't know what libraries the libpq would need in the future but it's quite unpleasant for me if the psqlodbc driver can't be loaded with the lack of needeless librairies. It's a reason, just not a good one IMHO. If the user has installed libpq with a number of libraries, then that's what the user wants. I'm not sure why psqlODBC is worried about that. So while this thread has produced several good ideas (which possibly should be implemented regardless), perhaps we should focus on these issues also. Have a nice day, [1] http://archives.postgresql.org/pgsql...4/msg00052.php -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFEP76AIB7bNG8LQkwRAtUhAJ9kk6ajyPRisV1YtgCNDl LL6bPHIQCeJrzJ yNv8//ijqBQsHqKn9oWMR14= =PQdt -----END PGP SIGNATURE----- |
| ||||
| Martijn van Oosterhout wrote: >On Fri, Apr 14, 2006 at 04:53:53PM +0200, Martijn van Oosterhout wrote: > > >>Sounds really good. >> >> ><snip> > >There's a message on the pgsql-odbc mailing list[1] with some reasons >for not using libpq: > >1. The driver sets some session default parameters(DateStyle, > client_encoding etc) using start-up message. > >As far as I can see it only does this when the environment variables >are set. Which IMHO is the correct behaviour. > IMHO if libpq is to be a generic library it should first provide exactly what it can do using the protocol. *Environment varibales* are not appropriate for per application/datasource settings at all. >3. Quote: I don't know what libraries the libpq would need in the >future but it's quite unpleasant for me if the psqlodbc driver can't be >loaded with the lack of needeless librairies. > >It's a reason, just not a good one IMHO. If the user has installed >libpq with a number of libraries, then that's what the user wants. I'm >not sure why psqlODBC is worried about that. > > It's very important to clarify for what the libraries are needed and my basic policy is to provide appropriate bindings(linkage) between the libraries for the current dependency relation. As for SSL mode it is only a mere extra for the current enhanced driver. My main purpose was to finish up my unfinished work before 7.4 using V3 protocol, holdable cursors etc. The current driver under Windows is available without the existence of libpq. regards, Hiroshi Inoue ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |