This is a discussion on Re: PGparam proposal within the pgsql Hackers forums, part of the PostgreSQL category; --> For starters, if binary results is a feature you wish you could uninvent then we are probably dead in ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| For starters, if binary results is a feature you wish you could uninvent then we are probably dead in the water for that reason. This goes to the core of our concept. If there is no desire to synch client & server in regards to type handling, than this a waste of time. I think all of this would make libpq more powerful. With that said, my follow up: > I think a printf-style API is fundamentally a bad idea in this > context. printf only works well when the set of concepts (datatypes, > format specifiers, etc) is small and fixed; As of now, including numbers and alpha chars, there are 62 classes of which we used 11 (a class is the first char). That leaves 51 classes with 62 types per class. Where I would agree with you, is that over time things can become cryptic. There are only several ways to do this. 1. Use Oids - doesn't work because they are not publically exposed by libpq. 2. Create a function for each type. This idea was rejected because it bloated the API (I'll buy this). 3. Type aliasing schema - naturally, printf-style comes to mind but may become cryptic. I should mention that we were only trying to support built-in pgtypes in libpq. If you are looking to support all external types, then we propose the below: For #3, maybe it would be better to abandon %c or %cc type encoding and move into something more verbose. We could just spell out the type: %int4, %point, etc... Maybe for built-in types you could prepend 'pg': %pgint4, %pgpoint. This opens up the namespace and removes scalability and cryptic issues. Expanding on %pgint4 idea, 3rd party types can supply their own %typname handlers (a more moduler approach). You can install them at runtime, PQinstallTypeHandler(typname, etc..), or something like that. When a 3rd party %typname is encountered, the appropriate handler would be used. Standard pgtypes would be installed by default. PQinstallTypeHandler(... "gisbox2d" ...); PQputf(... "%gisbox2d %pgpolygon %pgint4" ...); //PQgetf would use the same %typname The only thing libpq should support by default, is the built-in pgtypes. A handler can expand on this. > I find the idea of embedding state like that into the PGconn to be > pretty horrid, as well. It makes the design non-reentrant > > You can't just randomly change the behavior of existing API functions. Okay. We initially had the PGparam as a public opaque, but changed it. We will stop piggy backing off the existing parameterized functions. Instead, we will supply a PGparam exec/send functions. typedef struct pg_param PGparam;//opaque param = PQparamCreate(conn); PQputf(param, "%pgint4 %pgtimestamptz", 62, &tstz); res = PQparamExec(conn, param, "command", resfmt); //PQparamExec will always PQparamClear(param), whether it failed or not //That means after an exec/send, the param object is ready for puts PQparamFinish(param); // free it // This causes the below sequence of function calls: // PQparamCreate, PQputf, PQexecParams("... VALUES ($1)"), PQparamFinish res = PQparamExecf(conn, resfmt, "INSERT INTO t VALUES (%pgint)", 62); > * the 8.2-to-8.3 change in the width of type money We have code comments throughout the patches, as well as documented in the release notes. At this point, we solved getf by checking PGgetlength. If its 4, read4 otherwise read8. For putf, we would have to check the server version. > * the likely future change to type timestamptz to store original > timezone explicitly We would have to change how timestamptz handles the binary format from that version forward, looks like a switch on sversion for back+forwards compatibility. > * the likely future change to type text to store encoding/collation > info explicitly Would the server do text conversion and then pass the converted text back to the client? Or, would it choose a common encoding like UTF-8 and return the text with encoding and let the client convert. How does this affect text format? In the end, some of these changes would change the text format right? That would push these changes into the API users lap, to parse and fuss with. I just think it is cleaner to synch the binary and/or text formats with the server. If you are looking for ways to change the binary/text format of types w/o having to make the most recent clients aware of this, then I think we have lost this battle. Another solution is revamping utils/adt so that it is a shared API for client & server. If you upgrade a client, you would automatically get the latest formatting functions. Just like libpq checks protocol version in several places, conn->sversion would have to be checked, or maybe have a typefmt_api_version. andrew & merlin ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Andrew Chernow <ac@esilo.com> writes: > For starters, if binary results is a feature you wish you could uninvent > then we are probably dead in the water for that reason. This goes to > the core of our concept. Not really: AFAICS you could implement exactly the API you have sketched without any reliance on binary data transmission whatsoever. As long as PGparam is an opaque struct, library users would have no idea whether the values they provide are being sent as text or binary. You should probably take two steps back and think about what aspects of what you want to do are really involved with providing an easier-to-use API for PQexecParams and friends, and what parts are actually interested in binary data transmission (and why). Separating those issues in your mind might produce more clarity. > In the end, some of these changes would change the text format right? I'd consider that fairly unlikely. For instance, the money width change didn't impact the text format (except to the extent that longer values are now legal), and remembering a timestamptz's zone wouldn't impact the text representation either. Another example is that any significant re-implementation of type NUMERIC (say, as a bignum integer plus exponent instead of the current BCD-ish format) would probably change its binary representation, but there'd be no need for a text change. The bottom line to me is that binary representations are inherently a lot more fragile and version-dependent than text representations. Our attitude so far has been that client-side code that wants to use binary data transmission is taking all the risk of changes on itself. ("If it breaks, you get to keep both pieces.") It's not clear to me what we gain by making libpq subject to those risks. If we could have libpq insulate client apps from these kinds of changes, that would be one thing; but AFAICS, with these more complex types, a binary format change would usually also dictate a change in what the library exposes to clients. As far as I saw, your proposal completely glossed over the issue of exactly what data structure would be exposed to clients for anything more complex than an integer. I'm afraid that that structure would be subject to change, and then we'd just have two layers of brokenness on our hands instead of only one. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| >library users would have no idea whether > the values they provide are being sent as text or binary. The putf interface currently abstracts how it actually sends it. Although, you do put a C type rather than a string. There is a putstr "%pqtypstr", which puts the string representation of a type. > providing an easier-to-use > API for PQexecParams and friends, and what parts are actually > interested in binary data transmission (and why) There are two things happening here and two things we are trying to solve: 1. putf and PGparam - simpler interface for executing queries using parameterized functions. 2. getf - doesn't use PGparam at all. You pass it a PGresult. The goal of this is to translate either text or binary results into a uniform C type or structure. For instance, we expose the below structure for inet. /* This struct works with CIDR as well. */ typedef struct { /* convenience, value the same as first 2 bytes of sa_buf */ unsigned short sa_family; /* mask, ie. /24 */ int mask; int is_cidr; /* Cast to: sockaddr, sockaddr_in, sockaddr_in6, sockaddr_stroage */ int sa_len; char sa_buf[128]; /* avoid referencing sockaddr structs */ } PGinet; // "res" could be text or binary results, but PGinet remains the same. PGinet inet; PGgetf(res, tup_num, "%pginet", field_num &inet); connect(sock, (const struct sockaddr *)inet.sa_buf, (socklen_t)inet.sa_len); The above is simpler than examining "10.0.0.1/32" and converting it to a struct sockaddr. The same struct is used when putting a type as a "C type" rather than as a string. You can use getf without ever using putf+PGparam, and vise-versa. > your proposal completely glossed over the > issue of exactly what data structure would be exposed to clients for > anything more complex than an integer Complex types require a receiving structure on the client side, thus the types we added to libpq-fe.h: PGinet, PGpolygon, PGarray, PGtimestamp, etc... But keep in mind that these structs are the result of libpq extracting the data from text/binary formats and assigning data to struct members. It does not expose raw format, the API user can already get that via PQgetvalue(). > If we could have libpq insulate client apps from these kinds > of changes, that would be one thing; This is the goal of getf. The API user interfaces through PGinet, not through the output of PQgetvalue(). We propose that its libpq's job internally to handle changes to text or binary formats and expose consistent types/structures. > type NUMERIC (say, as a bignum integer plus > exponent instead of the current BCD-ish format) This is what we want to hide inside libpq's getf. Just expose a PGnumeric that has been translated into c types. We never expose the wire format, only the C translated version of it. > without any reliance on binary data transmission whatsoever. Yes this is possible, but at a performance & ease-of-use cost. Our performance tests didn't show much gain with strings or ints, but complex types were 10 times faster (putting/getting arrays, polygons, paths, etc...). So, the trade-off is a little more maintainence overhead on libpq. BTW, previously we mentioned a 3rd party handler api concept. This is not needed to get libpq up and going with built-in types (which is all we feel it should be responsible for). the handler API can always be added later w/o changing existing functions ... have to add a couple though. andrew & merlin ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| > your proposal completely glossed over the > issue of exactly what data structure would be exposed to clients for > anything more complex than an integer Yeah. Forgot to include the below in the proposal and the last email. Here is the lastest list for complex types. Most have been around since the last 0.5 patch. Still need numeric, date, time and a couple others. typedef struct { double x; double y; } PGpoint; typedef struct { PGpoint pts[2]; } PGlseg; typedef struct { PGpoint high; PGpoint low; } PGbox; typedef struct { PGpoint center; double radius; } PGcircle; /* When used with PQgetf, 'pts' must be freed with PQfreemem(). */ typedef struct { int npts; int closed; PGpoint *pts; } PGpath; /* When used with PQgetf, 'pts' must be freed with PQfreemem(). */ typedef struct { int npts; PGpoint *pts; } PGpolygon; /* This struct works with CIDR as well. */ typedef struct { /* here for convenience, value the same as first 2 bytes of sa_buf */ unsigned short sa_family; /* mask, ie. /24 */ int mask; int is_cidr; /* Cast to: sockaddr, sockaddr_in, sockaddr_in6, sockaddr_stroage */ int sa_len; char sa_buf[128]; } PGinet; typedef struct { int a; int b; int c; int d; int e; int f; } PGmacaddr; /* main problem with this type is that it can be a double * or int64 and that is a compile-time decision. This means * the client has a 50% chance of getting it wrong. It would * be nice if the server included an indicater or converted * the external format to one or the other. OR, make client * aware of server's timestamp encoding when it connects. */ typedef struct { /* When non-zero, this is a TIMESTAMP WITH TIME ZONE. When zero, * this is a TIMESTAMP WITHOUT TIME ZONE. When WITHOUT, gmtoff * will always be 0 and tzn will be "GMT". */ int withtz; /* binary timestamp from server (in host order). If haveint64 is * non-zero, use the 'ts.asint64' value otherwise use 'ts.asdouble'. */ int haveint64; union { struct { unsigned int a; signed int b; } asint64; double asdouble; } ts; /* microseconds */ int usec; /* GMT offset, some systems don't have this in struct tm */ int gmtoff; /* time zone name, some systems don't have this in struct tm */ char *tzn; /* broken-down time */ struct tm tm; } PGtimestamp; /* still working on this, may need access macros */ typedef struct { int dim; int lbound; } PGarraydim; typedef struct { int len; /* already in PGtype format. For instance: * (PGpolygon *)arr->items[i].data * or * printf("text=%s\n", arr->items[i].data); * * Could have a union of all types here but that * doesn't help much for 3rd party types. */ char *data; } PGarrayitem; typedef struct { Oid oid; /* type of items[].data */ int ndim; PGarraydim dims[MAXDIM]; int nitems; PGarrayitem *items; } PGarray; andrew & merlin ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| Here is our updated PGparam extension to the libpq api: http://archives.postgresql.org/pgsql...2/msg00356.php We have a patch implementing the following which we are cleaning up. We are also kicking around some ideas for arrays and possibly composite types which we may consider if the community wants to move forwards with this proposal. Tom made a number of comments some of which we have addressed: *) Separate PGparam from PGconn: we agree with this and separated them. *) Chanages to existing API functions: we agreed and moved new behavior to new functions *) 3rd party types: we now support this through a type registration interface *) Internal type changes: We think changes to binary format are fairly rare and easily addressed. *) Type confusion was removed by giving each type its own specifier. *) Objections to printf: We agreed in part: we moved to natural names, from %n4 to %pgint for example. This addressed scalability concerns and should be less cryptic to use. *) Argument passing in putf and getf is identical to the previous proposal. All we changed was the naming schema for the %spec and putf now takes a PGparam rather than a PGconn. * API INTERFACE /* opqaue */ typedef struct pg_param PGparam; PGparam *PQparamCreate(PGconn *conn); /* manually reset a param struct. This is done by * all execution functions for you. */ void PQparamReset(PQparam *param) /* free a PGparam */ void PQparamClear(PQparam *param); int PQputf( PGparam *param, const char *typeSpec, ...); int PQgetf( const PGresult *res, int tup_num, const char *fieldSpec, ...); /* PGparam Execution Functions */ PGresult *PQparamExec( PGconn *conn, PGparam *param, const char *command, int resultFormat); int PQparamSendQuery( PGconn *conn, PGparam *param, const char *command, int resultFormat); PGresult *PQparamExecPrepared PGconn *conn, PGparam *param, const char *stmtName, int resultFormat); int PQparamSendQueryPrepared PGconn *conn, PGparam *param, const char *stmtName, int resultFormat); /* All in wonder, no PGparam needed */ PGresult *PQexecParamsf( PGconn *conn, const char *commandSpec, int resultFormat, ...); /* All in wonder, no PGparam needed */ int PQsendQueryParamsf( PGconn *conn, const char *commandSpec, int resultFormat, ...); /* All in wonder, no PGparam needed */ PGresult *PQexecPreparedf( PGconn *conn, const char *stmtName, const char *typeSpec, int resultFormat, ...); /* All in wonder, no PGparam needed */ int PQsendQueryPreparedf( PGconn *conn, const char *stmtName, const char *typeSpec, int resultFormat, ...); /* gets the PGparam error message */ char *PQparamErrorMessage(const PGparam *param); * TYPE ALIAS SPECIFIERS The convention for postgresql built-in types is a % followed by the type alias. Every pgtype begins with "pg". For example: %pgint4 %pgpolygon %pgbox 3rd party types can register their own specifiers, which is discussed int the TYPE HANDLER SYSTEM section. Type aliases must be unique. * TYPE HANDLER SYSTEM typedef struct pg_typeputargs { /* The out buffer will be at least 16K. If more room is needed, * use the PQ_TYPE_SETOUT to grow the buffer. In most cases, * 16K is plenty of room. */ char *out; /* the size in bytes of the out buffer */ int outl; /* Should not use directly, see PQ_TYPE_SETOUT. For the brave, * set to 1 if you point the out buffer at memory that should be * freed after your put callback returns. */ int free_out; /* The arguments to putf. Use PQ_TYPE_NEXTARG. */ va_list *ap; /* The type's alias name, like 'pgint8'. */ const char *type_alias; /* Sets an error message. This msg shows up in * PQparamErrorMessage(). */ int (*seterr)(struct pg_typeputargs *args, const char *format, ...); } PGtypePutArgs; typedef struct pg_typegetargs { const PGresult *res; int tup_num; int field_num; /* pointer to the output of PQgetvalue for this tup+field */ char *value; /* The arguments to getf. Use PQ_TYPE_NEXTARG. NOTE: the field_num * supplied to getf has already been pulled out of the va_list and * assigned to this structs field_num member. */ va_list *ap; /* The type's alias name, like 'pgint8'. */ const char *type_alias; /* Sets an error message. This msg shows up in * PQresultErrorMessage(). */ int (*seterr)(struct pg_typegetargs *args, const char *format, ...); } PGtypeGetArgs; #define PQ_TYPE_NEXTARG(typeArgs, type) va_arg(*(typeArgs)->ap, type) /* makes sure that putArgs->out is larger enough for new_outl */ #define PQ_TYPE_SETOUT(putArgs, new_outl) do{ \ if((new_outl) > (putArgs)->outl) \ { \ (putArgs)->out = (char *)malloc(new_outl); \ if(!(putArgs)->out) \ return -1; \ *(putArgs)->out = 0; \ (putArgs)->outl = (new_outl); \ (putArgs)->free_out = 1; \ } \ } while(0) /* * Returns - the number of bytes put or -1 for error. */ typedef int (*PGtypePut)(PGtypePutArgs *args); /* * Returns - 0 for success or -1 for error. */ typedef int (*PGtypeGet)(PGtypeGetArgs *args); /* Indicates that a type handler should use text format for puts. * The default is binary. */ #define PQ_TEXTPUTFMT 0x01 /* By default, PQputf will make an internal copy of the type data. By * setting this flag, it will only maintain a pointer to the type data. * * An example of this flag is the %pgtextptr and %pgbyteaptr, both * of which can consume large amounts of memory. In most cases, it * is more effiecent to store a direct pointer rather than making a * copy. */ #define PQ_NOCOPYONPUT 0x02 /* register a type handler */ int PQregisterTypeHandler( const char *alias, Oid oid, PGtypePut put, PGtypeGet get, int flags); * TYPE HANDLER EXAMPLES Below are two examples of how to implement a libpq type handler. The postgresql int4 and text types are always available and would never need to be registered. This is just an example. PQregisterTypeHandler( "pgint4", /* The %alias used by PQputf/PQgetf, must be unique */ INT4OID, /* Oid of the type. not required - can be InvalidOid */ put_int4, /* put callback for this type */ get_int4, /* get callback for this type */ 0); /* no flags needed */ PQregisterTypeHandler( "pgtext", TEXTOID, put_text, get_text, PQ_TEXTPUTFMT); // // PQputf(param, "%pgint4", 100); // static int put_int4(PGtypePutArgs *args) { *(int *)args->out = (int)htonl((int)PQ_TYPE_NEXTARG(args, int)); return 4; } // // int n; // PQgetf(res, tup_num, "%pgint4", field_num, &n); // static int get_int4(PGtypeGetArgs *args) { int *i4p = PQ_TYPE_NEXTARG(args, int *); if(!i4p) return args->seterr(args, "%s out pointer is NULL", args->type_alias); *i4p = 0; if(PQgetisnull(args->res, args->tup_num, args->field_num)) return 0; /* text format conversion */ if(PQfformat(args->res, args->field_num) == 0) { int n; errno = 0; if((n = (int)strtol(args->value, NULL, 10)) == 0 && errno) return args->seterr(args, "%s string conversion failed: [%d] %s", args->type_alias, errno, strerror(errno)); *i4p = n; return 0; } /* binary format conversion */ *i4p = (int)ntohl(*(unsigned int *)args->value); return 0; } // // PQputf(param, "%pgtext", text); // static int put_text(PGtypePutArgs *args) { args->out = PQ_TYPE_NEXTARG(args, char *); return (args->out != NULL) ? (int)strlen(args->out) + 1 : 0; } // // char text[512]; // PQgetf(res, tup_num, "%pgtext", field_num, sizeof(text), text); // static int get_text(PGtypeGetArgs *args) { size_t textl; size_t bufl = PQ_TYPE_NEXTARG(args, size_t); char *buf = PQ_TYPE_NEXTARG(args, char *); if(!buf) return args->seterr(args, "%s out buffer is NULL", args->type_alias); *buf = 0; if(PQgetisnull(args->res, args->tup_num, args->field_num)) return 0; textl = PQgetlength(args->res, args->tup_num, args->field_num); if(bufl <= textl) return args->seterr(args, "%s buffer is too small", args->type_alias); memcpy(buf, args->value, textl + 1); return 0; } ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |