Unix Technical Forum

Re: PGparam proposal

This is a discussion on Re: PGparam proposal within the pgsql Hackers forums, part of the PostgreSQL category; --> For starters, if binary results is a feature you wish you could uninvent then we are probably dead in ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-15-2008, 11:36 PM
Andrew Chernow
 
Posts: n/a
Default Re: PGparam proposal

For starters, if binary results is a feature you wish you could uninvent
then we are probably dead in the water for that reason. This goes to
the core of our concept. If there is no desire to synch client & server
in regards to type handling, than this a waste of time. I think all of
this would make libpq more powerful. With that said, my follow up:

> I think a printf-style API is fundamentally a bad idea in this
> context. printf only works well when the set of concepts (datatypes,
> format specifiers, etc) is small and fixed;


As of now, including numbers and alpha chars, there are 62 classes of
which we used 11 (a class is the first char). That leaves 51 classes
with 62 types per class. Where I would agree with you, is that over
time things can become cryptic.

There are only several ways to do this.

1. Use Oids - doesn't work because they are not publically exposed by libpq.

2. Create a function for each type. This idea was rejected because it
bloated the API (I'll buy this).

3. Type aliasing schema - naturally, printf-style comes to mind but may
become cryptic.

I should mention that we were only trying to support built-in pgtypes in
libpq. If you are looking to support all external types, then we
propose the below:

For #3, maybe it would be better to abandon %c or %cc type encoding and
move into something more verbose. We could just spell out the type:
%int4, %point, etc... Maybe for built-in types you could prepend 'pg':
%pgint4, %pgpoint. This opens up the namespace and removes scalability
and cryptic issues.

Expanding on %pgint4 idea, 3rd party types can supply their own %typname
handlers (a more moduler approach). You can install them at runtime,
PQinstallTypeHandler(typname, etc..), or something like that. When a
3rd party %typname is encountered, the appropriate handler would be
used. Standard pgtypes would be installed by default.

PQinstallTypeHandler(... "gisbox2d" ...);
PQputf(... "%gisbox2d %pgpolygon %pgint4" ...);
//PQgetf would use the same %typname

The only thing libpq should support by default, is the built-in pgtypes.
A handler can expand on this.

> I find the idea of embedding state like that into the PGconn to be
> pretty horrid, as well. It makes the design non-reentrant
>
> You can't just randomly change the behavior of existing API functions.


Okay. We initially had the PGparam as a public opaque, but changed it.
We will stop piggy backing off the existing parameterized functions.
Instead, we will supply a PGparam exec/send functions.

typedef struct pg_param PGparam;//opaque

param = PQparamCreate(conn);
PQputf(param, "%pgint4 %pgtimestamptz", 62, &tstz);
res = PQparamExec(conn, param, "command", resfmt);
//PQparamExec will always PQparamClear(param), whether it failed or not
//That means after an exec/send, the param object is ready for puts
PQparamFinish(param); // free it

// This causes the below sequence of function calls:
// PQparamCreate, PQputf, PQexecParams("... VALUES ($1)"), PQparamFinish
res = PQparamExecf(conn, resfmt, "INSERT INTO t VALUES (%pgint)", 62);

> * the 8.2-to-8.3 change in the width of type money


We have code comments throughout the patches, as well as documented in
the release notes. At this point, we solved getf by checking
PGgetlength. If its 4, read4 otherwise read8. For putf, we would have
to check the server version.

> * the likely future change to type timestamptz to store original
> timezone explicitly


We would have to change how timestamptz handles the binary format from
that version forward, looks like a switch on sversion for back+forwards
compatibility.

> * the likely future change to type text to store encoding/collation
> info explicitly


Would the server do text conversion and then pass the converted text
back to the client? Or, would it choose a common encoding like UTF-8
and return the text with encoding and let the client convert. How does
this affect text format?

In the end, some of these changes would change the text format right?
That would push these changes into the API users lap, to parse and fuss
with. I just think it is cleaner to synch the binary and/or text
formats with the server. If you are looking for ways to change the
binary/text format of types w/o having to make the most recent clients
aware of this, then I think we have lost this battle.

Another solution is revamping utils/adt so that it is a shared API for
client & server. If you upgrade a client, you would automatically get
the latest formatting functions. Just like libpq checks protocol
version in several places, conn->sversion would have to be checked, or
maybe have a typefmt_api_version.

andrew & merlin


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-15-2008, 11:36 PM
Tom Lane
 
Posts: n/a
Default Re: PGparam proposal

Andrew Chernow <ac@esilo.com> writes:
> For starters, if binary results is a feature you wish you could uninvent
> then we are probably dead in the water for that reason. This goes to
> the core of our concept.


Not really: AFAICS you could implement exactly the API you have sketched
without any reliance on binary data transmission whatsoever. As long
as PGparam is an opaque struct, library users would have no idea whether
the values they provide are being sent as text or binary.

You should probably take two steps back and think about what aspects of
what you want to do are really involved with providing an easier-to-use
API for PQexecParams and friends, and what parts are actually interested
in binary data transmission (and why). Separating those issues in your
mind might produce more clarity.

> In the end, some of these changes would change the text format right?


I'd consider that fairly unlikely. For instance, the money width change
didn't impact the text format (except to the extent that longer values
are now legal), and remembering a timestamptz's zone wouldn't impact
the text representation either. Another example is that any significant
re-implementation of type NUMERIC (say, as a bignum integer plus
exponent instead of the current BCD-ish format) would probably change
its binary representation, but there'd be no need for a text change.

The bottom line to me is that binary representations are inherently a
lot more fragile and version-dependent than text representations.
Our attitude so far has been that client-side code that wants to use
binary data transmission is taking all the risk of changes on itself.
("If it breaks, you get to keep both pieces.") It's not clear to me
what we gain by making libpq subject to those risks. If we could
have libpq insulate client apps from these kinds of changes, that would
be one thing; but AFAICS, with these more complex types, a binary format
change would usually also dictate a change in what the library exposes
to clients. As far as I saw, your proposal completely glossed over the
issue of exactly what data structure would be exposed to clients for
anything more complex than an integer. I'm afraid that that structure
would be subject to change, and then we'd just have two layers of
brokenness on our hands instead of only one.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-15-2008, 11:36 PM
Andrew Chernow
 
Posts: n/a
Default Re: PGparam proposal

>library users would have no idea whether
> the values they provide are being sent as text or binary.


The putf interface currently abstracts how it actually sends it.
Although, you do put a C type rather than a string. There is a putstr
"%pqtypstr", which puts the string representation of a type.

> providing an easier-to-use
> API for PQexecParams and friends, and what parts are actually
> interested in binary data transmission (and why)


There are two things happening here and two things we are trying to solve:

1. putf and PGparam - simpler interface for executing queries using
parameterized functions.

2. getf - doesn't use PGparam at all. You pass it a PGresult. The goal
of this is to translate either text or binary results into a uniform C
type or structure. For instance, we expose the below structure for inet.

/* This struct works with CIDR as well. */
typedef struct
{
/* convenience, value the same as first 2 bytes of sa_buf */
unsigned short sa_family;

/* mask, ie. /24 */
int mask;
int is_cidr;

/* Cast to: sockaddr, sockaddr_in, sockaddr_in6, sockaddr_stroage */
int sa_len;
char sa_buf[128]; /* avoid referencing sockaddr structs */
} PGinet;

// "res" could be text or binary results, but PGinet remains the same.
PGinet inet;
PGgetf(res, tup_num, "%pginet", field_num &inet);
connect(sock,
(const struct sockaddr *)inet.sa_buf,
(socklen_t)inet.sa_len);

The above is simpler than examining "10.0.0.1/32" and converting it to a
struct sockaddr. The same struct is used when putting a type as a "C
type" rather than as a string. You can use getf without ever using
putf+PGparam, and vise-versa.

> your proposal completely glossed over the
> issue of exactly what data structure would be exposed to clients for
> anything more complex than an integer


Complex types require a receiving structure on the client side, thus the
types we added to libpq-fe.h: PGinet, PGpolygon, PGarray, PGtimestamp,
etc... But keep in mind that these structs are the result of libpq
extracting the data from text/binary formats and assigning data to
struct members. It does not expose raw format, the API user can already
get that via PQgetvalue().

> If we could have libpq insulate client apps from these kinds
> of changes, that would be one thing;


This is the goal of getf. The API user interfaces through PGinet, not
through the output of PQgetvalue(). We propose that its libpq's job
internally to handle changes to text or binary formats and expose
consistent types/structures.

> type NUMERIC (say, as a bignum integer plus
> exponent instead of the current BCD-ish format)


This is what we want to hide inside libpq's getf. Just expose a
PGnumeric that has been translated into c types. We never expose the
wire format, only the C translated version of it.

> without any reliance on binary data transmission whatsoever.


Yes this is possible, but at a performance & ease-of-use cost. Our
performance tests didn't show much gain with strings or ints, but
complex types were 10 times faster (putting/getting arrays, polygons,
paths, etc...). So, the trade-off is a little more maintainence
overhead on libpq.

BTW, previously we mentioned a 3rd party handler api concept. This is
not needed to get libpq up and going with built-in types (which is all
we feel it should be responsible for). the handler API can always be
added later w/o changing existing functions ... have to add a couple though.

andrew & merlin



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-15-2008, 11:36 PM
Andrew Chernow
 
Posts: n/a
Default Re: PGparam proposal

> your proposal completely glossed over the
> issue of exactly what data structure would be exposed to clients for
> anything more complex than an integer


Yeah. Forgot to include the below in the proposal and the last email.
Here is the lastest list for complex types. Most have been around since
the last 0.5 patch. Still need numeric, date, time and a couple others.

typedef struct
{
double x;
double y;
} PGpoint;

typedef struct
{
PGpoint pts[2];
} PGlseg;

typedef struct
{
PGpoint high;
PGpoint low;
} PGbox;

typedef struct
{
PGpoint center;
double radius;
} PGcircle;

/* When used with PQgetf, 'pts' must be freed with PQfreemem(). */
typedef struct
{
int npts;
int closed;
PGpoint *pts;
} PGpath;

/* When used with PQgetf, 'pts' must be freed with PQfreemem(). */
typedef struct
{
int npts;
PGpoint *pts;
} PGpolygon;

/* This struct works with CIDR as well. */
typedef struct
{
/* here for convenience, value the same as first 2 bytes of sa_buf */
unsigned short sa_family;

/* mask, ie. /24 */
int mask;
int is_cidr;

/* Cast to: sockaddr, sockaddr_in, sockaddr_in6, sockaddr_stroage */
int sa_len;
char sa_buf[128];
} PGinet;

typedef struct
{
int a;
int b;
int c;
int d;
int e;
int f;
} PGmacaddr;

/* main problem with this type is that it can be a double
* or int64 and that is a compile-time decision. This means
* the client has a 50% chance of getting it wrong. It would
* be nice if the server included an indicater or converted
* the external format to one or the other. OR, make client
* aware of server's timestamp encoding when it connects.
*/
typedef struct
{
/* When non-zero, this is a TIMESTAMP WITH TIME ZONE. When zero,
* this is a TIMESTAMP WITHOUT TIME ZONE. When WITHOUT, gmtoff
* will always be 0 and tzn will be "GMT".
*/
int withtz;

/* binary timestamp from server (in host order). If haveint64 is
* non-zero, use the 'ts.asint64' value otherwise use 'ts.asdouble'.
*/
int haveint64;
union
{
struct {
unsigned int a;
signed int b;
} asint64;

double asdouble;
} ts;

/* microseconds */
int usec;
/* GMT offset, some systems don't have this in struct tm */
int gmtoff;
/* time zone name, some systems don't have this in struct tm */
char *tzn;
/* broken-down time */
struct tm tm;
} PGtimestamp;


/* still working on this, may need access macros */

typedef struct
{
int dim;
int lbound;
} PGarraydim;

typedef struct
{
int len;

/* already in PGtype format. For instance:
* (PGpolygon *)arr->items[i].data
* or
* printf("text=%s\n", arr->items[i].data);
*
* Could have a union of all types here but that
* doesn't help much for 3rd party types.
*/
char *data;
} PGarrayitem;

typedef struct
{
Oid oid; /* type of items[].data */
int ndim;
PGarraydim dims[MAXDIM];
int nitems;
PGarrayitem *items;
} PGarray;

andrew & merlin

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-15-2008, 11:37 PM
Andrew Chernow
 
Posts: n/a
Default PGparam proposal v2

Here is our updated PGparam extension to the libpq api:
http://archives.postgresql.org/pgsql...2/msg00356.php

We have a patch implementing the following which we are cleaning up. We
are also kicking around some ideas for arrays and possibly composite
types which we may consider if the community wants to move forwards with
this proposal.

Tom made a number of comments some of which we have addressed:

*) Separate PGparam from PGconn: we agree with this and separated them.

*) Chanages to existing API functions: we agreed and moved new behavior
to new functions

*) 3rd party types: we now support this through a type registration
interface

*) Internal type changes: We think changes to binary format are fairly
rare and easily addressed.

*) Type confusion was removed by giving each type its own specifier.

*) Objections to printf: We agreed in part: we moved to natural names,
from %n4 to %pgint for example. This addressed scalability concerns and
should be less cryptic to use.

*) Argument passing in putf and getf is identical to the previous
proposal. All we changed was the naming schema for the %spec and putf
now takes a PGparam rather than a PGconn.


* API INTERFACE

/* opqaue */
typedef struct pg_param PGparam;

PGparam *PQparamCreate(PGconn *conn);

/* manually reset a param struct. This is done by
* all execution functions for you.
*/
void PQparamReset(PQparam *param)

/* free a PGparam */
void PQparamClear(PQparam *param);

int PQputf(
PGparam *param,
const char *typeSpec,
...);

int PQgetf(
const PGresult *res,
int tup_num,
const char *fieldSpec,
...);

/* PGparam Execution Functions */
PGresult *PQparamExec(
PGconn *conn,
PGparam *param,
const char *command,
int resultFormat);

int PQparamSendQuery(
PGconn *conn,
PGparam *param,
const char *command,
int resultFormat);

PGresult *PQparamExecPrepared
PGconn *conn,
PGparam *param,
const char *stmtName,
int resultFormat);

int PQparamSendQueryPrepared
PGconn *conn,
PGparam *param,
const char *stmtName,
int resultFormat);

/* All in wonder, no PGparam needed */
PGresult *PQexecParamsf(
PGconn *conn,
const char *commandSpec,
int resultFormat,
...);
/* All in wonder, no PGparam needed */
int PQsendQueryParamsf(
PGconn *conn,
const char *commandSpec,
int resultFormat,
...);

/* All in wonder, no PGparam needed */
PGresult *PQexecPreparedf(
PGconn *conn,
const char *stmtName,
const char *typeSpec,
int resultFormat,
...);

/* All in wonder, no PGparam needed */
int PQsendQueryPreparedf(
PGconn *conn,
const char *stmtName,
const char *typeSpec,
int resultFormat,
...);

/* gets the PGparam error message */
char *PQparamErrorMessage(const PGparam *param);


* TYPE ALIAS SPECIFIERS

The convention for postgresql built-in types is a % followed by the
type alias. Every pgtype begins with "pg". For example:

%pgint4
%pgpolygon
%pgbox

3rd party types can register their own specifiers, which is discussed
int the TYPE HANDLER SYSTEM section. Type aliases must be unique.


* TYPE HANDLER SYSTEM

typedef struct pg_typeputargs
{
/* The out buffer will be at least 16K. If more room is needed,
* use the PQ_TYPE_SETOUT to grow the buffer. In most cases,
* 16K is plenty of room.
*/
char *out;

/* the size in bytes of the out buffer */
int outl;

/* Should not use directly, see PQ_TYPE_SETOUT. For the brave,
* set to 1 if you point the out buffer at memory that should be
* freed after your put callback returns.
*/
int free_out;

/* The arguments to putf. Use PQ_TYPE_NEXTARG. */
va_list *ap;

/* The type's alias name, like 'pgint8'. */
const char *type_alias;

/* Sets an error message. This msg shows up in
* PQparamErrorMessage().
*/
int (*seterr)(struct pg_typeputargs *args, const char *format, ...);
} PGtypePutArgs;

typedef struct pg_typegetargs
{
const PGresult *res;
int tup_num;
int field_num;

/* pointer to the output of PQgetvalue for this tup+field */
char *value;

/* The arguments to getf. Use PQ_TYPE_NEXTARG. NOTE: the field_num
* supplied to getf has already been pulled out of the va_list and
* assigned to this structs field_num member.
*/
va_list *ap;

/* The type's alias name, like 'pgint8'. */
const char *type_alias;

/* Sets an error message. This msg shows up in
* PQresultErrorMessage().
*/
int (*seterr)(struct pg_typegetargs *args, const char *format, ...);
} PGtypeGetArgs;

#define PQ_TYPE_NEXTARG(typeArgs, type) va_arg(*(typeArgs)->ap, type)

/* makes sure that putArgs->out is larger enough for new_outl */
#define PQ_TYPE_SETOUT(putArgs, new_outl) do{ \
if((new_outl) > (putArgs)->outl) \
{ \
(putArgs)->out = (char *)malloc(new_outl); \
if(!(putArgs)->out) \
return -1; \
*(putArgs)->out = 0; \
(putArgs)->outl = (new_outl); \
(putArgs)->free_out = 1; \
} \
} while(0)

/*
* Returns - the number of bytes put or -1 for error.
*/
typedef int (*PGtypePut)(PGtypePutArgs *args);

/*
* Returns - 0 for success or -1 for error.
*/
typedef int (*PGtypeGet)(PGtypeGetArgs *args);

/* Indicates that a type handler should use text format for puts.
* The default is binary.
*/
#define PQ_TEXTPUTFMT 0x01

/* By default, PQputf will make an internal copy of the type data. By
* setting this flag, it will only maintain a pointer to the type data.
*
* An example of this flag is the %pgtextptr and %pgbyteaptr, both
* of which can consume large amounts of memory. In most cases, it
* is more effiecent to store a direct pointer rather than making a
* copy.
*/
#define PQ_NOCOPYONPUT 0x02

/* register a type handler */
int PQregisterTypeHandler(
const char *alias,
Oid oid,
PGtypePut put,
PGtypeGet get,
int flags);


* TYPE HANDLER EXAMPLES

Below are two examples of how to implement a libpq type handler. The
postgresql int4 and text types are always available and would never need
to be registered. This is just an example.

PQregisterTypeHandler(
"pgint4", /* The %alias used by PQputf/PQgetf, must be unique */
INT4OID, /* Oid of the type. not required - can be InvalidOid */
put_int4, /* put callback for this type */
get_int4, /* get callback for this type */
0); /* no flags needed */

PQregisterTypeHandler(
"pgtext",
TEXTOID,
put_text,
get_text,
PQ_TEXTPUTFMT);

//
// PQputf(param, "%pgint4", 100);
//
static int put_int4(PGtypePutArgs *args)
{
*(int *)args->out = (int)htonl((int)PQ_TYPE_NEXTARG(args, int));
return 4;
}

//
// int n;
// PQgetf(res, tup_num, "%pgint4", field_num, &n);
//
static int get_int4(PGtypeGetArgs *args)
{
int *i4p = PQ_TYPE_NEXTARG(args, int *);

if(!i4p)
return args->seterr(args, "%s out pointer is NULL",
args->type_alias);

*i4p = 0;

if(PQgetisnull(args->res, args->tup_num, args->field_num))
return 0;

/* text format conversion */
if(PQfformat(args->res, args->field_num) == 0)
{
int n;

errno = 0;
if((n = (int)strtol(args->value, NULL, 10)) == 0 && errno)
return args->seterr(args, "%s string conversion failed: [%d] %s",
args->type_alias, errno, strerror(errno));

*i4p = n;
return 0;
}

/* binary format conversion */
*i4p = (int)ntohl(*(unsigned int *)args->value);
return 0;
}


//
// PQputf(param, "%pgtext", text);
//
static int put_text(PGtypePutArgs *args)
{
args->out = PQ_TYPE_NEXTARG(args, char *);
return (args->out != NULL) ? (int)strlen(args->out) + 1 : 0;
}

//
// char text[512];
// PQgetf(res, tup_num, "%pgtext", field_num, sizeof(text), text);
//
static int get_text(PGtypeGetArgs *args)
{
size_t textl;
size_t bufl = PQ_TYPE_NEXTARG(args, size_t);
char *buf = PQ_TYPE_NEXTARG(args, char *);

if(!buf)
return args->seterr(args, "%s out buffer is NULL",
args->type_alias);

*buf = 0;

if(PQgetisnull(args->res, args->tup_num, args->field_num))
return 0;

textl = PQgetlength(args->res, args->tup_num, args->field_num);
if(bufl <= textl)
return args->seterr(args, "%s buffer is too small",
args->type_alias);

memcpy(buf, args->value, textl + 1);
return 0;
}

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:48 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com