This is a discussion on Re: Current enums patch within the Pgsql Patches forums, part of the PostgreSQL category; --> Tom Lane wrote: > Tom Dunstan <pgsql@tomd.cc> writes: >> Here's the current version of the enums patch. > > ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Tom Lane wrote: > Tom Dunstan <pgsql@tomd.cc> writes: >> Here's the current version of the enums patch. > > Applied with revisions --- some cosmetic, some not so much. > Attached is the patch-as-applied if you care to compare; feel free > to ask questions about anything not obvious. There's a little bug: postgres=# CREATE TYPE aenum AS ENUM ('a','b','c'); CREATE TYPE postgres=# CREATE TABLE t (e aenum); CREATE TABLE postgres=# INSERT INTO t VALUES ('fooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooo'); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. postgres=# In the server log: TRAP: FailedAssertion("!(keylen < 64)", File: "hashfunc.c", Line: 145) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Heikki Linnakangas wrote: > Tom Lane wrote: >> Tom Dunstan <pgsql@tomd.cc> writes: >>> Here's the current version of the enums patch. >> >> Applied with revisions --- some cosmetic, some not so much. >> Attached is the patch-as-applied if you care to compare; feel free >> to ask questions about anything not obvious. > > There's a little bug: > > postgres=# CREATE TYPE aenum AS ENUM ('a','b','c'); CREATE TYPE > postgres=# CREATE TABLE t (e aenum); CREATE TABLE postgres=# INSERT > INTO t VALUES > ('fooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooo'); > > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Succeeded. > postgres=# > > In the server log: > TRAP: FailedAssertion("!(keylen < 64)", File: "hashfunc.c", Line: 145) > > Looks like we need to check the length on type creation too. I'll fix later if not beaten to it. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Heikki Linnakangas <heikki@enterprisedb.com> writes: > There's a little bug: > postgres=# CREATE TYPE aenum AS ENUM ('a','b','c'); CREATE TYPE > postgres=# CREATE TABLE t (e aenum); CREATE TABLE postgres=# INSERT > INTO t VALUES > ('fooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooo'); > server closed the connection unexpectedly Hm, I suppose we should apply truncate_identifier rather than letting the strings be blindly truncated (perhaps in mid-character). Should we have it throw the truncation NOTICE, or not? First thought is to do so during CREATE TYPE but not during plain enum_in(). regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: > >> There's a little bug: >> > > >> postgres=# CREATE TYPE aenum AS ENUM ('a','b','c'); CREATE TYPE >> postgres=# CREATE TABLE t (e aenum); CREATE TABLE postgres=# INSERT >> INTO t VALUES >> ('fooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooooooo'); >> server closed the connection unexpectedly >> > > Hm, I suppose we should apply truncate_identifier rather than letting > the strings be blindly truncated (perhaps in mid-character). Should we > have it throw the truncation NOTICE, or not? First thought is to do so > during CREATE TYPE but not during plain enum_in(). > I don't see much point in truncating. The patch I have so far gives the regression output shown below (yes, I should make the messages more consistent). In fact the truncation and associated NOTICE just strike me as confusing. cheers andrew + -- Name, Values too long + -- + CREATE TYPE + abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijklm nopqrsatuvwxyz0123456789 + AS ENUM('a'); + NOTICE: identifier "abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijkl mnopqrsatuvwxyz0123456789" will be truncated to "abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijkl mnopqrsatuvwxy" + ERROR: type names must be 62 characters or less + CREATE TYPE toolong AS ENUM + ('abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijk lmnopqrsatuvwxyz0123456789'); + ERROR: invalid enum label "abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijkl mnopqrsatuvwxyz0123456789": must be 63 characters or less + INSERT INTO enumtest VALUES + ('abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijk lmnopqrsatuvwxyz0123456789'); + ERROR: input value too long (74) for enum: "abcdefghijklmnopqrsatuvwxyz0123456789abcdefghijkl mnopqrsatuvwxyz0123456789" + -- ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| >> Hm, I suppose we should apply truncate_identifier rather than letting >> the strings be blindly truncated (perhaps in mid-character). Should we >> have it throw the truncation NOTICE, or not? First thought is to do so >> during CREATE TYPE but not during plain enum_in(). >> > > I don't see much point in truncating. > > The patch I have so far gives the regression output shown below (yes, I > should make the messages more consistent). > > In fact the truncation and associated NOTICE just strike me as confusing. [snip] > + ERROR: input value too long (74) for enum: > "abcdefghijklmnopqrsatuvwxyz012345 etc etc I was about to suggest that we just truncate the value in the input function and look it up on the basis that if it's too long, the lookup will fail and we can just tell the user that it wasn't a valid value. But if there's a valid value that has the same 63 bytes as the first 63 of the too-long input string, we'll end up looking up the valid one wrongly. So we do need to test for length and die at that point. Can we just fail with the same error message as trying to input a smaller, but similarly invalid string? At create time, we should definitely throw an error... if we just truncate the value at byte 64 (with a notice or not) we might truncate in the middle of a multi-byte character. Yuk. Cheers Tom ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Tom Dunstan <pgsql@tomd.cc> writes: > I was about to suggest that we just truncate the value in the input > function and look it up on the basis that if it's too long, the lookup > will fail and we can just tell the user that it wasn't a valid value. > But if there's a valid value that has the same 63 bytes as the first 63 > of the too-long input string, we'll end up looking up the valid one > wrongly. So we do need to test for length and die at that point. Can we > just fail with the same error message as trying to input a smaller, but > similarly invalid string? > At create time, we should definitely throw an error... if we just > truncate the value at byte 64 (with a notice or not) we might truncate > in the middle of a multi-byte character. Yuk. While all this reasoning is perfectly OK on its own terms, it ignores the precedents of SQL identifier handling. Maybe we should revisit the question of whether the labels are identifiers. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Tom Lane wrote: > > While all this reasoning is perfectly OK on its own terms, it ignores > the precedents of SQL identifier handling. Maybe we should revisit the > question of whether the labels are identifiers. > > If we do that can we still cache the values in the syscache? My impression from what you and TomD said was that it would be at least more difficult. If it's possible I'm all in favor. (Side note: ISTM there is a pretty good case to move the truncation code from the lexer to the parser where it can be invoked according to context. Maybe something to think about next cycle.). cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Tom Lane wrote: > While all this reasoning is perfectly OK on its own terms, it ignores > the precedents of SQL identifier handling. Maybe we should revisit the > question of whether the labels are identifiers. Implying that they shouldn't be quoted like text (or should be double-quoted if required)? Originally when discussing the patch I thought that this was a good idea, but now I'm not so sure. How does one set an enum value from e.g. a JDBC PreparedStatement in that scenario? Cheers Tom ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Tom Dunstan wrote: > Tom Lane wrote: >> While all this reasoning is perfectly OK on its own terms, it ignores >> the precedents of SQL identifier handling. Maybe we should revisit the >> question of whether the labels are identifiers. > > Implying that they shouldn't be quoted like text (or should be > double-quoted if required)? Originally when discussing the patch I > thought that this was a good idea, but now I'm not so sure. How does > one set an enum value from e.g. a JDBC PreparedStatement in that > scenario? > > Heh ... I read the statement the other way, i.e. maybe we should treat them entirely as strings with no length limitation. The trouble is they are half identifiers and half not right now. We certainly can't treat them fully as identifiers unless I'm right off course - the ambiguities would be horrendous. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| ||||
| Tom Lane wrote: > But probably making them act like identifiers is not a good idea, > because I doubt we want automatic downcasing in enum_in. People > wouldn't be happy if they had to write WHERE color = '"Red"' or > something like that to get at a mixed-case enum label. Ah, that's what you had in mind. Yeah, that's a bit verbose. > Let's > just throw the error instead. (I agree that enum_in can just fail > with "no such label", but CREATE TYPE ought to give a specific > error about it.) Agreed. Andrew, you said you had a mostly-working patch already? Cheers Tom ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |