Unix Technical Forum

Re: Oracle Style packages on postgres

This is a discussion on Re: Oracle Style packages on postgres within the pgsql Hackers forums, part of the PostgreSQL category; --> > -----Original Message----- > From: Bruce Momjian [mailto gman@candle.pha.pa.us] > Sent: Tuesday, May 10, 2005 8:43 AM > To: ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-11-2008, 03:54 AM
Dave Held
 
Posts: n/a
Default Re: Oracle Style packages on postgres

> -----Original Message-----
> From: Bruce Momjian [mailtogman@candle.pha.pa.us]
> Sent: Tuesday, May 10, 2005 8:43 AM
> To: Thomas Hallgren
> Cc: Tom Lane; rmm@sqlisor.com; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Oracle Style packages on postgres
>
> [...]
> I suppose. I think we should focus on the use cases for Oracle
> packages, rather than the specific functionality it provides.
> What things do people need PostgreSQL to do that it already
> doesn't do?


Is that really the best way to go about things? Already RDBMSes
are patchwork quilts of functionality. Is merely adding another
patch the most elegant way to evolve the database? The problem is
that Oracle et al are trying to be ORDBMSes and aren't exactly sure
what the best way to go is. Instead of trying to formulate a
rational plan for what an ORDBMS should even look like, they simply
look at what would work with their existing infrastructure and tack
on features. Then Postgres plays the copycat game. Instead of
trying to play catch-up with Oracle, why not beat them at their own
game?

What packages provide is encapsulation. Hiding the data from the
user and forcing him/her to use the public interface (methods).
That is an important and admirable OO feature. Some people think
that using the DB's security model can achieve the same thing. It
can't, exactly, but there's an important lesson to be learned from
the suggestion. The problem is that OOP is a *programming* paradigm,
and a database is not a *programming language*. In a programming
language, there really is no such thing as "security". There is
only "visibility" and "accessibility". Private methods in an OOP
language do not provide *security*; they only limit *accessibility*.
Like so many other differences between the relational model and the
OOP model, there is an impedance mismatch here. However, there is
also opportunity.

In an OOPL, you can say: "Users can call this method from here, but
not from there." What you *can't* say is: "User X can call this
method, but User Y cannot." As you can see, these are orthogonal
concepts. You could call the first "accessibility by location" and
the second "accessibility by authentication". An ORDBMS should
support both. "Private" does not respect your identity, only your
calling location. An ACL does not respect your calling scope, only
your identity. A system that has both is clearly more flexible than
one that only has one or the other.

Now what you need to keep in mind is that each visibility model
serves a different purpose. The purpose of a security model is to
limit *who* can see/touch certain data because the data has intrinsic
value. The purpose of an accessibility model is to limit *where* and
*how* data can be seen/touched in order to preserve *program
invariants*. So if you have an object (or tuple!) that records the
start and stop time of some process, it is probably a logical
invariant that the stop time is greater than or equal to the start
time. For this reason, in a PL, you would encapsulate these fields
(attributes) and only provide controlled access to update them that
checks and preserves the invariant, *no matter who you are*. You
don't want a superuser violating this invariant any more than Sue
User.

Now you might object that constraints allow you to preserve
invariants as well, and indeed they do. But constraints do not
respect calling scope. Suppose there is a process that needs to
update the timestamps in a way that temporarily breaks the invariant
but restores it afterwards. The only way to effect this in a
constraint environment is to drop the constraint, perform the
operation, and restore it. However, dropping a constraint is not an
ideal solution because there may be other unprivileged processes
operating on the relation that still need the constraint to be
enforced. There is no way to say: "There is a priviledged class of
methods that is allowed to violate this constraint because they are
trusted to restore it upon completion." Note that this is different
from saying "There is a priviledged class of users that is allowed
to violate this constraint." If you try to do something like give
read-only access to everybody and only write access to one user and
define that user to be the owner of the methods that update the data,
you have to follow the convention that that user only operates
through the defined interface, and doesn't hack the data directly.
That's because user-level accessibility is not the same as scope-
level accessibility. Whereas, if you define something like a
package, and say: "Package X is allowed full and complete access
to relation Y", and stick the interface methods in X, you still have
all the user-level security you want while preserving the invariants
in the most elegant way.

So you can think of a package as a scope in a programming language.
It's like a user, but it is not a user. A user has privileges that
cut across scopes. Now, whether packages should be different from
schemas is a whole different ballgame. The purpose of a schema in
Postgres is not entirely clear to me. There's lots of different ways
to use schemas, and there is no obvious best way to use them. In
order to implement the accessibility features of packages, schemas
would have to be changed considerably. Probably a lot of users would
be unhappy if schemas were changed in that way. My guess is that
this would not be a good idea.

I think we can get some guidance from PLs. C++ is what you call a
"multi-paradigm language". You can do everything from assembly to
metaprogramming in C++. As such, it is very loose and open in some
respects. C++ has two kinds of scopes: it has classes and namespaces.
Members of a class are encapsulated and support data hiding. Members
of a namespace are only loosely grouped and do not support data hiding
explicitly. Namespaces exist primarily to avoid name collisions.

Java, on the other hand, decided that for OOP purity, everything must
be a class. That would be like making schemas into packages and
imposing accessibility rules on them. At the end of the day, I think
many PL design experts agree that making everything a class is not
necessarily the best way to go.

So schemas can be like C++ namespaces - they provide a means to
loosely group related objects and help avoid name collisions. So
the package could be like a class - they provide OOP-like
encapsulation via accessibility rules. However, that doesn't mean
that nested schemas wouldn't also be a good thing. In C++, nested
namespaces are extremely useful when one layer of scoping does not
sufficiently partition the namespace to avoid frequent name
collisions. I think the same is true of Postgres. I certainly would
like to be able to use nested schema names in several contexts.
Instead, I have to make a choice between making different schemas,
or making different name prefixes. I wouldn't even mind if nested
schemas were only allowed to contain schemas except at the leaves of
the tree. Another feature that is very useful is the "using clause".
Combined with nested namespaces, this is a very powerful way to give
programmers/dbas control over names. You can give everything the
most natural name, and just put it in the appropriate namespace,
and use the namespace that is relevant to the given task at hand.

So consider this example:

Tables:
etl.import.record
etl.export.record

As you can imagine, I don't really want to make an 'import' and
'export' schema at the top level. There's several tables in
each schema, but that should illustrate the point. Then, when
constructing queries, it would be nice to be able to do this:

USING etl.import
;
SELECT *
FROM record
JOIN header ON ...
JOIN file ON ...
;

The effect of a USING clause would be to import the schema names
into the public namespace for the duration of the transaction. If
that leads to ambiguous names, then the parser/planner should emit an
error.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East, Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-11-2008, 03:54 AM
Bruce Momjian
 
Posts: n/a
Default Re: Oracle Style packages on postgres

Dave Held wrote:
> > -----Original Message-----
> > From: Bruce Momjian [mailtogman@candle.pha.pa.us]
> > Sent: Tuesday, May 10, 2005 8:43 AM
> > To: Thomas Hallgren
> > Cc: Tom Lane; rmm@sqlisor.com; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] Oracle Style packages on postgres
> >
> > [...]
> > I suppose. I think we should focus on the use cases for Oracle
> > packages, rather than the specific functionality it provides.
> > What things do people need PostgreSQL to do that it already
> > doesn't do?

>
> Is that really the best way to go about things? Already RDBMSes
> are patchwork quilts of functionality. Is merely adding another
> patch the most elegant way to evolve the database? The problem is
> that Oracle et al are trying to be ORDBMSes and aren't exactly sure
> what the best way to go is. Instead of trying to formulate a
> rational plan for what an ORDBMS should even look like, they simply
> look at what would work with their existing infrastructure and tack
> on features. Then Postgres plays the copycat game. Instead of
> trying to play catch-up with Oracle, why not beat them at their own
> game?


I was unclear. I was suggesting exactly what you posted, that we look at
what functionality we _need_ from Oracle packages, rather than the
functionality of Oracle packages themselves. My assumption is that
Oracle does some things we need, and some things we don't, and does them
in some ways we will like, and others we will not, so let's look at the
actuall use cases that we need to address.

--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-11-2008, 03:55 AM
Bruce Momjian
 
Posts: n/a
Default Re: Oracle Style packages on postgres


OK, so it seems we need:

o make private objects accessable only to objects
in the same schema
o Allow current_schema.objname to access current
schema objects
o session variables
o nested schemas?

---------------------------------------------------------------------------

Dave Held wrote:
> > -----Original Message-----
> > From: Bruce Momjian [mailtogman@candle.pha.pa.us]
> > Sent: Tuesday, May 10, 2005 8:43 AM
> > To: Thomas Hallgren
> > Cc: Tom Lane; rmm@sqlisor.com; pgsql-hackers@postgresql.org
> > Subject: Re: [HACKERS] Oracle Style packages on postgres
> >
> > [...]
> > I suppose. I think we should focus on the use cases for Oracle
> > packages, rather than the specific functionality it provides.
> > What things do people need PostgreSQL to do that it already
> > doesn't do?

>
> Is that really the best way to go about things? Already RDBMSes
> are patchwork quilts of functionality. Is merely adding another
> patch the most elegant way to evolve the database? The problem is
> that Oracle et al are trying to be ORDBMSes and aren't exactly sure
> what the best way to go is. Instead of trying to formulate a
> rational plan for what an ORDBMS should even look like, they simply
> look at what would work with their existing infrastructure and tack
> on features. Then Postgres plays the copycat game. Instead of
> trying to play catch-up with Oracle, why not beat them at their own
> game?
>
> What packages provide is encapsulation. Hiding the data from the
> user and forcing him/her to use the public interface (methods).
> That is an important and admirable OO feature. Some people think
> that using the DB's security model can achieve the same thing. It
> can't, exactly, but there's an important lesson to be learned from
> the suggestion. The problem is that OOP is a *programming* paradigm,
> and a database is not a *programming language*. In a programming
> language, there really is no such thing as "security". There is
> only "visibility" and "accessibility". Private methods in an OOP
> language do not provide *security*; they only limit *accessibility*.
> Like so many other differences between the relational model and the
> OOP model, there is an impedance mismatch here. However, there is
> also opportunity.
>
> In an OOPL, you can say: "Users can call this method from here, but
> not from there." What you *can't* say is: "User X can call this
> method, but User Y cannot." As you can see, these are orthogonal
> concepts. You could call the first "accessibility by location" and
> the second "accessibility by authentication". An ORDBMS should
> support both. "Private" does not respect your identity, only your
> calling location. An ACL does not respect your calling scope, only
> your identity. A system that has both is clearly more flexible than
> one that only has one or the other.
>
> Now what you need to keep in mind is that each visibility model
> serves a different purpose. The purpose of a security model is to
> limit *who* can see/touch certain data because the data has intrinsic
> value. The purpose of an accessibility model is to limit *where* and
> *how* data can be seen/touched in order to preserve *program
> invariants*. So if you have an object (or tuple!) that records the
> start and stop time of some process, it is probably a logical
> invariant that the stop time is greater than or equal to the start
> time. For this reason, in a PL, you would encapsulate these fields
> (attributes) and only provide controlled access to update them that
> checks and preserves the invariant, *no matter who you are*. You
> don't want a superuser violating this invariant any more than Sue
> User.
>
> Now you might object that constraints allow you to preserve
> invariants as well, and indeed they do. But constraints do not
> respect calling scope. Suppose there is a process that needs to
> update the timestamps in a way that temporarily breaks the invariant
> but restores it afterwards. The only way to effect this in a
> constraint environment is to drop the constraint, perform the
> operation, and restore it. However, dropping a constraint is not an
> ideal solution because there may be other unprivileged processes
> operating on the relation that still need the constraint to be
> enforced. There is no way to say: "There is a priviledged class of
> methods that is allowed to violate this constraint because they are
> trusted to restore it upon completion." Note that this is different
> from saying "There is a priviledged class of users that is allowed
> to violate this constraint." If you try to do something like give
> read-only access to everybody and only write access to one user and
> define that user to be the owner of the methods that update the data,
> you have to follow the convention that that user only operates
> through the defined interface, and doesn't hack the data directly.
> That's because user-level accessibility is not the same as scope-
> level accessibility. Whereas, if you define something like a
> package, and say: "Package X is allowed full and complete access
> to relation Y", and stick the interface methods in X, you still have
> all the user-level security you want while preserving the invariants
> in the most elegant way.
>
> So you can think of a package as a scope in a programming language.
> It's like a user, but it is not a user. A user has privileges that
> cut across scopes. Now, whether packages should be different from
> schemas is a whole different ballgame. The purpose of a schema in
> Postgres is not entirely clear to me. There's lots of different ways
> to use schemas, and there is no obvious best way to use them. In
> order to implement the accessibility features of packages, schemas
> would have to be changed considerably. Probably a lot of users would
> be unhappy if schemas were changed in that way. My guess is that
> this would not be a good idea.
>
> I think we can get some guidance from PLs. C++ is what you call a
> "multi-paradigm language". You can do everything from assembly to
> metaprogramming in C++. As such, it is very loose and open in some
> respects. C++ has two kinds of scopes: it has classes and namespaces.
> Members of a class are encapsulated and support data hiding. Members
> of a namespace are only loosely grouped and do not support data hiding
> explicitly. Namespaces exist primarily to avoid name collisions.
>
> Java, on the other hand, decided that for OOP purity, everything must
> be a class. That would be like making schemas into packages and
> imposing accessibility rules on them. At the end of the day, I think
> many PL design experts agree that making everything a class is not
> necessarily the best way to go.
>
> So schemas can be like C++ namespaces - they provide a means to
> loosely group related objects and help avoid name collisions. So
> the package could be like a class - they provide OOP-like
> encapsulation via accessibility rules. However, that doesn't mean
> that nested schemas wouldn't also be a good thing. In C++, nested
> namespaces are extremely useful when one layer of scoping does not
> sufficiently partition the namespace to avoid frequent name
> collisions. I think the same is true of Postgres. I certainly would
> like to be able to use nested schema names in several contexts.
> Instead, I have to make a choice between making different schemas,
> or making different name prefixes. I wouldn't even mind if nested
> schemas were only allowed to contain schemas except at the leaves of
> the tree. Another feature that is very useful is the "using clause".
> Combined with nested namespaces, this is a very powerful way to give
> programmers/dbas control over names. You can give everything the
> most natural name, and just put it in the appropriate namespace,
> and use the namespace that is relevant to the given task at hand.
>
> So consider this example:
>
> Tables:
> etl.import.record
> etl.export.record
>
> As you can imagine, I don't really want to make an 'import' and
> 'export' schema at the top level. There's several tables in
> each schema, but that should illustrate the point. Then, when
> constructing queries, it would be nice to be able to do this:
>
> USING etl.import
> ;
> SELECT *
> FROM record
> JOIN header ON ...
> JOIN file ON ...
> ;
>
> The effect of a USING clause would be to import the schema names
> into the public namespace for the duration of the transaction. If
> that leads to ambiguous names, then the parser/planner should emit an
> error.
>
> __
> David B. Held
> Software Engineer/Array Services Group
> 200 14th Ave. East, Sartell, MN 56377
> 320.534.3637 320.253.7800 800.752.8129
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-11-2008, 03:55 AM
David Fetter
 
Posts: n/a
Default Re: Oracle Style packages on postgres

On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
>
> OK, so it seems we need:
>
> o make private objects accessable only to objects
> in the same schema
> o Allow current_schema.objname to access current
> schema objects
> o session variables
> o nested schemas?


Well, some kind of nestable namespace for objects, anyhow.

I'll look over the SQL:2003 draft and see if I can find anything along
that line in there.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778

Remember to vote!

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-11-2008, 03:55 AM
Bruce Momjian
 
Posts: n/a
Default Re: Oracle Style packages on postgres

David Fetter wrote:
> On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
> >
> > OK, so it seems we need:
> >
> > o make private objects accessable only to objects
> > in the same schema
> > o Allow current_schema.objname to access current
> > schema objects
> > o session variables
> > o nested schemas?

>
> Well, some kind of nestable namespace for objects, anyhow.


How would nested namespaces be different from nested schemas? I thought
the two were the same.

--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-11-2008, 03:55 AM
David Fetter
 
Posts: n/a
Default Re: Oracle Style packages on postgres

On Tue, May 10, 2005 at 09:49:13PM -0400, Bruce Momjian wrote:
> David Fetter wrote:
> > On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
> > >
> > > OK, so it seems we need:
> > >
> > > o make private objects accessable only to objects in the same
> > > schema
> > > o Allow current_schema.objname to access current
> > > schema objects
> > > o session variables
> > > o nested schemas?

> >
> > Well, some kind of nestable namespace for objects, anyhow.

>
> How would nested namespaces be different from nested schemas? I
> thought the two were the same.


I was thinking of nested namespaces in the more limited sense of
namespaces for bundles of functions/stored procedures rather than a
full-on hierarchy where a table can have a schema which resides inside
another schema which resides...unless people really want to have it
that way.

In a slightly related situation, at least in my mind, it seems like
for full-on ORDBMS functionality, it should be possible to have a
column of type schema or setof record, &c., and be able to take these
things apart at each row.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778

Remember to vote!

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-11-2008, 03:55 AM
Tom Lane
 
Posts: n/a
Default Re: Oracle Style packages on postgres

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, so it seems we need:


> o make private objects accessable only to objects
> in the same schema
> o Allow current_schema.objname to access current
> schema objects
> o session variables
> o nested schemas?


There's been a lot of handwaving about nested schemas in this thread,
but no one has explained how they could actually *work* given the SQL
syntax rules. In general, "a" is a column from the current table
set, "a.b" is a column b in table/alias a from the current query,
"a.b.c" is a column c from table b in schema a, "a.b.c.d" is a column
d from table c in schema b in catalog a, and any more than that is
a syntax error. I do not see how to add nested schemas without creating
unworkable ambiguities, not to say outright violations of the spec.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-11-2008, 03:55 AM
Bruce Momjian
 
Posts: n/a
Default Re: Oracle Style packages on postgres

David Fetter wrote:
> On Tue, May 10, 2005 at 09:49:13PM -0400, Bruce Momjian wrote:
> > David Fetter wrote:
> > > On Tue, May 10, 2005 at 06:55:39PM -0400, Bruce Momjian wrote:
> > > >
> > > > OK, so it seems we need:
> > > >
> > > > o make private objects accessable only to objects in the same
> > > > schema
> > > > o Allow current_schema.objname to access current
> > > > schema objects
> > > > o session variables
> > > > o nested schemas?
> > >
> > > Well, some kind of nestable namespace for objects, anyhow.

> >
> > How would nested namespaces be different from nested schemas? I
> > thought the two were the same.

>
> I was thinking of nested namespaces in the more limited sense of
> namespaces for bundles of functions/stored procedures rather than a
> full-on hierarchy where a table can have a schema which resides inside
> another schema which resides...unless people really want to have it
> that way.


Oh, so allow only functions to sit in the sub-namespace? Yea, we could
do that, but it seems sort of limiting. However, I am unclear how we
would do sub-namespaces either.

--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-11-2008, 04:05 AM
Bruce Momjian
 
Posts: n/a
Default Re: Oracle Style packages on postgres


Added to TODO:

* Add the features of packages
o Make private objects accessable only to objects in the same schema
o Allow current_schema.objname to access current schema objects
o Add session variables
o Allow nested schemas


---------------------------------------------------------------------------

Bruce Momjian wrote:
>
> OK, so it seems we need:
>
> o make private objects accessable only to objects
> in the same schema
> o Allow current_schema.objname to access current
> schema objects
> o session variables
> o nested schemas?
>
> ---------------------------------------------------------------------------
>
> Dave Held wrote:
> > > -----Original Message-----
> > > From: Bruce Momjian [mailtogman@candle.pha.pa.us]
> > > Sent: Tuesday, May 10, 2005 8:43 AM
> > > To: Thomas Hallgren
> > > Cc: Tom Lane; rmm@sqlisor.com; pgsql-hackers@postgresql.org
> > > Subject: Re: [HACKERS] Oracle Style packages on postgres
> > >
> > > [...]
> > > I suppose. I think we should focus on the use cases for Oracle
> > > packages, rather than the specific functionality it provides.
> > > What things do people need PostgreSQL to do that it already
> > > doesn't do?

> >
> > Is that really the best way to go about things? Already RDBMSes
> > are patchwork quilts of functionality. Is merely adding another
> > patch the most elegant way to evolve the database? The problem is
> > that Oracle et al are trying to be ORDBMSes and aren't exactly sure
> > what the best way to go is. Instead of trying to formulate a
> > rational plan for what an ORDBMS should even look like, they simply
> > look at what would work with their existing infrastructure and tack
> > on features. Then Postgres plays the copycat game. Instead of
> > trying to play catch-up with Oracle, why not beat them at their own
> > game?
> >
> > What packages provide is encapsulation. Hiding the data from the
> > user and forcing him/her to use the public interface (methods).
> > That is an important and admirable OO feature. Some people think
> > that using the DB's security model can achieve the same thing. It
> > can't, exactly, but there's an important lesson to be learned from
> > the suggestion. The problem is that OOP is a *programming* paradigm,
> > and a database is not a *programming language*. In a programming
> > language, there really is no such thing as "security". There is
> > only "visibility" and "accessibility". Private methods in an OOP
> > language do not provide *security*; they only limit *accessibility*.
> > Like so many other differences between the relational model and the
> > OOP model, there is an impedance mismatch here. However, there is
> > also opportunity.
> >
> > In an OOPL, you can say: "Users can call this method from here, but
> > not from there." What you *can't* say is: "User X can call this
> > method, but User Y cannot." As you can see, these are orthogonal
> > concepts. You could call the first "accessibility by location" and
> > the second "accessibility by authentication". An ORDBMS should
> > support both. "Private" does not respect your identity, only your
> > calling location. An ACL does not respect your calling scope, only
> > your identity. A system that has both is clearly more flexible than
> > one that only has one or the other.
> >
> > Now what you need to keep in mind is that each visibility model
> > serves a different purpose. The purpose of a security model is to
> > limit *who* can see/touch certain data because the data has intrinsic
> > value. The purpose of an accessibility model is to limit *where* and
> > *how* data can be seen/touched in order to preserve *program
> > invariants*. So if you have an object (or tuple!) that records the
> > start and stop time of some process, it is probably a logical
> > invariant that the stop time is greater than or equal to the start
> > time. For this reason, in a PL, you would encapsulate these fields
> > (attributes) and only provide controlled access to update them that
> > checks and preserves the invariant, *no matter who you are*. You
> > don't want a superuser violating this invariant any more than Sue
> > User.
> >
> > Now you might object that constraints allow you to preserve
> > invariants as well, and indeed they do. But constraints do not
> > respect calling scope. Suppose there is a process that needs to
> > update the timestamps in a way that temporarily breaks the invariant
> > but restores it afterwards. The only way to effect this in a
> > constraint environment is to drop the constraint, perform the
> > operation, and restore it. However, dropping a constraint is not an
> > ideal solution because there may be other unprivileged processes
> > operating on the relation that still need the constraint to be
> > enforced. There is no way to say: "There is a priviledged class of
> > methods that is allowed to violate this constraint because they are
> > trusted to restore it upon completion." Note that this is different
> > from saying "There is a priviledged class of users that is allowed
> > to violate this constraint." If you try to do something like give
> > read-only access to everybody and only write access to one user and
> > define that user to be the owner of the methods that update the data,
> > you have to follow the convention that that user only operates
> > through the defined interface, and doesn't hack the data directly.
> > That's because user-level accessibility is not the same as scope-
> > level accessibility. Whereas, if you define something like a
> > package, and say: "Package X is allowed full and complete access
> > to relation Y", and stick the interface methods in X, you still have
> > all the user-level security you want while preserving the invariants
> > in the most elegant way.
> >
> > So you can think of a package as a scope in a programming language.
> > It's like a user, but it is not a user. A user has privileges that
> > cut across scopes. Now, whether packages should be different from
> > schemas is a whole different ballgame. The purpose of a schema in
> > Postgres is not entirely clear to me. There's lots of different ways
> > to use schemas, and there is no obvious best way to use them. In
> > order to implement the accessibility features of packages, schemas
> > would have to be changed considerably. Probably a lot of users would
> > be unhappy if schemas were changed in that way. My guess is that
> > this would not be a good idea.
> >
> > I think we can get some guidance from PLs. C++ is what you call a
> > "multi-paradigm language". You can do everything from assembly to
> > metaprogramming in C++. As such, it is very loose and open in some
> > respects. C++ has two kinds of scopes: it has classes and namespaces.
> > Members of a class are encapsulated and support data hiding. Members
> > of a namespace are only loosely grouped and do not support data hiding
> > explicitly. Namespaces exist primarily to avoid name collisions.
> >
> > Java, on the other hand, decided that for OOP purity, everything must
> > be a class. That would be like making schemas into packages and
> > imposing accessibility rules on them. At the end of the day, I think
> > many PL design experts agree that making everything a class is not
> > necessarily the best way to go.
> >
> > So schemas can be like C++ namespaces - they provide a means to
> > loosely group related objects and help avoid name collisions. So
> > the package could be like a class - they provide OOP-like
> > encapsulation via accessibility rules. However, that doesn't mean
> > that nested schemas wouldn't also be a good thing. In C++, nested
> > namespaces are extremely useful when one layer of scoping does not
> > sufficiently partition the namespace to avoid frequent name
> > collisions. I think the same is true of Postgres. I certainly would
> > like to be able to use nested schema names in several contexts.
> > Instead, I have to make a choice between making different schemas,
> > or making different name prefixes. I wouldn't even mind if nested
> > schemas were only allowed to contain schemas except at the leaves of
> > the tree. Another feature that is very useful is the "using clause".
> > Combined with nested namespaces, this is a very powerful way to give
> > programmers/dbas control over names. You can give everything the
> > most natural name, and just put it in the appropriate namespace,
> > and use the namespace that is relevant to the given task at hand.
> >
> > So consider this example:
> >
> > Tables:
> > etl.import.record
> > etl.export.record
> >
> > As you can imagine, I don't really want to make an 'import' and
> > 'export' schema at the top level. There's several tables in
> > each schema, but that should illustrate the point. Then, when
> > constructing queries, it would be nice to be able to do this:
> >
> > USING etl.import
> > ;
> > SELECT *
> > FROM record
> > JOIN header ON ...
> > JOIN file ON ...
> > ;
> >
> > The effect of a USING clause would be to import the schema names
> > into the public namespace for the duration of the transaction. If
> > that leads to ambiguous names, then the parser/planner should emit an
> > error.
> >
> > __
> > David B. Held
> > Software Engineer/Array Services Group
> > 200 14th Ave. East, Sartell, MN 56377
> > 320.534.3637 320.253.7800 800.752.8129
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >

>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman@candle.pha.pa.us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-11-2008, 04:07 AM
Jonah H. Harris
 
Posts: n/a
Default Re: Oracle Style packages on postgres

Thanks for adding this Bruce!

Is anyone going to be working on this immediately? If so, I'd be glad
to work with someone. Unfortunately, I don't have the time to devote to
taking something this big on, but I think it would be a really great
thing to have. Just let me know jharris@tvi.edu OR
jonah.harris@gmail.com. Thanks!

Bruce Momjian wrote:

>Added to TODO:
>
>* Add the features of packages
> o Make private objects accessable only to objects in the same schema
> o Allow current_schema.objname to access current schema objects
> o Add session variables
> o Allow nested schemas
>
>
>---------------------------------------------------------------------------
>
>Bruce Momjian wrote:
>
>
>>OK, so it seems we need:
>>
>> o make private objects accessable only to objects
>> in the same schema
>> o Allow current_schema.objname to access current
>> schema objects
>> o session variables
>> o nested schemas?
>>
>>---------------------------------------------------------------------------
>>
>>Dave Held wrote:
>>
>>
>>>>-----Original Message-----
>>>>From: Bruce Momjian [mailtogman@candle.pha.pa.us]
>>>>Sent: Tuesday, May 10, 2005 8:43 AM
>>>>To: Thomas Hallgren
>>>>Cc: Tom Lane; rmm@sqlisor.com; pgsql-hackers@postgresql.org
>>>>Subject: Re: [HACKERS] Oracle Style packages on postgres
>>>>
>>>>[...]
>>>>I suppose. I think we should focus on the use cases for Oracle
>>>>packages, rather than the specific functionality it provides.
>>>>What things do people need PostgreSQL to do that it already
>>>>doesn't do?
>>>>
>>>>
>>>Is that really the best way to go about things? Already RDBMSes
>>>are patchwork quilts of functionality. Is merely adding another
>>>patch the most elegant way to evolve the database? The problem is
>>>that Oracle et al are trying to be ORDBMSes and aren't exactly sure
>>>what the best way to go is. Instead of trying to formulate a
>>>rational plan for what an ORDBMS should even look like, they simply
>>>look at what would work with their existing infrastructure and tack
>>>on features. Then Postgres plays the copycat game. Instead of
>>>trying to play catch-up with Oracle, why not beat them at their own
>>>game?
>>>
>>>What packages provide is encapsulation. Hiding the data from the
>>>user and forcing him/her to use the public interface (methods).
>>>That is an important and admirable OO feature. Some people think
>>>that using the DB's security model can achieve the same thing. It
>>>can't, exactly, but there's an important lesson to be learned from
>>>the suggestion. The problem is that OOP is a *programming* paradigm,
>>>and a database is not a *programming language*. In a programming
>>>language, there really is no such thing as "security". There is
>>>only "visibility" and "accessibility". Private methods in an OOP
>>>language do not provide *security*; they only limit *accessibility*.
>>>Like so many other differences between the relational model and the
>>>OOP model, there is an impedance mismatch here. However, there is
>>>also opportunity.
>>>
>>>In an OOPL, you can say: "Users can call this method from here, but
>>>not from there." What you *can't* say is: "User X can call this
>>>method, but User Y cannot." As you can see, these are orthogonal
>>>concepts. You could call the first "accessibility by location" and
>>>the second "accessibility by authentication". An ORDBMS should
>>>support both. "Private" does not respect your identity, only your
>>>calling location. An ACL does not respect your calling scope, only
>>>your identity. A system that has both is clearly more flexible than
>>>one that only has one or the other.
>>>
>>>Now what you need to keep in mind is that each visibility model
>>>serves a different purpose. The purpose of a security model is to
>>>limit *who* can see/touch certain data because the data has intrinsic
>>>value. The purpose of an accessibility model is to limit *where* and
>>>*how* data can be seen/touched in order to preserve *program
>>>invariants*. So if you have an object (or tuple!) that records the
>>>start and stop time of some process, it is probably a logical
>>>invariant that the stop time is greater than or equal to the start
>>>time. For this reason, in a PL, you would encapsulate these fields
>>>(attributes) and only provide controlled access to update them that
>>>checks and preserves the invariant, *no matter who you are*. You
>>>don't want a superuser violating this invariant any more than Sue
>>>User.
>>>
>>>Now you might object that constraints allow you to preserve
>>>invariants as well, and indeed they do. But constraints do not
>>>respect calling scope. Suppose there is a process that needs to
>>>update the timestamps in a way that temporarily breaks the invariant
>>>but restores it afterwards. The only way to effect this in a
>>>constraint environment is to drop the constraint, perform the
>>>operation, and restore it. However, dropping a constraint is not an
>>>ideal solution because there may be other unprivileged processes
>>>operating on the relation that still need the constraint to be
>>>enforced. There is no way to say: "There is a priviledged class of
>>>methods that is allowed to violate this constraint because they are
>>>trusted to restore it upon completion." Note that this is different
>>>from saying "There is a priviledged class of users that is allowed
>>>to violate this constraint." If you try to do something like give
>>>read-only access to everybody and only write access to one user and
>>>define that user to be the owner of the methods that update the data,
>>>you have to follow the convention that that user only operates
>>>through the defined interface, and doesn't hack the data directly.
>>>That's because user-level accessibility is not the same as scope-
>>>level accessibility. Whereas, if you define something like a
>>>package, and say: "Package X is allowed full and complete access
>>>to relation Y", and stick the interface methods in X, you still have
>>>all the user-level security you want while preserving the invariants
>>>in the most elegant way.
>>>
>>>So you can think of a package as a scope in a programming language.
>>>It's like a user, but it is not a user. A user has privileges that
>>>cut across scopes. Now, whether packages should be different from
>>>schemas is a whole different ballgame. The purpose of a schema in
>>>Postgres is not entirely clear to me. There's lots of different ways
>>>to use schemas, and there is no obvious best way to use them. In
>>>order to implement the accessibility features of packages, schemas
>>>would have to be changed considerably. Probably a lot of users would
>>>be unhappy if schemas were changed in that way. My guess is that
>>>this would not be a good idea.
>>>
>>>I think we can get some guidance from PLs. C++ is what you call a
>>>"multi-paradigm language". You can do everything from assembly to
>>>metaprogramming in C++. As such, it is very loose and open in some
>>>respects. C++ has two kinds of scopes: it has classes and namespaces.
>>>Members of a class are encapsulated and support data hiding. Members
>>>of a namespace are only loosely grouped and do not support data hiding
>>>explicitly. Namespaces exist primarily to avoid name collisions.
>>>
>>>Java, on the other hand, decided that for OOP purity, everything must
>>>be a class. That would be like making schemas into packages and
>>>imposing accessibility rules on them. At the end of the day, I think
>>>many PL design experts agree that making everything a class is not
>>>necessarily the best way to go.
>>>
>>>So schemas can be like C++ namespaces - they provide a means to
>>>loosely group related objects and help avoid name collisions. So
>>>the package could be like a class - they provide OOP-like
>>>encapsulation via accessibility rules. However, that doesn't mean
>>>that nested schemas wouldn't also be a good thing. In C++, nested
>>>namespaces are extremely useful when one layer of scoping does not
>>>sufficiently partition the namespace to avoid frequent name
>>>collisions. I think the same is true of Postgres. I certainly would
>>>like to be able to use nested schema names in several contexts.
>>>Instead, I have to make a choice between making different schemas,
>>>or making different name prefixes. I wouldn't even mind if nested
>>>schemas were only allowed to contain schemas except at the leaves of
>>>the tree. Another feature that is very useful is the "using clause".
>>>Combined with nested namespaces, this is a very powerful way to give
>>>programmers/dbas control over names. You can give everything the
>>>most natural name, and just put it in the appropriate namespace,
>>>and use the namespace that is relevant to the given task at hand.
>>>
>>>So consider this example:
>>>
>>>Tables:
>>> etl.import.record
>>> etl.export.record
>>>
>>>As you can imagine, I don't really want to make an 'import' and
>>>'export' schema at the top level. There's several tables in
>>>each schema, but that should illustrate the point. Then, when
>>>constructing queries, it would be nice to be able to do this:
>>>
>>> USING etl.import
>>> ;
>>>SELECT *
>>> FROM record
>>> JOIN header ON ...
>>> JOIN file ON ...
>>> ;
>>>
>>>The effect of a USING clause would be to import the schema names
>>>into the public namespace for the duration of the transaction. If
>>>that leads to ambiguous names, then the parser/planner should emit an
>>>error.
>>>
>>>__
>>>David B. Held
>>>Software Engineer/Array Services Group
>>>200 14th Ave. East, Sartell, MN 56377
>>>320.534.3637 320.253.7800 800.752.8129
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>>>
>>>
>>>

>>--
>> Bruce Momjian | http://candle.pha.pa.us
>> pgman@candle.pha.pa.us | (610) 359-1001
>> + If your life is a hard drive, | 13 Roberts Road
>> + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 2: you can get off all lists at once with the unregister command
>> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>
>>
>>

>
>
>



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump