Unix Technical Forum

SEO

vBulletin Search Engine Optimization


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 07:28 AM
Zoltan Boszormenyi
 
Posts: n/a
Default Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Hi,

I would like to be able to harden the conditions
of generating IDENTITY columns so the
events below run in this order:

- assign values for regular columns (with or without DEFAULT)
- NOT NULL checks on regular columns
- CHECK constraints whose expression contains only regular columns

- assign values for GENERATED columns
- NOT NULL checks on GENERATED columns
- CHECK constraints whose expression may contain regular
or GENERATED columns

- UNIQUE index checks that has only regular columns
- UNIQUE index checks that may have regular or GENERATED columns

- assign values for IDENTITY column
- NOT NULL on IDENTITY
- CHECK constraints on IDENTITY
- UNIQUE index checks that can reference IDENTITY column

At this point the heap tuple and the index tuple can be inserted
without further checks.

Currently tuple->t_self is required by ExecInsertIndexTuples()
and I don't see any way to make IDENTITY work the way it's
intended but to mix heap_insert()/heap_update() and
ExecInsertIndexTuples() together and use the result in
ExecInsert() and ExecUpdate().

Would it be acceptable?

Best regards,
Zoltán Böszörményi


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 07:28 AM
Tom Lane
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?

Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:
> Would it be acceptable?


No, because you can't create index entries when you haven't yet got the
TID for the heap tuple. What do you propose doing, insert a dummy index
entry and then go back to fill it in later? Aside from approximately
doubling the work involved, this is fundamentally broken because no
other backend could know what to do upon encountering the dummy index
entry --- there's no way for it to check if the entry references a live
tuple or not. Not to mention that a crash here will leave a permanently
dummy index entry that there's no way to vacuum.

The other rearrangements you suggest are not any more acceptable;
we are not going to restructure the entire handling of defaults and
check constraints around a single badly-designed SQL2003 feature.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 07:28 AM
Zoltan Boszormenyi
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Hi,

Tom Lane írta:
> Zoltan Boszormenyi <zboszor@dunaweb.hu> writes:
>
>> Would it be acceptable?
>>

>
> No, because you can't create index entries when you haven't yet got the
> TID for the heap tuple. What do you propose doing, insert a dummy index
> entry and then go back to fill it in later? Aside from approximately
>


No, I was thinking about breaking up e.g. heap_insert()
to be able to mix with ExecInsertIndexTuples() so I get a
pinned buffer and have the heap_tuple's t_self set first
then doing the uniqueness checks step by step.
BTW, can I use modify_tuple() after doing
RelationGetBufferForTuple() and RelationPutHeapTuple(),
right?

> doubling the work involved, this is fundamentally broken because no
>


Well, the work wouldn't be doubled as all the unique indexes
have to be checked anyway with the current way, too, to have
the tuple accepted into the database.

> other backend could know what to do upon encountering the dummy index
> entry --- there's no way for it to check if the entry references a live
> tuple or not. Not to mention that a crash here will leave a permanently
> dummy index entry that there's no way to vacuum.
>
> The other rearrangements you suggest are not any more acceptable;
> we are not going to restructure the entire handling of defaults and
> check constraints around a single badly-designed SQL2003 feature.
>


My IDENTITY/GENERATED patch broke up the
checks currently this way (CHECK constraints are prohibited
for special case columns):

- normal columns are assigned values (maybe using DEFAULT)
- check NOT NULLs and CHECKs for normal columns

( Up to this point this works the same way as before if you don't
use neither IDENTITY nor GENERATED. )

- assign GENERATED with ther values
- check NOT NULLs for GENERATED
- assign IDENTITY with value
- check NOT NULL for IDENTITY

and

- check UNIQUE for everything

Identity would be special so it doesn't inflate the sequence
if avoidable. Currently the only way if UNIQUE fails
for any index which is still very much makes it unusable.

What I would like to achieve is for IDENTITY to skip
a sequence value and fail to be INSERTed if the IDENTITY
column's uniqe check is failed. Which pretty much means
that there is already a record with that IDENTITY value
regardless of the UNIQUE index is defined for only the IDENTITY
column or the IDENTITY column is part of a multi-column
UNIQUE index.

If I could broke up the order of events the way I described
in my first mail, I could re-enable having CHECK constraints
for both IDENTITY and GENERATED columns.

The point with GENERATED is you have to have
all other columns assigned with values BEFORE
being able to compute a GENERATED column
that reference other columns in its expression so
you _have to_ break up the current order of computing
DEFAULTs. I know a computed column could be done
either in the application or with SELECTs but compare
the amount of work: if you do it in the SELECT you have to
compute the expressions every time the SELECT is run
making it slower. Doing it on UPDATE or INSERT
makes it LESS work in a fewer INSERT/UPDATE +
heavy SELECT workload. Of course, for a heavy UPDATE
workload it makes it more work but only if you actually
use GENERATED columns. It means exatly the same
amount of work if you use IDENTITY as with SERIAL,
it's just made in different order.

The GENERATED column is an easy of use feature
with possibly having less work, whereas the IDENTITY
column is mandatory for some applications (e.g. accounting
and billing is stricter in some countries) where you simply
cannot skip a value in the sequence, the strict monotonity is
not enough.

Best regards,
Zoltán Böszörményi


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 07:28 AM
Mageshwaran
 
Posts: n/a
Default Postgres Replication

Hi,

Some body help me regarding postgres replication, Give me some ideas .

Thanks in advance

Regards
J Mageshwaran


********** DISCLAIMER **********
Information contained and transmitted by this E-MAIL is proprietary to
Sify Limited and is intended for use only by the individual or entity to
which it is addressed, and may contain information that is privileged,
confidential or exempt from disclosure under applicable law. If this is a
forwarded message, the content of this E-MAIL may not have been sent with
the authority of the Company. If you are not the intended recipient, an
agent of the intended recipient or a person responsible for delivering the
information to the named recipient, you are notified that any use,
distribution, transmission, printing, copying or dissemination of this
information in any way or in any manner is strictly prohibited. If you have
received this communication in error, please delete this mail & notify us
immediately at admin@sifycorp.com


Watch the latest updates on Mumbai, with video coverage of news, events,
Bollywood, live darshan from Siddhivinayak temple and more, only on
www.mumbailive.in

Watch the hottest videos from Bollywood, Fashion, News and more only on
www.sifymax.com


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 07:28 AM
Andrew Dunstan
 
Posts: n/a
Default Re: Postgres Replication

Mageshwaran wrote:
> Hi,
>
> Some body help me regarding postgres replication, Give me some ideas .
>
> Thanks in advance
>
>


Here are some ideas:

lose the idiotic, pointless and inaccurate email addendum, especially
the ads for bollywood etc.

do some research yourself, especially by reading the Postgres
documentation - google is also your friend

ask questions in the correct forum, which for this type of question this
is not.

do not create mail on a new subject by replying to an old email and
changing the subject line


cheers

andrew


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 07:29 AM
Florian G. Pflug
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Zoltan Boszormenyi wrote:
> The GENERATED column is an easy of use feature
> with possibly having less work, whereas the IDENTITY
> column is mandatory for some applications (e.g. accounting
> and billing is stricter in some countries) where you simply
> cannot skip a value in the sequence, the strict monotonity is
> not enough.


But just postponing nextval() until after the uniqueness checks
only decreases the *probability* of non-monotonic values, and
*does not* preven them. Consindert two transactions

A: begin ;
B: Begin ;
A: insert ... -- IDENTITY generates value 1
B: insert .. -- IDENTITY generates value 2
A: rollback ;
B: commit ;

Now there is a record with IDENTITY 2, but not with 1. The *only*
way to fix this is to *not* use a sequence, but rather do
lock table t in exclusive mode ;
select max(identity)+1 from t ;
to generate the identity - but of course this prevents any concurrent
inserts, which will make this unuseable for any larger database.

Note that this is not a deficency of postgres sequences - there is no
way to guarantee stricly monotonic values while allowing concurrent
selects at the same time. (Other than lazyly assigning the values, but
this needs to be done by the application)

I agree that I'd be nice to generate the identity columns as late as
possible to prevents needless gaps, but not if price is a for more
intrusive patch, or much higher complexity.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 07:29 AM
Zoltan Boszormenyi
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Florian G. Pflug Ã*rta:
> Zoltan Boszormenyi wrote:
>> The GENERATED column is an easy of use feature
>> with possibly having less work, whereas the IDENTITY
>> column is mandatory for some applications (e.g. accounting
>> and billing is stricter in some countries) where you simply
>> cannot skip a value in the sequence, the strict monotonity is
>> not enough.

>
> But just postponing nextval() until after the uniqueness checks
> only decreases the *probability* of non-monotonic values, and
> *does not* preven them. Consindert two transactions
>
> A: begin ;
> B: Begin ;
> A: insert ... -- IDENTITY generates value 1
> B: insert .. -- IDENTITY generates value 2
> A: rollback ;
> B: commit ;


I can understand that. But your example is faulty,
you can't have transaction inside a transaction.
Checkpoints are another story. 8-)

You can have some application tricks to
have continous sequence today with regular
serials but only if don't have a unique index
that doesn't use the serial column. Inserting
a record to that table outside the transaction,
making note of the serial value.

If subsequent processing fails (because of unique,
check constraint, etc) you have to go back to the main
table and modify the record, indicating that the record
isn't representing valid data. But you must keep it with
the serial value it was assigned. I have seen systems
requiring this. My point is that with the identity
column, you will be able to define unique index
on the table that exludes the identity column.

> Now there is a record with IDENTITY 2, but not with 1. The *only*
> way to fix this is to *not* use a sequence, but rather do
> lock table t in exclusive mode ;
> select max(identity)+1 from t ;
> to generate the identity - but of course this prevents any concurrent
> inserts, which will make this unuseable for any larger database.
>
> Note that this is not a deficency of postgres sequences - there is no
> way to guarantee stricly monotonic values while allowing concurrent
> selects at the same time. (Other than lazyly assigning the values, but
> this needs to be done by the application)


Agreed.

> I agree that I'd be nice to generate the identity columns as late as
> possible to prevents needless gaps, but not if price is a for more
> intrusive patch, or much higher complexity.


Intrusive, hm? The catalog have to indicate that the column
is IDENTITY, otherwise you cannot know it.

The cost I am thinking now is an extra heap_update()
after heap_insert() without generating the identity value
and inserting index tuples to indexes that doesn't
contain the identity column.

Best regards,
Zoltán Böszörményi


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 07:29 AM
Zoltan Boszormenyi
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Zoltan Boszormenyi Ã*rta:
> Florian G. Pflug Ã*rta:
>> Zoltan Boszormenyi wrote:
>>> The GENERATED column is an easy of use feature
>>> with possibly having less work, whereas the IDENTITY
>>> column is mandatory for some applications (e.g. accounting
>>> and billing is stricter in some countries) where you simply
>>> cannot skip a value in the sequence, the strict monotonity is
>>> not enough.

>>
>> But just postponing nextval() until after the uniqueness checks
>> only decreases the *probability* of non-monotonic values, and
>> *does not* preven them. Consindert two transactions
>>
>> A: begin ;
>> B: Begin ;
>> A: insert ... -- IDENTITY generates value 1
>> B: insert .. -- IDENTITY generates value 2
>> A: rollback ;
>> B: commit ;

>
> I can understand that. But your example is faulty,
> you can't have transaction inside a transaction.
> Checkpoints are another story. 8-)
>
> You can have some application tricks to
> have continous sequence today with regular
> serials but only if don't have a unique index
> that doesn't use the serial column. Inserting
> a record to that table outside the transaction,
> making note of the serial value.
>
> If subsequent processing fails (because of unique,
> check constraint, etc) you have to go back to the main
> table and modify the record, indicating that the record
> isn't representing valid data. But you must keep it with
> the serial value it was assigned. I have seen systems
> requiring this. My point is that with the identity
> column, you will be able to define unique index
> on the table that exludes the identity column.
>
>> Now there is a record with IDENTITY 2, but not with 1. The *only*
>> way to fix this is to *not* use a sequence, but rather do
>> lock table t in exclusive mode ;
>> select max(identity)+1 from t ;
>> to generate the identity - but of course this prevents any concurrent
>> inserts, which will make this unuseable for any larger database.
>>
>> Note that this is not a deficency of postgres sequences - there is no
>> way to guarantee stricly monotonic values while allowing concurrent
>> selects at the same time. (Other than lazyly assigning the values, but
>> this needs to be done by the application)

>
> Agreed.
>
>> I agree that I'd be nice to generate the identity columns as late as
>> possible to prevents needless gaps, but not if price is a for more
>> intrusive patch, or much higher complexity.

>
> Intrusive, hm? The catalog have to indicate that the column
> is IDENTITY, otherwise you cannot know it.
>
> The cost I am thinking now is an extra heap_update()
> after heap_insert() without generating the identity value
> and inserting index tuples to indexes that doesn't
> contain the identity column.


And as far as I tested the current state, there is no cost
if you don't use GENERATED or IDENTITY.
The extra heap_update() would be performed only
if you have an IDENTITY colum.

> Best regards,
> Zoltán Böszörményi
>
>



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 07:29 AM
Florian G. Pflug
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Zoltan Boszormenyi wrote:
> Florian G. Pflug Ã*rta:
>> Zoltan Boszormenyi wrote:
>>> The GENERATED column is an easy of use feature
>>> with possibly having less work, whereas the IDENTITY
>>> column is mandatory for some applications (e.g. accounting
>>> and billing is stricter in some countries) where you simply
>>> cannot skip a value in the sequence, the strict monotonity is
>>> not enough.

>>
>> But just postponing nextval() until after the uniqueness checks
>> only decreases the *probability* of non-monotonic values, and
>> *does not* preven them. Consindert two transactions
>>
>> A: begin ;
>> B: Begin ;
>> A: insert ... -- IDENTITY generates value 1
>> B: insert .. -- IDENTITY generates value 2
>> A: rollback ;
>> B: commit ;

>
> I can understand that. But your example is faulty,
> you can't have transaction inside a transaction.
> Checkpoints are another story. 8-)


A: and B: are meant to denote *two* *different*
transactions running concurrently.

> You can have some application tricks to
> have continous sequence today with regular
> serials but only if don't have a unique index
> that doesn't use the serial column. Inserting
> a record to that table outside the transaction,
> making note of the serial value.
>
> If subsequent processing fails (because of unique,
> check constraint, etc) you have to go back to the main
> table and modify the record, indicating that the record
> isn't representing valid data. But you must keep it with
> the serial value it was assigned. I have seen systems
> requiring this. My point is that with the identity
> column, you will be able to define unique index
> on the table that exludes the identity column.


Yes, of course you can prevent gaps by just filling them
with garbage/invalid records of whatever. But I don't see
why this is usefull - either you want, say, your invoice
number to be continuous because it's required by law - or
you don't. But if the law required your invoice numbers to be
continous, surely just filling the gaps with fake invoices
it just as illegal as having gaps in the first place.

>> I agree that I'd be nice to generate the identity columns as late as
>> possible to prevents needless gaps, but not if price is a for more
>> intrusive patch, or much higher complexity.

>
> Intrusive, hm? The catalog have to indicate that the column
> is IDENTITY, otherwise you cannot know it.
>
> The cost I am thinking now is an extra heap_update()
> after heap_insert() without generating the identity value
> and inserting index tuples to indexes that doesn't
> contain the identity column.


I'll have to admit that I haven't actually looked at your patch -
so sorry if I missunderstood things. I got the impression that
tom's main complaint was that you are shuffling too much existing code
around in your patch, and I figured that this is partly because you
try to generate the IDENTITY value as late as possible. Since doing
this won't prevent gaps, but just reduces the probability of creating
them, I thought that a way around tom's concerns might be to drop
that requirement.

I will shut up now, at least until I have read the patch ;-)

greetings, Florian Pflug



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 07:29 AM
Zoltan Boszormenyi
 
Posts: n/a
Default Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples()?

Florian G. Pflug Ã*rta:
> Yes, of course you can prevent gaps by just filling them
> with garbage/invalid records of whatever. But I don't see
> why this is usefull - either you want, say, your invoice
> number to be continuous because it's required by law - or
> you don't. But if the law required your invoice numbers to be
> continous, surely just filling the gaps with fake invoices
> it just as illegal as having gaps in the first place.


Not fake invoice, "stornoed" for whatever reason.
But you have to keep the record to show you didn't delete anything.

Best regards,
Zoltán Böszörményi


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:04 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
UnixAdminTalk.com

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402