This is a discussion on GiST concurrency commited within the pgsql Hackers forums, part of the PostgreSQL category; --> Have we list named something like 'test focusing for 8.1'? If it exists then GiST concurrency and recovery testing ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Have we list named something like 'test focusing for 8.1'? If it exists then GiST concurrency and recovery testing should be added to it. Especially, recovery after crash. Of course, now Oleg and me going to begin a large test program. While I'm running test with concurrent select/insert/update/delete/vacuum/vacuum full I found, that sometimes postgres crashes in index_beginscan_internal on FunctionCall3, because structure 'procedure' becomes zeroed. As I understand, LockRelation can invalidate part of Relation structure. So, I moved GET_REL_PROCEDURE after LockRelation. It seems to me, this patch should be backpatched or it's needed another fixing. This problem was 2-4 times per million statements executing by 4 flows. And there is one more problem: it caused approximatly one time per 2-4 million statements, I got traps: TRAP: FailedAssertion("!((*curpage)->offsets_used == num_tuples)", File: "vacuum.c", Line: 2766) LOG: server process (PID 15847) was terminated by signal 6 Sorry, but I couldn't debug this trap and my knowledge about this piece of code is very limited. Postgres didn't create a core file. I don't believe this problem is in touch with my GiST framework, becouse it is about heap pages. I suspect trap occurs while concurrent vacuum, but I am not sure. PS My concurrency testing scripts: http://www.sigaev.ru/gist/ concur.pl - generator of SQL statements concur.sh - simple wrapper about concur.pl which reinit db, makes db and table. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Teodor Sigaev <teodor@sigaev.ru> writes: > While I'm running test with concurrent > select/insert/update/delete/vacuum/vacuum full I found, that sometimes > postgres crashes in index_beginscan_internal on FunctionCall3, because > structure 'procedure' becomes zeroed. As I understand, LockRelation > can invalidate part of Relation structure. So, I moved > GET_REL_PROCEDURE after LockRelation. Oooh, good catch. > It seems to me, this patch > should be backpatched or it's needed another fixing. No, it's not an issue in the back branches, because until recently GET_REL_PROCEDURE only fetched the function OID. > And there is one more problem: it caused approximatly one time per 2-4 million > statements, I got traps: > TRAP: FailedAssertion("!((*curpage)->offsets_used == num_tuples)", File: > "vacuum.c", Line: 2766) > LOG: server process (PID 15847) was terminated by signal 6 Odd. Will look at it later (after feature freeze), if you don't find the cause beforehand. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| I think the whole GiST limitations page can be removed now... http://developer.postgresql.org/docs...mitations.html Chris ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| >>And there is one more problem: it caused approximatly one time per 2-4 million >>statements, I got traps: >>TRAP: FailedAssertion("!((*curpage)->offsets_used == num_tuples)", File: >>"vacuum.c", Line: 2766) >>LOG: server process (PID 15847) was terminated by signal 6 > > > Odd. Will look at it later (after feature freeze), if you don't find > the cause beforehand. It's definitly bug in a vaccum code, I got the same trap without any GiST indexes (to reproduce, just comment out 'create index' command in my script). -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| |||
| Sorry, fixed. Qingqing Zhou wrote: > "Teodor Sigaev" <teodor@sigaev.ru> writes > >>concur.pl - generator of SQL statements > > > retrieving it is forbidden ... > > Regards, > Qingqing > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Christopher Kings-Lynne wrote: > I think the whole GiST limitations page can be removed now... > > http://developer.postgresql.org/docs...mitations.html Done. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |
| |||
| Awhile back, Teodor Sigaev <teodor@sigaev.ru> wrote: > And there is one more problem: it caused approximatly one time per 2-4 million > statements, I got traps: > TRAP: FailedAssertion("!((*curpage)->offsets_used == num_tuples)", File: > "vacuum.c", Line: 2766) > LOG: server process (PID 15847) was terminated by signal 6 > Sorry, but I couldn't debug this trap and my knowledge about this piece of code > is very limited. Postgres didn't create a core file. I don't believe this > problem is in touch with my GiST framework, becouse it is about heap pages. I > suspect trap occurs while concurrent vacuum, but I am not sure. > PS > My concurrency testing scripts: > http://www.sigaev.ru/gist/ > concur.pl - generator of SQL statements > concur.sh - simple wrapper about concur.pl which reinit db, makes db and table. I have committed changes that I believe fix this problem: http://archives.postgresql.org/pgsql...8/msg00213.php But it needs more testing. Would you update to CVS tip and see if you still see the failure? Also, if anyone else has some vacuum + concurrent update test cases, any testing you can do in CVS tip would be useful. This patch is big and ugly enough that back-patching it into all the supported back branches is a pretty scary prospect. I don't think we have a lot of choice --- it is a data-loss risk --- but we need to beat the heck out of the CVS-tip version before we start pushing it into the release branches. My current intention is to leave it just in CVS tip for the next few days, and not to start developing back-branch versions until after we've made the first 8.1 beta release. The back-ports are going to be painful (the code involved has changed often enough that I fear each branch will need a custom tailored patch) ... so I really don't want to start without some confidence that the CVS-tip patch is right. In other words ... if you can test this ... HELP!!! regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Sat, 20 Aug 2005, Tom Lane wrote: > I have committed changes that I believe fix this problem: > http://archives.postgresql.org/pgsql...8/msg00213.php > But it needs more testing. Would you update to CVS tip and see if you > still see the failure? I've written some quick scripts. One just vacuums constantly (999 vacuums to 1 vacuum full) while three other scripts three randomly insert into, update and delete from 3 tables. There's a mix of small and large transactions. The tables have a single int column. It is set up to run 3 million transactions across the 3 scripts. I will try and jump onto one of the larger OSDL machines to test as well. Gavin ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| Gavin Sherry <swm@linuxworld.com.au> writes: > I've written some quick scripts. One just vacuums constantly (999 vacuums > to 1 vacuum full) while three other scripts three randomly insert > into, update and delete from 3 tables. There's a mix of small and large > transactions. The tables have a single int column. It is set up to run 3 > million transactions across the 3 scripts. Note that since the issues have mainly to do with update chains, it'd be good to stress cases where a row is updated multiple times before being deleted. And use at least one long-running transaction, so that VACUUM can't just throw away the update chain. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|