vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I'm on Oracle 9.2.0.5, Open VMS. I perform an INSERT INTO target SELECT * FROM source If I do this noparallel, the result from the select is immediately inserted into the target. If I do this in parallel, the result of the selects from a slave set are first buffered in TEMP. After this is done, a slave set starts to insert the contents from the buffers into the target. Why this timeconsuming buffering? Can anything be done? While I am writing this, I wonder: I let Oracle decide which degree of parallellization to use. Maybe buffering only takes places if the degree differs between SELECT and INSERT? (I should test myself, but don't have the opportunity right now). Regards, Jaap. |
| |||
| On Nov 22, 11:23 am, j.w.vandijk.removet...@hetnet.nl (Jaap W. van Dijk) wrote: > Hi, > > I'm on Oracle 9.2.0.5, Open VMS. > > I perform an > > INSERT INTO target > SELECT * FROM source > > If I do this noparallel, the result from the select is immediately > inserted into the target. > > If I do this in parallel, the result of the selects from a slave set > are first buffered in TEMP. After this is done, a slave set starts to > insert the contents from the buffers into the target. > > Why this timeconsuming buffering? Can anything be done? > > While I am writing this, I wonder: I let Oracle decide which degree of > parallellization to use. Maybe buffering only takes places if the > degree differs between SELECT and INSERT? (I should test myself, but > don't have the opportunity right now). > > Regards, Jaap. IMHO parallelization is pretty useless if the affected tables aren't striped over several disks. I remember Tom Kyte stating somewhere you should use parallelization only if you can't get rid of full table scans at all. Also parallelization always will use parallel server slaves and a query coordinator (the original session), so I bet this is what you are looking at right now: data is retrieved by slaves and inserted by the coordinator. And yes, this is a 'feature', while you seem to qualify it as a bug. -- Sybrand Bakker Senior Oracle DBA |
| |||
| Jaap W. van Dijk <j.w.vandijk.removethis@hetnet.nl> schreef in bericht > Hi, > I'm on Oracle 9.2.0.5, Open VMS. > > I perform an > INSERT INTO target > SELECT * FROM source > > If I do this noparallel, the result from the select is immediately inserted into the target. > > If I do this in parallel, the result of the selects from a slave set > are first buffered in TEMP. After this is done, a slave set starts to > insert the contents from the buffers into the target. > > Why this timeconsuming buffering? Can anything be done? > > While I am writing this, I wonder: I let Oracle decide which degree of > parallellization to use. Maybe buffering only takes places if the > degree differs between SELECT and INSERT? (I should test myself, but > don't have the opportunity right now). > Regards, Jaap. Well Jaap, If you insert less then 100.000 rows a parallel insert might not be faster then a normal insert /*+APPEND */. Check out the APPEND hint. If you still want to do the parallel insert without TEMP usage then drop the indexes on the target table. Oracle first inserts the table data; then does some sorting on the new index data and rebuilds them. No index > no sorts needed. Cheers, Roelof Schierbeek; DBA |
| |||
| On 22 Nov 2006 03:54:40 -0800, "sybrandb" <sybrandb@gmail.com> wrote: > > >On Nov 22, 11:23 am, j.w.vandijk.removet...@hetnet.nl (Jaap W. van >Dijk) wrote: >> Hi, >> >> I'm on Oracle 9.2.0.5, Open VMS. >> >> I perform an >> >> INSERT INTO target >> SELECT * FROM source >> >> If I do this noparallel, the result from the select is immediately >> inserted into the target. >> >> If I do this in parallel, the result of the selects from a slave set >> are first buffered in TEMP. After this is done, a slave set starts to >> insert the contents from the buffers into the target. >> >> Why this timeconsuming buffering? Can anything be done? >> >> While I am writing this, I wonder: I let Oracle decide which degree of >> parallellization to use. Maybe buffering only takes places if the >> degree differs between SELECT and INSERT? (I should test myself, but >> don't have the opportunity right now). >> >> Regards, Jaap. > >IMHO parallelization is pretty useless if the affected tables aren't >striped over several disks. >I remember Tom Kyte stating somewhere you should use parallelization >only if you can't get rid of full table scans at all. >Also parallelization always will use parallel server slaves and a query >coordinator (the original session), so I bet this is what you are >looking at right now: data is retrieved by slaves and inserted by the >coordinator. >And yes, this is a 'feature', while you seem to qualify it as a bug. > >-- >Sybrand Bakker >Senior Oracle DBA > Maybe the coordinator wants to coordinate between select and insert , but it doesn't insert the records itself: that is also done by a slave set. I took a look at the Concept Manual. There inter-operation parallellism is discussed, where data flows directly from one slave set to the next (fig 18-3 on page 18-7). What is pictured there - two slave sets doing a select and a sort - could IMO as easily have been two slave sets doing a select and an insert, without any buffering. As I said before: maybe the optimizer needs to know beforehand that the degree of select and insert is the same, so when I have the opportunity, I perform a test. Regarding the parallel gain: selecting also takes an amount of CPU, and parallellizing that improves the select noticably, even if everything is read from the same disk, especially if the select is part of the transformation step of the load of a datawarehouse, with lots of complex case statements in the select list, consuming lots of CPU. Regards, Jaap. |
| |||
| Jaap W. van Dijk wrote: > Hi, > > I'm on Oracle 9.2.0.5, Open VMS. > > I perform an > > INSERT INTO target > SELECT * FROM source > > If I do this noparallel, the result from the select is immediately > inserted into the target. > > If I do this in parallel, the result of the selects from a slave set > are first buffered in TEMP. After this is done, a slave set starts to > insert the contents from the buffers into the target. > > Why this timeconsuming buffering? Can anything be done? > > While I am writing this, I wonder: I let Oracle decide which degree of > parallellization to use. Maybe buffering only takes places if the > degree differs between SELECT and INSERT? (I should test myself, but > don't have the opportunity right now). > > Regards, Jaap. The second time around is always faster if you've run the query and not flushed sga. Parallel always reads from disk. No parallel uses buffer cache. Parallel needs to sort hence the temp. Johnathan Lewis Cost-based Oracle has some good info on this. I like tkyes stuff too, but sometimes Kytes stuff is not as Warehouse as I like. Parallel is great if data is huge and the entire machine can be devoted to a single session. Otherwise, it can cause pain. Why is it an issue if you can do it fast w/o parallel? |
| |||
| R. Schierbeek wrote: > Jaap W. van Dijk <j.w.vandijk.removethis@hetnet.nl> schreef in bericht > > Hi, > > I'm on Oracle 9.2.0.5, Open VMS. > > > > I perform an > > INSERT INTO target > > SELECT * FROM source > > > > If I do this noparallel, the result from the select is immediately inserted into the target. > > > > If I do this in parallel, the result of the selects from a slave set > > are first buffered in TEMP. After this is done, a slave set starts to > > insert the contents from the buffers into the target. > > > > Why this timeconsuming buffering? Can anything be done? > > > > While I am writing this, I wonder: I let Oracle decide which degree of > > parallellization to use. Maybe buffering only takes places if the > > degree differs between SELECT and INSERT? (I should test myself, but > > don't have the opportunity right now). > > Regards, Jaap. > > Well Jaap, > If you insert less then 100.000 rows a parallel insert might not be faster then > a normal insert /*+APPEND */. Check out the APPEND hint. > > If you still want to do the parallel insert without TEMP usage then > drop the indexes on the target table. Oracle first inserts the table data; > then does some sorting on the new index data and rebuilds them. > No index > no sorts needed. > > Cheers, > Roelof Schierbeek; DBAb So if I offline my tempspace and run a parallel query it will work as long as the table being queried has not indexes? |
| |||
| EscVector wrote: > R. Schierbeek wrote: > > Jaap W. van Dijk <j.w.vandijk.removethis@hetnet.nl> schreef in bericht > > > Hi, > > > I'm on Oracle 9.2.0.5, Open VMS. > > > > > > I perform an > > > INSERT INTO target > > > SELECT * FROM source > > > > > > If I do this noparallel, the result from the select is immediately inserted into the target. > > > > > > If I do this in parallel, the result of the selects from a slave set > > > are first buffered in TEMP. After this is done, a slave set starts to > > > insert the contents from the buffers into the target. > > > > > > Why this timeconsuming buffering? Can anything be done? > > > > > > While I am writing this, I wonder: I let Oracle decide which degree of > > > parallellization to use. Maybe buffering only takes places if the > > > degree differs between SELECT and INSERT? (I should test myself, but > > > don't have the opportunity right now). > > > Regards, Jaap. > > > > Well Jaap, > > If you insert less then 100.000 rows a parallel insert might not be faster then > > a normal insert /*+APPEND */. Check out the APPEND hint. > > > > If you still want to do the parallel insert without TEMP usage then > > drop the indexes on the target table. Oracle first inserts the table data; > > then does some sorting on the new index data and rebuilds them. > > No index > no sorts needed. > > > > Cheers, > > Roelof Schierbeek; DBAb > > So if I offline my tempspace and run a parallel query it will work as > long as the table being queried has not indexes? Offline by meaning I kill the temp file to make sure it can't be used... |
| |||
| "EscVector" <Junk@webthere.com> wrote > R. Schierbeek wrote: >> Jaap W. van Dijk <j.w.vandijk.removethis@hetnet.nl> schreef in bericht >> > Hi, >> > I'm on Oracle 9.2.0.5, Open VMS. >> > I perform an >> > INSERT INTO target SELECT * FROM source >> > >> > If I do this noparallel, the result from the select is immediately inserted into the target. >> > >> > If I do this in parallel, the result of the selects from a slave set >> > are first buffered in TEMP. After this is done, a slave set starts to >> > insert the contents from the buffers into the target. >> > >> > Why this timeconsuming buffering? Can anything be done? >> > >> > While I am writing this, I wonder: I let Oracle decide which degree of >> > parallellization to use. Maybe buffering only takes places if the >> > degree differs between SELECT and INSERT? (I should test myself, but >> > don't have the opportunity right now). >> > Regards, Jaap. >> >> Well Jaap, >> If you insert less then 100.000 rows a parallel insert might not be faster then >> a normal insert /*+APPEND */. Check out the APPEND hint. >> >> If you still want to do the parallel insert without TEMP usage then >> drop the indexes on the target table. Oracle first inserts the table data; >> then does some sorting on the new index data and rebuilds them. >> No index > no sorts needed. >> >> Cheers, >> Roelof Schierbeek; DBAb > > So if I offline my tempspace and run a parallel query it will work as > long as the table being queried has not indexes? Yes it will work as long as the table being inserted into has not indexes. If you have indexes - well go ahead and make my day ! And take a look at temp/sort table while u're inserting, eg v$sort_usage or v$sort_segment: col current_users for 999 head 'curr|users' col FREED_EXTENTS for 999 head 'FREED|EXTENTS' col MAX_USED_SIZE head 'MAX_USED|SIZE' select current_users,total_extents , (USED_BLOCKS * &&block_size)/1024 used_temp , (MAX_USED_BLOCKS *&&block_size)/1024 max_used ,freed_extents, extent_hits ,max_size, max_used_size from v$sort_segment Cheers Roelof Schierbeek; DBA |
| |||
| On 22 Nov 2006 07:03:39 -0800, "EscVector" <Junk@webthere.com> wrote: > >Jaap W. van Dijk wrote: >> Hi, >> >> I'm on Oracle 9.2.0.5, Open VMS. >> >> I perform an >> >> INSERT INTO target >> SELECT * FROM source >> >> If I do this noparallel, the result from the select is immediately >> inserted into the target. >> >> If I do this in parallel, the result of the selects from a slave set >> are first buffered in TEMP. After this is done, a slave set starts to >> insert the contents from the buffers into the target. >> >> Why this timeconsuming buffering? Can anything be done? >> >> While I am writing this, I wonder: I let Oracle decide which degree of >> parallellization to use. Maybe buffering only takes places if the >> degree differs between SELECT and INSERT? (I should test myself, but >> don't have the opportunity right now). >> >> Regards, Jaap. > >The second time around is always faster if you've run the query and not >flushed sga. Parallel always reads from disk. No parallel uses buffer >cache. Parallel needs to sort hence the temp. Johnathan Lewis >Cost-based Oracle has some good info on this. I like tkyes stuff too, >but sometimes Kytes stuff is not as Warehouse as I like. Parallel is >great if data is huge and the entire machine can be devoted to a single >session. Otherwise, it can cause pain. Why is it an issue if you can >do it fast w/o parallel? > Because it needs to be faster. noparallel ==> slow select , slow insert, no buffering parallel ==> fast select, fast insert, but buffering. The fast select and insert more than outweigh the buffering, so this is faster than noparallel. But it would be really fast if I could do the select and insert parallel without the buffering. Regards, Jaap. |
| ||||
| On Wed, 22 Nov 2006 14:14:00 GMT, j.w.vandijk.removethis@hetnet.nl (Jaap W. van Dijk) wrote: >On 22 Nov 2006 03:54:40 -0800, "sybrandb" <sybrandb@gmail.com> wrote: > >> >> >>On Nov 22, 11:23 am, j.w.vandijk.removet...@hetnet.nl (Jaap W. van >>Dijk) wrote: >>> Hi, >>> >>> I'm on Oracle 9.2.0.5, Open VMS. >>> >>> I perform an >>> >>> INSERT INTO target >>> SELECT * FROM source >>> >>> If I do this noparallel, the result from the select is immediately >>> inserted into the target. >>> >>> If I do this in parallel, the result of the selects from a slave set >>> are first buffered in TEMP. After this is done, a slave set starts to >>> insert the contents from the buffers into the target. >>> >>> Why this timeconsuming buffering? Can anything be done? >>> >>> While I am writing this, I wonder: I let Oracle decide which degree of >>> parallellization to use. Maybe buffering only takes places if the >>> degree differs between SELECT and INSERT? (I should test myself, but >>> don't have the opportunity right now). >>> >>> Regards, Jaap. >> >>IMHO parallelization is pretty useless if the affected tables aren't >>striped over several disks. >>I remember Tom Kyte stating somewhere you should use parallelization >>only if you can't get rid of full table scans at all. >>Also parallelization always will use parallel server slaves and a query >>coordinator (the original session), so I bet this is what you are >>looking at right now: data is retrieved by slaves and inserted by the >>coordinator. >>And yes, this is a 'feature', while you seem to qualify it as a bug. >> >>-- >>Sybrand Bakker >>Senior Oracle DBA >> > >Maybe the coordinator wants to coordinate between select and insert , >but it doesn't insert the records itself: that is also done by a slave >set. > >I took a look at the Concept Manual. There inter-operation >parallellism is discussed, where data flows directly from one slave >set to the next (fig 18-3 on page 18-7). What is pictured there - two >slave sets doing a select and a sort - could IMO as easily have been >two slave sets doing a select and an insert, without any buffering. As >I said before: maybe the optimizer needs to know beforehand that the >degree of select and insert is the same, so when I have the >opportunity, I perform a test. > >Regarding the parallel gain: selecting also takes an amount of CPU, >and parallellizing that improves the select noticably, even if >everything is read from the same disk, especially if the select is >part of the transformation step of the load of a datawarehouse, with >lots of complex case statements in the select list, consuming lots of >CPU. > >Regards, Jaap. After some more reading and thinking: I thought that data got PIPED through the slave set processes, so the second slave set consumes immediately what the first set produces. This certainly is what figure 18-3 in the Concept manual is suggesting. But instead produce from the first slave set is buffered in totality first, and then this is processed by the second slave set. A similar difference as between the ordinary table function and the pipelined table function. I don't see any functional impossibilities, so I wonder why Oracle didn't implement the pipelined version, which would be much faster. Regards, Jaap. |