Unix Technical Forum

Re: how to copy large data fast

This is a discussion on Re: how to copy large data fast within the Informix forums, part of the Database Server Software category; --> >From: "Keith Simmons" <smiley73@googlemail.com> > > >Need to watch logging, both databases need to be the same so there ...


Go Back   Unix Technical Forum > Database Server Software > Informix

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-20-2008, 02:58 PM
Ian Michael Gumby
 
Posts: n/a
Default Re: how to copy large data fast




>From: "Keith Simmons" <smiley73@googlemail.com>


> >

>Need to watch logging, both databases need to be the same so there may
>be a requirement for lots of log space on the target to avoid long tx,
>also the number of locks used may come into play if the tables have a
>large number of records. However that aside I agree this is the
>fastest method.
>
>Keith


Locks?
Hmmm. I guess you haven't tried to lock the table so that there is only 1
lock per table per database?

Ok, so here's you other "problem".

If you chose the simplest method, of a single insert statement with an
embedded select statement, yes, you will have an issue of a long query in
the IDS 10 database.
Considering that this a new box being built, there shouldn't be anyone else
on the box and you can control your log file issues. (Or use rowid to
control the select and insert size.... ;-)

If you were going to write your own esql/c program, you would run in to the
standard librarian/reader mutex issue. One thread to perform the select
against IDS 7.x and n threads to read.

Haven't played with HPL, but does it allow for a select statement as its
input or a named pipe?
While you'll have some parallelization, you'll also have additional overhead
to contend with.

Now I know that there are other methods that could be "faster" like using a
combination of a shared memory queue and multiple writers and readers for
each table. (Note: you could create a system where you have a shared memory
queue for each table you are downloading. For each shared memory queue, you
have multiple readers where each reader opens a TCP/IP socket to the other
(IDS 10) machine, and on each of those sockets, you have one reader and
multiple writers, each being able to control how many transactions they want
per commit. (Which would flush the logs.) This would be a fairly complex
solution, however, how much faster would it be? You're going to be limited
by network bandwidth, amount of memory, processor speed, and disk speed.)

So, to wit,
The simplest and probably most efficient solution would be to account for
the table sizes with respect to logging, and then use an insert statement
with an embedded select.

But hey! What do I know?
I'm talking in terms of theory. I'd take the lazy approach and do the insert
statement.
(And if necessary, use the rowid to help break down the table in to more
manageable transactions. ;-)

-G

__________________________________________________ _______________
Find sales, coupons, and free shipping, all in one place! *MSN Shopping
Sales & Deals
http://shopping.msn.com/content/shp/...tnrdata=200639

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-20-2008, 02:58 PM
Neil Truby
 
Posts: n/a
Default Re: how to copy large data fast

"Ian Michael Gumby" <im_gumby@hotmail.com> wrote in message
news:mailman.9.1168889830.10648.informix-list@iiug.org...

> I'm talking in terms of theory. I'd take the lazy approach and do the
> insert statement.


The lazy approach is to use dbexport/dbimport.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-20-2008, 02:58 PM
Suppository Admin
 
Posts: n/a
Default Re: how to copy large data fast

If the new database is not being used by anyone, you could certainly turn logging off on the receiver database, run the load, then turn logging back on.

Same for the database with the data, if no one is in it, turn the logging off, run the unload, then turn the logging back on.


Ian Michael Gumby wrote:
>
>
>
>> From: "Keith Simmons" <smiley73@googlemail.com>

>
>> >

>> Need to watch logging, both databases need to be the same so there may
>> be a requirement for lots of log space on the target to avoid long tx,
>> also the number of locks used may come into play if the tables have a
>> large number of records. However that aside I agree this is the
>> fastest method.
>>
>> Keith

>
> Locks?
> Hmmm. I guess you haven't tried to lock the table so that there is only
> 1 lock per table per database?
>
> Ok, so here's you other "problem".
>
> If you chose the simplest method, of a single insert statement with an
> embedded select statement, yes, you will have an issue of a long query
> in the IDS 10 database.
> Considering that this a new box being built, there shouldn't be anyone
> else on the box and you can control your log file issues. (Or use rowid
> to control the select and insert size.... ;-)
>
> If you were going to write your own esql/c program, you would run in to
> the standard librarian/reader mutex issue. One thread to perform the
> select against IDS 7.x and n threads to read.
>
> Haven't played with HPL, but does it allow for a select statement as its
> input or a named pipe?
> While you'll have some parallelization, you'll also have additional
> overhead to contend with.
>
> Now I know that there are other methods that could be "faster" like
> using a combination of a shared memory queue and multiple writers and
> readers for each table. (Note: you could create a system where you have
> a shared memory queue for each table you are downloading. For each
> shared memory queue, you have multiple readers where each reader opens a
> TCP/IP socket to the other (IDS 10) machine, and on each of those
> sockets, you have one reader and multiple writers, each being able to
> control how many transactions they want per commit. (Which would flush
> the logs.) This would be a fairly complex solution, however, how much
> faster would it be? You're going to be limited by network bandwidth,
> amount of memory, processor speed, and disk speed.)
>
> So, to wit,
> The simplest and probably most efficient solution would be to account
> for the table sizes with respect to logging, and then use an insert
> statement with an embedded select.
>
> But hey! What do I know?
> I'm talking in terms of theory. I'd take the lazy approach and do the
> insert statement.
> (And if necessary, use the rowid to help break down the table in to more
> manageable transactions. ;-)
>
> -G
>
> __________________________________________________ _______________
> Find sales, coupons, and free shipping, all in one place! MSN Shopping
> Sales & Deals
> http://shopping.msn.com/content/shp/...tnrdata=200639
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:33 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com