vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, My thesis is about the storage of heartsignals (eg: .dat files). My goal is to prove the Oracle is the best choice... I have written an application that can upload and download en file to a BLOB in the Database (using java 1.5.0: InputStream/OutputStream). The performance is +- 15 mbs on a AMD XP 3200+, 512mb ram, 120gb 8mb cache and WinXP using Oracle 9.2. While just copying a file is a lot faster.. Question 1: Is this a normal speed or can it get any faster? Question 2: Maybe using interMedia? Question 3: Can i addapt interMedia, so i can include my own matlab-functions in the db and in that way that i can use them in my queries? Question 4: Any other tips/tricks? Thanks in advantage! Jan |
| |||
| Well, first of all, storing .dat files into an Oracle database has the definitive advantage of data protection - if the db is professionally managed (backups, mirrored online redo logs, archivedlog etc) you can recover from any hw or sw failure, or better, you can get as sophisticated as you like in order to lower the probability of data loss. Oracle is excellent in doing that, and heartsignals are definitely a type of data that i would protect - imagine phoning a patient "ahemm mr Smith, we lost your records ... we must repeat your session on the exercizer..". As far as performance is concerned, you may also try sqlloader, which is very likely more performant than custom Java code for loading BLOBs (you can load over the network with it). Or, if you can copy the .dat on the server, you can easily load it as a blob using dbms_lob.loadfromfile(). In Java, consider writing in exact multiples of the chunksize of the blob; you can control the "writing buffer" size using the lob Java apis. This can have a dramatical impact on performance. Consider also playing with the CACHE attribute of the blob. If caching, you are writing in the buffer cache (memory), so your perceived speed is that of memory (DBWR will asynchronously write to disk for you, perhaps later, and Oracle guarantees that your committed data will eventually gets written to disk even if the machine aborts, which is not guaranteed by every OS filesystem). Clearly, if you are massively loading, so that DBWR can't keep up with your speed, you will just end waiting for DBWR to clean a full cache - in that case go for a NOCACHE blob, so that you will write directly to disk, bypassing the buffer cache, which is more efficient (uses less resources) even if your perceived speed is that of disk, not memory. Try both and see. If you're going to process the data - you may also want to consider loading the .dat file as a table (eg one row for every sample). If that's the case, consider using 10g, with includes support for fast IEEE 754 floating point operations (binary_float, binary_double datatypes) and analyzing your data using pure sql (very very fast) or perhaps a bit of pl/sql (that has been improved as well in 10g). Take a quick look here for an example of math operations: http://asktom.oracle.com/pls/ask/f?p... 6024461840033 (read the next followup too for binary_double) For a Thesis I would consider 10g for sure, even if i didn't need the IEEE 754 functionality (10g is a better 9i for common operations). I don't know about using matlab in oracle - but i feel like that extending intermedia is best left to an Oracle/matlab team hth Alberto Dell'Era |
| |||
| Alberto Dell'Era wrote: > Well, first of all, storing .dat files into an Oracle database has the > definitive advantage of data protection - if the db is professionally > managed (backups, mirrored online redo logs, archivedlog etc) you can > recover from any hw or sw failure, or better, you can get as > sophisticated as you like in order to lower the probability of data > loss. Oracle is excellent in doing that, and heartsignals are > definitely a type of data that i would protect - imagine phoning a > patient "ahemm mr Smith, we lost your records ... we must repeat your > session on the exercizer..". > > As far as performance is concerned, you may also try sqlloader, which > is very likely more performant than custom Java code for loading BLOBs > (you can load over the network with it). Or, if you can copy the .dat > on the server, you can easily load it as a blob using > dbms_lob.loadfromfile(). > > In Java, consider writing in exact multiples of the chunksize of the > blob; you can control the "writing buffer" size using the lob Java > apis. This can have a dramatical impact on performance. > > Consider also playing with the CACHE attribute of the blob. If > caching, you are writing in the buffer cache (memory), so your > perceived speed is that of memory (DBWR will asynchronously write to > disk for you, perhaps later, and Oracle guarantees that your committed > data will eventually gets written to disk even if the machine aborts, > which is not guaranteed by every OS filesystem). Clearly, if you are > massively loading, so that DBWR can't keep up with your speed, you > will just end waiting for DBWR to clean a full cache - in that case go > for a NOCACHE blob, so that you will write directly to disk, bypassing > the buffer cache, which is more efficient (uses less resources) even > if your perceived speed is that of disk, not memory. Try both and see. > > If you're going to process the data - you may also want to consider > loading the .dat file as a table (eg one row for every sample). If > that's the case, consider using 10g, with includes support for fast > IEEE 754 floating point operations (binary_float, binary_double > datatypes) and analyzing your data using pure sql (very very fast) or > perhaps a bit of pl/sql (that has been improved as well in 10g). Take > a quick look here for an example of math operations: > > http://asktom.oracle.com/pls/ask/f?p... 6024461840033 > (read the next followup too for binary_double) > > For a Thesis I would consider 10g for sure, even if i didn't need the > IEEE 754 functionality (10g is a better 9i for common operations). > > I don't know about using matlab in oracle - but i feel like that > extending intermedia is best left to an Oracle/matlab team > > hth > Alberto Dell'Era Also look at the BLOB loading built into intermedia: As simple as: obj1.setSource('FILE', 'VIDEODIR','test.mov'); obj1.import(ctx); -- Daniel A. Morgan University of Washington damorgan@x.washington.edu (replace 'x' with 'u' to respond) |
| |||
| alberto.dellera@gmail.com (Alberto Dell'Era) wrote in message news:<4ef2fbf5.0411070235.30c1fedc@posting.google. com>... <snip> > As far as performance is concerned, you may also try sqlloader, which > is very likely more performant than custom Java code for loading BLOBs > (you can load over the network with it). Or, if you can copy the .dat > on the server, you can easily load it as a blob using > dbms_lob.loadfromfile(). The max that I get using dbms_lob.loadfromfile() with this method is 5MB/s.. <snip> > If you're going to process the data - you may also want to consider > loading the .dat file as a table (eg one row for every sample). If > that's the case, consider using 10g, with includes support for fast > IEEE 754 floating point operations (binary_float, binary_double > datatypes) and analyzing your data using pure sql (very very fast) or > perhaps a bit of pl/sql (that has been improved as well in 10g). Take > a quick look here for an example of math operations: That's no go, the files can not be converted because of safety reasons. The files are ECG-files that are used for testing algoriths for pacemakers, and one of the obligations is that de file keeps intact. Right now I'm calculating the MD5, en checking it after download. > For a Thesis I would consider 10g for sure, even if i didn't need the > IEEE 754 functionality (10g is a better 9i for common operations). > > I don't know about using matlab in oracle - but i feel like that > extending intermedia is best left to an Oracle/matlab team Ok, upgroading to 10g... Maybe some tips on extending intermedia? > hth > Alberto Dell'Era Tnx |
| ||||
| jan.bollen@gmail.com (Jan Bollen) wrote: > The max that I get using dbms_lob.loadfromfile() with this method is > 5MB/s.. what about sqlloader ? > That's no go, the files can not be converted because of safety > reasons. The files are ECG-files that are used for testing algoriths > for pacemakers, and one of the obligations is that de file keeps > intact. Right now I'm calculating the MD5, en checking it after > download. Well you could also calculate the MD5 after the conversion "as a table", so being 100% confident that the information is the same. bye Alberto Dell'Era |