This is a discussion on Optimize JOIN to gain speed within the MySQL forums, part of the Database Server Software category; --> Dear mysql-ians, I am performing a query that takes ages before showing a result. I suppose it is very ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Dear mysql-ians, I am performing a query that takes ages before showing a result. I suppose it is very time consuming because of the JOIN I perform. My question therefore is if you have any suggestions to optimize my query? What I want to do is divide a value of a cell (NDVI) by the median of its neighbours based on geographical coordinates (refgeo). The selection script is as follows: SELECT NDVI.id, NDVI.NDVI_99_24*IF(B.B_99_24<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_24*IF(refB.B_99_24<30,1,NUL L)), NDVI.NDVI_99_25*IF(B.B_99_25<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_25*IF(refB.B_99_25<30,1,NUL L)), NDVI.NDVI_99_26*IF(B.B_99_26<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_26*IF(refB.B_99_26<30,1,NUL L)), ...., NDVI.NDVI_99_170*IF(B.B_99_170<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_170*IF(refB.B_99_170<30,1,N ULL)) FROM vgt.NDVI AS NDVI INNER JOIN vgt.B AS B USING (id) INNER JOIN vgt.v_ecoclim AS eco USING (id) INNER JOIN vgt.v_vcf2001 AS vcf USING (id) , (vgt.meta_gsc AS refmgsc INNER JOIN vgt.gsc AS refgsc ON refgsc.gsc_XX = refmgsc.id) INNER JOIN vgt.geo AS refgeo ON refgeo.id = refgsc.id INNER JOIN vgt.B AS refB ON refgeo.id = refB.id INNER JOIN vgt.v_ecoclim AS refeco ON refgeo.id = refeco.id INNER JOIN vgt.v_vcf2001 AS refvcf ON refgeo.id = refvcf.id INNER JOIN (vgt.gba AS refgba INNER JOIN vgt.meta_gba AS refmgba ON refgba.gba_XX = refmgba.id) ON refgeo.id=refgba.id INNER JOIN vgt.NDVI AS refNDVI ON refgeo.id = refNDVI.id WHERE NDVI.id= 1133568 AND refgeo.X_coord > 1288 AND refgeo.X_coord < 1328 AND refgeo.Y_coord > 659 AND refgeo.Y_coord < 699 AND refmgba.burnt=0 AND refmgsc.burnt=0 AND refeco.v_landcov=eco.v_landcov AND refeco.v_lowreb=eco.v_lowreb AND refvcf.vcf_bare < vcf.vcf_bare+20 AND refvcf.vcf_bare > vcf.vcf_bare-20 AND refvcf.vcf_tree < vcf.vcf_tree+20 AND refvcf.vcf_tree > vcf.vcf_tree-20 AND refvcf.vcf_herb < vcf.vcf_herb+20 AND refvcf.vcf_herb > vcf.vcf_herb-20 GROUP BY refmgba.burnt I have thus several main-tables that have identical rows (and are linked by id in a 1to1-relatioship ) and I have 2 sub-tables in a manyto1-relationship. I have put it in a scheme below (view using Courier font for correct spacing). The main tables are: NDVI, B, eco, vcf, refgsc, refgeo, refeco, refgba, refB, refvcf, refNDVI. The 2 subtables: refmgba, refmgsc. Is my query correct? Do you have any solutions to make my query more efficient? I've studied the mysql-manual, but I don't see answers on the moment Thanks in advance, Stef Schematic overview tables: [=============] [=============] [============] [T_Main: gsc ] [T_Main: gba ] [T_Main:NDVI ] O [=============] [=============] [============] T [ ID ]<--->[ ID ]<->[ ID ]<->H |-->[ gsc_XX(many)] |->[ gba_XX(many)] [ NDVI_99_24 ] E | [ Others... ] | [ Others... ] [ NDVI_99_25 ] R | [=============] | [=============] [ Others... ] S | | [============] | | | | | [=============] | [=============] | [T: refmgsc ] | [T: refmgba ] | [=============] | [=============] |--> [ ID(1) ] |--> [ ID(1) ] [ BURNT ] [ BURNT ] [ Others... ] [ Others... ] [=============] [=============] |
| |||
| -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 stefaan.lhermitte@agr.kuleuven.ac.be schrieb: > Dear mysql-ians, Hi > >... > FROM vgt.NDVI AS NDVI > INNER JOIN vgt.B AS B > USING (id) > INNER JOIN vgt.v_ecoclim AS eco > USING (id) > INNER JOIN vgt.v_vcf2001 AS vcf > USING (id) > , > (vgt.meta_gsc AS refmgsc Whats this here? Seems to be a cross join. How do you combine the 3 combined tables with the ones below? > INNER JOIN vgt.gsc AS refgsc > ON refgsc.gsc_XX = refmgsc.id) > INNER JOIN vgt.geo AS refgeo > ON refgeo.id = refgsc.id > INNER JOIN vgt.B AS refB > ON refgeo.id = refB.id > INNER JOIN vgt.v_ecoclim AS refeco > ON refgeo.id = refeco.id > INNER JOIN vgt.v_vcf2001 AS refvcf > ON refgeo.id = refvcf.id > INNER JOIN (vgt.gba AS refgba > INNER JOIN vgt.meta_gba AS refmgba > ON refgba.gba_XX = refmgba.id) >... What indices are set? Regards Stefan >... -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.1 (MingW32) iD8DBQFEIgPxyeCLzp/JKjARAvOnAKDHXpMZhPuSBy574mu8S161EIOKTACgwsYh PgSa+xHKWPq1aN7K7EVNmVA= =IX8L -----END PGP SIGNATURE----- |
| |||
| > I am performing a query that takes ages before showing a result. Probably. It takes ages for me to understand what it should be doing. > I > suppose it is very time consuming because of the JOIN I perform. My > question therefore is if you have any suggestions to optimize my query? Split it up in comprehendable parts? Or, if your version of MySQL supports it, by organizing the comprehendable parts into stored procedures. From there, you can start optimizing. You might also take a good look at what the EXPLAIN command has to say about this query. Best regards > > What I want to do is divide a value of a cell (NDVI) by the median of > its neighbours based on geographical coordinates (refgeo). > The selection script is as follows: > > SELECT > NDVI.id, > NDVI.NDVI_99_24*IF(B.B_99_24<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_24*IF(refB.B_99_24<30,1,NUL L)), > NDVI.NDVI_99_25*IF(B.B_99_25<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_25*IF(refB.B_99_25<30,1,NUL L)), > NDVI.NDVI_99_26*IF(B.B_99_26<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_26*IF(refB.B_99_26<30,1,NUL L)), > ..., > NDVI.NDVI_99_170*IF(B.B_99_170<30,1,NULL)/MEDIAN(refNDVI.NDVI_99_170*IF(refB.B_99_170<30,1,N ULL)) > > FROM vgt.NDVI AS NDVI > INNER JOIN vgt.B AS B > USING (id) > INNER JOIN vgt.v_ecoclim AS eco > USING (id) > INNER JOIN vgt.v_vcf2001 AS vcf > USING (id) > , > (vgt.meta_gsc AS refmgsc > INNER JOIN vgt.gsc AS refgsc > ON refgsc.gsc_XX = refmgsc.id) > INNER JOIN vgt.geo AS refgeo > ON refgeo.id = refgsc.id > INNER JOIN vgt.B AS refB > ON refgeo.id = refB.id > INNER JOIN vgt.v_ecoclim AS refeco > ON refgeo.id = refeco.id > INNER JOIN vgt.v_vcf2001 AS refvcf > ON refgeo.id = refvcf.id > INNER JOIN (vgt.gba AS refgba > INNER JOIN vgt.meta_gba AS refmgba > ON refgba.gba_XX = refmgba.id) > ON refgeo.id=refgba.id > INNER JOIN vgt.NDVI AS refNDVI > ON refgeo.id = refNDVI.id > WHERE NDVI.id= 1133568 > AND refgeo.X_coord > 1288 > AND refgeo.X_coord < 1328 > AND refgeo.Y_coord > 659 > AND refgeo.Y_coord < 699 > AND refmgba.burnt=0 > AND refmgsc.burnt=0 > AND refeco.v_landcov=eco.v_landcov > AND refeco.v_lowreb=eco.v_lowreb > AND refvcf.vcf_bare < vcf.vcf_bare+20 > AND refvcf.vcf_bare > vcf.vcf_bare-20 > AND refvcf.vcf_tree < vcf.vcf_tree+20 > AND refvcf.vcf_tree > vcf.vcf_tree-20 > AND refvcf.vcf_herb < vcf.vcf_herb+20 > AND refvcf.vcf_herb > vcf.vcf_herb-20 > GROUP BY refmgba.burnt > > I have thus several main-tables that have identical rows (and are > linked by id in a 1to1-relatioship ) and I have 2 sub-tables in a > manyto1-relationship. I have put it in a scheme below (view using > Courier font for correct spacing). > > The main tables are: NDVI, B, eco, vcf, refgsc, refgeo, refeco, refgba, > refB, refvcf, refNDVI. > The 2 subtables: refmgba, refmgsc. > > Is my query correct? Do you have any solutions to make my query more > efficient? > I've studied the mysql-manual, but I don't see answers on the moment > > Thanks in advance, > Stef > > Schematic overview tables: > > [=============] [=============] [============] > [T_Main: gsc ] [T_Main: gba ] [T_Main:NDVI ] O > [=============] [=============] [============] T > [ ID ]<--->[ ID ]<->[ ID ]<->H > |-->[ gsc_XX(many)] |->[ gba_XX(many)] [ NDVI_99_24 ] E > | [ Others... ] | [ Others... ] [ NDVI_99_25 ] R > | [=============] | [=============] [ Others... ] S > | | [============] > | | > | | > | [=============] | [=============] > | [T: refmgsc ] | [T: refmgba ] > | [=============] | [=============] > |--> [ ID(1) ] |--> [ ID(1) ] > [ BURNT ] [ BURNT ] > [ Others... ] [ Others... ] > [=============] [=============] > |
| ||||
| Stefan Rybacki schreef: > >... > > FROM vgt.NDVI AS NDVI > > INNER JOIN vgt.B AS B > > USING (id) > > INNER JOIN vgt.v_ecoclim AS eco > > USING (id) > > INNER JOIN vgt.v_vcf2001 AS vcf > > USING (id) > > , > > (vgt.meta_gsc AS refmgsc > > Whats this here? Seems to be a cross join. How do you combine the 3 > combined tables with the ones below? I combine the 3 tables with the ones below based on the WHERE clauses. Since I am calculating a ratio, I use a subset of for example vgt.B as B in the nominator and a different subset of of vgt.B as refB in the denominator. The selection of the 2 subsets is based on the WHERE clause. Is this incorrect? > > > INNER JOIN vgt.gsc AS refgsc > > ON refgsc.gsc_XX = refmgsc.id) > > INNER JOIN vgt.geo AS refgeo > > ON refgeo.id = refgsc.id > > INNER JOIN vgt.B AS refB > > ON refgeo.id = refB.id > > INNER JOIN vgt.v_ecoclim AS refeco > > ON refgeo.id = refeco.id > > INNER JOIN vgt.v_vcf2001 AS refvcf > > ON refgeo.id = refvcf.id > > INNER JOIN (vgt.gba AS refgba > > INNER JOIN vgt.meta_gba AS refmgba > > ON refgba.gba_XX = refmgba.id) > >... |