Jerry Stuckle wrote :
> I'm a bit confused about exactly what you're trying to do.
I not an English native speaker, adding confusions in my explanations
about a non-trivial problem.
If you give me a chance, I'll try to summarize it differently :
I need to run a query that have 3 worst use cases, depending on the
input (X and Y). X and Y values heavily determined the number of lines
to read: from few (1,2 lines) to a lot (1 million of lines).
worst use case 1 (called 'WUC1') :
X is high (1.000.000)
Y is low (10)
worst use case 2 (called 'WUC2') :
X is low (10)
Y is high (1.000.000)
worst use case 3 (called 'WUC3') :
X is high (1.000.000)
Y is high (1.000.000)
For WUC1, I found a query which is really fast (QueryA, using distinct).
But it's very slow on WUC2 (which does not surprise me).
For WUC2, I found a query which is really fast (QueryB, using exist).
But it's very slow on WUC1 (which does not surprise me neither).
For the moment, I have to figure out (in my outer Java code) if I'm in
WUC1 and then trigger the QueryA, or in WUC2 and then trigger the
QueryB. But I want to let the MySQL optimizer doing that job for me !
My question is: Does someone could think of a query construction that
could be quick in both WUC1 and WUC2 ?
(For WUC3, the query is inherently slow, due to cross join between 1M
and 1M lines, and I'm not looking for an improvement)
> What are the
> SELECT statements you're using now, and what do you get if you EXPLAIN
> those statements?
Here we go
Query A (get distinct fathers of occurrences in a given document list
that are direct children of an entity):
select distinct eh2.idParentEntity
from entity e
join entityhierarchy eh on eh.uidEntity=e.uidEntity
join entityhierarchy eh2 on eh2.uidEntity=e.uidEntity
join positionedElement pe on pe.uidEntity = e.uidEntity
where eh.idParentEntity=X
and eh2.depth=eh.depth-1
and pe.idKnowledgeSet < Y
Query B (from every direct children of an entity, select the one that
have occurrences in a given document list)
select e.uidEntity
from entity e
join entityhierarchy eh on eh.uidEntity=e.uidEntity
where eh.idParentEntity=X and depth=1
and exists
(select 1 from entity e
join entityhierarchy eh2 on eh2.uidEntity=e.uidEntity
join entity father on eh2.idParentEntity = father.uidEntity
join positionedElement pe on pe.uidEntity = e.uidEntity
where eh.uidEntity=eh2.idParentEntity
and pe.idKnowledgeSet < Y)
Now the explain statement (80 cols...)
*[CASE 1]*
| Query A in WUC1
| X=lots of children (1.3M)
| Y=few documents (10)
| average execution time: 80 ms
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pe
type: range
possible_keys: pEntityIdx,ksIdx
key: ksIdx
key_len: 4
ref: NULL
rows: 155
Extra: Using where; Using temporary
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: eh
type: ref
possible_keys: entitiesIdx,fullHierarchyIdx,hIdx
key: entitiesIdx
key_len: 4
ref: tcell.pe.uidEntity
rows: 1
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: e
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using where; Using index
*************************** 4. row ***************************
id: 1
select_type: SIMPLE
table: eh2
type: ref
possible_keys: entitiesIdx
key: entitiesIdx
key_len: 4
ref: tcell.pe.uidEntity
rows: 1
Extra: Using where
************************************************** ************
*[CASE 2]*
| Query A, WUC2
| X=few children (8)
| Y=lots of documents (150.000)
| average execution time: 2 min
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pe
type: range
possible_keys: pEntityIdx,ksIdx
key: ksIdx
key_len: 4
ref: NULL
rows: 2511168
Extra: Using where; Using temporary
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: eh
type: ref
possible_keys: entitiesIdx,fullHierarchyIdx,hIdx
key: entitiesIdx
key_len: 4
ref: tcell.pe.uidEntity
rows: 1
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: e
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using where; Using index
*************************** 4. row ***************************
id: 1
select_type: SIMPLE
table: eh2
type: ref
possible_keys: entitiesIdx
key: entitiesIdx
key_len: 4
ref: tcell.pe.uidEntity
rows: 1
Extra: Using where
************************************************** ************
*[CASE 3]*
| Query B, WUC1
| X=lots children (1.3M)
| Y=few documents (10)
| average execution time: 2 min
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: eh
type: ref
possible_keys: entitiesIdx,fullHierarchyIdx,hIdx
key: hIdx
key_len: 5
ref: const,const
rows: 340292
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: e
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using index
*************************** 3. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: father
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using index
*************************** 4. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: eh2
type: ref
possible_keys: entitiesIdx,fullHierarchyIdx,hIdx
key: fullHierarchyIdx
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using where
*************************** 5. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: pe
type: ref
possible_keys: pEntityIdx,ksIdx
key: pEntityIdx
key_len: 4
ref: tcell.eh2.uidEntity
rows: 20
Extra: Using where; Using index
*************************** 6. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: e
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh2.uidEntity
rows: 1
Extra: Using index
************************************************** ************
*[CASE 4]*
| Query B, WUC2
| X=few children (8)
| Y=lots of documents (150.000)
| average execution time: 80 ms
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: eh
type: ref
possible_keys: entitiesIdx,fullHierarchyIdx,hIdx
key: hIdx
key_len: 5
ref: const,const
rows: 8
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: e
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using index
*************************** 3. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: father
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using index
*************************** 4. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: eh2
type: ref
possible_keys: entitiesIdx,fullHierarchyIdx,hIdx
key: fullHierarchyIdx
key_len: 4
ref: tcell.eh.uidEntity
rows: 1
Extra: Using where
*************************** 5. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: pe
type: ref
possible_keys: pEntityIdx,ksIdx
key: pEntityIdx
key_len: 4
ref: tcell.eh2.uidEntity
rows: 20
Extra: Using where; Using index
*************************** 6. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: e
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tcell.eh2.uidEntity
rows: 1
Extra: Using index
************************************************** ************
Thanks for any in-depth look at this problem. It really is my last-ditch
effort on that one. I reach the limit of my knowledge about MySQL and DB
design :/
--
Hugo