Unix Technical Forum

How can I improve the performance of this kind of query?

This is a discussion on How can I improve the performance of this kind of query? within the MySQL forums, part of the Database Server Software category; --> Please consider the following stored procedure: DROP PROCEDURE IF EXISTS `get_etf_stocks_and_prices`; CREATE PROCEDURE `get_etf_stocks_and_prices`(id INTEGER) BEGIN SELECT stocks.stock_id AS ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 11:29 AM
Ted
 
Posts: n/a
Default How can I improve the performance of this kind of query?

Please consider the following stored procedure:

DROP PROCEDURE IF EXISTS `get_etf_stocks_and_prices`;
CREATE PROCEDURE `get_etf_stocks_and_prices`(id INTEGER)
BEGIN
SELECT stocks.stock_id AS stock_id,
stocks.symbol AS symbol,
stocks.name AS stock_name,
A.`close` AS close_price,
A.price_date AS price_date
FROM etf_stocks LEFT JOIN stocks ON etf_stocks.stock_id =
stocks.stock_id
LEFT JOIN stockprices A ON stocks.stock_id = A.stock_id
LEFT JOIN stockprices B ON A.stock_id = B.stock_id
AND B.price_date > A.price_date
WHERE B.price_date IS NULL
AND etf_id = id;
END
//

While price_date is part of the primary key, creating an extra index
on it dramatically improved how long it takes to complete. However,
it takes close to 10 minutes for the results to be displayed where
there are 28 stocks in the etf required (out of almost 500 possible
stocks for almost 500 etfs), the times shown by MySQL Query browser
are 0.0037s (520.3263s): the latter s almost the 20 minutes I
experience waiting for the results to be displayed. Given that there
is daily data going back over 40 years, at least for some stocks, I
don't want to think how long it would take to get the final close
price for each week represented in the database for the stocks
requested.

The data I get appear to be correct, but I need to get it MUCH much
more quickly.

Thanks

Ted

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 11:29 AM
xenides@gmail.com
 
Posts: n/a
Default Re: How can I improve the performance of this kind of query?

Ted

The statement

AND B.price_date > A.price_date
WHERE B.price_date IS NULL

is contradictory. For B.price_date > A.price_date to return true,
then B.price_date has to be NOT NULL (otherwise it returns null). On
the next line you are asking B.price_date to be NULL. Seems like you
are doing a lot of comparison work for no purpose.

Also B.price_date is null will return unmatched rows in B or matched
rows where B.price_date is null. Not sure if that is what you want?
In fact I can't see a reason to have table B in there at all. What
was your purpose for including it?

X



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 11:29 AM
Ted
 
Posts: n/a
Default Re: How can I improve the performance of this kind of query?

On Nov 6, 2:54 am, xeni...@gmail.com wrote:
> Ted
>
> The statement
>
> AND B.price_date > A.price_date
> WHERE B.price_date IS NULL
>
> is contradictory. For B.price_date > A.price_date to return true,
> then B.price_date has to be NOT NULL (otherwise it returns null). On
> the next line you are asking B.price_date to be NULL. Seems like you
> are doing a lot of comparison work for no purpose.
>
> Also B.price_date is null will return unmatched rows in B or matched
> rows where B.price_date is null. Not sure if that is what you want?
> In fact I can't see a reason to have table B in there at all. What
> was your purpose for including it?
>
> X


This is what I was advised to use in the recent thread " Trouble
getting latest record for a given item." Articles in that thread give
URLs to additional sources, including MySQL documentation, that
describe using this method for getting the most recent value. The
essence of the rationale for it is that there will be exactly one
record in A that has no price greater than that in that record. We're
selecting records from a LEFT JOIN, so that join will have records in
A even for comparisons for which ALL values on the B side are NULL.

I agree there is a lot of comparison, but the question is, is there a
faster way to get the result. This is a worry since the table in
question has millions of records containing over 100 MB of data.

The first objective was to get the most recent price for a given
item. The next step is to get the last price for a given item for
each week for which there is data. The complication for this is that
while there is data for every business day, statutory holidays mean
that in some weeks the last business day of the week is a Thursday
rather than a Friday. It takes so long to get the last value for just
a couple dozen items, that using the same kind of query to get the
last record in each week, for series that go back 50 years, seems
impracticable: the application would just sit there apparently doing
nothing, for hours.

Thanks,

Ted

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 11:29 AM
xenides@gmail.com
 
Posts: n/a
Default Re: How can I improve the performance of this kind of query?

Hmmm...

SELECT stocks.stock_id AS stock_id,
stocks.symbol AS symbol,
stocks.name AS stock_name,
A.`close` AS close_price,
A.price_date AS price_date
FROM etf_stocks LEFT JOIN stocks ON etf_stocks.stock_id =
stocks.stock_id
LEFT JOIN stockprices A ON stocks.stock_id = A.stock_id
/*This is grabbing the most recent date for that STOCK_ID from an
aliased STOCKPRICES table and using that as a criteria.
It's the same as using a max, but I've found on MySQL that MAX in a
sub-query sucks, and an order/limit doesn't - no idea why
*/
where A.price_date = (select price_date from stockprices A2 where
A2.stock_id = A.stock_id order by price_date desc limit 1)
AND etf_id = id;

An index on stock_id and possibly on stock_id,price_date would
probably help in this case to make the subquery run faster.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-28-2008, 11:29 AM
Ted
 
Posts: n/a
Default Re: How can I improve the performance of this kind of query?

On Nov 9, 12:55 am, xeni...@gmail.com wrote:
> Hmmm...
>
> SELECT stocks.stock_id AS stock_id,
> stocks.symbol AS symbol,
> stocks.name AS stock_name,
> A.`close` AS close_price,
> A.price_date AS price_date
> FROM etf_stocks LEFT JOIN stocks ON etf_stocks.stock_id =
> stocks.stock_id
> LEFT JOIN stockprices A ON stocks.stock_id = A.stock_id
> /*This is grabbing the most recent date for that STOCK_ID from an
> aliased STOCKPRICES table and using that as a criteria.
> It's the same as using a max, but I've found on MySQL that MAX in a
> sub-query sucks, and an order/limit doesn't - no idea why
> */
> where A.price_date = (select price_date from stockprices A2 where
> A2.stock_id = A.stock_id order by price_date desc limit 1)
> AND etf_id = id;
>
> An index on stock_id and possibly on stock_id,price_date would
> probably help in this case to make the subquery run faster.


Thanks xeni. I tried my original with yours, and it seems an exercise
in frustration.

First note, while there is no index on stock_id, stock_id plus
price_date form the primary key. Adding an index on price_date has
made virtually no difference, neither in the times I see nor in the
timing numbers MySQL Query Browser reports.

For yours, applied to etf # 397, I get 28 rows in what Query Browser
reports as 0.0054s (1133.5915s). It felt like about 20 minutes. BTW:
do you know what the times are that the Query Browser reports?
Sometimes the number within parentheses corresponds to perceived
elapsed time on the wall and sometimes it does not.

Anyway, with my original query, the corresponding times (on the same
etf) are 0.0039s (516.0759s)

Thanks

Ted

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:33 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com