Re: Unit Hours of Activity vs. Unit Hours Scheduled On 6 Feb, 15:38, steven...@gmail.com wrote:
> On Feb 6, 5:34 am, "Captain Paralytic" <paul_laut...@yahoo.com> wrote:
>
>
>
>
>
> > On Feb 6, 1:09 am, steven...@gmail.com wrote:
>
> > > On Feb 5, 3:35 pm, "Paul Lautman" <paul.laut...@btinternet.com> wrote:
>
> > > > steven...@gmail.com wrote:
> > > > > Hi Everyone,
>
> > > > > I have a solution to creating a report of our company's unit hours of
> > > > > activity (time spent serving a call) based on an accompanying call log
> > > > > table, however, my queries take HOURS and HOURS to run (8+). As a
> > > > > result I can't provide the report as often as the powers that be would
> > > > > like it. Our table contains more than 1 million records and sensitive
> > > > > data so I can't provide test data. :/
>
> > > > > Let me explain the situation.
>
> > > > > - Start and Stop times for each call are available
> > > > > USE stevensjn;
>
> > > > > DROP TABLE IF EXISTS `tbluhadata`;
> > > > > CREATE TABLE `tbluhadata` (
> > > > > `CallId` bigint(20) default NULL,
> > > > > `CallStart` datetime default NULL,
> > > > > `CallEnd` datetime default NULL,
> > > > > PRIMARY KEY (`CallId`)
> > > > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>
> > > > > DROP TABLE IF EXISTS `_helper`;
> > > > > CREATE TABLE `_helper` (
> > > > > `i` tinyint(3) unsigned NOT NULL,
> > > > > PRIMARY KEY (`i`)
> > > > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>
> > > > > DROP TABLE IF EXISTS `tbluhahours`;
> > > > > CREATE TABLE tbluhahours (
> > > > > StartDate datetime NOT NULL,
> > > > > EndDate datetime NOT NULL,
> > > > > PRIMARY KEY (`StartDate`,`EndDate`)
> > > > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>
> > > > > CREATE TABLE _helper (i TINYINT UNSIGNED NOT NULL PRIMARY KEY);
> > > > > INSERT INTO _helper (i) VALUES (0), (1), (2), (3), (4), (5), (6), (7),
> > > > > (8), (9);
>
> > > > > SET @counter := -1;
> > > > > SET @start := '2000-01-01 00:00:00';
> > > > > SET @finish := '2010-12-31 23:59:59';
> > > > > INSERT INTO tbluhahours (StartDate, EndDate)
> > > > > SELECT
> > > > > @start + INTERVAL H.c HOUR,
> > > > > @start + INTERVAL H.c + 1 HOUR - INTERVAL 1 SECOND
>
> > > > > FROM (
> > > > > SELECT @counter := @counter + 1 AS c
> > > > > FROM _helper h1 -- 10
> > > > > CROSS JOIN _helper h2 -- 100
> > > > > CROSS JOIN _helper h3 -- 1000
> > > > > CROSS JOIN _helper h4 -- 10000
> > > > > CROSS JOIN _helper h5 -- 100000
> > > > > ) H
>
> > > > > WHERE
> > > > > @start + INTERVAL H.c HOUR < @finish;
> > > > > ////////////////// end
>
> > > > > Here is a set of queries I have put together that uses joins to
> > > > > determine elapsed times for each hour. Perhaps optimizing these
> > > > > queries will be enough to get the script working at a workable speed?
> > > > > These are the 8 hour queries. Any suggestions at all would be
> > > > > fantastic.
>
> > > > > USE stevensjn;
>
> > > > > CREATE TABLE tblUHADurations(
> > > > > SELECT
> > > > > tblUHAData.CallId,
> > > > > tblUHAData.CallStart,
> > > > > tblUHAData.CallEnd,
> > > > > tblUHAHours.StartDate,
> > > > > DATE_FORMAT(tblUHAHours.StartDate, "%H") AS Hour,
> > > > > UNIX_TIMESTAMP(tblUHAHours.EndDate)+1 -
> > > > > UNIX_TIMESTAMP(tblUHAData.CallStart) AS Duration
> > > > > FROM
> > > > > tblUHAData INNER JOIN tblUHAHours ON
> > > > > tblUHAHours.StartDate <= tblUHAData.CallStart AND
> > > > > tblUHAData.CallStart <= tblUHAHours.EndDate AND
> > > > > tblUHAHours.EndDate < tblUHAData.CallEnd
> > > > > );
>
> > > > > INSERT INTO tblUHADurations (
> > > > > SELECT
> > > > > tblUHAData.CallId,
> > > > > tblUHAData.CallStart,
> > > > > tblUHAData.CallEnd,
> > > > > tblUHAHours.StartDate,
> > > > > DATE_FORMAT(tblUHAHours.StartDate, "%H") AS Hour,
> > > > > 3600 AS Duration
> > > > > FROM
> > > > > tblUHAData INNER JOIN tblUHAHours ON
> > > > > tblUHAHours.StartDate > tblUHAData.CallStart AND
> > > > > tblUHAHours.EndDate < tblUHAData.CallEnd
> > > > > );
>
> > > > > INSERT INTO tblUHADurations (
> > > > > SELECT
> > > > > tblUHAData.CallId,
> > > > > tblUHAData.CallStart,
> > > > > tblUHAData.CallEnd,
> > > > > tblUHAHours.StartDate,
> > > > > DATE_FORMAT(tblUHAHours.StartDate, "%H") AS Hour,
> > > > > UNIX_TIMESTAMP(tblUHAData.CallEnd)+1 -
> > > > > UNIX_TIMESTAMP(tblUHAHours.StartDate) AS Duration
> > > > > FROM
> > > > > tblUHAData INNER JOIN tblUHAHours ON
> > > > > tblUHAHours.StartDate <= tblUHAData.CallEnd AND
> > > > > tblUHAData.CallEnd <= tblUHAHours.EndDate AND
> > > > > tblUHAHours.StartDate > tblUHAData.CallStart
> > > > > );
>
> > > > > INSERT INTO tblUHADurations (
> > > > > SELECT
> > > > > tblUHAData.CallId,
> > > > > tblUHAData.CallStart,
> > > > > tblUHAData.CallEnd,
> > > > > tblUHAHours.StartDate,
> > > > > DATE_FORMAT(tblUHAHours.StartDate, "%H") AS Hour,
> > > > > UNIX_TIMESTAMP(tblUHAData.CallEnd)+1 -
> > > > > UNIX_TIMESTAMP(tblUHAData.CallStart) AS Duration
> > > > > FROM
> > > > > tblUHAData INNER JOIN tblUHAHours ON
> > > > > tblUHAHours.StartDate <= tblUHAData.CallStart AND
> > > > > tblUHAData.CallEnd <= tblUHAHours.EndDate
> > > > > );
>
> > > > > Thanks for reading. With any luck someone has had to do something like
> > > > > this in the past, of will find the challenge interesting. 
>
> > > > > - Jake
>
> > > > I seem to remember posting about this question the other day and saying that
> > > > group by was a better solution (I just checked and I certainly did, but you
> > > > never responded to my poost).
>
> > > Using your advice I went ahead and wrote what I thought was a great
> > > solution using a group by query. Unfortunately, I ended up with a
> > > query that would sum elapsed seconds for calls that began in each hour
> > > of the day, but with disregard for calls that overlapped more than a
> > > single hour. So for instance with an elapsed time of > 3600 seconds
> > > all of those seconds would be grouped into the hour that which the
> > > call began in.
>
> > > Is there some way of doing the group by taking into account the
> > > possibility that calls may be longer than 3600 seconds? I'm not sure
> > > how.
>
> > > I had something like this:
>
> > > SELECT
> > > MAX(Year(CallStart)) AS theYear,
> > > MAX(Month(CallStart)) AS theMonth,
> > > MAX(Hour(CallStart) AS theHour,
> > > COUNT(*) AS NumCalls,
> > > SUM(UNIX_TIMESTAMP(CallEnd)-UNIX_TIMESTAMP(CallStart)) AS
> > > ElapsedSecond
> > > FROM tblUHAData
> > > WHERE
> > > Year(CallStart)=2006
> > > GROUP BY
> > > Year(CallStart), Month(CallStart), Hour(CallStart)
> > > ORDER BY
> > > 1,2,3
>
> > > Which of course SUMs total elapsed second for calls that start in each
> > > of the hours of the 24 hour clock.
>
> > > Can we change the group by to somehow reflect the beginning and ends
> > > of each call? I would love to know how.
>
> > > Thanks.
>
> > > - Jake- Hide quoted text -
>
> > > - Show quoted text -
>
> > So a call that lasted 2.5 hours would appear in three of the hourly
> > slots?
>
> No, the group by would only count it for the hour that in which it
> started, and then for the entire duration of the call. So a 2.5 hour
> call would SUM the entire duration of that call and add it to the
> grouping of the hour in which it started. This would add an extra 1.5
> hours of time to that hour, a bad thing. - Hide quoted text -
>
> - Show quoted text -
Confused.
Are you saying that for a 2.5 hour call you want (upto) 60 minutes
counted in the hour slot in which the call began and the rest of the
time you don't want counted at all? So if a 2.5 hour call started at
14:55 you would only ever count 5 minutes of the call? |