View Single Post

   
  #6 (permalink)  
Old 02-29-2008, 08:09 AM
Erland Sommarskog
 
Posts: n/a
Default Re: Can an update statement be used for interpolating missing data?

[posted and mailed, please reply in news]

Vorpal (brumac@gmail.com) writes:
> Here is a small sample of data from a table of about 500 rows
> (Using MSSqlserver 2000)
>
> EntryTime Speed Gross Net
> ------------------ ----- -----
> 21:09:13.310 0 0 0
> 21:09:19.370 9000 NULL NULL
> 21:09:21.310 NULL 95 NULL
> 21:10:12.380 9000 NULL NULL
> 21:10:24.310 NULL 253 NULL
> 21:11:24.370 8000 NULL NULL
> 21:11:27.310 NULL 410 NULL
> 21:11:51.320 NULL 438 NULL
> 21:11:51.490 NULL NULL 10
>
> After the first row, every row has only one value of the three.
> I would like to replace all the NULL values with calculated
> interpolations.
>
> I can do it w/ cursors or while loops.
> I could do it w/ VB (I think)
>
> Can this be done w/ an Update statement using self joins?


Not "an", but a couple. In the below script I get the data into a temp
table with an IDENTITY column, which has a consecutive number. I then
find the next and previous row with a non-NULL value for speed, for those
rows that have a NULL value. Once I have these pointers I can make the
interpolation. There is no extrapolation for the NULL values at the end.

The number of UPDATE statements could be reduced if you have three
sets of pointer columns, but I'm not sure that is worth the pain.

The script does not include handling of Net. That is left as an exercise
to the reader. :-)

CREATE TABLE tbl (entrytime datetime NOT NULL PRIMARY KEY,
speed int NULL,
gross int NULL,
net int NULL)
go
INSERT tbl(entrytime, speed, gross, net)
SELECT '21:09:13.310', 0, 0, 0 UNION
SELECT '21:09:19.370', 9000, NULL, NULL UNION
SELECT '21:09:21.310', NULL, 95, NULL UNION
SELECT '21:10:12.380', 9000, NULL, NULL UNION
SELECT '21:10:24.310', NULL, 253, NULL UNION
SELECT '21:11:24.370', 8000, NULL, NULL UNION
SELECT '21:11:27.310', NULL, 410, NULL UNION
SELECT '21:11:51.320', NULL, 438, NULL UNION
SELECT '21:11:51.490', NULL, NULL, 10
go
CREATE TABLE #temp (ident int IDENTITY UNIQUE,
entrytime datetime NOT NULL PRIMARY KEY,
speed int NULL,
gross int NULL,
net int NULL,
prevval int NULL,
nextval int NULL)

INSERT #temp(entrytime, speed, gross, net)
SELECT entrytime, speed, gross, net
FROM tbl
ORDER BY entrytime

UPDATE #temp
SET prevval = (SELECT MAX(t2.ident)
FROM #temp t2
WHERE t2.ident < t.ident
AND t2.speed IS NOT NULL)
FROM #temp t
WHERE t.speed IS NULL

UPDATE #temp
SET nextval = (SELECT MIN(t2.ident)
FROM #temp t2
WHERE t2.ident > t.ident
AND t2.speed IS NOT NULL)
FROM #temp t
WHERE t.speed IS NULL

UPDATE t
SET speed = p.speed +
1E0 * (n.speed - p.speed) * (t.ident - t.prevval) /
(t.nextval - t.prevval)
FROM #temp t
JOIN #temp p ON t.prevval = p.ident
JOIN #temp n ON t.nextval = n.ident
WHERE t.speed IS NULL

UPDATE #temp
SET prevval = NULL, nextval = NULL


UPDATE #temp
SET prevval = (SELECT MAX(t2.ident)
FROM #temp t2
WHERE t2.ident < t.ident
AND t2.gross IS NOT NULL)
FROM #temp t
WHERE t.gross IS NULL

UPDATE #temp
SET nextval = (SELECT MIN(t2.ident)
FROM #temp t2
WHERE t2.ident > t.ident
AND t2.gross IS NOT NULL)
FROM #temp t
WHERE t.gross IS NULL

UPDATE t
SET gross = p.gross +
1E0 * (n.gross - p.gross) * (t.ident - t.prevval) /
(t.nextval - t.prevval)
FROM #temp t
JOIN #temp p ON t.prevval = p.ident
JOIN #temp n ON t.nextval = n.ident
WHERE t.gross IS NULL

UPDATE #temp
SET prevval = NULL, nextval = NULL

go
UPDATE tbl
SET speed = t.speed,
gross = t.gross,
net = t.net
FROM tbl
JOIN #temp t ON tbl.entrytime = t.entrytime
go
SELECT * FROM #temp
SELECT * FROM tbl ORDER BY entrytime
go
DROP TABLE tbl
DROP TABLE #temp




--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Reply With Quote