vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Tue, 2005-04-26 at 15:00 -0700, Gurmeet Manku wrote: > 2. In a single scan, it is possible to estimate n_distinct by using > a very simple algorithm: > > "Distinct sampling for highly-accurate answers to distinct value > queries and event reports" by Gibbons, VLDB 2001. > > http://www.aladdin.cs.cmu.edu/papers...dist_sampl.pdf That looks like the one... ....though it looks like some more complex changes to the current algorithm to use it, and we want the other stats as well... > 3. In fact, Gibbon's basic idea has been extended to "sliding windows" > (this extension is useful in streaming systems like Aurora / Stream): > > "Distributed streams algorithms for sliding windows" > by Gibbons and Tirthapura, SPAA 2002. > > http://home.eng.iastate.edu/~snt/research/tocs.pdf > ....and this offers the possibility of calculating statistics at load time, as part of the COPY command Best Regards, Simon Riggs ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|