vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I need an ffs function for 64-bit arguments. Prompted by the fact that lib ffs() only takes int arguments I wrote a 3-line ffs64() function that uses popc, following the classical example in the SPARC V9 Architecture Manual and various other textbooks: neg %o0, %o1 xnor %o0, %o1, %o0 retl popc %o0, %o0 (zero inputs are filtered out before calling the function). Much to my surprise this takes 1800ns to 2000ns per call on a 1.2 GHz UltraSparc III depending on the number of bits set, compared to 50ns to 125ns for the naive C implementation (return ffs(low) if low != 0, otherwise return ffs(high) + 32). Digging a little deeper into the matter, I timed POPC by itself, and sure enough it accounts for practically all the time. Why is POPC so slow? dk |
| ||||
| "Dan" == Dan Koren <dankoren@yahoo.com> writes: Dan> Why is POPC so slow? Others more versed in the various UltraSPARC implementations should comment, but my understanding is that POPC is emulated in most implementations. -- Dave Marquardt Sun Microsystems, Inc. Austin, TX +1 512 401-1077 |