math - Why is fast inverse square root so odd and slow on Java? -
i'm trying implement fast inverse square root on java in order speed vector normalization. however, when implement single-precision version in java, speeds same 1f / (float)math.sqrt()
@ first, drops half speed. interesting, because while math.sqrt uses (i presume) native method, involves floating point division, i've heard slow. code computing numbers follows:
public static float fastinversesquareroot(float x){ float xhalf = 0.5f * x; int temp = float.floattorawintbits(x); temp = 0x5f3759df - (temp >> 1); float newx = float.intbitstofloat(temp); newx = newx * (1.5f - xhalf * newx * newx); return newx; }
using short program i've written iterate each 16 million times, aggregate results, , repeat, results this:
1f / math.sqrt() took 65209490 nanoseconds. fast inverse square root took 65456128 nanoseconds. fast inverse square root 0.378224 percent slower 1f / math.sqrt() 1f / math.sqrt() took 64131293 nanoseconds. fast inverse square root took 26214534 nanoseconds. fast inverse square root 59.123647 percent faster 1f / math.sqrt() 1f / math.sqrt() took 27312205 nanoseconds. fast inverse square root took 56234714 nanoseconds. fast inverse square root 105.895914 percent slower 1f / math.sqrt() 1f / math.sqrt() took 26493281 nanoseconds. fast inverse square root took 56004783 nanoseconds. fast inverse square root 111.392402 percent slower 1f / math.sqrt()
i consistently numbers same speed both, followed iteration fast inverse square root saves 60 percent of time required 1f / math.sqrt()
, followed several iterations take twice long fast inverse square root run control. i'm confused why fisr go same -> 60 percent faster -> 100 percent slower, , happens every time run program.
edit: the above data when run in eclipse. when run program javac/java
different data:
1f / math.sqrt() took 57870498 nanoseconds. fast inverse square root took 88206794 nanoseconds. fast inverse square root 52.421004 percent slower 1f / math.sqrt() 1f / math.sqrt() took 54982400 nanoseconds. fast inverse square root took 83777562 nanoseconds. fast inverse square root 52.371599 percent slower 1f / math.sqrt() 1f / math.sqrt() took 21115822 nanoseconds. fast inverse square root took 76705152 nanoseconds. fast inverse square root 263.259133 percent slower 1f / math.sqrt() 1f / math.sqrt() took 20159210 nanoseconds. fast inverse square root took 80745616 nanoseconds. fast inverse square root 300.539585 percent slower 1f / math.sqrt() 1f / math.sqrt() took 21814675 nanoseconds. fast inverse square root took 85261648 nanoseconds. fast inverse square root 290.845374 percent slower 1f / math.sqrt()
edit2: after few responses, seems speed stabilizes after several iterations, number stabilizes highly volatile. have idea why?
here's code (not concise, here's whole thing):
public class fastinversesquareroottest { public static fastinversesquareroottest conducttest() { float result = 0f; long starttime, endtime, midtime; starttime = system.nanotime(); (float x = 1f; x < 4_000_000f; x += 0.25f) { result = 1f / (float) math.sqrt(x); } midtime = system.nanotime(); (float x = 1f; x < 4_000_000f; x += 0.25f) { result = fastinversesquareroot(x); } endtime = system.nanotime(); return new fastinversesquareroottest(midtime - starttime, endtime - midtime); } public static float fastinversesquareroot(float x) { float xhalf = 0.5f * x; int temp = float.floattorawintbits(x); temp = 0x5f3759df - (temp >> 1); float newx = float.intbitstofloat(temp); newx = newx * (1.5f - xhalf * newx * newx); return newx; } public static void main(string[] args) throws exception { (int = 0; < 7; i++) { system.out.println(conducttest().tostring()); } } private long controldiff; private long experimentaldiff; private double percenterror; public fastinversesquareroottest(long controldiff, long experimentaldiff) { this.experimentaldiff = experimentaldiff; this.controldiff = controldiff; this.percenterror = 100d * (experimentaldiff - controldiff) / controldiff; } @override public string tostring() { stringbuilder sb = new stringbuilder(); sb.append(string.format("1f / math.sqrt() took %d nanoseconds.%n", controldiff)); sb.append(string.format( "fast inverse square root took %d nanoseconds.%n", experimentaldiff)); sb.append(string .format("fast inverse square root %f percent %s 1f / math.sqrt()%n", math.abs(percenterror), percenterror > 0d ? "slower" : "faster")); return sb.tostring(); } }
the jit optimiser seems have thrown call math.sqrt
away.
with unmodified code, got
1f / math.sqrt() took 65358495 nanoseconds. fast inverse square root took 77152791 nanoseconds. fast inverse square root 18,045544 percent slower 1f / math.sqrt() 1f / math.sqrt() took 52872498 nanoseconds. fast inverse square root took 75242075 nanoseconds. fast inverse square root 42,308531 percent slower 1f / math.sqrt() 1f / math.sqrt() took 23386359 nanoseconds. fast inverse square root took 73532080 nanoseconds. fast inverse square root 214,422951 percent slower 1f / math.sqrt() 1f / math.sqrt() took 23790209 nanoseconds. fast inverse square root took 76254902 nanoseconds. fast inverse square root 220,530610 percent slower 1f / math.sqrt() 1f / math.sqrt() took 23885467 nanoseconds. fast inverse square root took 74869636 nanoseconds. fast inverse square root 213,452678 percent slower 1f / math.sqrt() 1f / math.sqrt() took 23473514 nanoseconds. fast inverse square root took 73063699 nanoseconds. fast inverse square root 211,260168 percent slower 1f / math.sqrt() 1f / math.sqrt() took 23738564 nanoseconds. fast inverse square root took 71917013 nanoseconds. fast inverse square root 202,954353 percent slower 1f / math.sqrt()
consistently slower times fastinversesquareroot
, , times in same ball-park, while math.sqrt
calls sped considerably.
changing code calls math.sqrt
couldn't avoided,
(float x = 1f; x < 4_000_000f; x += 0.25f) { result += 1f / (float) math.sqrt(x); } midtime = system.nanotime(); (float x = 1f; x < 4_000_000f; x += 0.25f) { result -= fastinversesquareroot(x); } endtime = system.nanotime(); if (result == 0) system.out.println("wow!");
i got
1f / math.sqrt() took 184884684 nanoseconds. fast inverse square root took 85298761 nanoseconds. fast inverse square root 53,863804 percent faster 1f / math.sqrt() 1f / math.sqrt() took 182183542 nanoseconds. fast inverse square root took 83040574 nanoseconds. fast inverse square root 54,419278 percent faster 1f / math.sqrt() 1f / math.sqrt() took 165269658 nanoseconds. fast inverse square root took 81922280 nanoseconds. fast inverse square root 50,431143 percent faster 1f / math.sqrt() 1f / math.sqrt() took 163272877 nanoseconds. fast inverse square root took 81906141 nanoseconds. fast inverse square root 49,834815 percent faster 1f / math.sqrt() 1f / math.sqrt() took 165314846 nanoseconds. fast inverse square root took 81124465 nanoseconds. fast inverse square root 50,927296 percent faster 1f / math.sqrt() 1f / math.sqrt() took 164079534 nanoseconds. fast inverse square root took 80453629 nanoseconds. fast inverse square root 50,966689 percent faster 1f / math.sqrt() 1f / math.sqrt() took 162350821 nanoseconds. fast inverse square root took 79854355 nanoseconds. fast inverse square root 50,813704 percent faster 1f / math.sqrt()
much slower times math.sqrt
, , moderately slower times fastinversesqrt
(now had subtraction in each iteration).
Comments
Post a Comment