-
Notifications
You must be signed in to change notification settings - Fork 36
Tutorial 4 of 5: fixing lower order parameters
In the previous section, we took advantage of the symmetry of
Leading to the following coefficient, amongst others:
return u * x + 9.9999788400553332261e-1f;
This is an interesting value, because it is very close to 1. Many CPUs can load the value 1 very quickly, which can be a potential runtime gain.
We may wonder: what if we used 1
directly here?
return u * x + 1.f;
Let’s see the error value:
Duh. Pretty bad, actually. Maximum error is about 10 times worse.
The clever way involves some more maths. Instead of looking for polynomial
Dividing numerator and denominator by
Once again, we get a form suitable for the Remez algorithm.
lolremez --degree 3 --range "1e-50:pi*pi/4" "(sin(sqrt(x))-sqrt(x))/(x*sqrt(x))" "1/(x*sqrt(x))"
Only
After all the iterations the output should be as follows:
/* Approximation of f(x) = (sin(sqrt(x))-sqrt(x))/(x*sqrt(x))
* with weight function g(x) = 1/(x*sqrt(x))
* on interval [ 9.9999999999999999999e-51, 2.4674011002723396547 ]
* with a polynomial of degree 3. */
float f(float x)
{
float u = 2.5180282371807974667e-6f;
u = u * x + -1.9754457746087556056e-4f;
u = u * x + 8.3319148411123829805e-3f;
return u * x + -1.6666579517227416457e-1f;
}
We can therefore write the corresponding C++ function:
float fastsin2(float x)
{
return x + x * f(x * x);
}
Let’s see the new error curve:
Excellent! The loss of precision is clearly not as bad as before, and we need to load 4 constants instead of 5 previously.
You should now be able to fix lower-order coefficients in the minimax polynomial for possible performance improvements.
Please report any trouble you may have had with this document to sam@hocevar.net. You may then carry on to the next section: additional tips.