-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Conversation
biasedInteger allows generating an integer that is biased by a given function. See #304
* Returns a biased integer between $min and $max (both inclusive). | ||
* The distribution depends on $function. | ||
* | ||
* The algorithm creates two doubles, x, y ∈ [0, 1) and checks whether the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why you do this. Why isn't the following sufficient?
public function biasedInteger($min, $max, $function)
{
$randMax = mt_getrandmax();
return floor($function(mt_rand(0,$randMax) / $randMax) * ($max - $min + 1) + $min);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make creation of the “weighting” functions less intuitive. My method directly maps the distribution onto a graph and one can easily see: “Oh, that value is higher, the number is more likely to pop out”.
I cannot think of any function that would generate a linearLow
distribution using your method on top of my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about:
function linearLow(x)
{
return x / 2;
}
Your approach is really not fast enough because of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return x / 2;
This would create all the lower half with equal probability and none of the higher half (so it is essentially the same as numberBetween($min, $max / 2)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I shouldn't do maths after midnight :)
But anyway, I think power and root functions already deal with 90% of the required biases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, even powers are completely unintuitive with your suggestion. What distribution would you expect when using pow($x, 4);
? Are the low, the medium or the high numbers favored? Think a moment about it.
I would say: The high ones, but after thinking about it one will see that in fact the lower numbers are favored!
Therefore I will not change the actual algorithm, unless the distribution stays (roughly) the same. And the numbers show that depending on the chosen function the biased generator is between 33% and 16% of the speed of numberBetween
(see loop iterations percent), there are worse generators (e.g. numerify
, which generates each digit independently).
If I could convince you I will happily fix the test failures and rename the function and whatnot, but if you still think that the algorithm is too slow then you may close the pull request.
After thinking again about it, I agree with you approach. Do you have time to make the requested changes to make it mergeable? |
I'll try to take a look at it before end of the week. |
@fzaninotto The remaining issues should be fixed. |
As a note: That tests fail is not my fault. |
Thanks! |
Hi, would this function be able to have return a number biased towards a 3rd number using a function? e.g.. I want to generate users to a demographic with a min, max and mean age with a function like sqrt, gauss etc... min = 0, max = 100, mean = 33, function= ? returns a "falloff curve" around the mean rather than one end or the other? |
Hi, what other values we can use in 3rd parameter? Tried
|
biasedInteger allows generating an integer that is biased by a given function.
See #304
See this example data:
Loop iterations are the number of times the contents of the loop were executed,
100000
is perfect, no values were discarded.For even more fascination see this
sqrt
distribution of a higher number range 😄