Multi-dimensional weighted random roll (choice, dice, selection)
Suppose that you want to roll a 6-sided die in APL. That's trivial:
?6
0
?6
4
?6
0
?6
3
Now suppose that you wanted to roll a bunch of times. APL naturally scales, since the "roll" function ?
is a primitive scalar function:
?10⍴6
1 5 5 2 0 2 1 0 0 4
You could also roll a bunch of dice of different sides:
?3 4 5 6 8 10 12 20
0 3 2 3 4 6 9 3
But now suppose that we want to roll a weighted die, where each number comes up with a certain probability. Say we have 3 choices, and we want two of them to show up 1/4 of the time, and the other half of the time. A solution to this is to utilize the Roll function's ability to produce a uniformly distributed float on the interval (0, 1)
. We can slice this range into an arbitrary number of intervals, then use the Interval Index function ⍸
to select between them:
(+⍀0 0.25 0.5 0.25)⍸?7⍴0
1 1 1 2 1 0 1
That's a pretty fun trick, but it has a fatal flaw, at least apparently. We can produce as many values as we want in one go by changing the number of zeros that we pass to the roll function (7 in the above example). However, what if we want to have an arbitrary number of numbers where we want to choose an arbitrary number of different distributions for each roll? Suppose that we have 10 numbers, but two different distributions for a 6-side die? Suppose that the first distribution, we want the even numbers to show up twice as often as the odds, and the opposite for the other distribution? So, we have two different weighted dice, and we want to throw the odd-weighted die 7 times, but the even-weighted die only 3 times? How would we do that?
The initial difficulty here is that Interval Index (⍸
) only works on a single set of intervals, and it doesn't work multiple interval sets at the same time. Interval Index is a major-cell oriented function, so even if you pass it a matrix, it will treat that matrix as a single set of intervals between each major cell. That's not what we want in this case.
The solution here is to flatten the multiple intervals into a single interval set. Since each distribution is defined along the range (0, 1)
, we can make distribution N
to the range (N, N+1)
. Thus, the solution for our puzzle above might look like this:
even←(6⍴2 1)÷9
even
0.22222 0.11111 0.22222 0.11111 0.22222 0.11111
⊢odd←(6⍴1 2)÷9
0.11111 0.22222 0.11111 0.22222 0.11111 0.22222
+⍀0,even,odd
0 0.22222 0.33333 0.55556 0.66667 0.88889 1 1.1111 1.3333 1.4444 1.6667 1.7778 2
?10⍴0
0.54943 0.97562 0.09218 0.18439 0.17096 0.86066 0.27273 0.21014 0.99283 0.91454
(7⍴1),3⍴0
1 1 1 1 1 1 1 0 0 0
((7⍴1),3⍴0)+?10⍴0
1.7619 1.0723 1.5568 1.3648 1.5355 1.2023 1.6845 0.8264 0.72646 0.0029732
(+⍀0,even,odd)⍸((7⍴1),3⍴0)+?10⍴0
9 7 11 7 7 11 10 2 2 4
You'll notice that this method will work even when our weighted dice have different numbers of sides.
As an exercise to the reader, suppose that you have a general distribution of N
weighted sides for a die, but you want to "derive" a bunch of similar dice with the same basic distribution with certain "sides" disabled. Then, you want to roll all of these dice various numbers of times. How might you do this? I am using this technique to solve certain types of Markov-style generation problems.