Weighted Random Selection Optimizing
Posted: Wed Dec 16, 2015 2:19 pm
I have an LC project where I need to select a random number from a weighted array of numbers. The weighting is essentially a bias on randomly selected the element. I've written a routine that I think is an efficient solution to this problem. I'd love feedback/advice on how to make it faster as I am going to be calling this a significant number of times in the Application.
The logic is as as follows -- create a cumulative sum array of the weights, then select a random number between 1 and the cumulative sum total, then find the index that corresponds to this random number using a binary search.
Example:
put 1 into tDataA[1]
put 2 into tDataA[2]
put 4 into tDataA[3]
put 2 into tDataA[4]
put 1 into tDataA[5]
In a random draw (over a significant number of draws...)
Element 1 is selected 10%
Element 4 is selected 20%
Element 2 is selected 40%
Element 3 is selected 20%
Element 4 is selected 10%
Thanks in advance for any thoughts.
Bob Hall
The logic is as as follows -- create a cumulative sum array of the weights, then select a random number between 1 and the cumulative sum total, then find the index that corresponds to this random number using a binary search.
Example:
put 1 into tDataA[1]
put 2 into tDataA[2]
put 4 into tDataA[3]
put 2 into tDataA[4]
put 1 into tDataA[5]
In a random draw (over a significant number of draws...)
Element 1 is selected 10%
Element 4 is selected 20%
Element 2 is selected 40%
Element 3 is selected 20%
Element 4 is selected 10%
Thanks in advance for any thoughts.
Bob Hall
Code: Select all
function weightedRandomSelection pDataA
local tSum=0
local i=1
local tRandomNum, tResult, tResultA
-- create a new array with a running summation
repeat for each element aItem in pDataA
add pDataA[i] to tSum
put tSum into tResultA[i]
add 1 to i
end repeat
-- select a random number from 1 to the sum total of all weights
put random(tSum) into tRandomNum
-- find the index of the element in the array that the random number points to
put binarySearch( tResultA, tRandomNum, 1, (the number of elements of tResultA)) into tResult
return tResult
end weightedRandomSelection
/*--------------------------------------------------------------
returns the index of the Array where pKey is in the range of
--------------------------------------------------------------*/
function binarySearch @pArray pKey pMin pMax
local tMid
local tResult
-- find the mid point of the Min & Max
put round((pMin + pMax) / 2) into tMid
-- if we are at the first index then we are done
if tMid = 1 then
return tMid
-- if we have found the matching item then stop processing and return it
else if pArray[tMid-1] < pKey and pArray[tMid] >= pKey then
return tMid
--if any of the range pointers have the same value
--then we've hit the start of the list
else if (pMax = tMid or pMin = pMax or pMin = tMid) then
return 1 -- return the first element
-- split the search in half depending on whether we need
-- the upper or lower half.
else if pArray[tMid] > pKey then
put binarySearch(pArray, pKey, pMin, tMid) into tResult
else
put binarySearch(pArray, pkey, tMid, pMax) into tResult
end if
return tResult
end binarySearch