Page 1 of 1

Weighted Random Selection Optimizing

Posted: Wed Dec 16, 2015 2:19 pm
by bhall2001
I have an LC project where I need to select a random number from a weighted array of numbers. The weighting is essentially a bias on randomly selected the element. I've written a routine that I think is an efficient solution to this problem. I'd love feedback/advice on how to make it faster as I am going to be calling this a significant number of times in the Application.

The logic is as as follows -- create a cumulative sum array of the weights, then select a random number between 1 and the cumulative sum total, then find the index that corresponds to this random number using a binary search.

Example:
put 1 into tDataA[1]
put 2 into tDataA[2]
put 4 into tDataA[3]
put 2 into tDataA[4]
put 1 into tDataA[5]

In a random draw (over a significant number of draws...)
Element 1 is selected 10%
Element 4 is selected 20%
Element 2 is selected 40%
Element 3 is selected 20%
Element 4 is selected 10%

Thanks in advance for any thoughts.
Bob Hall

Code: Select all

function weightedRandomSelection pDataA
   local tSum=0
   local i=1
   local tRandomNum, tResult, tResultA
   
   -- create a new array with a running summation
   repeat for each element aItem in pDataA
      add pDataA[i] to tSum
      put tSum into tResultA[i]
      add 1 to i
   end repeat
   
   -- select a random number from 1 to the sum total of all weights
   put random(tSum) into tRandomNum
   
   -- find the index of the element in the array that the random number points to
   put binarySearch( tResultA, tRandomNum, 1, (the number of elements of tResultA)) into tResult
   
   return tResult
end weightedRandomSelection

/*--------------------------------------------------------------
returns the index of the Array where pKey is in the range of

--------------------------------------------------------------*/
function binarySearch @pArray pKey pMin pMax
   local tMid
   local tResult
   
   -- find the mid point of the Min & Max
   put round((pMin + pMax) / 2) into tMid
   
   -- if we are at the first index then we are done
   if tMid = 1 then
      return tMid
      
      -- if we have found the matching item then stop processing and return it
   else if pArray[tMid-1] < pKey and pArray[tMid] >= pKey then
      return tMid
      
      --if any of the range pointers have the same value
      --then we've hit the start of the list
   else if (pMax = tMid or pMin = pMax or pMin = tMid) then
      return 1 -- return the first element
      
      -- split the search in half depending on whether we need
      -- the upper or lower half.
   else if pArray[tMid] > pKey then
      put binarySearch(pArray, pKey, pMin, tMid) into tResult
   else
      put binarySearch(pArray, pkey, tMid, pMax) into tResult
   end if
   return tResult
end binarySearch

Re: Weighted Random Selection Optimizing

Posted: Thu Dec 17, 2015 5:08 am
by sritcp
HI Bob:

A simple weighted enumeration may work best if the weights are fixed beforehand but the randomization is going to be called repeatedly.

Code: Select all

# Say, tWeights[i] is the weight of option i

put empty into tSelectFrom

# Line up the weighted options in a single list
put 1 into i
repeat for each line tKey in the keys of tWeights
  repeat tWeights[tKey] times
    put tKey & comma after tSelectFrom
  end repeat
end repeat
delete the last char of tSelectFrom

# call randomization
put the number of items of tSelectFrom into tRange
put item random(tRange) of tSelectFrom into tRandom
Regards,
Sri

Note:
Your repeat structure

repeat for each element aItem in pDataA
      add pDataA to tSum
.......

is incorrect. You'll need to use aItem, not pData, inside the repeat loop.
(Or, use
repeat with i = 1 to ......
form)