English 中文(简体)
Replace duplicate values in array with new randomly generated values
原标题:

I have below a function (from a previous question that went unanswered) that creates an array with n amount of values. The sum of the array is equal to $max.

function randomDistinctPartition($n, $max) {
  $partition= array();
  for ($i = 1; $i < $n; $i++) {
    $maxSingleNumber = $max - $n;
    $partition[] = $number = rand(1, $maxSingleNumber);
    $max -= $number;
  }
  $partition[] = $max;
  return $partition;
}

For example: If I set $n = 4 and $max = 30. Then I should get the following.

array(5, 7, 10, 8);

However, this function does not take into account duplicates and 0s. What I would like - and have been trying to accomplish - is to generate an array with unique numbers that add up to my predetermined variable $max. No Duplicate numbers and No 0 and/or negative integers.

最佳回答

Ok, this problem actually revolves around linear sequences. With a minimum value of 1 consider the sequence:

f(n) = 1 + 2 + ... + n - 1 + n

The sum of such a sequence is equal to:

f(n) = n * (n + 1) / 2

so for n = 4, as an example, the sum is 10. That means if you re selecting 4 different numbers the minimum total with no zeroes and no negatives is 10. Now go in reverse: if you have a total of 10 and 4 numbers then there is only one combination of (1,2,3,4).

So first you need to check if your total is at least as high as this lower bound. If it is less there is no combination. If it is equal, there is precisely one combination. If it is higher it gets more complicated.

Now imagine your constraints are a total of 12 with 4 numbers. We ve established that f(4) = 10. But what if the first (lowest) number is 2?

2 + 3 + 4 + 5 = 14

So the first number can t be higher than 1. You know your first number. Now you generate a sequence of 3 numbers with a total of 11 (being 12 - 1).

1 + 2 + 3 = 6
2 + 3 + 4 = 9
3 + 4 + 5 = 12

The second number has to be 2 because it can t be one. It can t be 3 because the minimum sum of three numbers starting with 3 is 12 and we have to add to 11.

Now we find two numbers that add up to 9 (12 - 1 - 2) with 3 being the lowest possible.

3 + 4 = 7
4 + 5 = 9

The third number can be 3 or 4. With the third number found the last is fixed. The two possible combinations are:

1, 2, 3, 6
1, 2, 4, 5

You can turn this into a general algorithm. Consider this recursive implementation:

$all = all_sequences(14, 4);
echo "
All sequences:

";
foreach ($all as $arr) {
  echo implode( ,  , $arr) . "
";
}

function all_sequences($total, $num, $start = 1) {
  if ($num == 1) {
    return array($total);
  }
  $max = lowest_maximum($start, $num);
  $limit = (int)(($total - $max) / $num) + $start;
  $ret = array();
  if ($num == 2) {
    for ($i = $start; $i <= $limit; $i++) {
      $ret[] = array($i, $total - $i);
    }
  } else {
    for ($i = $start; $i <= $limit; $i++) {
      $sub = all_sequences($total - $i, $num - 1, $i + 1);
      foreach ($sub as $arr) {
        array_unshift($arr, $i);
        $ret[] = $arr;
      }
    }
  }
  return $ret;
}

function lowest_maximum($start, $num) {
  return sum_linear($num) + ($start - 1) * $num;
}

function sum_linear($num) {
  return ($num + 1) * $num / 2;
}

Output:

All sequences:

1, 2, 3, 8
1, 2, 4, 7
1, 2, 5, 6
1, 3, 4, 6
2, 3, 4, 5

One implementation of this would be to get all the sequences and select one at random. This has the advantage of equally weighting all possible combinations, which may or may not be useful or necessary to what you re doing.

That will become unwieldy with large totals or large numbers of elements, in which case the above algorithm can be modified to return a random element in the range from $start to $limit instead of every value.

问题回答

I would use area under triangle formula... like cletus(!?) Im really gonna have to start paying more attention to things...

Anyway, i think this solution is pretty elegant now, it applies the desired minimum spacing between all elements, evenly, scales the gaps (distribution) evenly to maintain the original sum and does the job non-recursively (except for the sort):

Given an array a() of random numbers of length n

Generate a sort index s()

and work on the sorted intervals a(s(0))-a(s(1)), a(s(1))-a(s(2)) etc

  1. increase each interval by the desired minimum separation size eg 1 (this necessarily warps their randomness )

  2. decrease each interval by a factor calculated to restore the series sum to what it is without the added spacing.

If we add 1 to each of a series we increase the series sum by 1 * len

1 added to each of series intervals increases sum by: len*(len+1)/2 //( ?pascal s triangle )

Draft code:

$series($length);        //the input sequence 
$seriesum=sum($series);  //its sum
$minsepa=1;              //minimum separation

$sorti=sort_index_of($series) //sorted index - php haz function?

$sepsum=$minsepa*($length*($length+1))/2; 
//sum of extra separation

$unsepfactor100=($seriesum*100)/($seriesum+sepsum); 
//scale factor for original separation to maintain size
//(*100~ for integer arithmetic)

$px=series($sorti(0)); //for loop needs the value of prev serie

for($x=1 ; $x < length; $x++)
{ $tx=$series($sorti($x));             //val of serie to
  $series($sorti($x))= ($minsepa*$x)   //adjust relative to prev
                     + $px 
                     + (($tx-$px)*$unsepfactor100)/100; 

  $px=$tx;                             //store for next iteration 
  }
  • all intervals are reduced by a constant (non-random-warping-factor)
  • separation can be set to values other than one
  • implementantions need to be carefuly tweaked (i usualy test& calibrate )
    to accomodate rounding errors. Probably scale everything up by ~15 then back down after. Intervals should survive if done right.

After sort index is generated, shuffle the order of indexes to duplicate values to avoid runs in the sequence of collided series. ( or just shuffle final output if order never mattered )

Shuffle indexes of dupes:

for($x=1; $x<$len; $x++)                   
{ if ($series($srt($x))==$series($srt($x-1)))
  { if( random(0,1) ) 
    { $sw= $srt($x);
      $srt($x)= $srt($x-1);
      $srt($x-1)= $sw;
  } } } 

A kind of minimal disturbance can be done to a random sequence by just parting dupes by the minimum required, rather than moving them more than minimum -some random amount that was sought by the question.

The code here separates every element by the min separation, whether duplicate or not, that should be kindof evenhanded, but overdone maybe. The code could be modified to only separate the dupes by looking through the series(sorti(n0:n1..len)) for them and calculating sepsum as +=minsep*(len-n) for each dupe. Then the adjustment loop just has to test again for dupe before applying adjustment.





相关问题
How to add/merge several Big O s into one

If I have an algorithm which is comprised of (let s say) three sub-algorithms, all with different O() characteristics, e.g.: algorithm A: O(n) algorithm B: O(log(n)) algorithm C: O(n log(n)) How do ...

Grokking Timsort

There s a (relatively) new sort on the block called Timsort. It s been used as Python s list.sort, and is now going to be the new Array.sort in Java 7. There s some documentation and a tiny Wikipedia ...

Manually implementing high performance algorithms in .NET

As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Enumerating All Minimal Directed Cycles Of A Directed Graph

I have a directed graph and my problem is to enumerate all the minimal (cycles that cannot be constructed as the union of other cycles) directed cycles of this graph. This is different from what the ...

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签