English 中文(简体)
Nth Combination
原标题:

Is there a direct way of getting the Nth combination of an ordered set of all combinations of nCr?

Example: I have four elements: [6, 4, 2, 1]. All the possible combinations by taking three at a time would be: [[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]].

Is there an algorithm that would give me e.g. the 3rd answer, [6, 2, 1], in the ordered result set, without enumerating all the previous answers?

最佳回答

Note you can generate the sequence by recursively generating all combinations with the first element, then all combinations without. In both recursive cases, you drop the first element to get all combinations from n-1 elements. In Python:

def combination(l, r):
    if r == 0:
        yield []
    elif len(l) == r:
        yield l
    else:
        for c in (combination(l[1:], r-1)):
            yield l[0:1]+c
        for c in (combination(l[1:], r)):
            yield c

Any time you re generating a sequence by making a choice like this, you can recursively generate the kth element by counting how many elements a choice generates and comparing the count to k. If k is less than the count, you make that choice. Otherwise, subtract the count and repeat for the other possible choices you could make at that point. If there are always b choices, you can view this as generating a number in base b. The technique still works if the number of choices varies. In pseudocode (when all choices are always available):

kth(k, choicePoints)
    if choicePoints is empty
        return empty list
    for each choice in head of choicePoints:
        if k < size of choice
            return choice and kth(k, tail of choicePoints)
        else
            k -= size of choice
    signal exception: k is out-of-bounds

This gives you a 0-based index. If you want 1-based, change the comparison to k <= size of choice.

The tricky part (and what is unspecified in the pseudocode) is that the size of a choice depends on previous choices. Note the pseudocode can be used to solve a more general case than the problem.

For this specific problem, there are two choices (b= 2) and the size of the 1st choice (i.e. including the 1st element) is given by n-1Cr-1. Here s one implementation (which requires a suitable nCr):

def kthCombination(k, l, r):
    if r == 0:
        return []
    elif len(l) == r:
        return l
    else:
        i=nCr(len(l)-1, r-1)
        if k < i:
            return l[0:1] + kthCombination(k, l[1:], r-1)
        else:
            return kthCombination(k-i, l[1:], r)

If you reverse the order of the choices, you reverse the order of the sequence.

def reverseKthCombination(k, l, r):
    if r == 0:
        return []
    elif len(l) == r:
        return l
    else:
        i=nCr(len(l)-1, r)
        if k < i:
            return reverseKthCombination(k, l[1:], r)
        else:
            return l[0:1] + reverseKthCombination(k-i, l[1:], r-1)

Putting it to use:

>>> l = [6, 4, 2, 1]
>>> [kthCombination(k, [6, 4, 2, 1], 3) for k in range(nCr(len(l), 3)) ]
[[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]]
>>> powOf2s=[2**i for i in range(4,-1,-1)]
>>> [sum(kthCombination(k, powOf2s, 3)) for k in range(nCr(len(powOf2s), 3))]
[28, 26, 25, 22, 21, 19, 14, 13, 11, 7]
>>> [sum(reverseKthCombination(k, powOf2s, 3)) for k in range(nCr(len(powOf2s), 3))]
[7, 11, 13, 14, 19, 21, 22, 25, 26, 28]
问题回答
  • TLDR? Just scroll to the very bottom for my final solution.

I stumbled across this question while I was looking for methods to both get the index a specified combination would be located at if it were in a lexicographically sorted list and vice versa, for a choice of objects from some potentially very large set of objects and couldn t find much on the latter (the inverse of your problem is not so elusive).

Since I also solved (what I thought was) your exact problem before I thought I d post my solutions to both here.

**
EDIT: My requirement is what your requirement was too - I saw the answers and thought recursion was fine. Well now, after six long years you have it; just scroll down.
**

For your requirement as (I thought it was) posed in the question this will do the job just fine:

def iterCombinations(n, k):
if k==1:
    for i in range(n):
        yield [i]
    return
result = []
for a in range(k-1, n):
    for e in iterCombinations(n, k-1):
        if e[-1] == a:
            break
        yield e + [a]

You can then lookup the item in a collection ordered in the descending order (or use some equivalent compare methodology), so for the case in question:

>>> itemsDescending = [6,4,2,1]
>>> for c in iterCombinations(4, 3):
...     [itemsDescending[i] for i in c]
...
[6, 4, 2]
[6, 4, 1]
[6, 2, 1]
[4, 2, 1]

This is also possible straight out of the box in Python, however:

>>> import itertools
>>> for c in itertools.combinations(itemsDescending, 3):
...     c
...
(6, 4, 2)
(6, 4, 1)
(6, 2, 1)
(4, 2, 1)

Here is what I did for my requirement (and really for yours!) of a non-recursive algorithm that does not create or traverse the ordered list for either direction, but rather uses a simple but effective non-recursive implementation of nCr, choose(n, k):

def choose(n, k):
       Returns the number of ways to choose k items from n items   
    reflect = n - k
    if k > reflect:
        if k > n:
            return 0
        k = reflect
    if k == 0:
        return 1
    for nMinusIPlus1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
        n = n * nMinusIPlus1 // i
    return n

To get the combination at some (zero-based) index in a forward sorted list:

def iterCombination(index, n, k):
       Yields the items of the single combination that would be at the provided
    (0-based) index in a lexicographically sorted list of combinations of choices
    of k items from n items [0,n), given the combinations were sorted in 
    descending order. Yields in descending order.
       
    if index < 0 or index >= choose(n, k):
        return
    n -= 1
    for i in range(k):
        while choose(n, k) > index:
            n -= 1
        yield n
        index -= choose(n, k)
        n -= 1
        k -= 1

To get the (zero-based) index at which some combination would reside in a reverse ordered list:

def indexOfCombination(combination):
       Returns the (0-based) index the given combination would have if it were in
    a reverse-lexicographically sorted list of combinations of choices of
    len(combination) items from any possible number of items (given the
    combination s length and maximum value)
   - combination must already be in descending order,
     and it s items drawn from the set [0,n).
       
    result = 0
    for i, a in enumerate(combination):
        result += choose(a, i + 1)
    return result

It s overkill for your example (but I realise now that that was just an example); this is how that would go for each index in turn:

def exampleUseCase(itemsDescending=[6,4,2,1], k=3):
    n = len(itemsDescending)
    print("index -> combination -> and back again:")
    for i in range(choose(n, k)):
        c = [itemsDescending[j] for j in iterCombination(i, n, k)][-1::-1]
        index = indexOfCombination([itemsDescending.index(v) for v in c])
        print("{0} -> {1} -> {2}".format(i, c, index))

>>> exampleUseCase()
index -> combination -> and back again:
0 -> [6, 4, 2] -> 0
1 -> [6, 4, 1] -> 1
2 -> [6, 2, 1] -> 2
3 -> [4, 2, 1] -> 3

This can find the index given some long list or return the combination at some astronomical index in the blink of an eye, for example:

>>> choose(2016, 37)
9617597205504126094112265433349923026485628526002095715212972063686138242753600
>>> list(iterCombination(_-1, 2016, 37))
[2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003,
2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989,
1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979]

or, since that was the very last one and could be fast due to the reflection in choose(n, k), here s one from right in the middle and it seems just as fast...

>>> choose(2016, 37)//2
4808798602752063047056132716674961513242814263001047857606486031843069121376800
>>> list(iterCombination(_, 2016, 37))
[1978, 1973, 1921, 1908, 1825, 1775, 1747, 1635, 1613, 1598, 1529, 1528, 1521,
1445, 1393, 1251, 1247, 1229, 1204, 1198, 922, 901, 794, 699, 685, 633, 619, 598,
469, 456, 374, 368, 357, 219, 149, 93, 71]

This last example pauses for thought for a split second, but wouldn t you?

>>> import random
>>> rSet = set(random.randint(0, 10000000) for i in range(900))
>>> len(rSet)
900
>>> rList = sorted(rSet, reverse=True)
>>> combinations.indexOfCombination(rList)
61536587905102303838316048492163850175478325236595592744487336325506086930974887
88085020093159925576117511028315621934208381981476407812702689774826510322023536
58905845549371069786639595263444239118366962232872361362581506476113967993096033
00541202874946853699568596881200225925266331936183173583581021914595163799417151
30442624813775945054888304722079206982972852037480516813527237183254850056012217
59834465303543702263588008387352235149083914737690225710105023486226582087736870
38383323140972279867697434315252036074490127510158752080225274972225311906715033
86851377357968649982293794242170046400174118714525559851836064661141086690326842
25236658978135989907667078625869419802333512020715700514133380517628637151215549
05922388534567108671308819960483147825031620798631811671493891643972220604919591
22785587505280326638477135315176731640100473359830821781905546117103137944239120
34912084544221250309244925308316352643060056100719194985568284049903555621750881
39419639825279398618630525081169688672242833238889454445237928356800414839702024
66807635358129606994342005075585962080795273287472139515994244684088406544976674
84183671032002497594936116837768233617073949894918741875863985858049825755901232
89317507965160689287607868119414903299382093412911433254998227245783454244894604
83654290108678890682359278892580855226717964180806265176337132759167920384512456
91624558534942279041452960272707049107641475225516294235268581475735143470692000
78400891862852130481822509803019636619427631175355448729708451565341764545325720
79277290914349746541071731127111532099038538549697091038496002102703737347343739
96398832832674081286904287066696046621691978697914823322322650123025472624927566
99891468668052668317066769517155581261265629289158798073055495539590686279250097
27295943276536772955923599217742543093669565147228386873469711200278811335649924
13587219640724942441913695193417732608127949738209466313175361161142601108707568
19470026889319648128790363676253707359290547393198350533094409863254710237344552
47692325209744353688541868412075798500629908908768438513508959321262250985142709
19794478379412756202638771417821781240327337108495689300616872374578607430951230
96908870723878513999404242546015617238957825116802801618973562178005776911079790
22026655573872019955677676783191505879571719659770550759779880002320421606755826
75809722478174545846409923210824885805972611279030267270741509747224602604003738
30411365119180944456819762167312738395140461035991994771968906979578667047734952
21981545694935313345331923300019842406900689401417602004228459137311983483386802
30352489602769346000257761959413965109940729263098747702427952104316612809425394
85037536245288888254374135695390839718978818689595231708490351927063849922772653
26064826999661128817511630298712833048667406916285156973335575847429111697259113
53969532522640227276562651123634766230804871160471143157687290382053412295542343
14022687833967461351170188107671919648640149202504369991478703293224727284508796
06843631262345918398240286430644564444566815901074110609701319038586170760771099
41252989796265436701638358088345892387619172572763571929093224171759199798290520
71975442996399826830220944004118266689537930602427572308646745061258472912222347
18088442198837834539211242627770833874751143136048704550494404981971932449150098
52555927020553995188323691320225317096340687798498057634440618188905647503384292
79493920419695886724506109053220167190536026635080266763647744881063220423654648
36855624855494077960732944499038847158715263413026604773216510801253044020991845
89652657529729792772055725210165026891724511953666038764273616212464901231675592
46950937136633665320781952510620087284589083139308516989522633786063418913473703
96532777760440118656525488729217328376766171004246127636983612583177565603918697
15557602015171235214344399010185766876727226408494760175957535995025356361689144
85181975631986409708533731043231896096597038345028523539733981468056497208027899
6245509252811753667386001506195

However going back from that index to the combination of 900-choose-10,000,000 that it represents with the previous implementation would be very slow (since it simply subtracts one from n at each iteration).

For such large lists of combinations we can instead do a binary search of the space, and the overhead we add means it will only be a little slower for small lists of combinations:

def iterCombination(index, n, k):
       Yields the items of the single combination that would be at the provided
    (0-based) index in a lexicographically sorted list of combinations of choices
    of k items from n items [0,n), given the combinations were sorted in 
    descending order. Yields in descending order.
       
    if index < 0 or n < k or n < 1 or k < 1 or choose(n, k) <= index:
        return
    for i in range(k, 0, -1):
        d = (n - i) // 2 or 1
        n -= d
        while 1:
            nCi = choose(n, i)
            while nCi > index:
                d = d // 2 or 1
                n -= d
                nCi = choose(n, i)
            if d == 1:
                break
            n += d
            d //= 2
            n -= d
        yield n
        index -= nCi

From this one may notice that all the calls to choose have terms that cancel, if we cancel everything out we end up with a much faster implementation and what is, I think...

The optimal function for this problem

def iterCombination(index, n, k):
       Yields the items of the single combination that would be at the provided
    (0-based) index in a lexicographically sorted list of combinations of choices
    of k items from n items [0,n), given the combinations were sorted in 
    descending order. Yields in descending order.
       
    nCk = 1
    for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
        nCk *= nMinusI
        nCk //= iPlus1
    curIndex = nCk
    for k in range(k, 0, -1):
        nCk *= k
        nCk //= n
        while curIndex - nCk > index:
            curIndex -= nCk
            nCk *= (n - k)
            nCk -= nCk % k
            n -= 1
            nCk //= n
        n -= 1
        yield n

A final reminder that for the use case of the question one would do something like this:

def combinationAt(index, itemsDescending, k):
    return [itemsDescending[i] for i in
            list(iterCombination(index, len(itemsDescending), k))[-1::-1]]

>>> itemsDescending = [6,4,2,1]
>>> numberOfItemsBeingChosen = 3
>>> zeroBasedIndexWanted = 1
>>> combinationAt(zeroBasedIndexWanted, itemsDescending, numberOfItemsBeingChosen)
[6, 4, 1]

One way to do it would be by using properties of bits. This still requires some enumeration, but you wouldn t have to enumerate every set.

For your example, you have 4 numbers in your set. So if you were generating all the possible combinations of 4 numbers, you could enumerate them as follows:

{6, 4, 2, 1}

0000 - {(no numbers in set)}
0001 - {1}
0010 - {2}
0011 - {2, 1}
...
1111 - {6, 4, 2, 1}

See how each "bit" corresponds to "whether that number is in your set"? We see here that there are 16 possibilities (2^4).

So now we can go through and find all of the possibilities that have only 3 bits turned on. This will tell us all of the combinations of "3" that exist:

0111 - {4, 2, 1}
1011 - {6, 2, 1}
1101 - {6, 4, 1}
1110 - {6, 4, 2}

And lets rewrite each of our binary values as decimal values:

0111 = 7
1011 = 11
1101 = 13
1110 = 14

Now that we ve done that - well, you said you wanted the "3rd" enumeration. So lets look at the 3rd largest number: 11. Which has the bit pattern 1011. Which corresponds to... {6, 2, 1}

Cool!

Basically, you can use the same concept for any set. So now all we ve done is translate the problem from "enumerating all the sets" to "enumerating all of the integers". This might be a lot easier for your problem.

From the Python 3.6 itertools recipes:

def nth_combination(iterable, r, index):
     Equivalent to list(combinations(iterable, r))[index] 
    pool = tuple(iterable)
    n = len(pool)
    if r < 0 or r > n:
        raise ValueError
    c = 1
    k = min(r, n-r)
    for i in range(1, k+1):
        c = c * (n - k + i) // i
    if index < 0:
        index += c
    if index < 0 or index >= c:
        raise IndexError
    result = []
    while r:
        c, n, r = c*r//n, n-1, r-1
        while index >= c:
            index -= c
            c, n = c*(n-r)//n, n-1
        result.append(pool[-1-n])
    return tuple(result)

In practice:

iterable, r, index = [6, 4, 2, 1], 3, 2

nth_combination(iterable, r, index)
# (6, 2, 1)

Alternatively, as mentioned in the docstring:

import itertools as it


list(it.combinations(iterable, r))[index]
# (6, 2, 1)

See also more_itertools - a third party library that implements this recipe for you. Install via:

> pip install more_itertools

just a rough sketch: arrange your numbers into upper triangular matrix of tuples:

A(n-1,n-1)   
Aij = [i+1, j-1]

if you traverse matrix row first, you will get combinations in increasing order for two elements. To generalize to three elements, think of your matrix rows as another triangular matrix, rather than a vector. It kind of creates a corner of a cube.

At least this is how I have would approach the problem

let me clarify this, you do not have to store the matrix, you will need to compute index. Let me work out to dimensional example, which you in principle could expand to 20 dimensions(bookkeeping may be atrocious).

ij = (i*i + i)/2 + j // ij is also the combination number
(i,j) = decompose(ij) // from ij one can recover i,j components
I = i // actual first index
J = j + 1 // actual second index

this two-dimensional example works for any number n, and you dont have to tabulate permutations.

Yes there a direct way of getting the Nth combination of an ordered set of all combinations of nCr? Say you need to generate 0th, 3rd, 6th.. combinations of a given set. You can generate it directly without generating combinations in between using JNuberTools. You can even generate next billionth combination (if your set size is large) Here is the code example:

JNumberTools.combinationsOf(list)
        .uniqueNth(8,1000_000_000) //skip to billionth combination of size 8
        .forEach(System.out::println);

The maven dependency for JNumberTools is :

<dependency>
    <groupId>io.github.deepeshpatel</groupId>
    <artifactId>jnumbertools</artifactId>
    <version>1.0.0</version>
</dependency>




相关问题
How to manage a pageview DB

I am interested in tracking my users pageviews on my site. Being that traffic is expanding very quickly, I am worried about robots, etc, and I also want to be able to use tracked data live to alter ...

Statistics Question

Suppose I conduct a survey of 10 people asking whether to rank a movie as 0 to 4 stars. Allowable answers are 0, 1, 2, 3, and 4. The mean is 2.0 stars. How do I calculate the certainty (or ...

Calculating variance with large numbers

I haven t really used variance calculation that much, and I don t know quite what to expect. Actually I m not too good with math at all. I have a an array of 1000000 random numeric values in the ...

R statistical package: wrapping GOFrame objects

I m trying to generate GOFrame objects to generate a gene ontology mapping in R for unsupported organisms (see http://www.bioconductor.org/packages/release/bioc/vignettes/GOstats/inst/doc/...

Generating correlated numbers

Here is a fun one: I need to generate random x/y pairs that are correlated at a given value of Pearson product moment correlation coefficient, or Pearson r. You can imagine this as two arrays, array ...

Multivariate time series modelling in R

I want do fit some sort of multi-variate time series model using R. Here is a sample of my data: u cci bci cpi gdp dum1 dum2 dum3 dx 16.50 14.00 53.00 45.70 80....

热门标签