Selection

In this article will summarize mathematics related to selection of k items in n items and counting problems problems related to selection of items.

Let’s first understand what n choose k means, i.e. choosing/selecting k items out of n items.

Let’s take an example of 4 balls of Red(R), Blue(B), Green(G), White(W) color and try to understand from that. Whenever we choose k items from n items, we can create a bucket of size k and put the items in that bucket, so the remaining items outside of bucket will be n - k at the end of the selection.

  • Selecting No ball out of 4 balls :-> 1 way of doing that, as no selection is also a choice.
  • Selecting 1 ball out of 4 balls :-> This is quite simple, as each of the ball can be selected and but in the bucket. The bucket with all the four balls in each selection i.e. {R,G,B,W}
  • Selecting 2 balls out of 4 balls :-> This gives us multiple selections let’s iterate those, {RB, RG, RW, BG, BW, GW} this can be done by selecting and putting a ball in the bucket from {R,G,B,W}, and then selecting again with the remaining 3 balls to fill the bucket of size 2.
  • Selecting 3 balls out of 4 balls :-> There are 4 ways for selecting 3 balls, and can be given as : {RGB,RGW,RBW,GBW}. This step is in symmetry with selecting 1 ball out of 4 as the problem can also be rephrased as not selecting 1 ball from the 4 balls, thus remaining 3 balls left. In each iteration we don’t select one ball and the remaining sequence becomes the selection.
  • Selecting all 4 balls out of 4 balls :-> There is only one way to do so. This again is in symmetry with the no ball selection as no ball selection is same as all ball selection for the remaining balls.

Now the number of ways for this selection is denoted by ^n^C~r~ or (n r).

Mathematically ^n^C~k~ is equal to n!/k!(n-k)!. Let’s try to understand how we reach to this closed form formula.

The numerator of n! is defined by number of ways to arrange n items, this can be reasoned by thinking of arrangement as n blanks and the first blank can be filled with any of the n item so n ways, once the first blank is filled the second can be filled with any of the n-1 items and so forth the last nth blank can be filled with only 1 item remaining. This leads to n * (n - 1) * (n - 2) * (n - 3) ... 1 which is n!. For n = 4 these would be 24 arrangements.

Now an imaginary boundary can be thought between these n blanks such that on the left side there are k blanks and on the right side n - k blanks, this will help to understand the denominator. In case of n = 4 and k = 2 this would be _ _ | _ _. Now if we fill all the n! 24 arrangements in these spaces for n = 4. Then we can see that the first 2 blanks will contain all the selections for example {RG, GR, RB, BR, RW, WR, GB, BG, BW, WB, GW, WG}. But in these selections BW and WB are considered as two selections, and similarly for other like BG, GB or RB, BR etc. In fact there are only 6 unique selections without considering the order. That is why we need to divide it by the number of arrangements which are possible in the k blanks and that would be k!. This will make sure all the inter arrangements between the k blanks due to different order of the same items is only considered once. A similar analysis can be done for the remaining (n - k) blanks, so we take only the unique selection of elements only once irrespective of order. This can be done by dividing with (n-k)!.

  • Here to note we are not considering the order of selection as a separate selection, i.e. if there are 4 balls of Red(R), Blue(B), Green(G), White(W) color then selection of 2 balls such that Red, White {RW} or White, Red {WR} is considered as 1 selection. In some problems this will be considered as two separate selections, for example selecting 2 digits out of 1,2,3,4 as 12 and 21 are different numbers.
  • What would be the number of ways in such scenario? Hint: Here as the order is important, so we don’t need to divide n! by k! as every sequence is a separate count in the selection. And so what is remaining is n!/(n - k)! which is defined as ^n^P~k~ or can also be given as k! * ^n^C~k~. Which is defined as, number of ways to arrange k items out of n items.

Going further I will be referring ^n^C~k~ as C(n, k), just to make it simpler to write it.

Let’s look at the first problem

Given n and k, return all possible combinations of k selection in n number from 1 to n.

One of the approaches for these problems is to create a partial solution stack <st> and pass that solution to the following recursive calls. In the case of selecting k items in n items, the idea is to select by putting the element in the stack calling the recursive call with next item and unselecting it by removing the element from the stack and again calling the recursing call with next item. In that way each item is selected and unselected in each recursive call, till k items are present in the stack which is one of the solution.

def combine(self, n: int, k: int) -> List[List[int]]:
    sol = []
    def dfs(st: List[int], elem: int):
        if len(st) == k:
            sol.append(st[:])
            return
        if elem <= n:
            st.append(elem)
            dfs(st, elem + 1)
            st.pop()
            dfs(st, elem + 1)
    dfs([], 1)
    return sol

Another way to think about this problem, at first position of k blanks, each number of n can be selected and the following k - 1 numbers can be selected for further positions. In this way basically at each recursive call each of the k blanks are filled.

def combine(self, n: int, k: int) -> List[List[int]]:
    sol = []
    def dfs(st: List[int], elem: int):
        if len(st) == k:
            sol.append(st[:])
            return
        for i in range(elem, n + 1):
            st.append(i)
            dfs(st, i + 1)
            st.pop()
    dfs([], 1)
    return sol

Time Complexity: Complexity of both the solutions will be equal to sum of inner nodes and k * each leaf node, due to copy operation at each leaf node. There will be C(n, k) leaf nodes in the recursive tree which will be added to solution. So complexity will be O(k * n!/k!(n-k)!) which can be said to have an O(n!/k!(n-k)!) or a bigger upper bound to O(n^k/k!),

This can be proven as n!/k!(n-k)! => n*(n-1)*(n-2) ... (n - k + 1) / k! => Taking n as common in k numerator terms will give => n^k * (1 - 1/n) * (1 - 2/n) ... (1 - (k + 1)/n) which can be upper bounded by n ^ k. => Giving O(n^k/k!)

More information about the upper and lower bound of C(n, k) link

comments powered by Disqus