Random Subset Generation Algorithm: Proof of Uniformity and Termination
To prove that RANDOM-SAMPLE returns a subset of size m drawn uniformly at random, we need to show two things:
- Every possible subset of size m has an equal probability of being returned.
- The algorithm terminates and returns a valid subset.
Proof:
- To show that every possible subset of size m has an equal probability of being returned, we will use mathematical induction.
Base case: When m = 0, the algorithm immediately returns an empty set, which is a valid subset of size 0. Since there is only one possible empty set of size 0, it is returned with probability 1.
Inductive step: Assume that for m = k, every possible subset of size k has an equal probability of being returned.
Now, consider the case when m = k+1. The algorithm first calls RANDOM-SAMPLE with m-1 and n-1, which returns a subset S of size k drawn uniformly at random from the set {1, 2, ..., n-1}. The next step is to choose a random integer i from {1, 2, ..., n}.
If i is not in S, the algorithm returns S ∪ {i}, which is a subset of size k+1. The probability of choosing i is 1/n, and since S was chosen uniformly at random from {1, 2, ..., n-1}, the probability that i is not in S is (n-1)/n. Therefore, the probability of returning S ∪ {i} is (k/n) * ((n-1)/n) = k/(n(n-1)).
If i is in S, the algorithm returns S, which is a subset of size k. The probability of choosing i is 1/n, and since S was chosen uniformly at random from {1, 2, ..., n-1}, the probability that i is in S is 1/n. Therefore, the probability of returning S is (k/n) * (1/n) = k/(n^2).
The sum of these probabilities is (k/(n(n-1))) + (k/(n^2)) = k/n * (1/(n-1) + 1/n) = k/n, which is the probability of returning a subset of size k+1.
By the principle of mathematical induction, we can conclude that for any m ≤ n, every possible subset of size m has an equal probability of being returned.
- To show that the algorithm terminates and returns a valid subset, we need to consider the recursive calls and the base case.
The algorithm makes a recursive call to RANDOM-SAMPLE with m-1 and n-1. Since m is decremented by 1 in each recursive call, the algorithm will eventually reach the base case when m = 0. At this point, the algorithm immediately returns an empty set, which is a valid subset.
In each recursive call, the algorithm also calls RANDOM(1,n) to choose a random integer i. Since RANDOM takes O(1) worst-case time, this step does not affect the termination of the algorithm.
Therefore, we can conclude that the algorithm terminates and returns a valid subset.
In conclusion, RANDOM-SAMPLE returns a subset of size m drawn uniformly at random from the set {1, 2, ..., n}.
原文地址: https://www.cveoy.top/t/topic/plJz 著作权归作者所有。请勿转载和采集!