Skip to content
March 15, 2010 / steve8

Factoring Algorithm puzzle

What is the minimal number, X, of yes/no questions needed to find the smallest positive integer divisor (other than 1) of an integer between 2 and 166 (inclusive)?

We are asking for the exact answer in two cases:

  1. In the worst case, i.e., what is the smallest number X for which we can guarantee finding it in no more than X questions.
  2. On average, i.e., assuming that the number was chosen in uniform distribution from 2 to 166 and we want to minimize the expected number of questions.

This problem was IBM’s Ponder This puzzle of November 2009.

I’m going to write my solution in English, along with code (both python and java) so anyone knowing any one of the 3 languages should be able to follow along.

For both #1 and #2 there are some relevant thoughts:

the smallest integer dividers (above 1) of all integers (above 1) are prime, so the divisor we are looking for will be prime, specifically only a few prime numbers that will be relevant between 2 and 166 (inclusively):

to find out about these primes, I wrote a little code:

 

Python (3.1)  :

prime_list = []
maxx = 166
for n in range(2,maxx+1):
    isprime = True
    for p in prime_list:
        if n % p is 0;
            isprime = False;
            break;
    if isprime == True:
        prime_list.append(n)
print("the primes less than",maxx+1,"are:")
print(prime_list)
print("there are ", len(prime_list), "of them")

OR Java :

import java.util.ArrayList;
public class primes {
	public static ArrayList<Integer> prime_list = new ArrayList<Integer>();
	public static void main(String args[]){
		int maxx = 166;
		boolean isPrime;
		for(int n=2; n <= maxx; n++){
			isPrime = true;
			for(int p : prime_list){
				if(n % p == 0){
					isPrime = false;
					break;
				}
			}
			if(isPrime){
				prime_list.add(n);
			}
		}
		System.out.println("the primes less than"+ maxx+1+ " are:");
		System.out.println(prime_list);
		System.out.println("there are " + prime_list.size() + " of them");
	}
}

Both the Python and the Java version of the logic will output the following:

the primes less than 167 are:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]
there are  38 of them

these 38 primes are the only possible ‘smallest integer dividers’ which we are trying to determine with our questions.

Worst Case (part #1 of the Puzzle)  seems like the easier problem to solve, so lets do it.

In the worst case, i.e., what is the smallest number X for which we can guarantee finding it in no more than X questions.

well there are 38 possible solutions, and we only care about worst case, so lets gather half of them and ask (Question 1) if the smallest integer divisor (>1) is in that group.

if the answer is yes, then we have narrowed it down to 19, if the answer is 19 then we have narrowed it down to the other 19, either way we are left with 19.

 

Now take that 19 and group roughly half of them together (10) and ask (Question 2) if the smallest integer divisor (>1) is in that group.

if the answer is yes, then we have narrowed it down to 10, if the answer is no, then we have narrowed it down to 9.

 

Now take that 10 (or 9) and group together about half (5) and ask (Question 3) if the smallest integer divisor (>1) is in that group.

If the answer isyes, then we have narrowed it down to 5, otherwise we have also narrowed it down to 5 or 4 (depending if we had 10 or 9).

 

Now take those 5 or 4 and take roughly half (2) and ask (Question 4) if the smallest integer divisor is one of the two.

If the answer is yes, then we have narrowed it down to 2, otherwise we have narrowed it down to 3 or 2 (depending if we had 5 or 4)

 

Now take those 3 or 2 and take the first integer, and ask (Question 5) if that’s the greatest integer divisor (>1).

If yes, then we are done, that’s the number, if not and we started with 2, we are also done (the other is the number), if we had 3 then we have to ask one more question.

 

Take the 2 remaining and ask (Question 6 (not always required)) if that’s the greatest integer divisor (>1).

If yes, then we are done, that’s the number, if not then the other number is it.

 

So this method has a worst case of 6, a best case of 5, what about the average case?

here’s the code so far in Python 3.1:

prime_list = []
maxx = 166
for n in range(2,maxx+1):
    isprime = True
    for p in prime_list:
        if n % p is 0;
            isprime = False;
            break;
    if isprime == True:
        prime_list.append(n)
print("the primes less than",maxx+1,"are:")
print(prime_list)
print("there are ", len(prime_list), "of them")
def smallestPrimeFactor(n):
	for p in prime_list:
		if n/p == int(n/p):
			return p
question_history=[]
for n in range(2,maxx+1):
	prime_sublist = prime_list
	question_count=0
	while True:
		question_count+=1
		if smallestPrimeFactor(n) in prime_sublist[0:int(len(prime_sublist)/2)]:
			prime_sublist = prime_sublist[0:int(len(prime_sublist)/2)]
		else:
			prime_sublist = prime_sublist[int(len(prime_sublist)/2):]
		if len(prime_sublist)==1:
			print( "number =", n, " smallest factor (>1) =", prime_sublist[0], " questions asked =", question_count)
			question_history.append(question_count)
			break
print( "worst case   =", max(question_history) )
print( "best case    =", min(question_history) )
print( "average case =", sum(question_history) / (maxx-1) )

OR the program so far in Java:

import java.util.ArrayList;
import java.util.List;
public class primes {
	public static ArrayList<Integer> prime_list = new ArrayList<Integer>();
	public static void main(String args[]){
		int maxx = 166;
		boolean isPrime;
		for(int n=2; n <= maxx; n++){
			isPrime = true;
			for(int p : prime_list){
				if(n % p == 0){
					isPrime = false;
					break;
				}
			}
			if(isPrime){
				prime_list.add(n);
			}
		}
		System.out.println("the primes less than 167 are:");
		System.out.println(prime_list);
		System.out.println("there are " + prime_list.size() + " of them");
		int max_questions = -1;
		int min_questions = -1;
		int questions_sum = 0;
		for(int n=2; n <= maxx; n++){
			List<Integer> prime_sublist = prime_list;
			int question_count = 0;
			while(true){
				question_count++;
				if(prime_sublist.subList(0,prime_sublist.size()/2).contains(smallestPrimeFactor(n))){
					prime_sublist = prime_sublist.subList(0,prime_sublist.size()/2);
				}else{
					prime_sublist = prime_sublist.subList(prime_sublist.size()/2,prime_sublist.size());
				}
				if(prime_sublist.size() == 1){
					System.out.println("number = " + n + "  smallest factor (>1) = " + prime_sublist.get(0) + "  questions asked = " + question_count);
					if(question_count > max_questions){
						max_questions = question_count;
					}
					if(question_count < min_questions || min_questions == -1){
						min_questions = question_count;
					}
					questions_sum += question_count;
					break;
				}
			}
		}
		System.out.println("worst case   = "+max_questions);
		System.out.println("best case    = "+min_questions);
		System.out.println("average case = "+(float) questions_sum/(maxx-1));
	}
	public static int smallestPrimeFactor(int n){
		for(int p: prime_list){
			if((n / p) == (float)n / p){
				return p;
			}
		}
		return 0;
	}
}

The output for either of these programs is this:

the primes less than 167 are:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]
there are 38 of them
number = 2  smallest factor (>1) = 2  questions asked = 5
number = 3  smallest factor (>1) = 3  questions asked = 5
number = 4  smallest factor (>1) = 2  questions asked = 5
number = 5  smallest factor (>1) = 5  questions asked = 5
number = 6  smallest factor (>1) = 2  questions asked = 5
number = 7  smallest factor (>1) = 7  questions asked = 5
number = 8  smallest factor (>1) = 2  questions asked = 5
number = 9  smallest factor (>1) = 3  questions asked = 5
number = 10  smallest factor (>1) = 2  questions asked = 5
number = 11  smallest factor (>1) = 11  questions asked = 5
number = 12  smallest factor (>1) = 2  questions asked = 5
number = 13  smallest factor (>1) = 13  questions asked = 5
number = 14  smallest factor (>1) = 2  questions asked = 5
number = 15  smallest factor (>1) = 3  questions asked = 5
number = 16  smallest factor (>1) = 2  questions asked = 5
number = 17  smallest factor (>1) = 17  questions asked = 5
number = 18  smallest factor (>1) = 2  questions asked = 5
number = 19  smallest factor (>1) = 19  questions asked = 6
number = 20  smallest factor (>1) = 2  questions asked = 5
number = 21  smallest factor (>1) = 3  questions asked = 5
number = 22  smallest factor (>1) = 2  questions asked = 5
number = 23  smallest factor (>1) = 23  questions asked = 6
number = 24  smallest factor (>1) = 2  questions asked = 5
number = 25  smallest factor (>1) = 5  questions asked = 5
number = 26  smallest factor (>1) = 2  questions asked = 5
number = 27  smallest factor (>1) = 3  questions asked = 5
number = 28  smallest factor (>1) = 2  questions asked = 5
number = 29  smallest factor (>1) = 29  questions asked = 5
number = 30  smallest factor (>1) = 2  questions asked = 5
number = 31  smallest factor (>1) = 31  questions asked = 5
number = 32  smallest factor (>1) = 2  questions asked = 5
number = 33  smallest factor (>1) = 3  questions asked = 5
number = 34  smallest factor (>1) = 2  questions asked = 5
number = 35  smallest factor (>1) = 5  questions asked = 5
number = 36  smallest factor (>1) = 2  questions asked = 5
number = 37  smallest factor (>1) = 37  questions asked = 5
number = 38  smallest factor (>1) = 2  questions asked = 5
number = 39  smallest factor (>1) = 3  questions asked = 5
number = 40  smallest factor (>1) = 2  questions asked = 5
number = 41  smallest factor (>1) = 41  questions asked = 6
number = 42  smallest factor (>1) = 2  questions asked = 5
number = 43  smallest factor (>1) = 43  questions asked = 6
number = 44  smallest factor (>1) = 2  questions asked = 5
number = 45  smallest factor (>1) = 3  questions asked = 5
number = 46  smallest factor (>1) = 2  questions asked = 5
number = 47  smallest factor (>1) = 47  questions asked = 5
number = 48  smallest factor (>1) = 2  questions asked = 5
number = 49  smallest factor (>1) = 7  questions asked = 5
number = 50  smallest factor (>1) = 2  questions asked = 5
number = 51  smallest factor (>1) = 3  questions asked = 5
number = 52  smallest factor (>1) = 2  questions asked = 5
number = 53  smallest factor (>1) = 53  questions asked = 5
number = 54  smallest factor (>1) = 2  questions asked = 5
number = 55  smallest factor (>1) = 5  questions asked = 5
number = 56  smallest factor (>1) = 2  questions asked = 5
number = 57  smallest factor (>1) = 3  questions asked = 5
number = 58  smallest factor (>1) = 2  questions asked = 5
number = 59  smallest factor (>1) = 59  questions asked = 5
number = 60  smallest factor (>1) = 2  questions asked = 5
number = 61  smallest factor (>1) = 61  questions asked = 6
number = 62  smallest factor (>1) = 2  questions asked = 5
number = 63  smallest factor (>1) = 3  questions asked = 5
number = 64  smallest factor (>1) = 2  questions asked = 5
number = 65  smallest factor (>1) = 5  questions asked = 5
number = 66  smallest factor (>1) = 2  questions asked = 5
number = 67  smallest factor (>1) = 67  questions asked = 6
number = 68  smallest factor (>1) = 2  questions asked = 5
number = 69  smallest factor (>1) = 3  questions asked = 5
number = 70  smallest factor (>1) = 2  questions asked = 5
number = 71  smallest factor (>1) = 71  questions asked = 5
number = 72  smallest factor (>1) = 2  questions asked = 5
number = 73  smallest factor (>1) = 73  questions asked = 5
number = 74  smallest factor (>1) = 2  questions asked = 5
number = 75  smallest factor (>1) = 3  questions asked = 5
number = 76  smallest factor (>1) = 2  questions asked = 5
number = 77  smallest factor (>1) = 7  questions asked = 5
number = 78  smallest factor (>1) = 2  questions asked = 5
number = 79  smallest factor (>1) = 79  questions asked = 5
number = 80  smallest factor (>1) = 2  questions asked = 5
number = 81  smallest factor (>1) = 3  questions asked = 5
number = 82  smallest factor (>1) = 2  questions asked = 5
number = 83  smallest factor (>1) = 83  questions asked = 5
number = 84  smallest factor (>1) = 2  questions asked = 5
number = 85  smallest factor (>1) = 5  questions asked = 5
number = 86  smallest factor (>1) = 2  questions asked = 5
number = 87  smallest factor (>1) = 3  questions asked = 5
number = 88  smallest factor (>1) = 2  questions asked = 5
number = 89  smallest factor (>1) = 89  questions asked = 5
number = 90  smallest factor (>1) = 2  questions asked = 5
number = 91  smallest factor (>1) = 7  questions asked = 5
number = 92  smallest factor (>1) = 2  questions asked = 5
number = 93  smallest factor (>1) = 3  questions asked = 5
number = 94  smallest factor (>1) = 2  questions asked = 5
number = 95  smallest factor (>1) = 5  questions asked = 5
number = 96  smallest factor (>1) = 2  questions asked = 5
number = 97  smallest factor (>1) = 97  questions asked = 5
number = 98  smallest factor (>1) = 2  questions asked = 5
number = 99  smallest factor (>1) = 3  questions asked = 5
number = 100  smallest factor (>1) = 2  questions asked = 5
number = 101  smallest factor (>1) = 101  questions asked = 5
number = 102  smallest factor (>1) = 2  questions asked = 5
number = 103  smallest factor (>1) = 103  questions asked = 6
number = 104  smallest factor (>1) = 2  questions asked = 5
number = 105  smallest factor (>1) = 3  questions asked = 5
number = 106  smallest factor (>1) = 2  questions asked = 5
number = 107  smallest factor (>1) = 107  questions asked = 6
number = 108  smallest factor (>1) = 2  questions asked = 5
number = 109  smallest factor (>1) = 109  questions asked = 5
number = 110  smallest factor (>1) = 2  questions asked = 5
number = 111  smallest factor (>1) = 3  questions asked = 5
number = 112  smallest factor (>1) = 2  questions asked = 5
number = 113  smallest factor (>1) = 113  questions asked = 5
number = 114  smallest factor (>1) = 2  questions asked = 5
number = 115  smallest factor (>1) = 5  questions asked = 5
number = 116  smallest factor (>1) = 2  questions asked = 5
number = 117  smallest factor (>1) = 3  questions asked = 5
number = 118  smallest factor (>1) = 2  questions asked = 5
number = 119  smallest factor (>1) = 7  questions asked = 5
number = 120  smallest factor (>1) = 2  questions asked = 5
number = 121  smallest factor (>1) = 11  questions asked = 5
number = 122  smallest factor (>1) = 2  questions asked = 5
number = 123  smallest factor (>1) = 3  questions asked = 5
number = 124  smallest factor (>1) = 2  questions asked = 5
number = 125  smallest factor (>1) = 5  questions asked = 5
number = 126  smallest factor (>1) = 2  questions asked = 5
number = 127  smallest factor (>1) = 127  questions asked = 5
number = 128  smallest factor (>1) = 2  questions asked = 5
number = 129  smallest factor (>1) = 3  questions asked = 5
number = 130  smallest factor (>1) = 2  questions asked = 5
number = 131  smallest factor (>1) = 131  questions asked = 6
number = 132  smallest factor (>1) = 2  questions asked = 5
number = 133  smallest factor (>1) = 7  questions asked = 5
number = 134  smallest factor (>1) = 2  questions asked = 5
number = 135  smallest factor (>1) = 3  questions asked = 5
number = 136  smallest factor (>1) = 2  questions asked = 5
number = 137  smallest factor (>1) = 137  questions asked = 6
number = 138  smallest factor (>1) = 2  questions asked = 5
number = 139  smallest factor (>1) = 139  questions asked = 5
number = 140  smallest factor (>1) = 2  questions asked = 5
number = 141  smallest factor (>1) = 3  questions asked = 5
number = 142  smallest factor (>1) = 2  questions asked = 5
number = 143  smallest factor (>1) = 11  questions asked = 5
number = 144  smallest factor (>1) = 2  questions asked = 5
number = 145  smallest factor (>1) = 5  questions asked = 5
number = 146  smallest factor (>1) = 2  questions asked = 5
number = 147  smallest factor (>1) = 3  questions asked = 5
number = 148  smallest factor (>1) = 2  questions asked = 5
number = 149  smallest factor (>1) = 149  questions asked = 5
number = 150  smallest factor (>1) = 2  questions asked = 5
number = 151  smallest factor (>1) = 151  questions asked = 5
number = 152  smallest factor (>1) = 2  questions asked = 5
number = 153  smallest factor (>1) = 3  questions asked = 5
number = 154  smallest factor (>1) = 2  questions asked = 5
number = 155  smallest factor (>1) = 5  questions asked = 5
number = 156  smallest factor (>1) = 2  questions asked = 5
number = 157  smallest factor (>1) = 157  questions asked = 6
number = 158  smallest factor (>1) = 2  questions asked = 5
number = 159  smallest factor (>1) = 3  questions asked = 5
number = 160  smallest factor (>1) = 2  questions asked = 5
number = 161  smallest factor (>1) = 7  questions asked = 5
number = 162  smallest factor (>1) = 2  questions asked = 5
number = 163  smallest factor (>1) = 163  questions asked = 6
number = 164  smallest factor (>1) = 2  questions asked = 5
number = 165  smallest factor (>1) = 3  questions asked = 5
number = 166  smallest factor (>1) = 2  questions asked = 5
worst case   = 6
best case    = 5
average case = 5.072727

 

So for this algorithm (let’s call it the worst-case algorithm), we found the possible solution set of 38 primes, then we divided it in half and asked which half iteratively until we were had a set of 1, and that’s the solution.

This is an intuitive way to have the best worst-case in terms of number of questions.

Dividing the set in half for each question is the most amount of progress we can make, getting rid of half the dataset each question.

And that’s the goal, getting to the solution in the least amount of questions in the worst case…

Puzzle part 2:

On average, i.e., assuming that the number was chosen in uniform distribution from 2 to 166 and we want to minimize the expected number of questions.

In the average case, our worst-case method is not very productive, look at the first question:

well there are 38 possible solutions, and we only care about worst case, so lets gather half of them and ask (Question 1) if the smallest integer divisor (>1) is in that group.

so dividing the set in two we have [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67] and [71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]

but since 2 is the solution more than half the time, and 3 being the 2nd most likely, you can see that we really aren’t asking a 50/50 question.  In this case we are asking a 146/19 question, in the average case we aren’t really making a lot of progress with that.

Le’t’s figure out the probabilty of the relevant primes in this dataset:

Python 3.1:

prime_count={p:0 for p in prime_list}
for n in range(2,maxx+1):
	prime_count[smallestPrimeFactor(n)]+=1
n=0
for p in prime_list:
	n+=1
	print("prime#",n,"prime =",p," # of times it is the solution =", prime_count[p])

OR in java:

Hashtable<Integer,Integer> prime_count = new Hashtable<Integer,Integer>();
for(int p:prime_list){
	prime_count.put(p,0);
}
for(int n=2;n<= maxx;n++){
	prime_count.put(smallestPrimeFactor(n),prime_count.get(smallestPrimeFactor(n))+1);
}
int n = 0;
for(int p: prime_list){
	n++;
	System.out.println("prime#" + n + " prime = " + p + " # of times it is the solution = " + prime_count.get(p));
}

Both programs output this:

prime#01 prime = 2    # of times it is the solution = 83
prime#02 prime = 3    # of times it is the solution = 28
prime#03 prime = 5    # of times it is the solution = 11
prime#04 prime = 7    # of times it is the solution = 7
prime#05 prime = 11   # of times it is the solution = 3
prime#06 prime = 13   # of times it is the solution = 1
prime#07 prime = 17   # of times it is the solution = 1
prime#08 prime = 19   # of times it is the solution = 1
prime#09 prime = 23   # of times it is the solution = 1
prime#10 prime = 29   # of times it is the solution = 1
prime#11 prime = 31   # of times it is the solution = 1
prime#12 prime = 37   # of times it is the solution = 1
prime#13 prime = 41   # of times it is the solution = 1
prime#14 prime = 43   # of times it is the solution = 1
prime#15 prime = 47   # of times it is the solution = 1
prime#16 prime = 53   # of times it is the solution = 1
prime#17 prime = 59   # of times it is the solution = 1
prime#18 prime = 61   # of times it is the solution = 1
prime#19 prime = 67   # of times it is the solution = 1
prime#20 prime = 71   # of times it is the solution = 1
prime#21 prime = 73   # of times it is the solution = 1
prime#22 prime = 79   # of times it is the solution = 1
prime#23 prime = 83   # of times it is the solution = 1
prime#24 prime = 89   # of times it is the solution = 1
prime#25 prime = 97   # of times it is the solution = 1
prime#26 prime = 101  # of times it is the solution = 1
prime#27 prime = 103  # of times it is the solution = 1
prime#28 prime = 107  # of times it is the solution = 1
prime#29 prime = 109  # of times it is the solution = 1
prime#30 prime = 113  # of times it is the solution = 1
prime#31 prime = 127  # of times it is the solution = 1
prime#32 prime = 131  # of times it is the solution = 1
prime#33 prime = 137  # of times it is the solution = 1
prime#34 prime = 139  # of times it is the solution = 1
prime#35 prime = 149  # of times it is the solution = 1
prime#36 prime = 151  # of times it is the solution = 1
prime#37 prime = 157  # of times it is the solution = 1
prime#38 prime = 163  # of times it is the solution = 1

To develop the best average-case algorithm we need to evenly divide the solution set in 2 for each question, but not even in the number of primes in each set, but even in probability. We will try to get the sum of the number of times the prime is a solution to be the same in each half.  Effectively choosing the division so that the question will be a 50/50 question.

 

Question 1 is easy (since 2 is the solution to just over half the numbers):

the set will be split into [2] and [ 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]

this is a 83/82 question, nearly 50/50…

there is probability = 83/165 that the answer to this question will pinpoint the solution at 2.

 

Question 2 (if necessary): We now must divide the remaining into two nearly* equally probable sets:

so we can split the remaining up into sets like this:

[3,5,7] vs [11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]

with weightings 28+11+7 vs 3+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1

so that’s 46 vs. 36

It’s very important to note that you can get more balanced groupings, for example grouping 3 (with occurrence score of 28) with 13 of the larger primes (each with 1 occurrence), yielding 42 vs. 42…

The problem with that is you will have a group with a 28 and 13 1’s, so the next matchup if that group is selected will be very imbalanced.

Since we cannot divide up the probability beyond per prime, we should need to consider these cases and not go for perfectly half every time, but instead find the best division for all the possible subsets that will be dealt with in later questions, that will give us the best average case.  This is a very significant and interesting puzzle in itself, It’s almost a shame that a little luck and a naive approach to this can still yield the optimal solution for this dataset, but in a different dataset, this problem could not be glossed over.

Here is the algorithm to optimally construct the tree, with the English description following…

 

Python:

keys = []
values = []
for p in sorted(prime_count):
	keys.append([p])
	values.append(prime_count[p])
print(keys)
print(values
while len(keys) > 2:
	min1_value = min(values)
	min1_index = values.index(min1_value)
	min1_key = keys[min1_index]
	del values[min1_index]
	del keys[min1_index]
	min2_value=min(values)
	min2_index = values.index(min2_value)
	min2_key = keys[min2_index]
	del keys[min2_index]
	del values[min2_index]
	keys.append(min1_key + min2_key)
	values.append(min1_value + min2_value)
	print()
	print(keys)
	print(values)

OR Java:

for(int p:prime_list){
	HashSet<Integer> temp = new HashSet<Integer>();
	temp.add(p);
	keys.add(temp);
	values.add(prime_count.get(p));
}
ArrayList<HashSet<Integer>> in_sets = new ArrayList<HashSet<Integer>>();
ArrayList<HashSet<Integer>> out_sets = new ArrayList<HashSet<Integer>>();
while(keys.size()>1){
	int tmp_index = 0;
	int min1_v = 0;
	int min2_v = 0;
	int min1_i = 0;
	int min2_i = 0;
	HashSet<Integer> min1_k = new HashSet<Integer>();
	HashSet<Integer> min2_k = new HashSet<Integer>();
	HashSet<Integer> min_union_k = new HashSet<Integer>();
	for(int v:values){
		if(tmp_index == 0 | v < min1_v){
			min2_v = min1_v;
			min2_k = min1_k;
			min2_i = min1_i;
			min1_v = v;
			min1_k = keys.get(tmp_index);
			min1_i = tmp_index;
		}else if(v < min2_v | tmp_index == 1){
			min2_v = v;
			min2_k = keys.get(tmp_index);
			min2_i = tmp_index;
		}
		tmp_index++;
	}
	min_union_k.addAll(min1_k);
	min_union_k.addAll(min2_k);
	values.remove(Math.max(min1_i, min2_i));
	values.remove(Math.min(min1_i, min2_i));
	keys.remove(Math.max(min1_i, min2_i));
	keys.remove(Math.min(min1_i, min2_i));
	values.add(min1_v+min2_v);
	keys.add(min_union_k);
	System.out.println();
	System.out.println(keys);
	System.out.println(values);
}

 

This code will incrementally merge the two smallest (in frequency score) sets of primes together, until there is just 2 sets,  keeping a record of this will provide us with an answer-key to how to optimally partition the possible primes when we want to ask a question to narrow down our options.

Here is the output, it starts with all the primes, each in their own set, and below each list is the corresponding frequency scores for each set, when two of the least-valued primes are merged, their scores are summed:

[[2], [3], [5], [7], [11], [13], [17], [19], [23], [29], [31], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

[[2], [3], [5], [7], [11], [19], [23], [29], [31], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2]

[[2], [3], [5], [7], [11], [29], [31], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2]

[[2], [3], [5], [7], [11], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2]

[[2], [3], [5], [7], [11], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137]]
[83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149]]
[83, 28, 11, 7, 3, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157]]
[83, 28, 11, 7, 3, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

[[2], [3], [5], [7], [11], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17]]
[83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3]

[[2], [3], [5], [7], [11], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31]]
[83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4]

[[2], [3], [5], [7], [11], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47]]
[83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4]

[[2], [3], [5], [7], [11], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67]]
[83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4, 4]

[[2], [3], [5], [7], [11], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83]]
[83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4, 4, 4]

[[2], [3], [5], [7], [11], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103]]
[83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 3, 4, 4, 4, 4, 4]

[[2], [3], [5], [7], [11], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127]]
[83, 28, 11, 7, 3, 2, 2, 2, 3, 4, 4, 4, 4, 4, 4]

[[2], [3], [5], [7], [11], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149]]
[83, 28, 11, 7, 3, 2, 3, 4, 4, 4, 4, 4, 4, 4]

[[2], [3], [5], [7], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11]]
[83, 28, 11, 7, 3, 4, 4, 4, 4, 4, 4, 4, 5]

[[2], [3], [5], [7], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31]]
[83, 28, 11, 7, 4, 4, 4, 4, 4, 4, 5, 7]

[[2], [3], [5], [7], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67]]
[83, 28, 11, 7, 4, 4, 4, 4, 5, 7, 8]

[[2], [3], [5], [7], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103]]
[83, 28, 11, 7, 4, 4, 5, 7, 8, 8]

[[2], [3], [5], [7], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103], [107, 109, 113, 127, 131, 137, 139, 149]]
[83, 28, 11, 7, 5, 7, 8, 8, 8]

[[2], [3], [5], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103], [107, 109, 113, 127, 131, 137, 139, 149], [151, 157, 11, 7]]
[83, 28, 11, 7, 8, 8, 8, 12]

[[2], [3], [5], [71, 73, 79, 83, 89, 97, 101, 103], [107, 109, 113, 127, 131, 137, 139, 149], [151, 157, 11, 7], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67]]
[83, 28, 11, 8, 8, 12, 15]

[[2], [3], [5], [151, 157, 11, 7], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149]]
[83, 28, 11, 12, 15, 16]

[[2], [3], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149], [5, 151, 157, 11, 7]]
[83, 28, 15, 16, 23]

[[2], [3], [5, 151, 157, 11, 7], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149]]
[83, 28, 23, 31]

[[2], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149], [5, 151, 157, 11, 7, 3]]
[83, 31, 51]

[[2], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 5, 151, 157, 11, 7, 3]]
[83, 82]

The ouptut shows the algorithm in action, essentially building the partition tree from the ground, in each pair of lists, the first is a list of groups of primes, the 2nd is the corresponding frequency value of each group.  The final code will record each merging, each step, as a guide to how to partition any remaining group.  It will map out the partition tree of the possible solutions in a fashion that is the optimal in average-case.

Question 3+

so we simply move the partition from left to right (in the ordered list of remaining possible least prime factors) until the left group has equal to or greater than half the total occurrences as solutions to the 165 numbers.

Here is the complete code implementing this average-case algorithm and applying it to each number over the range of this problem, computing average-case performance in the process.

The complete program in Python 3.1:

import time
t0=time.time()
prime_list = []
maxx = 166
for n in range(2, maxx+1):
    isprime = True
    for p in prime_list:
        if n % p is 0:
            isprime = False
            break
    if isprime == True:
        prime_list.append(n)
def smallestPrimeFactor(n):
	for p in prime_list:
		if n % p == 0:
			return p
prime_count={p : 0 for p in prime_list}
for n in range(2, maxx + 1):
	prime_count[smallestPrimeFactor(n)] += 1
keys = []
values = []
in_sets = []
out_sets = []
for p in sorted(prime_count):
	keys.append([p])
	values.append(prime_count[p])
while len(keys) > 1:
	min1_value = min(values)
	min1_index = values.index(min1_value)
	min1_key = keys[min1_index]
	del values[min1_index]
	del keys[min1_index]
	min2_value = min(values)
	min2_index = values.index(min2_value)
	min2_key = keys[min2_index]
	del keys[min2_index]
	del values[min2_index]
	in_sets.append(set(min1_key + min2_key))
	out_sets.append(set(min1_key))
	keys.append(min1_key + min2_key)
	values.append(min1_value + min2_value)
#now that we have a partition answer-key (mapping of in_sets to out_sets)
#let's increment through the entire range of integers and compute the algorithm's performance
print("done with partition answer-key, time elapsed =", int(1000*(time.time()-t0)), "ms")
question_history = []
for n in range(2, maxx + 1):
	prime_subset = set(prime_list)
	question_count = 0
	while True:
		question_count += 1
		checkset = out_sets[in_sets.index(prime_subset)]
		if smallestPrimeFactor(n) in checkset:
			prime_subset = checkset
		else:
			prime_subset = prime_subset-checkset
		if len(prime_subset) == 1:
			print( "number =", n, " smallest factor (>1) =", list(prime_subset)[0], " questions asked =", question_count)
			question_history.append(question_count)
			break
print( "worst case   =", max(question_history), "questions" )
print( "best case    =", min(question_history), "questoins" )
print( "average case =", sum(question_history), "/", (maxx-1), "~=",sum(question_history)/(maxx-1), "questions" )
print( "time elapsed =", int((time.time() - t0)*1000),"ms")

Alternatively, the complete Java program:

import java.util.*;
import java.lang.Math;
public class primes {
	public static ArrayList<Integer> prime_list = new ArrayList<Integer>();
	public static void main(String args[]){
		Long t0 = System.currentTimeMillis();
		int maxx = 166;
		boolean isPrime;
		for(int n=2; n <= maxx; n++){
			isPrime = true;
			for(int p : prime_list){
				if(n % p == 0){
					isPrime = false;
					break;
				}
			}
			if(isPrime){
				prime_list.add(n);
			}
		}
		Hashtable<Integer,Integer> prime_count = new Hashtable<Integer,Integer>();
		ArrayList<Integer> values = new ArrayList<Integer>();
		ArrayList<HashSet<Integer>> keys = new ArrayList<HashSet<Integer>>();
		HashSet<Integer> prime_set = new HashSet<Integer>();
		for(int p:prime_list){
			prime_count.put(p,0);
			prime_set.add(p);
		}
		for(int n=2; n <= maxx; n++){
			prime_count.put(smallestPrimeFactor(n),prime_count.get(smallestPrimeFactor(n))+1);
		}
		for(int p:prime_list){
			HashSet<Integer> temp = new HashSet<Integer>();
			temp.add(p);
			keys.add(temp);
			values.add(prime_count.get(p));
		}
		ArrayList<HashSet<Integer>> in_sets = new ArrayList<HashSet<Integer>>();
		ArrayList<HashSet<Integer>> out_sets = new ArrayList<HashSet<Integer>>();
		while(keys.size()>1){
			int tmp_index = 0;
			int min1_v = 0;
			int min2_v = 0;
			int min1_i = 0;
			int min2_i = 0;
			HashSet<Integer> min1_k = new HashSet<Integer>();
			HashSet<Integer> min2_k = new HashSet<Integer>();
			HashSet<Integer> min_union_k = new HashSet<Integer>();
			for(int v:values){
				if(tmp_index == 0 | v < min1_v){
					min2_v = min1_v;
					min2_k = min1_k;
					min2_i = min1_i;
					min1_v = v;
					min1_k = keys.get(tmp_index);
					min1_i = tmp_index;
				}else if(v < min2_v | tmp_index == 1){
					min2_v = v;
					min2_k = keys.get(tmp_index);
					min2_i = tmp_index;
				}
				tmp_index++;
			}
			min_union_k.addAll(min1_k);
			min_union_k.addAll(min2_k);
			in_sets.add(min_union_k);
			out_sets.add(min2_k);
			values.remove(Math.max(min1_i, min2_i));
			values.remove(Math.min(min1_i, min2_i));
			keys.remove(Math.max(min1_i, min2_i));
			keys.remove(Math.min(min1_i, min2_i));
			values.add(min1_v+min2_v);
			keys.add(min_union_k);
		}
		//now that we have a partition answer-key (mapping of in_sets to out_sets)
		//let's increment through the entire range of integers and compute the algorithm's performance
		System.out.println("done with partition answer-key, time elapsed = " + (System.currentTimeMillis()-t0) + " ms");
		int max_questions = -1;
		int min_questions = -1;
		int questions_sum = 0;
		for(int n=2; n <= maxx; n++){
			HashSet<Integer> prime_subset = new HashSet<Integer>();
			prime_subset.addAll(prime_set);
			int question_count = 0;
			int index = -1;
			while(true){
				question_count++;
				index = in_sets.indexOf(prime_subset);
				if( out_sets.get(index).contains(smallestPrimeFactor(n))){
					prime_subset.clear();
					prime_subset.addAll(out_sets.get(index));
				}else{
					prime_subset.removeAll(out_sets.get(index));
				}
				if(prime_subset.size() == 1){
					System.out.println("number = " + n + "  smallest factor (>1) = " + prime_subset.toArray()[0] + "  questions asked = " + question_count);
					if(question_count > max_questions){
						max_questions = question_count;
					}
					if(question_count < min_questions || min_questions == -1){
						min_questions = question_count;
					}
					questions_sum += question_count;
					break;
				}
			}
		}
		System.out.println("worst case   = " + max_questions + " questions");
		System.out.println("best case    = " + min_questions + " questions");
		System.out.println("average case = " + questions_sum + " / " + (maxx-1) + " ~= " + (float) questions_sum/(maxx-1) + " questions");
		System.out.println("time elapsed = " + (System.currentTimeMillis()-t0) + " ms");
	}
	public static int smallestPrimeFactor(int n){
		for(int p: prime_list){
			if(n % p == 0 ){
				return p;
			}
		}
		return 0;
	}
}

The code could be optimized for speed (particularly when obtaining the index in the partition list (in_sets) to use on out_sets, currently it’s dumb, but could be improved with some sort of hash-map).  With 166 integers, speed of execution was not an issue.

Both output:

done with partition answer-key, time elapsed = 0 ms
number = 2  smallest factor (>1) = 2  questions asked = 1
number = 3  smallest factor (>1) = 3  questions asked = 3
number = 4  smallest factor (>1) = 2  questions asked = 1
number = 5  smallest factor (>1) = 5  questions asked = 4
number = 6  smallest factor (>1) = 2  questions asked = 1
number = 7  smallest factor (>1) = 7  questions asked = 5
number = 8  smallest factor (>1) = 2  questions asked = 1
number = 9  smallest factor (>1) = 3  questions asked = 3
number = 10  smallest factor (>1) = 2  questions asked = 1
number = 11  smallest factor (>1) = 11  questions asked = 6
number = 12  smallest factor (>1) = 2  questions asked = 1
number = 13  smallest factor (>1) = 13  questions asked = 7
number = 14  smallest factor (>1) = 2  questions asked = 1
number = 15  smallest factor (>1) = 3  questions asked = 3
number = 16  smallest factor (>1) = 2  questions asked = 1
number = 17  smallest factor (>1) = 17  questions asked = 7
number = 18  smallest factor (>1) = 2  questions asked = 1
number = 19  smallest factor (>1) = 19  questions asked = 7
number = 20  smallest factor (>1) = 2  questions asked = 1
number = 21  smallest factor (>1) = 3  questions asked = 3
number = 22  smallest factor (>1) = 2  questions asked = 1
number = 23  smallest factor (>1) = 23  questions asked = 7
number = 24  smallest factor (>1) = 2  questions asked = 1
number = 25  smallest factor (>1) = 5  questions asked = 4
number = 26  smallest factor (>1) = 2  questions asked = 1
number = 27  smallest factor (>1) = 3  questions asked = 3
number = 28  smallest factor (>1) = 2  questions asked = 1
number = 29  smallest factor (>1) = 29  questions asked = 7
number = 30  smallest factor (>1) = 2  questions asked = 1
number = 31  smallest factor (>1) = 31  questions asked = 7
number = 32  smallest factor (>1) = 2  questions asked = 1
number = 33  smallest factor (>1) = 3  questions asked = 3
number = 34  smallest factor (>1) = 2  questions asked = 1
number = 35  smallest factor (>1) = 5  questions asked = 4
number = 36  smallest factor (>1) = 2  questions asked = 1
number = 37  smallest factor (>1) = 37  questions asked = 7
number = 38  smallest factor (>1) = 2  questions asked = 1
number = 39  smallest factor (>1) = 3  questions asked = 3
number = 40  smallest factor (>1) = 2  questions asked = 1
number = 41  smallest factor (>1) = 41  questions asked = 7
number = 42  smallest factor (>1) = 2  questions asked = 1
number = 43  smallest factor (>1) = 43  questions asked = 7
number = 44  smallest factor (>1) = 2  questions asked = 1
number = 45  smallest factor (>1) = 3  questions asked = 3
number = 46  smallest factor (>1) = 2  questions asked = 1
number = 47  smallest factor (>1) = 47  questions asked = 7
number = 48  smallest factor (>1) = 2  questions asked = 1
number = 49  smallest factor (>1) = 7  questions asked = 5
number = 50  smallest factor (>1) = 2  questions asked = 1
number = 51  smallest factor (>1) = 3  questions asked = 3
number = 52  smallest factor (>1) = 2  questions asked = 1
number = 53  smallest factor (>1) = 53  questions asked = 7
number = 54  smallest factor (>1) = 2  questions asked = 1
number = 55  smallest factor (>1) = 5  questions asked = 4
number = 56  smallest factor (>1) = 2  questions asked = 1
number = 57  smallest factor (>1) = 3  questions asked = 3
number = 58  smallest factor (>1) = 2  questions asked = 1
number = 59  smallest factor (>1) = 59  questions asked = 7
number = 60  smallest factor (>1) = 2  questions asked = 1
number = 61  smallest factor (>1) = 61  questions asked = 7
number = 62  smallest factor (>1) = 2  questions asked = 1
number = 63  smallest factor (>1) = 3  questions asked = 3
number = 64  smallest factor (>1) = 2  questions asked = 1
number = 65  smallest factor (>1) = 5  questions asked = 4
number = 66  smallest factor (>1) = 2  questions asked = 1
number = 67  smallest factor (>1) = 67  questions asked = 7
number = 68  smallest factor (>1) = 2  questions asked = 1
number = 69  smallest factor (>1) = 3  questions asked = 3
number = 70  smallest factor (>1) = 2  questions asked = 1
number = 71  smallest factor (>1) = 71  questions asked = 7
number = 72  smallest factor (>1) = 2  questions asked = 1
number = 73  smallest factor (>1) = 73  questions asked = 7
number = 74  smallest factor (>1) = 2  questions asked = 1
number = 75  smallest factor (>1) = 3  questions asked = 3
number = 76  smallest factor (>1) = 2  questions asked = 1
number = 77  smallest factor (>1) = 7  questions asked = 5
number = 78  smallest factor (>1) = 2  questions asked = 1
number = 79  smallest factor (>1) = 79  questions asked = 7
number = 80  smallest factor (>1) = 2  questions asked = 1
number = 81  smallest factor (>1) = 3  questions asked = 3
number = 82  smallest factor (>1) = 2  questions asked = 1
number = 83  smallest factor (>1) = 83  questions asked = 7
number = 84  smallest factor (>1) = 2  questions asked = 1
number = 85  smallest factor (>1) = 5  questions asked = 4
number = 86  smallest factor (>1) = 2  questions asked = 1
number = 87  smallest factor (>1) = 3  questions asked = 3
number = 88  smallest factor (>1) = 2  questions asked = 1
number = 89  smallest factor (>1) = 89  questions asked = 7
number = 90  smallest factor (>1) = 2  questions asked = 1
number = 91  smallest factor (>1) = 7  questions asked = 5
number = 92  smallest factor (>1) = 2  questions asked = 1
number = 93  smallest factor (>1) = 3  questions asked = 3
number = 94  smallest factor (>1) = 2  questions asked = 1
number = 95  smallest factor (>1) = 5  questions asked = 4
number = 96  smallest factor (>1) = 2  questions asked = 1
number = 97  smallest factor (>1) = 97  questions asked = 7
number = 98  smallest factor (>1) = 2  questions asked = 1
number = 99  smallest factor (>1) = 3  questions asked = 3
number = 100  smallest factor (>1) = 2  questions asked = 1
number = 101  smallest factor (>1) = 101  questions asked = 7
number = 102  smallest factor (>1) = 2  questions asked = 1
number = 103  smallest factor (>1) = 103  questions asked = 7
number = 104  smallest factor (>1) = 2  questions asked = 1
number = 105  smallest factor (>1) = 3  questions asked = 3
number = 106  smallest factor (>1) = 2  questions asked = 1
number = 107  smallest factor (>1) = 107  questions asked = 7
number = 108  smallest factor (>1) = 2  questions asked = 1
number = 109  smallest factor (>1) = 109  questions asked = 7
number = 110  smallest factor (>1) = 2  questions asked = 1
number = 111  smallest factor (>1) = 3  questions asked = 3
number = 112  smallest factor (>1) = 2  questions asked = 1
number = 113  smallest factor (>1) = 113  questions asked = 7
number = 114  smallest factor (>1) = 2  questions asked = 1
number = 115  smallest factor (>1) = 5  questions asked = 4
number = 116  smallest factor (>1) = 2  questions asked = 1
number = 117  smallest factor (>1) = 3  questions asked = 3
number = 118  smallest factor (>1) = 2  questions asked = 1
number = 119  smallest factor (>1) = 7  questions asked = 5
number = 120  smallest factor (>1) = 2  questions asked = 1
number = 121  smallest factor (>1) = 11  questions asked = 6
number = 122  smallest factor (>1) = 2  questions asked = 1
number = 123  smallest factor (>1) = 3  questions asked = 3
number = 124  smallest factor (>1) = 2  questions asked = 1
number = 125  smallest factor (>1) = 5  questions asked = 4
number = 126  smallest factor (>1) = 2  questions asked = 1
number = 127  smallest factor (>1) = 127  questions asked = 7
number = 128  smallest factor (>1) = 2  questions asked = 1
number = 129  smallest factor (>1) = 3  questions asked = 3
number = 130  smallest factor (>1) = 2  questions asked = 1
number = 131  smallest factor (>1) = 131  questions asked = 7
number = 132  smallest factor (>1) = 2  questions asked = 1
number = 133  smallest factor (>1) = 7  questions asked = 5
number = 134  smallest factor (>1) = 2  questions asked = 1
number = 135  smallest factor (>1) = 3  questions asked = 3
number = 136  smallest factor (>1) = 2  questions asked = 1
number = 137  smallest factor (>1) = 137  questions asked = 7
number = 138  smallest factor (>1) = 2  questions asked = 1
number = 139  smallest factor (>1) = 139  questions asked = 7
number = 140  smallest factor (>1) = 2  questions asked = 1
number = 141  smallest factor (>1) = 3  questions asked = 3
number = 142  smallest factor (>1) = 2  questions asked = 1
number = 143  smallest factor (>1) = 11  questions asked = 6
number = 144  smallest factor (>1) = 2  questions asked = 1
number = 145  smallest factor (>1) = 5  questions asked = 4
number = 146  smallest factor (>1) = 2  questions asked = 1
number = 147  smallest factor (>1) = 3  questions asked = 3
number = 148  smallest factor (>1) = 2  questions asked = 1
number = 149  smallest factor (>1) = 149  questions asked = 7
number = 150  smallest factor (>1) = 2  questions asked = 1
number = 151  smallest factor (>1) = 151  questions asked = 7
number = 152  smallest factor (>1) = 2  questions asked = 1
number = 153  smallest factor (>1) = 3  questions asked = 3
number = 154  smallest factor (>1) = 2  questions asked = 1
number = 155  smallest factor (>1) = 5  questions asked = 4
number = 156  smallest factor (>1) = 2  questions asked = 1
number = 157  smallest factor (>1) = 157  questions asked = 7
number = 158  smallest factor (>1) = 2  questions asked = 1
number = 159  smallest factor (>1) = 3  questions asked = 3
number = 160  smallest factor (>1) = 2  questions asked = 1
number = 161  smallest factor (>1) = 7  questions asked = 5
number = 162  smallest factor (>1) = 2  questions asked = 1
number = 163  smallest factor (>1) = 163  questions asked = 6
number = 164  smallest factor (>1) = 2  questions asked = 1
number = 165  smallest factor (>1) = 3  questions asked = 3
number = 166  smallest factor (>1) = 2  questions asked = 1
worst case   = 7 questions
best case    = 1 questions
average case = 494 / 165 ~= 2.99393939394 questions
time elapsed = 18 ms

Here we see that we can figure out the smallest positive integer divisor greater than 1 in under 3 questions on average with this methodology, even though the worst case is a little worse than our algorithm for part 1 of this puzzle.

 

I tried to keep the Java and Python code comparable (although not always asymptotically optimal) so we could have a rough comparison of performance.

With 166 numbers, python does the job a little faster (18ms vs. 58ms), but for 16,600 numbers, Java is about 50% faster.  Java becomes an order of magnitude faster as we get to 166,000 numbers.

 

This methodology impressively determines the smallest prime factor of any positive integer from 2 to 166,000 in ~4.04 questions on average (that’s over 15,000 primes)!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: