CA318 - Advanced Algorithms and AI Search - 2023-24

Sample paper

Multiple choice

Answer all 20 multiple-choice questions. (2 marks each.) Total 40 marks. No negative marking.
Only the filled-in boxes will be looked at. No additional commentary or analysis will be read.
Remember to hand up the multiple-choice answer sheet!

The input to the 7th output node of a neural network is:
x₇ = w₁₇ y₁ + w₂₇ y₂ + w₃₇ y₃
Then ∂x₇ / ∂w₂₇ is:
1. y₂
2. y_j
3. w₂₇
4. w_jk
5. w_jk
6. y_j
A 3-dimensional AND gate perceptron looks like:
1. w₁=w₂=w₃=1, t=0
2. w₁=w₂=w₃=1, t=3
3. w₁=w₂=w₃=0, t=3
4. w₁=w₂=w₃=1, t=1
5. w₁=w₂=w₃=0, t=0
6. w₁=w₂=w₃=0, t=1
Why would we in practice not learn the function y = sin(x) + (1/x) from exemplars?
1. We would learn it from exemplars.
2. It would take too long.
3. Learning it from exemplars is impossible for some values of x.
4. Learning it from exemplars is impossible.
5. It is a chaotic function.
6. There is nothing to learn. The problem is solved.
7. Divide by zero error.
A problem has a branching factor of x. To examine all possible states between the current state and n steps into the future requires examining how many states?
1. nx
2. x+xⁿ
3. x
4. x + x² + x³ + ... + xⁿ
5. xⁿ
6. n
7. x^x
8. n^x
In machine evolution, we have 4 individuals, where individual i has fitness f(i):
```
 i     1  2  3  4 
```
```
 f(i)  0  5  4  5 
```
If we use a "Soft max" probability distribution with a "temperature" parameter T to choose who reproduces, then as T -> infinity the probability of each individual reproducing goes to:
1. ```
 0    0.9   0.1   0.9 		
```
2. ```
 0    5/14  4/14  5/14 		
```
3. ```
 0    1     0     0 			
```
4. ```
 1/4  1/4   1/4   1/4 		
```
5. ```
 0    1/2   0     1/2 		
```
6. ```
 0    1     0     1 			
```
In machine evolution, each individual has a genotype encoded in binary format. We repeatedly select 2 parents to make 2 children by copying the 2 genotypes, with the possibility of mutation and crossover. Let p_c be the probability of crossover between the 2 genotypes. Let p_m be the probability that any bit in the genotype mutates when it is inherited.
Each generation, we only pick the single best individual to be both parents of each child. Then:
1. We rely on mutation to produce new solutions.
2. Both mutation and crossover will produce new solutions.
3. This is a normal search.
4. No new solutions are produced.
5. We rely on crossover to produce new solutions.
Neural Networks and GAs can both be seen as hill-climbing on an unknown landscape by taking a finite number of samples. We consider E, the error in our solution, and the direction in which to change our solution. Which of the following is the case:
1. For Neural Networks, E is known but is unknown. For GAs, neither E nor are known.
2. For Neural Networks, E is known and is known. For GAs, E is known but is unknown.
3. For Neural Networks, E is known and is known. For GAs, neither E nor are known.
4. For Neural Networks, E is known and is known. For GAs, E is known and is known.
In machine evolution, let the probability of an individual reproducing be its fitness divided by the sum of fitnesses of all individuals. Then:
1. Fitness must be < 0
2. Fitness can have any value.
3. Fitness can have any value except zero.
4. Fitness must be > 0
5. Fitness must be ≤ 0
6. Fitness must be ≥ 0
In machine evolution, each individual has a genotype encoded in binary format. We repeatedly select 2 parents to make 2 children by copying the 2 genotypes, with the possibility of mutation and crossover. Let p_c be the probability of crossover between the 2 genotypes. Let p_m be the probability that any bit in the genotype mutates when it is inherited.
Only one of the following carries out a serious search of the state space. All the others are constrained by whatever happens to be the initial population. i.e. Only one of the following can generate all possible genotypes, given enough time:
1. p_m=0, p_c=1
2. p_m=0, p_c=0
3. p_m=0.1, p_c=0
4. p_m=1, p_c=0
5. p_m=1, p_c=1
We have a neural network where inputs are finite (positive or negative). If a node has a (positive) infinite threshold, then:
1. If weights are positive, you have constant output 0.
2. If weights are negative, you have constant output 1.
3. If weights are finite, you have constant output 1.
4. If weights are finite, you have constant output 0.
5. If weights are negative, you have constant output 0.
6. None of the above.
7. If weights are positive, you have constant output 1.
One of these is not a property of neural networks:
1. Input is compared to previous Input, and "nearby" Input should produce "nearby" Output.
2. It can be used to represent a function from real-valued single-dimensional input to real-valued single-dimensional output.
3. As learning progresses, the size of the network remains fixed.
4. It can be used to represent any normal linear, continuous, non-chaotic function.
5. It can be used to represent a function from real-valued multi-dimensional input to real-valued multi-dimensional output.
6. It can be used to represent a function from integer-valued multi-dimensional input to integer-valued multi-dimensional output.
7. It can be used to represent any normal non-linear, continuous, non-chaotic function.
8. Given a network architecture, an Input leads to an Output after a fixed number of calculations.
In neural networks, where w_ij is the weight connecting input node i with hidden node j, then input I_i is irrelevant to the workings of the network if:
1. w_ij = 0 for all j.
2. w_ij = 0 for no j.
3. It is never irrelevant.
4. w_ij = 0 for at least one j.
5. It is always irrelevant.
In a multi-layer neural network, we have n hidden units with binary outputs, 1 output unit, weights w₁,..,w_n connecting them, and threshold t for the output unit. Which one of the following implements an output unit that fires if and only if all the hidden units fire.
1. w₁=w₂=..=w_n=0, t=1
2. w₁=w₂=..=w_n=1, t=0
3. w₁=w₂=..=w_n=W, W > 0, t=W
4. w₁=w₂=..=w_n=W, W > 0, t=W+n
5. w₁=w₂=..=w_n=W, W > 0, t=nW
6. w₁=w₂=..=w_n=W, W > 0, t=n
Consider a heuristic for human mate choice:
Have a relationship with n partners of fitness > f
Marry the next partner you find of fitness > g
where fitness is ranked by the individual out of 10.
Which of the following values for n,f,g (respectively) makes the most sense?
1. 0, 8, 3
2. 10, 5, 8
3. 10, 8, 3
4. 10, 0, 8
5. 0, 0, 8
6. 0, 5, 8
Which is the correct ordering of these search methods, from the least work for the human programmer to the most work for the human programmer?
1. Random search, Metaheuristic, Domain-specific heuristic
2. Metaheuristic, Random search, Domain-specific heuristic
3. Domain-specific heuristic, Metaheuristic, Random search
4. Domain-specific heuristic, Random search, Metaheuristic
5. Metaheuristic, Domain-specific heuristic, Random search
6. Random search, Domain-specific heuristic, Metaheuristic
Consider a heuristic for human mate choice:
Have a relationship with n partners of fitness > f
Marry the next partner you find of fitness > g
where fitness is ranked by the individual out of 10.
Which of the following values for n,f,g (respectively) makes the most sense?
1. 100, 8, 3
2. 10, 0, 8
3. 100, 5, 8
4. 100, 0, 8
5. 10, 8, 3
6. 10, 5, 8
An upper bound on the state-space complexity of Xs and Os is:
1. 3³
2. 9!
3. 9³
4. 3⁹
5. 3!
6. 9⁹
In machine evolution with asexual reproduction, mutation might break the parent's genotype and make it unfit. So therefore:
1. We should set an appropriate mutation rate for the problem.
2. We should only mutate at points that will improve the parent.
3. We should only mutate about 1 in a million bits.
4. We should mutate the unfit, not the fit.
5. The original statement is not true.
6. We should not mutate.
7. We should mutate the fit, not the unfit.
8. It is not something to worry about.
In a perceptron, let inputs be 0 or 1. Let the output be y = 0 or 1, and the correct output be C. The Perceptron Learning Rule can be summarised as:
1. If y=0 and C=1, then reduce threshold and increase weights.
2. If y=0 and C=1, then increase threshold and reduce weights.
3. If y=1 and C=0, then increase threshold and increase weights.
4. If y=0 and C=1, then reduce threshold and reduce weights.
5. If y=1 and C=0, then reduce threshold and increase weights.
Define a perceptron that receives x and y as input and fires for these points:
a x - b y ≤ -c
1. weights: -a, -b, threshold: -c
2. weights: -a, b, threshold: -c
3. weights: a, -b, threshold: -c
4. weights: a, -b, threshold: c
5. weights: a, b, threshold: -c
6. weights: a, b, threshold: c
7. weights: -a, b, threshold: c
8. weights: -a, -b, threshold: c

Long questions

Answer 2 long questions out of 4. (25 marks each.) Total 50 marks.

Each question gives maximum number of sentences for the answer. Do not exceed this.
You might like to write your answers as rough work first to figure out what sentences to use. Make sure any rough work is crossed out.
Put your sentences as bullet points not a paragraph.
If you answer more than 2 questions, I will pick the best 2.

Question 1

[Total marks: 25]

1(a) Is this function y = f(x) easy or hard to maximise from exemplars?
f(1.0372) = 80.
f(1.0373) = 100.
f(x) = 0 for all other x.
Maximum length: 3 sentences
[5 marks]

1(b) Consider trying to maximise a function of x, where x is an integer from 1 to 10²⁰. There are seven (and only seven) values of x where f(x) = 10. For all other x, f(x) = 1. The machine is not told this but has to find out by trial and error.
What would maximising this function by trial and error look like?
Maximum length: 3 sentences
[5 marks]

1(c) Mathematicians have long-standing tools to maximise functions and calculate function values.
What is different about the "functions" AI programs try to maximise and calculate values for?
Maximum length: 3 sentences
[5 marks]

1(d) We believe that some function exists from n-dimensional input x to output f(x). That is, we can set up our system with input x and some process will give us a value for f(x). We have no equation for f.
Contrast representing f(x) with maximising f(x).
Maximum length: 3 sentences
[5 marks]

1(e) Discuss trying to maximise the following function from exemplars. Explain what would happen during learning.
Input x is a number. It does not include the clock time.
The function is defined as follows:
if ( x > 10 ) f(x) = 1 if clock time is PM. f(x) = -1 if clock time is AM. else if ( x < -10 ) f(x) = -1 if clock time is PM. f(x) = 1 if clock time is AM. else f(x) = 0

Maximum length: 3 sentences
[5 marks]

Question 2
[Total marks: 25]

2(a) When a problem is represented as a State Space Search problem, the "search tree" is drawn. True or false?
Maximum length: 1 sentence
[5 marks]

2(b) Imagine the following situation in Best-first search. The heuristic evaluation function tells us a state is 4 steps away from the goal. We examine the children of this state and the heuristic evaluation function tells us all of them are 5 steps away from the goal. Can this happen?
Maximum length: 2 sentences
[5 marks]

2(c) In Best-first search what would happen if you add a constant c to every result returned by the heuristic evaluation function?
Maximum length: 2 sentences
[5 marks]

2(d) In the Travelling Salesman Problem with n cities, we say there is an upper bound of (n-1)! on the number of paths to be searched. What assumption did we just make?
Maximum length: 2 sentences
[5 marks]

2(e) In the N-puzzle problem, we use a heuristic (distance to goal estimate) as follows: h(n) = number of pieces out of place.
If h(x) = 5 and h(y) = 3 then is y a better state than x?
Maximum length: 2 sentences
[5 marks]

Question 3
[Total marks: 25]

3(a) A mobile robot has Input state = (angle of direction of nearest wall, velocity value). Output behaviour = one of (turn left, turn right, stop, keep going).
The Input state is clearly missing some information. What are the consequences of this missing information?
Maximum length: 3 sentences
[5 marks]

3(b) We are trying to build a character recognition program. Characters are images of 100 x 100 pixels. We are trying to recognise the digits "0" to "9".
Output is 1 dimensional so therefore the output vector should be 1 dimensional. True or false?
Maximum length: 3 sentences
[5 marks]

3(c) In a neural network, the network generates a (vector) output y = (y₁,..,y_n). We then tell it the "correct" output O = (O₁,..,O_n).
Consider the following as a measure of "error" E:

Why is this a bad measure of the error?
Maximum length: 3 sentences
[5 marks]

3(d) In supervised learning with a neural network, we show the network an Input, and then show it the correct Output. The network modifies its weights so that next time it sees the input, it will be more likely to generate the correct output.
Why such a roundabout method? Why not simply remember the correct output and return it exactly next time it is asked?
Maximum length: 2 sentences
[5 marks]

3(e) Given a line in 2-dimensional space:

a x + b y = c

Define a perceptron that fires if and only if the point is on this side of the line:

a x + b y <= c

Maximum length: 3 sentences
[5 marks]

Question 4
[Total marks: 25]

4(a) In a genetic algorithm (GA), let p_c be the probability of crossover.
What would happen if p_c = 1?
Maximum length: 2 sentences
[5 marks]

4(b) In a genetic algorithm (GA), let each individual i have fitness f(i), and let its probability of reproducing, p(i), be:

where T is the "temperature".
Show that if all fitnesses are the same, we pick a random individual.
Maximum length: 2 sentences

[5 marks]

4(c) Consider crossing over two parents to produce two offspring. The parents are binary bitstrings.
Name two circumstances under which the crossover operation will not produce any new diversity.
Maximum length: 2 sentences
[5 marks]

4(d) In a genetic algorithm (GA), let p_m be the probability that any bit in the chromosome mutates at the point where it is inherited from the parent by the child. Let the chromosome be binary, so that mutation means a bit flip.
What would happen if p_m = 0 (but you still had crossover)?
Maximum length: 4 sentences
[5 marks]

4(e) In a genetic algorithm: What would happen if you had a variable (rather than a constant) population size?
Maximum length: 3 sentences
[5 marks]

Bonus question

Bonus question. (10 marks.)
It is "bonus" because I tell you it in advance. The question is to make you do some reading. You can prepare this question in advance.
Bonus question:

Tell me about some AI-related algorithm or software that was not mentioned on the course.

CA318 - Advanced Algorithms and AI Search

Answer sheet

Black out the square that matches the correct answer.

Exam No:_____________________

Class:_____________________

Seat No:_____________________

Question	1.	2.	3.	4.	5.	6.	7.	8.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20