本次编程任务是人类感知和认知的计算机模型

BCS/CSC 229: Computer Models of Human Perception and Cognition

Homework Assignment #2

Instructions: Answer all questions below. Include all requested calculations and graphs.

Also include the Python code that you wrote to answer the questions. When writing text

or equations, please write NEATLY!

(0) (Part A) At the top of the document that you turn in, place your name and the date.

(Part B) Next, please take the honor pledge. That is, write (by hand using a pen): “I affirm

that I have not given or received any unauthorized help on this assignment, and that this

work is my own.” Then sign your name.

(1) [WARNING: This problem is mathematically challenging. Don’t be surprised if you

struggle with it. Indeed, it may be smart to first work on the other homework problems, and

then return to this problem if time permits.] (Problem 2.4 from the draft of the textbook

by Ma, Kording, and Goldreich) Many Bayesian inference problems involve a product of

two or more Gaussians. A convenient property of Gaussians is that their product is also

Gaussian. In this problem, we will lead you through an example to derive this property

yourself. Consider an observer who infers a stimulus s from a measurement x. Suppose that

the measurement distribution p(x|s) is a Gaussian distribution with standard deviation σ

and the prior distribution is a Gaussian with mean µ and standard deviation σs.

(a) Write down the equations for p(x|s) and p(s).

(b) Use Bayes’ rule to write down the equation for the posterior p(s|x). Substitute p(x|s)

and p(s), but do not simplify.

The numerator is a product of two Gaussians. The denominator p(x) is a normalization

factor that ensures that the integral equals 1. For now, we will ignore it and focus on the

numerator.

(c) Apply the rule e

A e

B = e

A+B to simplify the numerator.

(d) Expand the two quadratic terms in the exponent.

1

(e) Rewrite the exponent to the form as2 + bs + c.

(f) Show that any quadratic function of the form as2 + bs + c can be written as:

a

s +

b

2a

2

+ c −

b

2

4a

.

This operation is known as “completing the square”.

(g) Rewrite your expression obtained in (e) by completing the square.

(h) Apply the rule e

A e

B = e

A+B to rewrite this into the form

e

Z

e

−

(s−µcombined)

2

2σ2

combined .

Express µcombined and σcombined in terms of x, σ, µ, and σs.

(i) Why is µcombined the same as the maximum-a-aposteriori (MAP) estimate of the stimulus (i.e., the s that maximizes the posterior distribution p(s|x))?

(j) Recall that p(s|x) is a distribution and that its integral should therefore be equal

to 1. However, the expression that you obtained in (e) is not properly normalized because

we ignored p(x). Modify the expression such it is properly normalized, without using p(x)

(Hint: Does e

Z depend on s?)

(2) (Problem 2.12 from the draft of the textbook by Ma, Kording, and Goldreich) An observer infers a stimulus s from a measurement x. Let’s say that on a particular trial, the

measurement is x = 30. The measurement distribution p(x|s) is Gaussian with standard

deviation σ = 5. Assume a Gaussian stimulus distribution p(s) with mean 20 and standard

deviation 4; this also serves as the prior distribution. We are now going to calculate the

posterior pdf using Python.

(a) Define a vector of possible s-values: 0, 0.2, 0.4, . . . , 40.

(b) Compute the likelihood function and the prior on this vector of values of s. [Hint: The

values of the prior distribution will not sum to one (instead, they should sum to 1/stepsize

where stepsize = 0.2). That is because we are approximating a continuous distribution by a

discrete distribution. A similar comment applies to the likelihood function, though keep in

mind that the likelihood function is not a distribution, and thus its values do not need to

2

sum to one.]

(c) Multiply the likelihood and the prior. In Python, elementwise multiplication of two

vectors can be achieved using the “*” command.

(d) Divide this product by its sum over all s (normalization step).

(e) Convert this posterior probability mass function into a probability density function

by dividing by the step size you used in your vector of s-values (e.g., 0.2).

(f) Plot the likelihood, prior, and posterior in the same plot. Is the posterior wider

or narrower than the likelihood and prior? Do you expect this based on the equations we

discussed?

(g) Change the standard deviation of the measurement distribution to a very large value.

What happens to the posterior? Can you explain this?

(h) Change the standard deviation of the measurement distribution to a very small value.

What happends to the posterior? Can you explain this?

(3) (Problem 2.13 from the draft of the textbook by Ma, Kording, and Goldreich) Repeat

Question (2), but instead of using a single value of the measurement x, start with a fixed

value of s = 10. From this value of s, draw 10 values of x from the measurement distribution.

You should observe that, from trial to trial, the likelihood function and posterior probability

density function “jump around”. Observe how the posterior shifts under the influence of the

“jumping” likelihood function and stationary prior. Explain.

(4) (Problem 2.14 from the draft of the textbook by Ma, Kording, and Goldreich) Continuing

from Questions (2) and (3), generate a distribution of maximum-a-posteriori (MAP) and

maximum likelihood (ML) estimates by:

(a) drawing an s from the stimulus distribution;

(b) drawing a single x from the measurement distribution, and calculating the posterior

distribution.

(c) For each of 1000 repetitions of (a) and (b), plot the MAP estimate (y-axis) against

the true stimulus (x-axis). On a separate graph, plot the MLE (i.e., measurement x) against

the true stimulus.

3

(d) Repeat (a), (b), and (c) using different values of the noise standard deviation relative

to prior standard deviation. When the noise standard deviation is very small, the MAP and

MLE plots should look the same. Why? When the noise standard deviation is very large,

the MAP plot looks flat, whereas the MLE plot looks very scattered. Why?

(5) (Problem 3.7 from the draft of the textbook by Ma, Kording, and Goldreich) In Chapters

2 and 3 (of the Ma, Kording, and Goldreich textbook), we were able to derive analytical

expressions for the posterior distribution. For more complex psychophysical tasks, however,

analytical solutions often do not exist. In such a case, we can use numerical methods to

approximate the distribution of interest. To get some familiarity with this method, we

will reconsider the cue combination experiment described in this chapter, but we will now

compute the distribution of MAP estimates using numerical methods. We assume that the

experimenter introduces a cue conflict between the auditory and the visual stimuli: sA = 5

and sV = 10. The standard deviation of the auditory and of the visual noise is σA = 2 and

σV = 1, respectively. We assume a flat (uniform) prior over s.

(a) Randomly draw an auditory measurement xA and a visual measurement xV from

their respective distributions. (It’s okay if a measurement has a negative value.)

(b) Plot the corresponding elementary likelihood functions, p(xA|s) and p(xV |s), in one

figure.

(c) Calculate the combined likelihood function, p(xA, xV |s), by numerically multiplying

the elementary likelihood functions in Python. Plot this function.

(d) Calculate the posterior distribution by normalizing the combined likelihood function.

Plot this distribution in the same figure as the likelihood functions.

(e) Use Python to find the MAP estimate of s, i.e., the value of s at which the posterior

distribution is maximal.

(f) Compare with the MAP estimate of s computed from Eq. (3.3) using the measurements drawn in (a). For convenience, here is Eq. (3.3):

sˆMAP =

xA

σ

2

A

+

xV

σ

2

V

1

σ

2

A

+

1

σ

2

V

4

(g) In the above, we simulated a single trial and computed the observer’s MAP estimate

of s, given the noisy measurements on that trial. If an analytical solution does not exist

for the distribution of MAP estimates, we can repeat the above procedure many times to

approximate this distribution. Here, we practice this method even though an analytical

solution is available in this case. Draw 100 pairs (xA, xV ) and numerically compute the

observer’s MAP estimate for each pair as in (e).

(h) Compute the mean of the MAP estimates obtained in (g) and compare with the mean

estimate predicted using Eq. (3.5). For convenience, here is Eq. (3.5):

wA =

1

σ

2

A

1

σ

2

A

+

1

σ

2

V

wV =

1

σ

2

V

1

σ

2

A

+

1

σ

2

V

hsˆi = wAsA + wV sV

(i) Make a histogram of the MAP estimate (in Python, use the “numpy.histogram”

function).

(j) Relative auditory bias is defined as the mean MAP estimate minus the true auditory

stimulus, divided by the true visual stimulus minus the true auditory stimulus. Compute

relative auditory bias for your set of estimates.

5

EasyDue™ 支持PayPal, AliPay, WechatPay, Taobao等各种付款方式!

**E-mail:** easydue@outlook.com **微信:**easydue

**EasyDue™是一个服务全球中国留学生的专业代写公司
专注提供稳定可靠的北美、澳洲、英国代写服务
专注提供CS、统计、金融、经济、数学等覆盖100+专业的作业代写服务**