Question 0 – Hello World!We’ve included a little “Hello
World!” example. There will be an accompanying video on Moodle (the link
will be posted on the forum as an announcement) which you can follow to
learn how to use this Jupyter Notebook.
Even if you’ve used
Jupyter before, it’s highly recommended that you watch the video and go
through this introductory exercise, because this assignment uses
auto-marking and requires you to follow a certain (straightforward)
Question 0.a – Saying Hello in MarkdownFor
this assignment, you will need to answer the questions in the “cells”
below each question. Some questions will require written work (which you
can either do on paper and scan, or do in the cell using Markdown), and
some will require R code (which you must do using R code in the
Watch through the video to see how this should be done.Question 0.b – Saying Hello in RYou
will need to write some R code in this assignment. In this section,
save a variable called “hello.world” (don’t include the quotes in the
variable name) and set it to the value “Hello World!”. Then run the cell
below the one you wrote your code in to verify that your answers have
been registered and given the correct variable names.
Like before, follow the video tutorial to see how this is done.Question 1 – ProbabilitiesSuppose
we are playing a simple collectable card game (e.g. like Hearthstone,
or Magic the Gathering). In this game, each player has a card deck which
contains 30 cards (with no duplicate cards). At the start of this game,
both players shuffle their decks. Then the player going first draws
five cards, and the player going second draws six cards. After this, the
game starts, and players alternate turns.
Each player draws an
additional card at the start of their turn. So, for example, after their
third turn player one should have drawn eight cards in total (the 5
cards they started with, plus another three cards over three turns).
Player two should have drawn nine cards in total after their third turn.
the following questions, suppose there is a special combination of five
cards, and if you have those five cards in your hand you instantly win
A bit of helpAs a little hint for some of the
questions below, you’re reminded that if you have some product n Ã— (n
âˆ’ 1)Ã— (n âˆ’ 2)Ã— …Ã— (n âˆ’ k), we can express this as n!/(n âˆ’ k
âˆ’ 1)!. That is,
n(n âˆ’ 1)(n âˆ’ 2)(n âˆ’ 3)…(n âˆ’ k) = ( 1)!This is because we can think of the product n Ã— (n âˆ’ 1)Ã— (n âˆ’ 2)Ã— …Ã— (n âˆ’ k) to be , which isn Ã— (n 1)Ã— (n 2)Ã— …Ã— 2 Ã— 1, but with the last (n-k-1) parts removed. Because this is a multiplication ofâˆ’ âˆ’terms, we can think of removing terms as the same as dividing by them, meaning that
n Ã— (n âˆ’ 1)Ã— (n âˆ’ 2)Ã— …Ã— 2 Ã— 1 n Ã— (n âˆ’ 1)Ã— (n âˆ’ 2)Ã— …Ã— (n âˆ’ k) = ( 1)( 2)…(2)(1) = ( 1)!
is the probability that the first player will draw this combination on
their first turn and win the game immediately? What about the second
Question 1.bWhat is the probability that the
five cards required for victory are all at the bottom of a player’s deck
(i.e. they are the last five cards in their deck)?
a player has drawn 15 cards from their deck. What is the probability
that all of the cards in the winning combination are still in their
Question 1.dSuppose a player has drawn cards
from their deck, where is between 0 and 30. What is the probability that
all of the cards in the winning combination is still in their deck
(i.e. that they have not drawn any piece of the winning combination yet)
in terms of ?
Question 2 – PDFs and ExpectationsSuppose we have defined a probability density function for a random variable as follows:
2 0 â‰¤ x â‰¤ Î± p(x) = Notice that our PDF has two constants, and . is a parameter, and is a coefficient which we will carefullychoose so the integral of 2 between and (with respect to ) is equal to .
Question 2.aSuppose . Find the value of which would cause the integral of p(x) from 0 to with respect to x to beequal to . That is, find such that 1 c 2 dx = 1 0
the value of for a general value of That is, find such that (you can do
this in a way similar to how you answered question 2.a).
Î±c 2 dx = 1Question 2.cSuppose = 3 and = 1. Find E(X ), the expected value of our variable .
Question 2.dSuppose = 3 and = 1. Find Var(X ), the variance of our variable .
Question 3 – DistributionsSuppose we are given the following information:You
are modelling the number of people visiting a particular doctor’s
office within a day, with the hope of identifying a disease outbreak in
the local area of the doctor
It is known that, on an average day, 30 patients will see this doctor, with a standard deviation of 3 patients per day
a model you might use to model the number of patients on a given day
(there might be more than one choice, so pick one and justify it). Also
give the parameters of this model based on the given information.
one particular day, 45 patients visit the doctor. Considering the model
you developed in your answer to the previous question, do you think
that this number of patients in a given day is cause for alarm? Use
calculations to back up your answer by determining the probability of
seeing 45 or more patients in a given day.
Question 4 – Maximum Likelihood Estimation of ParametersSuppose
we are developing a new plant treatment which will (hopefully) improve
crop yields. We have a dataset which contains weights for two candidate
treatments, as well as a control group (which receives neither of the
Question 4.aSuppose we want to
create models for the weight of each group. You think a normal
distribution would be suitable for this purpose, but a colleague has
suggested that you should use a binomial distribution instead. Someone
else proposed using a uniform distribution instead.
For both the binomial and uniform distributions, explain whether they would be a good choice (justifying your answer).
Also justify why using the normal distribution is a good choice here.
rather than modelling the weights directly, we instead want to model
the probability that a plant will grow to weigh over 6 units of weight,
for each of the three treatments we are testing (treatments 1 and 2, and
the control). Suggest a model that would be suitable for this purpose,
and justify your choice.
Question 4.cAfter considering
our answers to questions 4.a and 4.b, we have decided to model the
weights directly (i.e. we will use the model discussed in question 4.a,
not 4.b). To do this, we will create three models: one for each of the
three groups. We will use normal distributions to model each of the
three groups, and then compare the estimated means of each group.
now have to decide how we will calculate our estimates of the mean (Î¼)
and standard deviation ( ) of each of our datasets. One approach is to
use the maximum likelihood method, where we wish to maximize the
likelihood of the data given the parameters Î¼ and (that is, we wish to
find the values of Î¼ and which cause P(x Î¼,Ïƒ) to be maximized). Note
that maximizing something is the same as maximizing the log of that
thing, because log (for any base) is “monotonically increasing”- that
is, if , log(a) > log(b). We’re actually going to maximize the
A colleague of yours seems to think
that maximizing the log-likelihood is the same as minimizing the mean
absolute error. Another colleague disagrees, saying that they are
misremembering and the likelihood is the same as minimizing the mean
squared error. Yet another colleague seems to believe that we minimize
the negative log-likelihood by minimizing the log-cosh loss (since they
both have the word “log” in them; you are not convinced by this
Question 4.dOne of your colleagues in
Question 4.c is correct; which one does it appear to be based on our
calculations in the previous question? Prove this colleague correct
using algebra (you only have to prove them correct; you don’t have to
disprove the other two).Question 4.eGiven your
maximum likelihood estimates for the mean of each population (and
keeping in mind that we have a very small number of samples for each
group), which treatment appears to work best?
Question 5 – Central Limit TheoremSuppose
our company is trialling a new production method for phone cases, based
on 3D printing. 3D printing can a volatile process, and the company has
decided to accept the fact that there will be a certain proportion of
failures out of the total number of 3D prints.
committing to the new process, management would like to estimate the
probability of failure by printing a number of phone cases. They have
asked you how many cases they should print to ensure they have a
reasonably good idea of the probability of failure.
developing the new production method assure management that the
probability of failure is somewhere between 1% and 20%, but they are
unwilling to make any guarantees beyond this without testing the method
Question 5.aWe will model this problem with a
binomial distribution. Justify why the binomial distribution is a good
choice for this problem.
Question 5.bSuppose that we are considering three potential failure probabilities:
also are considering three potential sizes for our test production run
(i.e. the number of phone cases we will print in our test run):
each combination of failure probability and number of cases printed,
calculate the limiting distribution for the sample mean. You should
calculate 9 limiting distributions in total
For this question, do this using written calculations (i.e. not using R) and with the Central Limit Theorem.
Question 5.cVerify the results you obtained by hand in the previous question using R code.
each of the sample sizes and potential failure probabilities listed
above, we now know the theoretical distribution by the Central Limit
Theorem (we calculated this in Questions 6.b and 6.c). However,
management is still not convinced and have asked us to develop a
simulation which will experimentally demonstrate our calculations were
R has a built-in function called rbinom, which takes
three arguments (the number of simulations you want to run, the number
of trials per simulation, and the probability of success for each
trial). Hint: you are allowed to use the rbinom function, although you
don’t have to.
Question 5.eWe’re presenting our
findings to management; they have asked us to provide visualisations for
our results. For each failure probability discussed above (0.01, 0.05
and 0.2) and for each potential sample size discussed above (50, 200,
and 800), produce a histogram plot of the maximum likelihood estimates
of the failure probability (calculated 50,000 times through 50,000
Question 5.fManagement has asked us
recommend how many tests they should run. Based on all the information
we have computed, do you recommend 50, 200, 800, or even more tests than
that? Justify your answer using relevant calculations and/or by
referring to the above plots.
Let’s block ads! (Why?)