STAC51H3:Categorical Data Analysis assignment 代写

  • 100%原创包过,高质代写&免费提供Turnitin报告--24小时客服QQ&微信:120591129
  • STAC51H3:Categorical Data Analysis
    Assign 1
    Due: Thu Sep 28, 2017 in class
    All relevant work must be shown for credit.
    Note: In any question, if you are using R, all R codes and R outputs must be
    included in your answers. You should assume that the reader is not familiar
    with R outputs and so explain all your findings, quoting necessary values form
    your outputs.
    Whenever you are using an R command for generating random numbers, set
    seed to 123. This can be done by simply adding the command set.seed(123)
    before the your R command for generating the random number.
    Please note that academic integrity is fundamental to learning and scholarship.
    You may discuss questions with other students. However, the work you submit
    should be your own. If I feel suspicious of any assignment (e.g. if your work
    doesn’t appear to be consistent with what we have discussed in class), I will not
    mark the assignment. Instead, I will ask you to present your work in my office
    and your grade will be assigned based on your presentation.
    Total points for this assignment: 45
    1. Let Y ∼ Bin(n,π), where n = 20 and π = 0.8. Y can be interpreted as the number of
    successes in a sample of size n = 20 from a Bernoulli distribution with probability of
    success π = 0.8.
    (a) (12 points) y = 15 is an observed value of Y where Y ∼ Bin(n,π), where n = 20
    and π = 0.8. Calculate the Wald , score (i.e. Wilson’s method), Agresti-Coull and
    Clopper-Pearson 95 percent confidence intervals for π.
    In this part (i.e all confidence intervals in part a ) do not use R or many computer
    package. Show your work clearly.
    (b) (3 points) Calculate a 95% confidence interval for π based on likelihood ratio test.
    (For this part you may use the R code we discussed in class but do not use any R
    functions that give the confidence interval directly.)
    2. Observed (or true) coverage and the targeted coverage probabilities of confidence in-
    tervals are not necessarily equal. In this question we will calculate the observed (or
    true) coverage probability of Wald confidence intervals using two methods: Mote Carlo
    simulation and direct calculation.
    (a) (5 points) (Monte Carlo simulation) Generate N = 1000000 observations on Y
    where Y ∼ Bin(n,π), where n = 20 and π = 0.8. From each observation gener-
    ated, calculate a Wald 95% confidence interval for the population proportion (π).
    (Note: This means you are calculating 1000000 confidence intervals). Calculate the
    Question 2 continues on the next page...
    Page 2 of 2
    proportion of these Wald intervals that contain 0.8 (the value of π). Comment on
    your results.
    (b) (5 points) (Direct calculation) In order to calculate the coverage probability for a
    known value of π, calculate a confidence interval for every possible value of y (y =
    0,...,n) and check whether true value of the parameter is in the confidence interval
    calculated. Identify those confidence intervals that contain the true parameter. For
    example if the interval with n = 20, y = 5 contains the true value of π (say
    π = 0.8), then the probability for that interval is P(y = 5) =
    ? 20
    5
    ?
    × 0.8 5 × 0.2 20−5 .
    The coverage probability is the sum of all these probabilities for the intervals that
    contain π (in this example 0.2). Use this way to calculate the coverage probability
    of 95% Wald confidence intervals cased on a sample of size n = 20 if the true value
    of π is 0.8.
    3. In this question also we will calculate and plot the true coverage probabilities of Wald
    confidence intervals for proportions (i.e. Binomial parameter) based on a sample of
    given size (n), but this time we calculate the coverage probabilities for many values of
    π making a plot of coverage probably versus π.
    (a) (5 points) For a Bernoulli sample of size n = 25, use the method in part (b) of
    the previous question (i.e. direct calculation) to calculate the coverage probability
    of a 95% confidence interval for π = 0.01,0.02,...,0.99 and plot them against π.
    Draw a horizontal line through the target probability 0.95. Comment on what you
    learned from your plot.
    (b) (5 points) Repeat part (a) above with n = 100 and plot both the curves on the
    same plot. Compare and comment on your findings.
    (c) (10 points) Repeat part (a) for Wald, Wilson, Agresti-Coull and Clopper-Preason
    confidence intervals and plot the coverage probabilities versus π for all four confi-
    dence intervals on one graph (i.e all four curves on the same system of axes). Use
    four different colours for easy comparison. Compare and comment on your results.
    (Note that in this part, we are using the same values as in aprt (a) above, i.e n = 25,
    95% confidence interval and π = 0.01,0.02,...,0.99 )