STAT 3401/STAT 4065
Assignment One
Important: This assignment is assessed, and carries weight 20% towards your final mark for the
STAT3401 unit. Your work for this assignment must be submitted to the unit lecturer by 5pm on
Monday, 29 April 2019.
You may hand in your work during the lectures/computer labs of the STAT3401 on 29 April, or electronically
at any time before the due date and time. Any student failing to submit work by the deadline
will receive a penalty for late submission (5% per day late) unless the unit lecturer is advised as soon
as possible of any extenuating circumstances. Please ensure that you include your name and student
number on your work.
Plagiarism: The work that you submit must be your sole effort (i.e. not copied from anyone else). If
you are found guilty of plagiarism you may be penalised.
The two assignment questions involve the analysis of some data. The data sets mentioned are available
from LMS. If you have difficulty accessing the data sets you should contact the lecturer immediately.
For these two assignments you should hand in a mini-report. The mini-report for each exercise should
be no longer than a single side of A4 paper (excluding any relevant Figures, Tables or computer output,
which can be attached as an appendix). You will be marked down for exceeding this page limit. The
aim of the mini-report is to convey the aims, methodology and results of your data analysis in a concise,
readable fashion. It is strongly recommended that you structure your report into sections, along the
following lines.
Introduction: Summarise the data and the aims of the analysis.
Methodology: Describe the statistical methods that you use (technical details not required).
Results: Describe the results of your analysis and their interpretations.
Discussion: Draw conclusions (based on your results) as necessary. Discuss any interesting issues
arising from your analysis.
Marks for each mini-report will be awarded (with a relative weighting of 2:3) for
Exposition: your mini-reports should be well organised. You should aim to write in a concise, yet
readable, manner.
Statistical content: marks will be awarded for the correct use of appropriate statistical techniques,
and for the correct interpretation of results from these techniques.
Question one
The file asian.rda contains the data frame asian with data reported by Rabe-Hesketh and Skrondal
(2005, Multilevel and Longitudinal Modeling Using Stata, Stata Press) on the weight gain of Asian
children in a British community — between 1 and 5 weight measurements were taken for each child.
Specifically, the following variables are contained in this data frame:
id: a unique identifier for each child
occ: on which occasion (visit) the measurement was taken
age: the age (in years) of the child when the measurement was taken
weight: the weight (in kilograms) of the child
brthwt: the birth weight of the child (in grams)
gender: factor indicating whether the child is a boy or a girl
Aims of the analysis: To investigate which covariates (and possibly which interactions) have an influence
on how the weight of these children develops as they grow older. Use weight as the response, and age,
brthwt and gender and as possible covariates to model the mean structure.
Note: You may also want to consider polynomial terms in some of the covariates (and interactions of
these polynomial terms with other covariates). Ensure that you have fitted an appropriate model for
the residuals (heterogeneity and correlation) before investigating the mean structure.
Question two
The file ratpup.csv contains data on the birth weights of rat pups born to mothers who were randomly
assigned to receive one of three doses (high, low, control) of an experimental compound. When analysing
this data previously in lectures and computer labs, we noticed that the data set may contain outlying
values. Identify the data point corresponding to the largest (in absolute value) Pearson residual and
remove it from the data before re-analysing the data.
Aims of the analysis: To compare the birth weights of pups from litters born to female rats that received
the high- and low-dose treatments to the birth weights of pups from litters that received the control
treatment, and to quantify the influence the treatments have on the birth weights.
Note: In your report you should clearly state
which observation you have removed,
whether this removal changed the results of the analysis qualitatively (i.e. did you arrive at a
different final model? If not, did the outlying observation have an undue influence on the estimated
fixed effects, especially those associated with treatment, in the original analysis?); and
is there any indication that there are further outlying observations present in the data.
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱