#### 联系方式

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codinghelp

#### 您当前位置：首页 >> Database作业Database作业

###### 日期：2020-10-25 11:24

MATH2392 Practice of Analytics

Assignment 3

Due date: Sunday, 4 October, 2020

1. The file juul2.txt (located on Canvas -> Assignment 3) contains data from

Anders Juul on insulin-like growth factors:

? Age – age in years

? Height – height in cm

? Menarche – did the menarche occur? (1/2 = no/yes)

? Sex – 1 = Male, 2 = Female

? IGF1 – Serum IGF-I

? Tanner – Tanner’s puberty classification (1-5)

? Testvol – testis volume

? Weight – weight in kg

Numbers are separated by blanks and missing values are coded by a dot.

a. Find the correlation coefficient between “Weight” and each of the

following variables: “Sex”, “Age”, “Height”, “IGF1”, “Tanner”.

b. Restrict the data to the variables IGF1, Age, Sex, Tanner, Weight and

Height to the subjects that are younger than or exactly 20 years old.

Remove all subjects that have a missing value in any of the variables.

c. Consider body weight as response variable and perform a multiple

regression analysis using the following variables as predictors: “Sex”,

“Age”, “Height”, “IGF1”, “Tanner”. Extract the regression coefficients

with standard error, 95% confidence intervals, and p-values. (Hint: use

OUTEST = option in PROC REG statement with TABLEOUT

requirement).

d. Consider body weight as response variable and do a linear regression

model with age and gender as predictors.

e. Consider body weight as response variable and do a linear regression

model with age and gender and an interaction term for age and gender

as predictors. (Hint: Use PROC GLM procedure).

Copy and Paste the program on your answer sheet for each part.

All reports should be created in RTF format.

(1 + 2 + 2 + 1 + 1 = 7 marks)

2. There are three datasets Inventory, Orders, and Transaction are located

on Canvas -> Assignment 3. This information has been extracted from a

clothing e-retailer’s database.

Use DATA step to read all three the datasets into SAS, then use PROC

SQL procedure answering the following questions.

a. Combine the three datasets as a single data set.

b. Identify the web_id which has the largest total order quantity.

c. Identify the web_id that has the lowest average cost.

Copy and Paste the program on your answer sheet for each part.