联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-08-17 10:44

ISYE 4140 – STATISTICAL ANALYSIS

Summer 19 – Final Exam

By my signature below, I attest that I completed this exam in accordance with the rules. I did not have

the assistance of another person (student or other), use a solution aid prepared by another such as a

spreadsheet template, copy the solution from an answer key or from homework completed by another,

or offer assistance to others in the class. __________________________ (signature)

Question 1. (20 pts)

a) Use bootstrap to calculate a 94% CI of the mean of a Lognormal random variable with a

mean of 3.2 and a variance of 4. Use a sample of size 11 and 5,000 replications. Use a

seed equal to the last 3 digits of your RIN. (8)

b) Carry out a simulation experiment using R with 10,000 replications to study the

following scenario. A unit is made by attaching 3 parts as follows:

Base ~ N (20, .22

), a Right ~ N (8, .32

), and a Left ~ N (4, .42

)

Now the gap has to be between (7.8 and 8.2) to be acceptable.

i- Using a seed equal to the middle 3 digits of your RIN, estimate the percentage of

these connections that will be acceptable. (6)

ii- Now using probability theory calculate the true percentage. (6)

Question 2: (10 points)

Read in the R data “mtcars”, which has data of many car brands fuel consumption in mpg along

with 10 other design aspects.

a) Construct a 95% CI on the ratio of the variance of the 6-cylinder to the 8-cylinder cars.

b) Carry out a test of hypothesis at .05 l.o.s. to check whether the mean mpg for these two

engine types is the same

Question 3. (45 points)

a) The pull strength of a wire bond is an important characteristic. The data in file

strength.txt give information on pull strength y, die height x1, post height x2, loop height

x3, wire length x4, bond width on the die x5, and bond width on the post x6. Fit a linear

regression model between the response and the six independent variables and comment

on the results (20)

i. List and discuss the model assumptions (2)

ii. Calculate the correlation coefficients and comment on multi-collinearity (3)

iii. Use partial sum squares to find the contribution of each variable and test its

significance at .05 level of significance (4)

iv. Using step function determine the value of the best model’s AIC? (3)

v. Check the equality of variance assumption and calculate the variance inflation

factor for each dependent variable. What are your conclusions? (2)

vi. Design the best model to use showing the corresponding standard deviation, R- square, and R-square adjusted. (3)

vii. Construct a 99% confidence interval for the slopes of all the significant

independent variables (3)

b) Read in the file “website.csv” that contains data about different website. The file

contains 551 rows and 11 columns. You are required to: (25)

i) Use Test Sets (by dividing the data into two groups) train and test using a seed of

1776, and assigning the train data to the random numbers generated between .25 and

.75. Build and check the validity of a model that uses “entertain”, “inspire”, and

“trust” to estimate the “sum” by reporting the MSE and the MSPR. (10)

ii) Use k-fold cross validation and a seed of 1991, to check the model validity in using

“timeout” and “social” to estimate “sum”. Report the MSE and MSPR. (15)

Question 4. (25 points)

a) A study on the amount of dye needed to get the best color for a certain type of fabric was

reported. The three amounts of dye, 1/3 %, 1%, and 3% (weight of fabric) were each

administered at two different plants.

The color density of the fabric was then observed four times for each level of dye at each

plant.

The data is found in file fabric.txt. (10)

i. Perform an analysis of variance to test the hypothesis, at the 0.05 level of

significance, that there is no difference in the color density of the fabric for the

three levels of dye and select the appropriate test, and state your conclusion.

Report a p-value (5)

ii. Perform a Tukey test at .05 l.o.s. and discuss your findings (5)

b) Corrosion fatigue in metals has been defined as the simultaneous action of cyclic stress

and chemical attack on a metal structure. A widely used technique for minimizing

corrosion fatigue damage in aluminum involves the application of a protective coating. A

study conducted by the Department of Mechanical Engineering at Virginia Tech used 3

different levels of humidity

Low: 20–25% relative humidity

Medium: 55–60% relative humidity

High: 86–91% relative humidity, and

3 types of surface coatings

Uncoated: no coating

Anodized: sulfuric acid anodic oxide coating

Conversion: chromate chemical conversion coating

The corrosion fatigue data, expressed in thousands of cycles to failure is stored in file

fatigue.txt (15)

i) Perform an analysis of variance with α = 0.05 to test for significant main and

interaction effects. (5)

ii) Use Tukey’s test at the 0.05 level of significance to determine which humidity

levels result in different corrosion fatigue damage (5)

iii) Use an interaction plot and comment on your findings (5)


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp