联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2020-03-03 10:01

MECH 203

Week 7 Jupyter Notebook Written Report

LINEAR REGRESSION

Due date: 11:59PM on Tuesday, March 3rd, 2020

Grading & Weight: This assignment is out of 50 marks, as further specified in the mark

breakdown for each question. The assignment is worth 6% of your overall final grade in the

course.

Late Penalty: Late submissions will be penalized at 10% each day for up to 5 days, in which

case a grade of zero will be given.

1. Overview

This assignment is about applying simple linear regression to interpret data sets on model

rockets, thermal expansion of Al and hybrid cars in Questions 1, 2, 3, respectively.

Before you start working on the assignment make sure you:

 Review the online lecture videos, in-class lecture slides and the required reading.

This assignment aligns with the following CLO’s:

CLO 4: Implement simple linear regression

CLO 4: Implement simple linear regression with error bars

CLO 4: Apply a statistical test to compare regression models

1.1 Time for completion

This assignment will take approximately 6 hours to complete.

2. Instructions

For each question the corresponding data is available both as *.csv and

*.dat files, comma-separated values file and tab-delimited text file, respectively.

The *.csv and *.dat files with the same name contain the same data.

When you have completed the assignment, upload the Jupyter Notebook

file to onQ.

TASKS

Question 1

The “Q1_rocket_data” file contains data on the performance of model rockets constructed by

MME students. Each line represents a rocket launch, where the X and Y correspond to the

pressure of the propellant gas (measured in psi) and the maximum height (apogee, measured in

m) reached by the rocket, respectively.

By applying linear regression to this data we can create an empirical model which can predict the

expected apogee of such a rocket from the pressure of the propellant. To perform the linear

regression follow the steps below:

a. Calculate the values of 𝑥̅, 𝑦̅, 𝑥𝑦̅̅̅, 𝑥

̅̅2̅, 𝑦̅̅2̅ (2/50)

b. Calculate the regression coefficients 𝛽̂

0, 𝛽̂

1 for the best fitting regression line using the

quantities above (2/50)

c. Calculate the sum of squares corresponding to the best fitting regression line (3/50)

d. Calculate the standard error of the regression coefficients 𝛽̂

0, 𝛽̂

1 and comment on their

value (3/50)

e. Make a plot which shows the data points and the best fitting regression line (2/50)

f. Calculate the 𝑅

2

and comment on its value (i.e. interpret its meaning) (2/50)

g. Perform the linear fit using Python (e.g. numpy.polyfit) and compare the coefficients 𝛽̂

0,

𝛽̂

1 for the best fitting regression line and the 𝑅

2 value obtained this way to the values

obtained above (2/50).

(The data was provided by Prof. Surgenor.)

Question 2

The “Q2_Al-thermal-expansion_data” file contains data collected using neutron scattering on

the crystal lattice parameter of an Al-based composite as a function of temperature (i.e. the data

is on the thermal expansion of the material). Each line represents a measurement, where the X

and Y correspond to the crystal lattice parameter (measured in Angstroms, 1 Angstrom = 10-10

m) and the temperature (measured in C), respectively.

By applying linear regression to this data we can create an empirical model which can predict the

expected lattice parameter of this Al-composite if the temperature of the material is known. To

perform the linear regression follow the steps below:

a. Calculate the values of 𝑥̅, 𝑦̅, 𝑥𝑦̅̅̅, 𝑥

̅̅2̅, 𝑦̅̅2̅ (2/50)

b. Calculate the regression coefficients 𝛽̂

0, 𝛽̂

1 for the best fitting regression line

using the quantities above (2/50)

c. Calculate the sum of squares corresponding to the best fitting regression line (3/50)

d. Calculate the standard error of the regression coefficients 𝛽̂

0, 𝛽̂

1 and comment on their

value (3/50)

e. Make a plot which shows the data points and the best fitting regression line (2/50)

f. Calculate the 𝑅

2

and comment on its value (i.e. interpret its meaning) (2/50)

g. Perform the linear fit using Python (e.g. numpy.polyfit) and compare the coefficients 𝛽̂

0,

𝛽̂

1 for the best fitting regression line and the 𝑅

2 value obtained this way to the values

obtained above (2/50).

(The data was collected by E. Tulk.)

Question 3

The “Q3_hybrid-cars_data” file contains data on hybrid cars from various manufacturers which

came out in the years between 1997 and 2013. Each line represents a specific car. The columns

denoted year, msrp, accelrate and mpg represent the model year, the manufacturer's suggested

retail price in 2013 $, the maximum acceleration rate in km/hour/second and the fuel economy in

miles/gallon, respectively.

Using this data set we would like to investigate how the characteristics listed above correlate

with each other. Use 𝑅

2

to quantify and investigate these correlations while answering the

questions below:

a. How much does the year the car was manufactured affect its retail price? I.e. what is the

𝑅

2

for year vs msrp? Make a plot which shows the data points and the best fitting

regression line (3/50)

b. How much does the retail price of the car affect its maximum acceleration rate? I.e. what

is the 𝑅

2

for msrp vs accelrate? Make a plot which shows the data points and the best

fitting regression line (3/50)

c. How much does the fuel economy of the car affect its maximum acceleration rate? I.e.

what is the 𝑅

2

for mpg vs accelrate? Make a plot which shows the data points and the

best fitting regression line (3/50)

d. How much does the year the car was manufactured affect its fuel economy? I.e. what is

the 𝑅

2

for year vs mpg? Make a plot which shows the data points and the best fitting

regression line (3/50)

e. Compare the 𝑅

2 values obtained above and comment on their relative value: In your

subjective opinion which cases from above show a noteworthy effect (correlation) and

which don’t? Explain why (6/50).

(Note: the value of 𝑅

2

is independent of the choice of the response and regressor variables for

the data set pairs above, i.e. X vs Y and Y vs X have identical 𝑅

2

values. (This does not hold for

the regression coefficients and sum of squares)).

(Source of the data: D-J. Lim, S.R. Jahromi, T.R. Anderson, A-A. Tudorie (2014) "Comparing

Technological Advancement of Hybrid Electric Vehicles (HEV) in Different Market Segments",

Technological Forecasting & Social Change, http://dx.doi.org/10.1016/j.techfore.2014.05.008)

3. Evaluation

The report will be evaluated based on the completeness of the answers/solutions provided

for each question.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp