联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-03-07 10:18

DDA3020: Assignment I

February 26, 2023

This assignment accounts for 14/100 of the final score. Homework due: 11:59 pm, March

12, 2023

1 Written Problems (50 points)

1.1 Given the following denominator layout derivatives, (13 points)

• Differentiation of a scalar function w.r.t. a vector: If f(w) is a scalar function of d variables,

w is a d × 1 vector, then differentiation of f(w) w.r.t. w results in a d × 1 vector

df(w)

• Differentiation of a vector function w.r.t. a vector: If f(w) is a vector function of size h×1 and

w is a d × 1 vector, then differentiation of f(w) w.r.t. w results in a d × h matrix

df(w)

Please prove the following derivatives:

Consider X ∈ R

h×d and y ∈ R

h×1

, which are not functions of w:

d(y

⊤Xw)

dw

= X⊤y, (4 points)

d(w⊤w)

dw

= 2w, (4 points)

Consider X ∈ R

d×d and w ∈ R

d×1

(5 points):

d(w⊤Xw)

dw

= (X + X⊤)w

1.2 Suppose we have training data {(x1, y1),(x2, y2), . . . ,(xN , yN )}, where xi ∈ R

d and yi ∈ R

d

, i =

1, 2, . . . , N. Consider fw,b(x) = x

⊤w + b. (12 points)

(1) Find the closed-form solution of the following problem

min

w,b

X

N

i=1

(fw,b(xi) − yi)

2 + λw¯

⊤w¯ , (1)

where w¯ = ˆIdw = [0, w1, w2, . . . , wd]

⊤. (6 points)

1

(2) Show how to use gradient descent to solve the problem. (6 points)

1.3 Prove that:

(1) f(x) = x

2

is convex. (4 points)

(2) every affine function f(x) = ax + b is convex, but not strictly convex; (4 points)

(3) f(x) = |x| is convex, but not strictly convex; (5 points)

1.4 Suppose x1, x2, . . . , xN are drawn from Laplace(µ, b). Calculate the MLE (maximum likelihood estimation) of µ and b. Hint: Use the logarithmic trick to process multiple exponential items. (12 points)

2 Programming (50 points)

The boston.csv file contains Boston Housing Dataset. The Boston Housing Dataset is a derived from information collected by the U.S. Census Service concerning housing in the area of Boston MA. The following

describes the dataset columns:

• CRIM - per capita crime rate by town

• ZN - proportion of residential land zoned for lots over 25,000 sq.ft.

• INDUS - proportion of non-retail business acres per town.

• CHAS - Charles River dummy variable (1 if tract bounds river; 0 otherwise)

• NOX - nitric oxides concentration (parts per 10 million)

• RM - average number of rooms per dwelling

• AGE - proportion of owner-occupied units built prior to 1940

• DIS - weighted distances to five Boston employment centres

• RAD - index of accessibility to radial highways

• TAX - full-value property-tax rate per $10,000

• PTRATIO - pupil-teacher ratio by town

• B - 1000(Bk − 0.63)2 where Bk is the proportion of blacks by town

• LSTAT - % lower status of the population

• MEDV - Median value of owner-occupied homes in $1000’s

You need to use appropriate attributes in ’crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio, b,

lstat’ to predict the last attributes ’MEDV’. You need to finish the following steps:

• Step 1: use pandas library to check the data in the dataset. Process incomplete data point such

as ’NaN’ or ’Null’. Briefly summarize the characteristics of this dataset and guess which is the most

relevant attribute for MEDV.

• Step 2: use seaborn library to visualize the dataset. Plot the MEDV distributions over each attribute.

Briefly analyze the characteristics of the attributes and revise the assumption in Step 1 if necessary.

• Step 3: use seaborn.heatmap function to plot the pairwise correlation on data. Select the good

attributes which are good indications of using as predictors. Report your findings.

• Step 4: use sklearn.preprocessing.MinMaxScaler function to scale the columns you select in

Step 3. Then use seaborn.regplot to plot the relevance of these columns against MEDV with 95%

confidence interval.

2

• Step 5: Randomly split the data into two parts, one contains 80% of the samples and the other

contains 20% of the samples. Use the first part as training data and train a linear regression model

and make prediction on the second part with gradient descent methods. X should be the attributes

you select in previous steps.

Report the training error and testing error in terms of RMSE. Plot the loss curves in the training

process. Notice: you need to write the codes of learning the parameters by yourself. Do not use the

regression packages of Sklearn.

• Step 6: Repeat the splitting, training, and testing for 10 times with different parameters such as step

size, iteration steps, etc. Use a loop and print the RMSEs in each trial. Analyze the influence of

different parameters on RMSE.

2.1 Submission Format

• code.ipynb (without Input Data included) - 25 pts - your jupyter notebook files should contain the

running output of each step (numbers, plots, etc.). If your notebook has only code but no output

results, you will get a discounted score.

• Submit report containing linear model description, loss function, hyperparameter settings, RMSE

equation, outputs (errors, plots, figures), and relevant analysis required in above steps. This should be

included as part of written assignment in pdf file. - 25 pts.

• Note: Please include all results from your model in the report. You will receive no credits if we have to

run the code to get outputs. The recommended length of the report is about 3-5 pages. If the report

is too short, the score will be deducted for lacking sufficient contents. There is also no

credit bonus for too long reports. The overall submission format is one ipynb file and one pdf

file containing both answers for written and report.

3


相关文章

版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp