data编程代写代做、Python，Java程序语言代写、代写c++编程-代写Python编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-23:00
微信：codinghelp

您当前位置：首页 >> Python编程Python编程

data编程代写代做、Python，Java程序语言代写、代写c++编程

日期：2020-11-28 11:14

HW1 (Due date: 11/12).

Upload your answers in a Word document to Canvas with your name in the filename. For code

questions, paste your code in Word and the corresponding results (print out needed outputs and

figures), and add some explanations. Alternatively, you can also submit a jupyter notebook file with

some comments. For by hand calculation problems, you may write the answers in word or on a piece of

paper and take a picture and paste the picture into Word.

1. Gradient accumulation for a 2-layer neural network with 2-dimensional input (5 points).

x = [11; 5]; W1=[1.1, 2.1, 4; 0.8, 3.2, 3.3] (a 2x3 matrix); b1=-1.3; W2=[1/8; 1/6; 1/7] (a 3x1

matrix); b2=0.1. 𝑦 = 𝑊ଶ

்

(𝑊ଵ

்𝑥 + 𝑏ଵ

) + 𝑏ଶ. The observation y*=20. Use squared loss. Calculate:

(a) what is dL/dW1 (hint: you should have a matrix of 6 values here)

(b) with a learning rate of 0.01, if you update the weights, what are the new weights?

2. Clustering and dimensional reduction. From the CAMELS dataset (attributes.csv), we can

extract the following attributes: (i) aridity: annual potential evapotranspiration (PET) divided by

precipitation; (ii) precipitation seasonality index; (iii) fraction of precipitation falling as snow.

(a) Define the distance as Euclidean distance of the above three indices. Run a k-means

clustering of the CAMELS basins. How many clusters should you set? Show the

total_sum_of_squared_distance vs k plot to justify your choice.

(b) There are 17 attributes in attributes.csv, use principal component analysis to find the first

principal components. Do scatter plot of basins on the 2D plot with PC-1 and PC-2 as the axes.

Better yet, use colors to indicate which cluster they belong to.

3. Boosting and feature importance. Still working with the CAMELS dataset, extract annual

average runoff from runoff_mm.csv (we did this in hw1). Together with the 17 attributes, you

have 18 attributes. Normalize the attributes first.

(a) Write a loop, in each iteration, predict one of the attributes using xgboost with the rest 17

attributes as inputs. You can predict all 18 attributes with this loop. Which attribute has the

highest predictability?

(b) For the most predictable attribute you found in (a), use permutation_importance to rank the

feature importance.

4. Neural network training. Use 80% of the basins as train and 20% as test. Report both the train

and the test metrics.

(a) For the problem of predicting annual average runoff_mm using the other 17 attributes, write

a PyTorch code to train a 2-layer neural network.

(b) write an two-layer MLP as an autoencoder for the 17 catchment attributes (not including

runoff) with a hidden size of 4 or 6. What kind of reconstruction error do you get for these two

setups?

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：COMP557代写程序代写、代做c++，Python程序、Java编程语言代写

【下一篇】：COMP557代写程序代写、代做c++，Python程序、Java编程语言代写

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Python编程Python编程

data编程代写代做、Python，Java程序语言代写、代写c++编程

日期：2020-11-28 11:14

相关文章