#### 联系方式

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codinghelp

#### 您当前位置：首页 >> Python编程Python编程

###### 日期：2020-10-12 11:29

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY

Department of Computer Science and Engineering

MSBD5008: Introduction to Social Computing

Fall 2020 Assignment 1

IMPORTANT NOTES

Late submission: 25 marks will be deducted for every 24 hours after the deadline.

ZERO-Tolerance on Plagiarism: All involved parties will get zero mark.

NetworkX

In this question, you are required to use NetworkX to do basic data analysis on a Wikipedia vote network dataset. It

contains 7,115 nodes and 103,689 (directed) edges. The dataset can be downloaded from http://snap.stanford.

edu/data/wiki-Vote.html.

2. Output the following information related to degree:

average degree, average in-degree, average out-degree;

degree distribution (plot both the degree and frequency in log scale);

density (E/N2

), where E is the number of edges, and N is the number of nodes;

3. Find the largest strongly connected component (giant component), and output the number of nodes in it;

distribution of path length

average path length;

distribution of clustering coefficient;

average clustering coefficient.

5. Treat the network as undirected. Output the following information related to degree:

average degree;

degree distribution (plot both the degree and frequency in log scale);

density (E/N2

).

1

Deep Graph Library (DGL)

In this question, you are required to use DGL to build a graph neural network for node classification. The dataset

view?usp=sharing

1. Load the dataset with the following command:

This file contains a dictionary object with the following information of a directed graph:

nodes: a list containing the id’s of all the nodes in the graph;

labels: a list containing the label of each node;

num classes: the total number of node labels;

features: a matrix of size: number-of-nodes × feature-dimensionality;

source nodes: a list containing the source node-id of each (directed) edge;

target nodes: a list containing the target node-id of each (directed) edge;

train mask: a list (of values “True” or “False”) indicating whether each node is used in the training set or not;

val mask: This has the same format as train mask, and shows whether each node is used in the validation set

or not.

2. You have to use the graph neural network model dgl.nn.pytorch.conv.GINConv in DGL. It implements the following

neighborhood aggregation:

This model includes the graph neural network model discussed in class, but is more general. For details, read

https://docs.dgl.ai/api/python/nn.pytorch.html#dgl.nn.pytorch.conv.GINConv.

classification accuracy on a test set (which is hidden from you). We will use the following code to test your model.

Your code should include a test function (with your model and a mask as inputs) so that we do not need to retrain

print("Testing Acc {:.4}".format(accuracy))

Please also use the following functions

def save_checkpoint(checkpoint_path, model):

# state_dict: a Python dictionary object that:

# - for a model, maps each layer to its parameter tensor;

state = {’state_dict’: model.state_dict()}

torch.save(state, checkpoint_path)

print(’model saved to %s’ % checkpoint_path)

save_checkpoint("best_model.pth", model)

2

print(’model loaded from %s’ % checkpoint_path)

Submission Guidelines

Please submit two Python notebooks (A1.ipynb and A2.ipynb) and a report (report.pdf) for your results and conclusions.

The submitted folder should be Zip all the files into A1 awangab 12345678 (replace awangab with your ust