联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2019-12-14 10:50

Machine Learning Coursework

The coursework aims to make use of the machine learning techniques to classify objects in images using CIFAR-10 dataset. The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class. Here are the classes in the dataset, as well as 10 random images from each:

You will perform the following tasks using Python with necessary libraries (Scikit-learn and PyTorch). You can find CIFAR-10 dataset from above link. You can download the dataset and load the training and testing data according to the description from above link. Or you can use PyTorch to download the dataset (see task 3).


Task 1: Apply PCA to reduce the original input features into new feature vectors with different amount of information kept, e.g. 10% dimensions, 30% dimensions, 50% dimensions, 70% dimensions, 100% dimensions.


Task 2: Design and implement object recognition system using SVM. Do the following:

1.Apply linear SVM with training data to do 10-fold cross validation to train and validate your models with different input feature vectors from Task 1 (original input and reduced input calculated from Task 1).

2.Using test data to compute f1 values (for each class) and accuracy for your models and plot figures showing result vs feature dimension.  

3.Use polynomial and RBF kernels to train different SVM models with original input features (non-PCA) and do 10-fold cross validation to train and validate your models.

Note that each kernel has different parameters to set, for example, orders for polynomial model and sigma for RBF kernels. You should try different parameters as well.

4.Use test data to compute the f1 values for each class and accuracy for your models with different kernels and parameters.


Task 3: Design and implement object recognition system using CNN. You should use PyTorch as deep learning framework. Note that there is no specific requirement on the actual architecture of your CNN. However, please do not used LeNet, this is the one we used before. You should try to play around with convolutional and pooling layers (for example, more layers or more kernel windows) for best result you can get. Use test data to compute the f1 values for each class and accuracy for your CNN.

Note that in fact PyTorch does include classes and functions for downloading and making use of CIFAR-10 dataset. See

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitzcifar10-tutorial-py for more details.


Task 4: Based on your experiences of performing task 2 and task 3 and findings therein, in your own words, compare and contrast the performances (accuracy, precision and recall, f1), computational complexity (time), level of overfitting of the two approaches. To look at the level of overfitting, you can compare the performance of a given model on the training data with test data and see how different they are. State which one you think would be a better approach to this problem under certain situation and explain why.  


Important Notes: CIFAR-10 contains 60000 images which may cost a lot of time for training. Depending on your computer, using the whole dataset may take too much time for for both Task 2 and 3.



Marking scheme: The marking distribution is given in 100 scaling as follows. Note that you should properly organize your code with appropriate comments for easy of marking and running.

1)Completeness of task 1 (10 marks)

2)Completeness of task 2 (35 marks)

3)Completeness of task 3 (35 marks)

4)Completeness of task 4 (15 marks)

5)Coding with proper comments and organization (5 marks)


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp