联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2018-10-23 10:00

MSC-BDT5002 Knowledge Discovery

and Data Mining, Fall 2018

Assignment 2

Deadline: Oct. 25th, 11:59pm, 2018

Task Description

The dataset come from 1994 Census database. Prediction task is to

determine whether a person makes over 50K a year.

Files Description

1.trainFeatures.csv: 34189 individual’s basic information with 14

attributes for training.

2.testFeatures.csv: 14653 individual’s basic information with 14 attributes

for testing.

3.trainLabels.csv: 34189 individual’s incomes, 0: <=50k, 1: >50k.

4.sampleSubmission.csv: The sample submission file you may refer.

5.dataDescription.pdf: 14 attributes information.

Notes

1.You must use ensemble learning algorithm to do Prediction.

2.Real-world data contains noise, missing values or even mistakes. Preprocessing

is necessary.

3.Your assignment will be graded by the testing accuracy and

clarification for your feature engineering (in readme.pdf).

4.TA will check your source code carefully, so your code must be

runnable. Keep your code clean and comment it clearly.

5.You can use any programming language. In principle, python is

preferred.

6.Cheating is not allowed. Your result MUST be reproducible.

7.Plagiarism will lead to zero mark.

Submission Guidelines

1.Assignment should be submitted to mscbdt5002fall18@gmail.com as

attachment

2.You need to zip the following three files together:

A2_itsc_stuid_readme.pdf. Write your feature engineering in it

A2_itsc_stuid_code.zip: The zip file contains all your source codes.

A2_itsc_stuid_prediction.csv: The prediction result.

3.Attachment should be named in the format of: A2_itsc_stuid.zip. E.g.

A2_lliny_20181234.zip.

4.Submissions after the deadline or not following the rules above are

NOT accepted.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp