联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2025-05-02 08:07


DS 3003 Data Processing Workshop II

2025 Spring Project

There are two projects, in total 50% of your overall mark: Project 1 – 20%, Project 2 – 20%, Presentation – 10%

• Project group

You will form a three-person group and work together to finish the project. Therefore, cooperation is very crucial to the success of this project. However, you should clearly specify the role and contribution percentage of each member in the project report. The grading will consider the overall result of the teamwork, as well as the workload, output quality, and presentation of each member in a team.

• Project deliverable

a) Codes with proper comments.

b) A written report in PDF format with name “GROUP_NO_GROUP_NAME.pdf” recording what you did, how you do it and what result you get in each step. Please include important code and screenshot with comments.

c) Prepare a presentation of ~8 minutes each group, clearly list the contribution percentage of each member.

d) Zip your code, report, and presentation as GROUP_NO_GROUP_NAME.zip and submit to iSpace.

Projects (20% + 20% + 10%)

1. Goal: Each group must finish TWO (each 20%) of the following projects according to the document and requirement. Show your code and screenshot with comments in the report and presentation.

2. Related dataset and some of the code can be found here:

  

3. Project options:

  

1) Data analytics for the Bilibili website: (1) follow the document and implement everything, (2) retrieve the latest data from Bilibili website and redo the previous step, (3) analyze the data which are related to “UIC”, (4) analyze the data with regression or tree methods, (5) describe any interesting result you found from the datasets.

2) CO2 emissions analysis: (1) follow the document and implement everything, (2) retrieve the carbon emission data in 2022 and 2023 and redo the previous step, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

3) Music album analytics: (1) follow the document and implement everything, (2) retrieve another album data from Kaggle and redo the previous step, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

4) Weather analysis: (1) follow the document and implement everything, (2) retrieve the latest data from NMC website and redo the previous step, (3) analyze the data which are related to “Guangdong” and “Zhuhai”, (4) analyze the data with regression or tree methods, (5) describe any interesting result you found from the datasets.

5) Movie data analytics: (1) follow the document and implement everything, (2) retrieve another TMDB dataset (size>5000) from Kaggle website and redo the previous step, (3) analyze the movie data which are related to China and Chinese, (4) analyze the data with regression or tree methods, (5) describe any interesting result you found from the datasets.

6) Credit card data analysis: (1) follow the document and implement everything, (2) retrieve a different credit risk dataset from Kaggle website and redo the previous step, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

7) NBA player data analysis: (1) follow the document and implement everything, (2) retrieve the latest NBA season dataset and redo the previous step, (3) Build a model and output the best players according to the dataset in the previous step, (4) analyze the data with regression or tree methods, (5) describe any interesting result you found from the datasets.

8) PUBG game data analysis: (1) follow the document and implement everything, (2) retrieve another game dataset and redo the previous step, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

9) E-Commerce data analytics: (1) follow the document and implement everything, (2) retrieve another e-commerce dataset from Kaggle website and redo the previous step, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

10) COVID data analytics: (1) follow the document and implement everything, (2) retrieve a global COVID dataset from 2020-2022 from Kaggle website and redo the previous step, (3) analyze the data which are related to China, (4) analyze the data with regression or tree methods, (5) describe any interesting result you found from the datasets.

11) Book recommendation analytics: (1) follow the document and implement everything, (2) retrieve another book dataset (preferably related to China) and redo the previous step, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

12) Stroke data analytics: (1) follow the document and implement everything, (2) retrieve another disease dataset and analyze accordingly, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

13) Taobao AD data analytics: (1) follow the document and implement everything, (2) can you predict whether a user will buy the product based on the dataset? (3) retrieve another AD dataset and analyze accordingly, (4) analyze the data with regression or tree methods, (5) describe any interesting result you found from the datasets.

14) British Airlines review data analytics: (1) follow the document and implement everything, (2) retrieve another airlines review dataset and analyze accordingly, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

15) Netflix data analytics: (1) follow the document and implement everything, (2) retrieve another movie/tvshow dataset and analyze accordingly, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets.

16) Stock market & LLM data analytics: (1) follow the document and implement everything, (2) retrieve another stock or fund or index dataset and analyze accordingly, (3) analyze the data with regression or tree methods, (4) describe any interesting result you found from the datasets

相关文章

【上一篇】:到头了
【下一篇】:没有了

版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp