联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2020-01-17 09:58

Final Project - IA626

Summary

The final project for IA-626 will be an open ended project including the topics below. The project

will be primarily graded on complexity, analysis and documentation.

Requirements

The project should contain the following

● ETL - The methods used to fetch and store source data should be clearly outlined and

repeatable. There should be some thought put into the storage format.

○ Example - News stories were scraped from 3 webpages hourly using the

getStories.py script. Stories were stored as raw HTML files. The files were then

parsed by parseHTML.py which loaded them into JSON object with the following

schema. I stored the news stories in this JSON schema over MySQL because we

wanted flexibility in the schema.

● Analysis - What is the primary question you are asking? This might be just an initial

question which leads into more in depth analysis.

○ Example - We looked at the frequency of posts but noticed that the frequency

varies between two cities of the same size in the same timezone. We then looked

at demographic information to see if there was a correlation.

● Two or more data sources - Projects should contain 2 or more data sources. One of

these “sources” can be an API which translates results.

○ Example - I took each post containing a word in our keyword list and sent it to an

API which categorized its popularity score.

Waiver of requirements

Some of these requirements can be waived for projects which contain

● Custom data visualization

● Unique or novel analysis

● UI Application

Included code appendix

All students must supply an appendix of APIs and code they have used.

Deliverable content

● Summary / initial question

● Outline - general approach

○ For multi step approaches use diagrams to describe the data flow.

● Python code

● Figures

● Results

● Code / API appendix

Final Project - IA626

https://docs.google.com/document/u/0/d/1xkoDmsR4IWFpc9iKphMEzhLrNCknRl35B7rHdGn3U50/mobilebasic?pli=1 1/15/20, 10:53

Page 1 of 2

Here are a few data sources and APIs to consider:

● Files

○ Reddit Comments - 1 month

○ Reddit Comments - 1 year (TBD)

○ Taxi Trips (see me for complete set)

○ Taxi Fares

○ NYS Data

○ NYC Data

● APIs

○ Google places API

○ Google Geolocation API

○ Forecast.io weather API

○ Energy data - bulk

Deliverable format

Project should be delivered as a PDF including images, figures, code snapshots etc. If your

project requires another content type please consult me.

https://docs.google.com/document/u/0/d/1xkoDmsR4IWFpc9iKphMEzhLrNCknRl35B7rHdGn3U50/mobilebasic?pli=1 1/15/20, 10:53

Page 2 of 2


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp