联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2020-04-16 11:12

Data Bootcamp Problem Set 6 Spring 2020

Instructions:

? After completing the assignment, please submit your .ipnyb file to NYU Classes with the following

naming convention: Lastname_Firstname_NetID_ProblemSet# (ex. Smith_John_js123_ProblemSet2)

? Submit your answers in a Jupyter notebook with proper markdowns to indicate problem numbers.

? Write the questions in markdown before you provide your answers.

? When copying the dictionary or any values directly from this file, make sure that all the quotations and

brackets are in the right form in Jupyter Notebook. (Especially for string quotations – sometimes if you

copy directly from a pdf file, the quotation breaks and it won’t show up properly as a string in Jupyter

Notebook)

? See Grading Guidelines under Problem Set 1 Instructions on NYU Classes.

? Before getting into the problems, import all_data_master.csv, and replace all \N values with NaN. Name

this data frame as “all_data”

? For problems 1 to 6 use all_data, so do not change this data frame at any point

? For problems that ask to order by a variable always use ascending order unless stated otherwise

? For problem 6 the overall median is the median of all salaries in all_data

? For problem 7 and 8 import csv files core_data and salary_grid into data frames employee and salary

respectively. From employee drop rows where all fields are null (Carries credit)

? In-line comments are preferred for this assignment but not mandatory

? No explanations are expected at the end of answers, unless requested

Problems:

1. Display total number of job postings in each year. Print the year that had most jobs. Plot a simple line

graph to see if jobs rise with each passing year.

2. Display mean salary per year for the company Wells Fargo in a single data frame (company, year,

mean_salary). Plot a graph to determine whether Wells Fargo mean salaries are on the rise with every

passing year.

3. Display standard deviation in salaries for the states AZ, TX and DC in descending order. Now visualize this

data in a bar chart.

4. Display all_data without those states that have less than 1000 job postings. Final data frame must include

all columns as the original data frame.

5. For each state, find the company that posted the job with highest salary (among all job postings in that

state alone). Final data frame must have columns job_id, company, salary. There will be only one record

per state.

6. Display all_data without those companies whose highest salary was lower than the overall median. Final

data frame must include all columns of the original data frame.

7. Get salary information for all employees. Display the employee name, state, age, position and

Hourly_Max salary offered.

8. Who are the top 20 highest paid employees based on the Hourly_Max salary column? Print the

percentage of top 20 employees that fully meet their performance score.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp