联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-03-27 10:19

Analysis of FinTech Data

Assignment #3

Download the data from the following link.

http://www.dropbox.com/sh/62m5jr0t4vpbeyp/AAASXqr3ZUlC71b3FYDxmLJJa?dl=0

NOTE: Use all available resources to solve the problems. You can find a solution to most of the coding

problems from the Internet. Google it, if you are stuck in the middle.

Q1. Load the data to your R system. How many variables and observations are in the data?

Q2. How many are currently employed? How many are self-employed among the currently employed?

Q3. What is the average monthly income of the whole sample? What is the average monthly income of the

currently employed?

Q4. Generate the histogram of “loan_amount.” Can you find some interesting patterns from the graph? Can

you guess the reasons why the graph has the shape?

Q5. Replace the value of “friends_facebook” to NA if the value is 0. What is the average number of Facebook friends of

applicants who have the account in Facebook?

Q6. Generate the scatterplot of “month_of_service” and “credit_score”. Can you find any relationship

between them? What about “monthly_income” and “credit_score”? Confirm the relationship with the

correlation tests.

Q7. Make a new variable, named “automatic_approved,” which has the value “t” if approved by the decision

engine, “f” if rejected by the decision engine, and “NA” if reviewed manually. How many cases are approved

or rejected by their decision engine? How many are classified as “manual review”?

Q8. Compare the automatically approved cases and the automatically rejected cases. Conduct t-tests on

variables available in the dataset to answer the following subquestions.

1) Are they different in “loan_amount”?

2) Are they different in “tenor”?

3) Are they different in “age”?

4) Are they different in “month_of_service”?

5) Are they different in “residential_status”?

6) Are they different in “monthly_income”?

7) Are they different in “bankrupted”?

8) Are they different in “currently_employed”?

9) Are they different in “channel”?

10) Are they different in “language”?

11) Are they different in “credit_score”?

12) Are they different in “friends_facebook”?

13) Are they different in “location_application”?

Q9. Make a new variable, named “automatic_approved_dummy,” which has the value of 1 if

automatic_approved = t, and 0 otherwise. Develop a regression model for approval by the decision engine

using the DV of “automatic_approved_dummy.” Include all relevant independent variables in the model.

Q10. Based on the analysis results above, provide the logic behind the decision engine to judge “approve”.

Q11. Develop the best classification model to reduce their manual jobs. Which classification models will you

choose? What is the sensitivity and specificity of your model? Provide a table that contains the sensitivity and

specificity of your models.

Q12. Given that your classification model is not perfect, the managers have concerns that the new decision

engine based on your classification model can accept the application which should be rejected, or reject the

application which should be accepted. What is your suggestion to address their concerns?

Guideline for Assignment 3:

Submit the answer sheet and R-code used for the analysis to the course Blackboard. Please include your

student number and name in the header of the document. Make your answer sheet formatted as follows:

Times New Roman, 12-point font, double-spaced only (not 1.5), 1-inch margins all around 8.5 x 11-inch

paper (or A4). Your answer sheet should not exceed two A4 pages.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp