联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-02-19 10:48

Assignment #1

Due: Monday, Feb. 20, 11:59pm, 2023


Introduction


The assignment will be done in groups of up to 6 students. Please submit 1 single report, no

more than 20 pages, double spaced. Report should be concise. Additional tables and

figures can be included in the appendix but cannot be guaranteed that it will be read.


The assignment will use both Tableau and AET. The data file is available on Moodle. It is

divided into 6 main parts – exploratory and descriptive analysist, market basket analysis,

assortment planning, RFM & Clustering, Predictions, and Price Elasticity. Read the

assignment in advance. The time required on each part is not equal.


The assignment assumes basic knowledge of Tableau. There are a number of video tutorials

online to help you get started. You will also likely need to learn more advance functions like

how to join tables in Tableau. One tutorial here https://youtu.be/Xbg2WabUIKI .


Answer the following questions.


Part 1 – Exploratory data analysis


1. [10 points] Perform descriptive analytics of the orders. Specifically, plot the

distributions for number of orders, quantity and unit price over time. Note that you

will need to create a new variable for unit price which is the Sales divided by

Quantity. Describe what you see. Are there any purchasing patterns over time like

holidays?

2. [15 points] For a particular product, is there variance in unit price? Which product

has the biggest variance in unit price? How does the unit price vary over time? Can

you show a few examples of products which varying a lot over time? Which

products vary little over time? Which subcategory has the largest variance? Can you

explain why some of the products vary in unit price? Why do you think it happens?


Part 2 – market basket analysis

3. [25 points] Perform market basket analysis at the sub-category level. You do not

need to compute the support, lift and confidence. Rather, a simple quantity count is

sufficient. What sub categories are purchased the most together?

4. [25 points] Now, conduct a market basket analysis at the product level for the top 10

selling products. What do you find? Are these high or low priced products?


Part 3 – Assortment Planning

5. [25 points] For the bakery example done in class, what is the estimated demand for

Fudge, Family Size, Tiny Tina? What was the actual substitution effect with Fudge,

Family Size, Yummy Cake? Show your work.



Part 4 – RFM & Clustering


1. [5 points] Generate the RFM variables using the superstore data. Remember, you

will need to group by customer id first. Then, assign a score of 1 for the “best”

customer for the top 20 percentile of customers for that dimension. Show your and

procedure how you did this.

2. [20 points] Once you have created a worksheet with proper variables and

visualization, you can export the worksheet data using the “Crosstab to Excel” in the

“Worksheet” menu item. Make sure that the rows and columns in the exported files

are in the correct. Open the file in Excel to check that it is what you want. Then,

export the file to csv, and import into the AET clustering module.

a. Make plots for R&F, F&M, and M&R. Perform clustering on these plots. The

AET shows What do you choose for the number of clusters? How and why

did you come up with those values?

b. Now, we will cluster on all 3 RFM dimensions. Use Step 3 in the AET. How

many clusters did you choose? Why?

c. Now, repeat part (b) with more variables from Tableau. Which variables do

you add and why?


Part 5 - Prediction

3. [5 points] Now we will predict customers who are likely to return items. In Tableau,

join the order table with the returns table. Do a cross tabulation or “group by”

Customer ID, and count how many returns each customer does. Check that your

table join includes customers who had no returns. Now perform a “slice-and-dice”

by visualizing the number of returns by each group of R, F, M like in lecture for the

campaign response rates.

4. [20 points] Now export your worksheet data like in part 4, into excel, then into csv.

Upload the csv file into the Decision Tree module of the AET. Note that for decision

tree, the value can only be 0 or 1. You will need to convert your returns count into 0

if they have never returned an item, and 1 if they have returned 1 or more. In the

Decision Tree module, select your new returns binary variable as the “target”, and

one more more for the features. Plot the decision tree1. What do you see? Explain

the outcomes. What other variables can you add?


1 For explanation of the results and plots, see “Results Explained” below. Note that the AET reports the

“value” as a proportion, not a count as mentioned in the website:

https://www.w3schools.com/python/python_ml_decision_tree.asp

5. [25 points] Now we will try to use the Regression Module to predict the returns. You

can use the total returns count. What independent variables do you use? Which

ones are statistically significant? What other variables can you add?


Part 6 – Price Elasticity


6. [25 points] For this part, we will estimate the price elasticity of each sub-category

from the superstore dataset. Prepare your data by creating separate csv files for

each sub-categories, and each row should be aggregated by product. Product the

average unit price for each product, and total units (quantity) sold. Remember to

also produce the log of these values. You will need to do regression analysis. There

are several options to do so, and you can choose which you are most familiar with:

a. Tableau. You can use the Analytics and trend line function. A quick tutorial

https://youtu.be/lc76P-C205o

b. AET. Import the csv into the Regression Module of the AET and compute the

regression on the log values.

c. R. If you are familiar with R, you can use


What are the price elasticities for each of the sub-categories? Can you explain what

you see? Does it make sense?


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp