Assignment 5
Due: 3/6
Note: Show all your work.
Problem 1 (10 points) Consider the following confusion matrix.
predicted class
actual class
C1 C2
C1 628 137
C2 59 394
Note: C1 is positive and C2 is negative.
Compute sensitivity, specificity, precision, accuracy, F-meassure, and F2.
Problem 2 (10 points) Suppose you built two classifier models M1 and M2 from the
same training dataset and tested them on the same test dataset using 10-fold crossvalidation.
The error rates obtained over 10 iterations (in each iteration the same
training and test partitions were used for both M1 and M2) are given in the table
below. Determine whether there is a significant difference between the two models
using the statistical method discussed in Section 6 of the online lecture Module 4 (also
in Section 8.5.5, pp 372-373 of the textbook). Use a significance level of 1%. If there
is a significant difference, which one is better?
Iteration M1 M2
1 0.21 0.13
2 0.12 0.1
3 0.09 0.20
4 0.15 0.2
5 0.03 0.15
6 0.07 0.05
7 0.13 0.14
8 0.14 0.21
9 0.05 0.23
10 0.14 0.17
Note: When you calculate var(M1 – M2), calculate a sample variance (not a
population variance).
Problem 3 (20 points). For this problem, you are required to run, on Weka, Native
Bayes, J48, SimpleLogistic, RandomForest, neural network (Multilayer Perceptron),
and One R classification algorithms on german-bank.arff dataset and compare the
performance of the models built by these six algorithms. Make sure that you select
“Cross-validation” for “Test options.” If you have to choose one model, which one
would you choose and why? Note that the neural network algorithm will take a longer
time than other algorithms.
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。