联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2023-03-16 08:57

DATA ANALYSIS AND MACHINE LEARNING 4 – COURSEWORK 2 RUBRIC


Content: Data Analysis (20% of total mark)


Outstanding 100% The student satisfies the criteria for “Excellent” and goes beyond. Their analysis

is extremely detailed and insightful.

Excellent 80% The student concretely establishes the problem of sentiment analysis in

context, and justifies its importance, citing relevant literature. The “Sentiment

Soup” dataset is analysed in depth and clearly summarised using helpful figures

and tables. Attention and detail will be given to how text samples differ by

source, and by sentiment. To perform this analysis the student will draw upon

data analysis techniques. They will use, and demonstrate an understanding of, a

data analysis technique that wasn’t taught on the course.

Good 60% The student tries to establish the problem of sentiment analysis and why it is

important. This will draw on literature, but this may not be particularly relevant.

The “Sentiment Soup” dataset is analysed in some depth and summarised using

figures and tables. There will be a reasonable attempt at analysing how samples

differ by source, and by sentiment. The student will draw upon data analysis

techniques for this summary. They will not consider techniques beyond those

taught on the course, or will use such techniques without demonstrating full

understanding.

Satisfactory 40% The student provides little context for the problem of sentiment analysis. The

“Sentiment Soup” dataset is analysed at a surface level and summarised using

figures and tables. The student won’t consider how samples differ by source or

by sentiment. There will be some use of data analysis techniques although it

will be obvious to the reader that the student doesn’t fully understand how

these techniques work.

Fail 20% The student provides no context for the problem of sentiment analysis. The

“Sentiment Soup” dataset undergoes very little analysis. The dataset is not

summarised using any figures or tables, and any references to data analysis

techniques is superficial and demonstrates little understanding.

Bad Fail 0% There is little or no content relating to data analysis.


Content: Machine Learning (40% of total mark)


Outstanding 100% The student satisfies the criteria for “Excellent” and goes beyond. Their

experimental setup is flawless, and they have clearly established the extent to

which machine learning can be used for sentiment analysis on “Sentiment

Soup” and have concretely shown how their findings can be deployed

elsewhere.

Excellent 80% The student constructs a set of appropriate, clearly described classification tasks

using “Sentiment Soup”. There is a strict separation of training and test data for

each task with no leakage. The student explores the performance of different

models on these tasks using held-out validation data, or through cross-

validation, before evaluating a final chosen model on test data. The student will

explore the interpretability of their chosen models where possible. The student

considers, and demonstrates an understanding of, a classification model that

was not taught on the course. The student will compare several different vector

representations for text and establish which parts of the classification pipeline

have the largest effect on performance. They will show an awareness of how

different performance criteria may be more suitable for certain tasks. The

student examines the effectiveness of their chosen models on relevant external

data that they have sourced.

Good 60% The student constructs a set of appropriate tasks using the “Sentiment Soup”

dataset which are reasonably well described. The student makes a concerted

effort to separate training and test data but there may be some unintentional

leakage. The student explores the performance of different models on these

tasks using held-out validation data, or through cross-validation. There will be

some consideration of the interpretability of their chosen models. The student

will attempt to use a classification model that was not taught on the course but

there won’t be evidence of clear understanding. The student will consider a few

different vector representations of text but this analysis may not be particularly

in-depth and the student may not identify how important this part of the

classification pipeline is from the model used. They won’t consider performance

criteria beyond accuracy. The student will apply their models to some external

data they have sourced.

Satisfactory 40% The student constructs a set of classification tasks using the “Sentiment Soup”

dataset. Some of these tasks might not be appropriate, and the descriptions of

the tasks may be lacking detail. There is some effort to measure generalisation

by separating training and test data although this won’t be strictly enforced. The

student explores the performance of a few models on these tasks, but

evaluation may be performed directly on the test set, leading to unintentional

overfitting. The student won’t comment on the interpretability of their chosen

models. They won’t examine any representations beyond Bag-of-words, and

won’t consider performance criteria beyond accuracy. The student will not

apply their models to any external data.

Fail 20% The student tries to construct some classification tasks from “Sentiment Soup”

but these are not appropriate. The experiments conducted are substantially

flawed such that there is no meaningful way of measuring the generalisation

performance of any classification model.

Bad Fail 0% There is little or no content relating to machine learning.


Report (20% of total mark)


Outstanding 100% The report satisfies the criteria for “Excellent” and goes beyond. It is

immaculate, and of a publishable standard.

Excellent 80% The report will be easy for the reader to understand. It will be well-written and

use paragraphing. Sentences will be grammatical correct and contain minimal

typos. The report will be tidy and well-formatted. It will be partitioned into

sections with clear titles and begin with an abstract. It will have a clear

narrative. The report will contain high-quality figures and tables with detailed

captions that are clearly referenced in the text. It will have a tidy, consistent

bibliography that is referenced by the main text. The report will be of the

correct length. Overall, it will be aesthetically pleasing.

Good 60% The report will be straightforward for the reader to understand. Writing will be

largely clear, but there may be some sentences that cause confusion.

Paragraphs will be present. There may be minor grammatical errors, or

excessive typos. The report will be largely tidy but may have minor formatting

issues. The report will be partitioned into sections but may be missing an

abstract. There will be a good attempt at forming a narrative. Figures and tables

will be present, although these may be untidy with short captions. There will be

a bibliography although there may be inconsistent formatting between entries.

The report will be of the correct length.

Satisfactory 40% The writing in the report will get across the essence of what the writer is trying

to convey but will only be of passable quality, and unclear in places.

Paragraphing will not be used effectively. Grammatical errors and typos will be

noticeable and occur quite frequently. The report will look quite messy and

have formatting issues. The report won’t have a clear structure with different

sections and there won’t be a clear narrative. Figures and tables will be present

but these will be messy, low quality, and lack captions. There won’t be a

bibliography, and the report may be slightly over- or under-length.

Fail 20% The report will be of a poor quality, and badly written. In many places it won’t

be clear to the reader what the writer is trying to convey. There will be no

paragraphs, and the report will only consist of walls of text. Grammatical errors

and typos will be commonplace. The report will be messy and poorly formatted

and be lacking in any structure. It will be devoid of a narrative. Figures and

tables will not be present, or be of such a quality that they don’t provide any

value. There won’t be a bibliography and the report may be significantly over-

or under-length.

Bad Fail 0% The report looks unfinished and does not convey anything of value.


Code (20% of total mark)


Outstanding 100% The code satisfies the criteria for “Excellent” and goes beyond. It is of the same

standard as code produced by a professional software engineer.

Excellent 80% The code in the appendix will be clear and easy to read. It will not be in the form

of screenshots taken from a Jupyter notebook or IDE. The code will be high

quality, efficient, and will easily generalise to other text datasets. It will adhere

to the PEP 8 style guide for Python code. The code will be well commented. It

will be presented and structured in such a way that it is very easy for the reader

to understand which part of the code produced which figures and results in the

main report.

Good 60% The code in the appendix will be reasonably clear. It will not be in the form of

screenshots taken from a Jupyter notebook or IDE. It will be relatively high

quality but may contain some inefficiencies. It would take a reasonable amount

of work to make the code generalise to other text datasets. The code will

contain comments although some of these may be unhelpful. The code will be

presented and structured in such a way that it doesn’t take too much effort for

the reader to identify which parts of the code produced which figures and

results in the main report.

Satisfactory 40% The code in the appendix will be readable. However, it may be in the form of

screenshots take from a Jupyter notebook or IDE such that it isn’t possible for

the reader to isolate the text. The code will work but will be messy and

inefficient. It would take a lot of work to make the code generalise to other text

datasets. Comments will be missing or too sparse to be of any help. It won’t be

clear how different parts of the code produced the figures and results in the

main report.

Fail 20% The code in the appendix will be difficult to read and may be in the form of a

low-quality screenshot. The code will work but will but be messy and borderline

indecipherable. It would be more beneficial to start from scratch than try to

modify the code to generalise to other text datasets.

Bad Fail 0% There is no code provided, or the code provided does not work.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp