联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2023-05-25 08:43

Advanced (Business) Data Analytics – 2023 S1 – Assignment 2

Advanced (Business) Data Analytics

ASSIGNMENT 2

Advanced (Business) Data Analytics – 2023 S1 – Assignment 2

2

Summary

• Type: Project report, individual assignment

• Deliverable:

o Report in the format of Python script (.ipynb file) only

(you need to use the provided A2 template)

The aim is to provide experience in the steps involved with text preparation, text feature generation,

topic modeling and profiling and using the text features in model building and evaluation. Feel free to

discuss concepts and ideas with peers but remember your submission must be your work. Be careful

not to allow anyone to copy your work. You need to research text analytics and Python functionalities

if you aim to achieve excellent marks.

Specification

The focus of this ML project is to predict discharge decision after the hospitalization. The hospital

environment presents a challenging environment due to the constantly evolving severity of each

patient's illness and the presence of multiple independent measurement devices that often produce

conflicting and false alarms, negatively impacting the quality of care. Previous work in discharge

decision models aimed to consolidate data from these devices and transform the information streams

into knowledge, but this approach overlooked a valuable source of medical information: free-text

clinical notes and reports.

Clinical notes provide health care staff with a quick overview of the most important aspects of a patient's

health conditions. Integrating features extracted from these notes with standard health measurements

yields a more comprehensive representation of a patient's health state, resulting in improved outcome

prediction. However, free-text data is challenging to incorporate into predictive models due to its lack

of structure. To overcome this challenge, latent variable models such as topic models can be used to

infer intermediary representations that can be used as structured features for a prediction task.

In this project, you aim to demonstrate the value of incorporating information from clinical notes, via

latent topic features, in predicting discharge decision after the hospitalization.

Dataset

A2 dataset consists of clinical notes along with some structured data. It uses hospitalization data, which

includes electronic medical records (EMRs) for patients. It includes patients’ information and their

health metrics along with clinical notes. In this data, discharge decision after the hospitalization

determines which patients died in hospital, required extended care etc.

Advanced (Business) Data Analytics – 2023 S1 – Assignment 2

3

Your task is to

 prepare text,

 generate text features,

 apply topic modeling & generate topic profiles, and

 develop a predictive model & evaluate it.

You will need to analyze clinical notes, extract their features and then develop and evaluate predictive

model(s).

 Note: In your final notebook, you should only use one classification technique (e.g., SVM)

along with 6-folds cross validation to show that the extracted features can predict with target

variable.

Deliverables

A notebook template is provided to show how you can structure your work. You need to use the template

and strictly follow its format which is designed based on the provided A2 rubric.

It is useful to add some comments next to your codes to explain it briefly. Using text analytics can be

challenging, and you will need to do your own research. Your reports should be delivered in the ipynb

file.

You will get higher marks if your approach is innovative. For an innovative method, you need to

customize it in the provided context, and elaborate on them based on the context too. Usually a novel

method is unique and no other, or few others, have used it with some differences. It is highly advised

that you do not share your creative work with anyone else. You can still discuss ideas and help each

other.

Submission

To be done through Blackboard Assignment Submission, as indicated in Learn.UQ. The only acceptable

submission format is .ipynb file. The file should be named in the format of YourStudentID.ipynb

You need to submit only one ipynb file. Before submission, make sure that all the important outputs are

shown in your notebook. Avoid showing trivial outputs such as df1, df2, etc, in the notebook. So make

sure to remove codes such as head() before submission.

Note: Your marker will first look at your generated output as a reference without running your notebook.

So your significant outputs should be generated and the elaboration should be provided in the notebook,

as shown in the template.

Advanced (Business) Data Analytics – 2023 S1 – Assignment 2

4

Then your marker will use “Restart & Run All” option from Kernel tab. If there is an error in running

your notebook, you will not receive any mark for all the parts after the cell that returns the error.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp