Note: This unit version is currently under review and is subject to change!

DATA3406: Human-in-the-Loop Data Analytics (2019 - Semester 2)

Download UoS Outline

Unit: DATA3406: Human-in-the-Loop Data Analytics (6 CP)
Mode: Normal-Day
On Offer: Yes
Level: Senior
Faculty/School: School of Computer Science
Unit Coordinator/s: Professor Kay, Judy
Session options: Semester 2
Versions for this Unit:
Campus: Camperdown/Darlington
Pre-Requisites: DATA2001 AND DATA2002.
Brief Handbook Description: DATA3406, Human-in-the-loop Data Analytics (HILDA) deals with the critical topic of the people's involvement in every aspect of data science. People are central to defining the problems that drive the data analysis and people may be affected by the outcomes, as decision makers or those affected by data-driven decisions such as those made by politicians, law makers, teachers ... In addition, it is people who actually do the data analysis, often in analysis teams and as part of larger teams that need the analysis. People own data and are sources of much of the data that people care most about. Critically, data analysts need to consider the implications of all the technical steps data engineering - wrangling, cleansing and preparation - that typically account for 50-80% of the time for data analytics projects. It is human data analysts who then use many, many methods to gain insights from the data; these ranges from the highly human-centred visual analytic methods to diverse statistical, machine learning and data mining methods.

This subject introduces human-centred methods for all these aspects, from stakeholder analysis and problem definition right to visualisation methods for exploration and reporting. All these are underpinned by study of human aspects of ethics and values, privacy and data management, cognition and perception, management of teams, literate programming and the profoundly difficult task of dealing with and understanding uncertainty.

The practical work is based on literate programming using Python-based collaborative notebooks.

On completion of this unit, students will be able to identify and analyse the humans in data analytics, and will be able to draw upon theory and methods that are human-centred.
Assumed Knowledge: Basic statistics, database management, and programming.
Lecturer/s: Professor Kay, Judy
Timetable: DATA3406 Timetable
Time Commitment:
# Activity Name Hours per Week Sessions per Week Weeks per Semester
1 Lecture 2.00 1 13
2 Laboratory 1.00 1 13
3 Independent Study 3.00 13
4 Project Work - own time 6.00 12

Learning outcomes are the key abilities and knowledge that will be assessed in this unit. They are listed according to the course goal supported by each. See Assessment Tab for details how each outcome is assessed.

(6) Communication and Inquiry/ Research (Level 3)
1. Ability to communicate the process used to analyse a large data set, and to justify the methods used in the context of the humans gathering the data and interpreting the analysis.
2. Ability and experience to use interactive visualisation to communicate the thought process behind complex analytical questions.
3. Ability to communicate the results produced by an analysis pipeline, in oral and written form, including meaningful diagrams.
(8) Professional Effectiveness and Ethical Conduct (Level 2)
4. Ability to identify ethical and legal issues that may relate to a data analytics task.
(5) Interdisciplinary, Inclusiveness, Influence (Level 3)
5. Understanding of the diverse roles of humans in the data analysis process.
6. Understanding of the technical issues that are present when data is gathered from or used by humans.
(2) Engineering/ IT Specialisation (Level 3)
7. Experience with appropriate technologies to address the technical issues of human-centred data analysis.
8. Ability to carry out (in guided stages) the whole design and implementation cycle for creating a human-in-the-loop pipeline to analyse a dataset.
(3) Problem Solving and Inventiveness (Level 3)
9. Ability to identify explicit and implicit requirements for carrying out a data analysis task to address specific stakeholder purposes.
(1) Maths/ Science Methods and Tools (Level 3)
10. Ability to select statistical techniques appropriate for modelling uncertainty and bias in data, and students can justify their choice.
11. Ability to select appropriate techniques for validating their uncertain models, and ability to justify the choice.
Assessment Methods:
# Name Group Weight Due Week Outcomes
1 Assignment 1: HILDA planning report Yes 5.00 Week 5 (Wednesday, 11 pm) 1, 4, 5, 9,
2 Assignment 2: HILDA project Yes 20.00 Week 12 (Wednesday, 11 pm) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
3 Peerwise questions - individual mark No 10.00 Multiple Weeks (Wednesday, 11 pm) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
4 In-lecture and in-lab activities each week No 5.00 Multiple Weeks (Thursday) 1, 2, 3, 5, 6, 7, 8, 9, 10, 11,
5 Final Exam No 60.00 Exam Period 2, 3, 4, 5, 6, 10, 11,
Assessment Description: There are two group projects through the semester:

• Assignment 1: HILDA (Human-in-the-loop data analytics) planning report (group) – this has two parts (1) each group analyses an allocated case study and (2) identifies a data set for us in Project 2. Both are reported in a set of slides which serve as a report and are used in a Week 5 lab presentation.

• Assignment 2: HILDA code, action report and presentation (group) – this involves conduct of data analysis using methods studied in class to produce (1) a literate programming notebook that documents the processes and provides intermediate analysis steps (2) visualisations of the raw data and exploratory analyses (3) visual and text presentation of the final results (4) presentation of these in the Week 12 lab.

There is individual grading of work that consolidates the lecture material and provides formative feedback on set preparation for classes as well as work done in class:

• Peerwise questions - each student will created questions for allocated weeks of lecture material and answer a broad set of questions as a core part of learning lecture and lab content.

• In-lecture and in-lab activities each week - there will be class activities in each of the 13 lectures and 12 labs and each will be graded as satisfactory or not. Each is of equal weight.

• Written exam: Final examination covering all materials in lectures, tutorials, laboratories, and assignments.

This course will use text-based similarity detecting software (Turnitin) for all text-based written assignments.

Deadlines for assignments are set on the assumption that students may experience minor setbacks caused by sickness, computer breakdown etc. In this context, ‘minor’ means ‘causing a delay of up to three working days’. Extensions will not be granted for minor setbacks. Since the projects are group based, individuals need to complete their contributions in order to earn the group mark.

Late work: In the interests of fairness to all students, the School of Information Technologies policy states that late work cannot be accepted. In exceptional cases, late work must be submitted directly to the unit of study coordinator accompanied by an application for Special Consideration.
Assessment Feedback: The teaching team will provide feedback on the assessment tasks. Assignment results will be published on the course web site. Students are required to check their results. Any errors or omissions must be reported to the unit coordinator, with appropriate evidence, within 5 working days (a week) of being published. 5 days after being published, marks are considered to have been confirmed and will not subsequently be altered.
Grading:
Grade Type Description
Standards Based Assessment Final grades in this unit are awarded at levels of HD for High Distinction, DI (previously D) for Distinction, CR for Credit, PS (previously P) for Pass and FA (previously F) for Fail as defined by University of Sydney Assessment Policy. Details of the Assessment Policy are available on the Policies website at http://sydney.edu.au/policies . Standards for grades in individual assessment tasks and the summative method for obtaining a final mark in the unit will be set out in a marking guide supplied by the unit coordinator.
Minimum Pass Requirement It is a policy of the School of Computer Science that in order to pass this unit, a student must achieve at least 40% in the written examination. For subjects without a final exam, the 40% minimum requirement applies to the corresponding major assessment component specified by the lecturer. A student must also achieve an overall final mark of 50 or more. Any student not meeting these requirements may be given a maximum final mark of no more than 45 regardless of their average.
Policies & Procedures: IMPORTANT: School policy relating to Academic Dishonesty and Plagiarism.

In assessing a piece of submitted work, the School of Computer Science may reproduce it entirely, may provide a copy to another member of faculty, and/or to an external plagiarism checking service or in-house computer program and may also maintain a copy of the assignment for future checking purposes and/or allow an external service to do so.

Other policies

See the policies page of the faculty website at http://sydney.edu.au/engineering/student-policies/ for information regarding university policies and local provisions and procedures within the Faculty of Engineering and Information Technologies.
Recommended Reference/s: Note: References are provided for guidance purposes only. Students are advised to consult these books in the university library. Purchase is not required.
Note on Resources: Note that the VizMaster Book is available online for no cost.

There will also be readings posted for each topic. These are all available via the library or on the web.

Note that the "Weeks" referred to in this Schedule are those of the official university semester calendar https://web.timetable.usyd.edu.au/calendar.jsp

Week Description
Week 1 Lecture: Introductions: big picture, assessment overview, survey on pre-knowledge and values, Asst 1 spec, Critical definitions, preregistration - protocols, group work and communication.
Week 2 Lecture: Analysis: introducing the 4 case studies for Asst 1 - more definitions and examples.
Lab: Form groups and select case study.
Week 3 Lecture: Data collection: crowdsourcing, human issues of data
Lab: Assignment 1 work on lecture topics.
Week 4 Lecture: Literate programming and Colab overview
Lab: Colab - exploring data to understand it.
Week 5 Lecture: Data engineering 2.
Lab: Assignment 1 presentations.
Assessment Due: Assignment 1: HILDA planning report
Week 6 Lecture: Guest lecturer.
Lab: Data engineering
Week 7 Lecture: Effective visualisations - principles and people
Lab: Form groups and select datasets.
Assessment Due: Project: Presentation
Week 8 Lecture: Effective visualisations - exploration.
Lab: Asst 2 stage 1 demo to tutor.
Week 9 Lecture: Effective visualisations - reporting - Understanding uncertainty, Truth decay
Lab: Colab - visualisation
Week 10 Lecture: Machine learning in the loop
Lab: Asst 2 stage 2 demo to tutor.
Week 11 Lecture: Interfaces for machine learning, end user programming, personal informatics, personal hypothesis testing.
Lab: Tableau.
Week 12 Lecture: Leading edge - immersive analytics (tabletops, large displays, VR, AR), personalised scaffolding.
Lab: Asst 2 presentation
Assessment Due: Assignment 2: HILDA project
Week 13 Lecture: Revision
Lab: Revision.
Exam Period Assessment Due: Final Exam

Course Relations

The following is a list of courses which have added this Unit to their structure.

Course Year(s) Offered
Bachelor of Advanced Computing/Bachelor of Commerce 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science (Health) 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science (Medical Science) 2018, 2019, 2020
Bachelor of Advanced Computing (Computational Data Science) 2018, 2019, 2020
Bachelor of Advanced Computing (Computer Science Major) 2018, 2019, 2020
Bachelor of Advanced Computing (Information Systems Major) 2018, 2019, 2020
Bachelor of Advanced Computing (Software Development) 2018, 2019, 2020
Biomedical Mid-Year 2016, 2017, 2018, 2019, 2020
Biomedical 2016, 2017, 2018, 2019, 2020

Course Goals

This unit contributes to the achievement of the following course goals:

Attribute Practiced Assessed
(6) Communication and Inquiry/ Research (Level 3) No 31.5%
(8) Professional Effectiveness and Ethical Conduct (Level 2) No 10%
(5) Interdisciplinary, Inclusiveness, Influence (Level 3) No 23.5%
(4) Design (Level 3) No 0%
(2) Engineering/ IT Specialisation (Level 3) No 7%
(3) Problem Solving and Inventiveness (Level 3) No 3%
(1) Maths/ Science Methods and Tools (Level 3) No 25%

These goals are selected from Engineering & IT Graduate Outcomes Table 2018 which defines overall goals for courses where this unit is primarily offered. See Engineering & IT Graduate Outcomes Table 2018 for details of the attributes and levels to be developed in the course as a whole. Percentage figures alongside each course goal provide a rough indication of their relative weighting in assessment for this unit. Note that not all goals are necessarily part of assessment. Some may be more about practice activity. See Learning outcomes for details of what is assessed in relation to each goal and Assessment for details of how the outcome is assessed. See Attributes for details of practice provided for each goal.