Note: This unit version is currently under review and is subject to change!

DATA5207: Data Analysis in the Social Sciences (2019 - Semester 1)

Download UoS Outline

Unit: DATA5207: Data Analysis in the Social Sciences (6 CP)
Mode: Normal-Day
On Offer: Yes
Level: Postgraduate
Faculty/School: School of Computer Science
Unit Coordinator/s: Ratcliff, Shaun
Session options: Semester 1, Int December
Versions for this Unit:
Campus: Camperdown/Darlington
Pre-Requisites: None.
Brief Handbook Description: We are in the middle of a data revolution. A new laptop can run processes impossible for a supercomputer a few generations ago. The internet makes data collection and distribution easier and cheaper than ever, with terabytes of information on consumer behaviour, public transport use, crime statistics and election results sourced from across the world now available almost anywhere in seconds.

These advances in modern computing allow us to begin to answer important questions about the world, including when disease outbreaks are expected, why certain choices were made by voters during elections, and whether individuals convicted of serious crimes are likely to reoffend. To take advantage of these possibilities, collecting, examining, understanding and communicating the meaning of data are becoming vital skills for modern life. They provide the tools for investigating and making sense of the world, and have become increasingly significant for gaining the best jobs in industry, government and the not-for-profit sector.

Taught by quantitative political scientists from the United States Studies Centre and data scientists from the Centre for Translational Data Science, DATA5207 is one of the only subjects in Australia training students to use data science to address real questions in the social world. Participants will gain a critical understanding of the strengths and weaknesses of quantitative research. They will acquire practical skills using different research methods and tools to answer relevant questions from politics, criminology, public health and economics.

Classes will be structured around a one hour lecture and two-hour lab session each week. In lectures, problems we seek to solve in the social sciences (broadly defined to include economics, journalism, industry, academia and government) will be explained, and the methods we can use to solve them outlined. Lab sessions will be built around small group work, with students working on real-life problems and data.

This unit is designed to provide practical skills in data analysis and lessons will be problem based, with a focus on practical examples. Participants of this course will learn to analyse data and present their results. As students develop their methodological skills in data analysis within the Master of Data Science, DATA5207 will concurrently provide resources to show the diversity of how this knowledge can be practically applied outside of the university in a range of contexts.

Course requirements are listed below. If you are interested in taking this unit but unsure whether you have the necessary assumed knowledge, please contact the unit coordinator, Shaun Ratcliff: shaun.ratcliff@sydney.edu.au

For those not familiar with the R working environment, we will also be running at least one pre-semester workshop in November to help get students up to speed.
Assumed Knowledge: To undertake this unit, you should be able to competently work with data. This includes being able to do undertake some of: collect, clean, analyse, model and visualise data. Ability to code is helpful, but not a requirement.
Department Permission Department permission is required for enrollment in this session.
Lecturer/s: Ratcliff, Shaun
Tutor/s: Tim Nallaiah
Timetable: DATA5207 Timetable
Time Commitment:
# Activity Name Hours per Week Sessions per Week Weeks per Semester
1 Lecture 1.00 1 13
2 Laboratory 2.00 1 13
3 Independent Study 6.00 13

Learning outcomes are the key abilities and knowledge that will be assessed in this unit. They are listed according to the course goal supported by each. See Assessment Tab for details how each outcome is assessed.

Unassigned Outcomes
1. Students will be well-versed in the various ethical issues and professional standards around the gathering of data.
2. During the unit, students will be required to deliver a small scale group project. Students will be proficient in the delivery of a small-scale project, and the management of the project from initial conception to delivery to evaluation.
3. Upon completion, students will be able to present data and reports of a high standard.
4. Students will be trained in the autonomous collection, collation, assessment and comparison of data from multiple sources, such as the Australian Bureau of Statistics and the Australian Data Archive. Students will be able to discern the quality of data to a minute level, and be able to draw a broad range of insights from data of various degrees of statistical significance
5. Students will be trained in the sophisticated application of established data analytical methodology. Students will be expected to have a medium degree of proficiency in methodological procedures, and will be tasked with complex problems specifically related to the social sciences.
6. Students will be utilising industry-leading concepts and frameworks in their pedagogy. Students will be directing formidable amounts of data for protracted, complex insights into areas such as polling data and demography.
7. Students will be expected to apply their theoretical understanding of statistical methods to practical problems around data gathering methodology, statistical significance and sample sizing. Students will be expected to autonomously create basic design frameworks for statistical modelling problems.
Assessment Methods:
# Name Group Weight Due Week Outcomes
1 Class Test 1 No 10.00 Week 5 5, 6, 7,
2 Class Test 2 No 10.00 Week 9 5, 6, 7,
3 Class Test 3 No 10.00 Week 13 5, 6, 7,
4 Group Work Yes 30.00 Multiple Weeks 1, 2, 3, 4, 5, 6, 7,
5 Research Plan No 10.00 Week 6 3, 4, 5, 6, 7,
6 Research Project No 30.00 Exam Period 3, 4, 5, 6, 7,
Assessment Description: In-class tests:

These are short, 20 minute in-class tests designed to ensure students are progressing as expected. Each test will consist of three questions requiring written answers. Approximately half the material in each will explicitly cover social science applications rather than simply the methods involved, the other half a specifically methods-related question.

Group work:

The graded component of class participation, these are in-class group projects (of ~4 members), run over multiple weeks, to encourage peer-assisted learning. The group work will generally provide the opportunity to practice and learn the methods covered that week, as well as design the type of research project that comprises the final assessment of the unit.

Group composition is not fixed for the entire semester. Rather, students will be able to form new groups each week. This varies from typical group assessments at the university, which involve either one-off group formation or assigned groups. Instead, it is a repeated game (in the game theory meaning of the word) where students repeatedly play the same game (participate in group work). Repeated games capture the idea that an actor must consider the impact of their current actions on the future actions of other actors (other students). This provides an incentive for students to be constructive members of groups. Constructive behaviour in the early classes of the semester will increase the probability that they will be included in effective groups later in the semester.

Prior to the assessable component of the group activity, there will be discrete non-assessed group projects. These non-assessable group activities will provide students with the opportunity to work with each other on non-graded projects to provide you with information on how the graded portions of lab work will operate; and information on how class members cooperate in groups. Assessable group work will be conducted in half the labs, mostly in the second half of the semester.

Students will be required to work together to focus on different parts of the project, and quickly analyse data and write up their results. The finished product will be submitted for grading at the end of the seminar, with marks returned in the following lab. These assessments are designed to ensure students are understanding the material, as well as providing opportunities for them to grow their skills of collaboration and teamwork, which are increasingly important across most fields.

Major research project:

Forty per cent of your final grade for this unit comes from the completion of the major research project. This consists of a research plan, which requires you to outline how you intend to complete the project, and a final paper due at the beginning of the exam period. These assessments are designed to track your competence in the core skills developed in this unit, and provide you with the opportunity to apply them on practical research tasks.

Research plan - The first component of this assessment is a 600-word research plan, which is designed to encourage you to think about the question you will answer, and the data and methods you will use, for the major research project. You will be provided with three possible questions and several sets of data before classes start. Questions and data will be made available before semester begins. Time will be made in the first few weeks of the semester to discuss the questions and our expectations for this assessment in detail. You will be required to select the best way to answer your chosen question. You will need to outline your approach to the question and the literature that informs it (only five sources are required for your plan), and the methodology you intend to use to answer the question. At the end of your plan you should include a reference list outlining all the data and literature you have used. This will not count towards your word limit. Assessments will need to be completed in R Markdown, which we will show you how to use.

Research project - The final assessment will be a 2000-word research report answering the question chosen for your plan. You must provide a (brief) literature review (10 pieces of literature only are required to be used; and this is not limited to academic papers, it can include government and media reports), outline your data and methodology and specify why you have chosen the methods used, and present your results. Grades will be awarded for quality of analysis and presentation, and how well you use the methods and material covered in this class.
Assessment Feedback: Written feedback will be provided for group work, and the research plan and project.
Policies & Procedures: IMPORTANT: School policy relating to Academic Dishonesty and Plagiarism.

In assessing a piece of submitted work, the School of Computer Science may reproduce it entirely, may provide a copy to another member of faculty, and/or to an external plagiarism checking service or in-house computer program and may also maintain a copy of the assignment for future checking purposes and/or allow an external service to do so.

Other policies

See the policies page of the faculty website at http://sydney.edu.au/engineering/student-policies/ for information regarding university policies and local provisions and procedures within the Faculty of Engineering and Information Technologies.

Note that the "Weeks" referred to in this Schedule are those of the official university semester calendar https://web.timetable.usyd.edu.au/calendar.jsp

Week Description
Week 1 Understanding the social world using data science.
Week 2 Visualising social science data
Week 3 Confounding factors and human behaviour
Week 4 Understanding economic behaviour
Week 5 Understanding the probability of real world problems
Assessment Due: Class Test 1
Week 6 Predicting outcomes in the social world
Assessment Due: Research Plan
Week 7 Non-linear problems in the social sciences
Week 8 Understanding human behaviour through survey design
Week 9 Measuring latent variables in the social world
Assessment Due: Class Test 2
Week 10 Regularisation and variable selection in the social sciences
Week 11 Causality in the social world, and using spatial data
Week 12 Quantitative social science in the wild (data journalism)
Week 13 Conclusion, and developing your research project
Assessment Due: Class Test 3
Pre-Semester We will look at holding one or two pre-semester workshops for students who require assistance with R and some of the basic methods we are covering during semester.
Post-Semester We will look at holding one or two post-semester workshops for students who require assistance with their major research projects.
Exam Period Assessment Due: Research Project

Course Relations

The following is a list of courses which have added this Unit to their structure.

Course Year(s) Offered
Bachelor of Advanced Computing/Bachelor of Commerce 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science (Health) 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science (Medical Science) 2018, 2019, 2020
Bachelor of Advanced Computing (Computational Data Science) 2018, 2019, 2020
Bachelor of Advanced Computing (Computer Science Major) 2018, 2019, 2020
Bachelor of Advanced Computing (Information Systems Major) 2018, 2019, 2020
Bachelor of Advanced Computing (Software Development) 2018, 2019, 2020
Graduate Certificate in Information Technology 2016, 2017, 2018, 2019, 2020
Graduate Certificate in Information Technology Management 2015, 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Health Technology Innovation 2017, 2018, 2019, 2020
Graduate Diploma in Information Technology 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Information Technology Management 2015, 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Complex Systems 2017, 2018, 2019, 2020
Master of Complex Systems 2017, 2018, 2019, 2020
Master of Data Science 2018, 2019, 2020
Master of Health Technology Innovation 2015, 2016, 2017, 2018, 2019, 2020
Master of Information Technology 2015, 2016, 2017, 2018, 2019, 2020
Master of Information Technology Management 2015, 2016, 2017, 2018, 2019, 2020
Master of IT/Master of IT Management 2015, 2016, 2017, 2018, 2019, 2020

Course Goals

This unit contributes to the achievement of the following course goals:

Attribute Practiced Assessed
(6) Communication and Inquiry/ Research (Level 4) No 0%
(7) Project and Team Skills (Level 3) No 0%
(8) Professional Effectiveness and Ethical Conduct (Level 2) No 0%
(5) Interdisciplinary, Inclusiveness, Influence (Level 2) No 0%
(4) Design (Level 2) No 0%
(2) Engineering/ IT Specialisation (Level 4) No 0%
(3) Problem Solving and Inventiveness (Level 2) No 0%
(1) Maths/ Science Methods and Tools (Level 3) No 0%

These goals are selected from Engineering & IT Graduate Outcomes Table 2018 which defines overall goals for courses where this unit is primarily offered. See Engineering & IT Graduate Outcomes Table 2018 for details of the attributes and levels to be developed in the course as a whole. Percentage figures alongside each course goal provide a rough indication of their relative weighting in assessment for this unit. Note that not all goals are necessarily part of assessment. Some may be more about practice activity. See Learning outcomes for details of what is assessed in relation to each goal and Assessment for details of how the outcome is assessed. See Attributes for details of practice provided for each goal.