Note: This unit version is currently being edited and is subject to change!
INFO3406: Introduction to Data Analytics (2021 - Semester 2)
Unit: | INFO3406: Introduction to Data Analytics [not offered in 2020] (6 CP) |
Mode: | Normal-Day |
On Offer: | Yes |
Level: | Senior |
Faculty/School: | School of Computer Science |
Unit Coordinator/s: |
Anaissi, Ali
|
Session options: | Semester 2 |
Versions for this Unit: | |
Site(s) for this Unit: |
https://canvas.sydney.edu.au/courses/4545/pages/data-analysis-skills#OLEO1300' |
Campus: | Camperdown/Darlington |
Pre-Requisites: | (MATH1005 OR MATH1905 OR BUSS1020) AND (INFO2120 OR INFO2820). |
Brief Handbook Description: | Big Data refers to datasets that are massive, heterogenous, and dynamic that are beyond current approaches for the capture, storage, management, and analysis of the data. The focus of this unit is on understanding and applying relevant concepts, techniques, algorithms, and tools for the analysis, management and visualization of big data – with the goal of keeping abreast of the continual increase in the volume and complexity of data sets and enabling discovery of information and knowledge to guide effective decision making. Core data analytics content will be taught in normal lecture + tutorial delivery mode. Python programming will be taught through an online learning platform in addition to the weekly face-to-face lecture/tutorials. The unit of study will include hands-on exercises covering the range of data science skills above. |
Assumed Knowledge: | Basic statistics and database management. |
Lecturer/s: |
Liu, Tongliang
Anaissi, Ali |
|||||||||||||||
Timetable: | INFO3406 Timetable | |||||||||||||||
Time Commitment: |
|
Attributes listed here represent the key course goals (see Course Map tab) designated for this unit. The list below describes how these attributes are developed through practice in the unit. See Learning Outcomes and Assessment tabs for details of how these attributes are assessed.
Attribute Development Method | Attribute Developed |
Students learn and practice the design of a pipeline of processes to analyse a huge set of complex data. | Design (Level 3) |
Students are given scenario(s) that require them to use various components and tools to create a pipeline to process a set of complex data. Students have to articulate and substantiate their choice of computational methods & tools used in the process owing to technical, social and application constraints in the given setting. | Engineering/IT Specialisation (Level 3) |
Students are required to determine and identify appropriate tools and methods to pre-process heterogenous data coming from different channels in the practical assessment. Through their assignment, Different tools are provide to students but they have to come up with rational choice of tools to analyse & to clean up data. | Maths/Science Methods and Tools (Level 3) |
Students are required to perform requirements analysis through the practical assessment. They have to identify implicit & explicit requirements in a given project brief. Students should also explore the implied constraints through literature after synthesising the given requirements. | Information Seeking (Level 3) |
Students practice their written and oral communication skills through the assessments. They need to articulate well the aim and issues of the problems, the social and technical constraints, the reasons behind decision choices. They should be able to discuss and draw insights from the results through their analytical work. | Communication (Level 3) |
For explanation of attributes and levels see Engineering & IT Graduate Outcomes Table 2018.
Learning outcomes are the key abilities and knowledge that will be assessed in this unit. They are listed according to the course goal supported by each. See Assessment Tab for details how each outcome is assessed.
Engineering/IT Specialisation (Level 3)Assessment Methods: |
|
||||||||||||||||||||||||||||||||||||
Assessment Description: |
Participation: Complete and submit lab exercises [10 marks]. Project Stage 1: Obtain data, clean it, load and summarize [13 marks individual work; due week 6] Project Stage 2: Analyse the data, develop and test a predictive model [20 marks individual work, due week 12] Project Stage 3: presentation of results [7 marks individual work; due week 12] |
||||||||||||||||||||||||||||||||||||
Grading: |
|
||||||||||||||||||||||||||||||||||||
Policies & Procedures: | IMPORTANT: School policy relating to Academic Dishonesty and Plagiarism. In assessing a piece of submitted work, the School of IT may reproduce it entirely, may provide a copy to another member of faculty, and/or to an external plagiarism checking service or in-house computer program and may also maintain a copy of the assignment for future checking purposes and/or allow an external service to do so. Other policies See the policies page of the faculty website at http://sydney.edu.au/engineering/student-policies/ for information regarding university policies and local provisions and procedures within the Faculty of Engineering and Information Technologies. |
Online Course Content: |
This subject will use Python as programming language throughout the course. An online tutorial on Python Programming and Database Management is made available through the Grok learning platform. For the best learning effect, students should start working on this Python tutorial already before the semester start. Please go to this link ' https://canvas.sydney.edu.au/courses/4545/pages/data-analysis-skills#OLEO1300' and enroll yourself in the following free courses: - Beginner Programming for Data Analysis (OLEO1306) - Managing and Analysing Data: Introduction to SQL (OLEO1300) |
Note on Resources: | Lecture notes, tutorial notes and links to online questions will be provided on Canvas. |
Note that the "Weeks" referred to in this Schedule are those of the official university semester calendar https://web.timetable.usyd.edu.au/calendar.jsp
Week | Description |
Week 1 | Introduction to Data Science and Big Data |
Week 2 | Data Exploration with Spreadsheets |
Week 3 | Data Exploration with Python |
Week 4 | Cleaning and Storing Data |
Week 5 | Querying and Summarising Data |
Week 6 | Hypothesis Testing and Evaluation |
Assessment Due: Project Stage 1 | |
Week 7 | Data Mining - Association Rules and Dimensionality Reduction |
Week 8 | Data Mining - Clustering |
Week 9 | Machine Learning - Regression |
Week 10 | Machine Learning - Classification |
Week 11 | Unstructured Data |
Week 12 | Information, actionable knowledge from data, and link to effective decision making. |
Assessment Due: Project Stage 2 | |
Assessment Due: Project Stage 3 | |
Week 13 | Revision |
Exam Period | Assessment Due: Written exam |
Course Relations
The following is a list of courses which have added this Unit to their structure.
Course Goals
This unit contributes to the achievement of the following course goals:
Attribute | Practiced | Assessed |
Engineering/IT Specialisation (Level 3) | Yes | 30.6% |
Maths/Science Methods and Tools (Level 3) | Yes | 34.9% |
Information Seeking (Level 3) | Yes | 17% |
Design (Level 3) | Yes | 13.3% |
Communication (Level 3) | Yes | 4.2% |
These goals are selected from Engineering & IT Graduate Outcomes Table 2018 which defines overall goals for courses where this unit is primarily offered. See Engineering & IT Graduate Outcomes Table 2018 for details of the attributes and levels to be developed in the course as a whole. Percentage figures alongside each course goal provide a rough indication of their relative weighting in assessment for this unit. Note that not all goals are necessarily part of assessment. Some may be more about practice activity. See Learning outcomes for details of what is assessed in relation to each goal and Assessment for details of how the outcome is assessed. See Attributes for details of practice provided for each goal.