Note: This unit version is currently under review and is subject to change!
DATA1902: Informatics: Data and Computation (Advanced) (2019 - Semester 2)
Unit: | DATA1902: Informatics: Data and Computation (Advanced) (6 CP) |
Mode: | Normal-Day |
On Offer: | Yes |
Level: | Junior |
Faculty/School: | School of Computer Science |
Unit Coordinator/s: |
Prof Fekete, Alan
|
Session options: | Semester 2 |
Versions for this Unit: | |
Site(s) for this Unit: |
Campus: | Camperdown/Darlington |
Pre-Requisites: | None. |
Prohibitions: | INFO1903 OR DATA1002. |
Brief Handbook Description: | This unit covers computation and data handling, integrating sophisticated use of existing productivity software, e.g. spreadsheets, with the development of custom software using the general-purpose Python language. It will focus on skills directly applicable to data-driven decision-making. Students will see examples from many domains, and be able to write code to automate the common processes of data science, such as data ingestion, format conversion, cleaning, summarization, creation and application of a predictive model. |
Assumed Knowledge: | None. |
Lecturer/s: |
Prof Fekete, Alan
|
|||||||||||||||||||||||||
Timetable: | DATA1902 Timetable | |||||||||||||||||||||||||
Time Commitment: |
|
|||||||||||||||||||||||||
T&L Activities: | Note that each laboratory session will include an online multichoice quiz, and also time devoted to covering an advanced topic (not part of the material in data1002). Thus attendance at lab is crucial. |
Learning outcomes are the key abilities and knowledge that will be assessed in this unit. They are listed according to the course goal supported by each. See Assessment Tab for details how each outcome is assessed.
Unassigned OutcomesAssessment Methods: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
Assessment Description: |
Weekly Python tasks: the material in the GrokLearning platform includes tasks where the student must write a Python program to prduce precisely described output. The program will be graded automatically, by being run against several input datasets (only one of these datasets is visible to the student before submission), and the output will be compared to what is required. *In case of special consideration, reweighting will be applied, taking the grade from those tasks not covered by the consideration. However, students should still complete the tasks even if the due date has passed. Weekly quizzes: held during the student`s scheduled tutorial session each week, each quiz consists of multiple-choice questions related to the lecture content from the previous week, and also the extra Advanced content from lab session of the previous week; the quizzes are done through the Canvas system. Each quiz is worth 1 point, and the total mark is the sum of these but capped at 10. *In case of special consideration, extension or alternative assessment are not possible, instead reweighting will be done, to replace an affected quiz by the average score on non-affected quizzes. Practice Python coding test: held during scheduled tutorial sessions. Each student will be required to produce Python code that calculates precisely described output from data in a file. This carries no weight in final grade, but is intended to accustom students to the setting in preparation for the later coding test. *In case of special consideration, no action is needed. Python coding test: held during scheduled tutorial sessions. Each student will be required to produce Python code that calculates precisely described output from data in a file. *In case of special consideration, alternative assessment will be arranged. Project Stage 1: This is the first part of a group project (the students in a group should all be attending the same scheduled tutorial session). This stage involves finding data from a domain of interest for the students; data cleaning and importing to a tool, and doing a very simple analysis from some of the data. A report is required that describes the dataset, how it was obtained, and how it was processed by the tool. If this stage is missed or badly done, the group can be given a clean data set, for a domain chosen by the instructor, to use in the rest of the project. It is crucial that each group manages its internal working effectively, and they need mechanisms to detect problems and report them to the coordinator early. Project Stage 2: the group will use computational tools, to analyse the data and offer interactive visualisation, build a useful predictive model of some kind, and report on both what was done and what was found. This stage cannot be reweighted in response to special consideration; extension or alternative assessment are needed. It is crucial that each group manages its internal working effectively, and they need mechanisms to detect problems and report them to the coordinator early. Exam: a written exam, covering conceptual content, skills, and experiences Except for tasks where late work is not accepted at all, as noted above, late submission of a progressive assessment (up to 10 days late) will attract a penalty of 5 percent of the available mark, for each calendar day after the due date. Work that is not submitted within 10 calendar days will receive a mark of zero. |
||||||||||||||||||||||||||||||||||||||||||||||||
Grading: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
Policies & Procedures: | IMPORTANT: School policy relating to Academic Dishonesty and Plagiarism. In assessing a piece of submitted work, the School of IT may reproduce it entirely, may provide a copy to another member of faculty, and/or to an external plagiarism checking service or in-house computer program and may also maintain a copy of the assignment for future checking purposes and/or allow an external service to do so. Other policies See the policies page of the faculty website at http://sydney.edu.au/engineering/student-policies/ for information regarding university policies and local provisions and procedures within the Faculty of Engineering and Information Technologies. |
Recommended Reference/s: |
Note: References are provided for guidance purposes only. Students are advised to consult these books in the university library. Purchase is not required.
|
Online Course Content: | The unit`s Canvas site will contain copies of lecture slides (and lecture recordings if the technology works as it ought to), tutorial instructions, assessment instructions, and a discussion forum. The Python teaching will be done in two independent ways: though lectures, and through labwork where students work on the GrokLearning platform by following a sequence that integrates expository material with frequent exercises which are automatically graded. |
Note that the "Weeks" referred to in this Schedule are those of the official university semester calendar https://web.timetable.usyd.edu.au/calendar.jsp
Week | Description |
Week 1 | Introduction and adminstrivia; data science lifecycle and pipeline; how to learn to program. [Advanced lab: Unix tools] |
Week 2 | Data science with spreadsheets; Python as a calculator, variables and expressions; assignment, simplified notional machine model for Python. [Advanced lab: Unix tools] |
Week 3 | More spreadsheet techniques; Decisions and conditionals; Strings, text files and loops. [Advanced lab: regular expressions] |
Week 4 | Pivot tables and lookup in spreadsheets; Lists and tuples; Dictionaries [Advanced lab: AWK] |
Week 5 | Communication and charts; data management, metadata and data quality; Writing a function in Python. [Advanced lab: combining tools] |
Week 6 | Storage and number formats; Charts in spreadsheets; Prepare for practice coding test. [Advanced lab: comparing tools] |
Week 7 | Data persistence and recovery; Intro to Pandas and Dataframes; More Pandas capabilities. [Advanced lab: comparing tools] |
Assessment Due: Practice Python coding test* | |
Week 8 | Optimisation and simulation; Plotting with Python; Scope and notional machine for Python functions. [Advanced lab: Interactive visualisation] |
Week 9 | [PUBLIC HOLIDAY]; Predicting a category (classification) and evaluating a classifier; scikit-learn Python library [Advanced lab: Interactive visualisation] |
Assessment Due: Project stage 1 | |
Week 10 | Sharing data; Predicting a numeric value (regression) and evaluation a regression; Exception-handling in Python. [Advanced lab: Interactive visualisation] |
Assessment Due: Python coding test* | |
Week 11 | Data management policies; Clustering; Introduction to classes and objects in Python. [Advanced lab: Interactive visualisation] |
Week 12 | Notebooks, workflow, provenance; Recommendation; Software quality issues. [Advanced lab: Interactive visualisation] |
Assessment Due: Project stage 2 | |
Week 13 | Review of semester; further study of data science or programming; preview of exam |
Exam Period | Assessment Due: Exam |
Course Relations
The following is a list of courses which have added this Unit to their structure.
Course | Year(s) Offered |
Software Engineering (mid-year) | 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025 |
Software / Project Management 2019+ | 2023, 2024, 2025 |
Software Engineering | 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025 |
Software / Arts 2023+ | 2023, 2024, 2025 |
Software / Commerce 2023+ | 2023, 2024, 2025 |
Software / Science | 2023, 2024, 2025 |
Software / Science - Mid Year | 2023, 2024, 2025 |
Software / Law 2023+ | 2023, 2024, 2025 |
Course Goals
This unit contributes to the achievement of the following course goals:
Attribute | Practiced | Assessed |
Unit has not been assigned any attributes yet. |
These goals are selected from Engineering & IT Graduate Outcomes Table 2018 which defines overall goals for courses where this unit is primarily offered. See Engineering & IT Graduate Outcomes Table 2018 for details of the attributes and levels to be developed in the course as a whole. Percentage figures alongside each course goal provide a rough indication of their relative weighting in assessment for this unit. Note that not all goals are necessarily part of assessment. Some may be more about practice activity. See Learning outcomes for details of what is assessed in relation to each goal and Assessment for details of how the outcome is assessed. See Attributes for details of practice provided for each goal.