Note: This unit version is currently under review and is subject to change!

COMP5046: Natural Language Processing (2019 - Semester 1)

Download UoS Outline

Unit: COMP5046: Natural Language Processing (6 CP)
Mode: Normal-Day
On Offer: Yes
Level: Postgraduate
Faculty/School: School of Computer Science
Unit Coordinator/s: Dr Han, Caren
Session options: Semester 1
Versions for this Unit:
Site(s) for this Unit:
Campus: Camperdown/Darlington
Pre-Requisites: None.
Brief Handbook Description: This unit introduces computational linguistics and the statistical techniques and algorithms used to automatically process natural languages. It will review the core statistics and information theory, and the basic linguistics, required to understand natural language processing (NLP).

NLP is used in a wide range of applications, including information retrieval and extraction; question answering; machine translation; and classifying and clustering of documents. This unit will explore the key challenges of natural language to computational modelling, and the state of the art approaches to the key NLP sub-tasks, including tokenisation, morphological analysis, word sense representation, part-of-speech tagging, named entity recognition and other information extraction, text categorisation and syntactic parsing.

Students will implement many of these sub-tasks in labs and assignments, that can be used in the real world cases. The unit will also investigate the annotation process that is central to creating training data for interesting application. With this unit, students can develop the innovative application that can be used in the real world
Assumed Knowledge: Knowledge of an OO programming language
Lecturer/s: Dr Han, Caren
Tutor/s: All email should be directed to sit.comp5046@sydney.edu.au rather than personal staff addresses.
Timetable: COMP5046 Timetable
Time Commitment:
# Activity Name Hours per Week Sessions per Week Weeks per Semester
1 Lecture 2.00 1 12
2 Laboratory 1.00 1 12
3 Independent Study 6.00 14
T&L Activities: Tutorial: practical software development exercises and in-class discussions.

Independent Study: study of texts and completion of assignments.

Practical work will be demonstrated with the programming language and encourage the use of relevant libraries such as the Natural Language Toolkit.

Learning outcomes are the key abilities and knowledge that will be assessed in this unit. They are listed according to the course goal supported by each. See Assessment Tab for details how each outcome is assessed.

(4) Design (Level 4)
1. Apply basic linguistic knowledge to identifying the structure of language.
2. Develop formal models to express natural language phenomenon
3. Develop machine learning and statistical methods for solving natural language tasks
(3) Problem Solving and Inventiveness (Level 4)
4. Evaluate the performance of natural language processing systems
5. implement and debug large NLP systems in a clean and structured manner
(1) Maths/ Science Methods and Tools (Level 4)
6. Apply basic statistical methods and information theory principles to modelling language.
Assessment Methods:
# Name Group Weight Due Week Outcomes
1 Lab Exercises No 10.00 Multiple Weeks 1, 2, 3, 4, 6,
2 Assignment 1 No 20.00 Week 8 2, 3, 4, 5,
3 Assignment 2 No 20.00 STUVAC (Week 14) 2, 3, 4, 5,
4 Final Exam No 50.00 Exam Period 1, 2, 3, 4, 6,
Assessment Description: Lab Exercises: Programming tasks done in weekly computer labs

Two individual assignments take place through the teaching period, as well as a final written exam.

Penalties for lateness: 10% of the available marks per day late; maximum 7 days late (after that: 0).
Grading:
Grade Type Description
Standards Based Assessment Final grades in this unit are awarded at levels of HD for High Distinction, DI (previously D) for Distinction, CR for Credit, PS (previously P) for Pass and FA (previously F) for Fail as defined by University of Sydney Assessment Policy. Details of the Assessment Policy are available on the Policies website at http://sydney.edu.au/policies . Standards for grades in individual assessment tasks and the summative method for obtaining a final mark in the unit will be set out in a marking guide supplied by the unit coordinator.
Minimum Pass Requirement It is a policy of the School of Computer Science that in order to pass this unit, a student must achieve at least 40% in the written examination. For subjects without a final exam, the 40% minimum requirement applies to the corresponding major assessment component specified by the lecturer. A student must also achieve an overall final mark of 50 or more. Any student not meeting these requirements may be given a maximum final mark of no more than 45 regardless of their average.
Policies & Procedures: IMPORTANT: School policy relating to Academic Dishonesty and Plagiarism.

In assessing a piece of submitted work, the School of Computer Science may reproduce it entirely, may provide a copy to another member of faculty, and/or to an external plagiarism checking service or in-house computer program and may also maintain a copy of the assignment for future checking purposes and/or allow an external service to do so.

Other policies

See the policies page of the faculty website at http://sydney.edu.au/engineering/student-policies/ for information regarding university policies and local provisions and procedures within the Faculty of Engineering and Information Technologies.
Recommended Reference/s: Note: References are provided for guidance purposes only. Students are advised to consult these books in the university library. Purchase is not required.
Online Course Content: Via Canvas
Note on Resources: Students may be interested in the 3rd edition-in-draft of Jurafsky and Martin's "Speech and Language Processing" ( https://web.stanford.edu/~jurafsky/slp3/), which covers some recent developments and bridges some gaps relative to Manning and Schütze. While some materials are not yet available in SLP3, we rely on Manning and Schütze to provide some of the foundational materials.

Note that the "Weeks" referred to in this Schedule are those of the official university semester calendar https://web.timetable.usyd.edu.au/calendar.jsp

Week Description
Week 1 Introduction to Natural Language Processing (No Lab)
Week 2 Word Embedding (Word Vector for Meaning)
Week 3 Text Classification with Machine Learning I
Week 4 Text Classification with Machine Learning II
Week 5 Language Fundamental
Week 6 Part of Speech Tagging
Week 7 Dependency Parsing
Week 8 Language Model
Assessment Due: Assignment 1
Week 9 Information Extraction I: Named Entity Recognition
Week 10 Information Extraction II: Relation Extraction
Week 11 Application I: Question and Answering
Week 12 Application II: Machine Translation
Week 13 Future of NLP and Exam Review
STUVAC (Week 14) Assessment Due: Assignment 2
Exam Period Assessment Due: Final Exam

Course Relations

The following is a list of courses which have added this Unit to their structure.

Course Year(s) Offered
Bachelor of Advanced Computing/Bachelor of Commerce 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science (Health) 2018, 2019, 2020
Bachelor of Advanced Computing/Bachelor of Science (Medical Science) 2018, 2019, 2020
Bachelor of Advanced Computing (Computational Data Science) 2018, 2019, 2020
Bachelor of Advanced Computing (Computer Science Major) 2018, 2019, 2020
Bachelor of Advanced Computing (Information Systems Major) 2018, 2019, 2020
Bachelor of Advanced Computing (Software Development) 2018, 2019, 2020
Bachelor of Computer Science and Technology (Honours) 2015, 2016, 2017
Bachelor of Computer Science and Technology (Honours) 2014 2013, 2014
Bachelor of Information Technology 2015, 2016, 2017
Bachelor of Information Technology/Bachelor of Arts 2015, 2016, 2017
Bachelor of Information Technology/Bachelor of Commerce 2015, 2016, 2017
Bachelor of Information Technology/Bachelor of Medical Science 2015, 2016, 2017
Bachelor of Information Technology/Bachelor of Science 2015, 2016, 2017
Bachelor of Information Technology (Computer Science) 2014 and earlier 2009, 2010, 2011, 2012, 2013, 2014
Information Technology (Computer Science)/Arts 2012, 2013, 2014
Information Technology (Computer Science) / Commerce 2012, 2013, 2014
Information Technology (Computer Science) / Medical Science 2012, 2013, 2014
Information Technology (Computer Science) / Science 2012, 2013, 2014
Information Technology (Computer Science) / Law 2012, 2013, 2014
Bachelor of Information Technology (Information Systems) 2014 and earlier 2010, 2011, 2012, 2013, 2014
Information Technology (Information Systems)/Arts 2012, 2013, 2014
Information Technology (Information Systems) / Commerce 2012, 2013, 2014
Information Technology (Information Systems) / Medical Science 2012, 2013, 2014
Information Technology (Information Systems) / Science 2012, 2013, 2014
Information Technology (Information Systems) / Law 2012, 2013, 2014
Bachelor of Information Technology/Bachelor of Laws 2015, 2016, 2017
Graduate Certificate in Information Technology 2015, 2016, 2017, 2018, 2019, 2020
Graduate Certificate in Information Technology Management 2015, 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Computing 2015, 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Health Technology Innovation 2015, 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Information Technology 2015, 2016, 2017, 2018, 2019, 2020
Graduate Diploma in Information Technology Management 2015, 2016, 2017, 2018, 2019, 2020
Graduate Certificate in Information Technology (till 2014) 2012, 2013, 2014
Graduate Diploma in Information Technology (till 2014) 2012, 2013, 2014
Master of Data Science 2016, 2017, 2018, 2019, 2020
Master of Health Technology Innovation 2015, 2016, 2017, 2018, 2019, 2020
Master of Information Technology 2015, 2016, 2017, 2018, 2019, 2020
Master of Information Technology Management 2015, 2016, 2017, 2018, 2019, 2020
Master of IT/Master of IT Management 2015, 2016, 2017, 2018, 2019, 2020
Master of Information Technology (till 2014) 2014

Course Goals

This unit contributes to the achievement of the following course goals:

Attribute Practiced Assessed
(5) Interdisciplinary, Inclusiveness, Influence (Level 4) No 0%
(4) Design (Level 4) No 57%
(3) Problem Solving and Inventiveness (Level 4) No 32%
(1) Maths/ Science Methods and Tools (Level 4) No 11%

These goals are selected from Engineering & IT Graduate Outcomes Table 2018 which defines overall goals for courses where this unit is primarily offered. See Engineering & IT Graduate Outcomes Table 2018 for details of the attributes and levels to be developed in the course as a whole. Percentage figures alongside each course goal provide a rough indication of their relative weighting in assessment for this unit. Note that not all goals are necessarily part of assessment. Some may be more about practice activity. See Learning outcomes for details of what is assessed in relation to each goal and Assessment for details of how the outcome is assessed. See Attributes for details of practice provided for each goal.