COMP7607B - Natural language processing

Semester 1, 2025-26

Instructor
Lingpeng Kong
Syllabus Natural language processing (NLP) is the study of human language from a computational perspective. The course will be focusing on machine learning and corpus-based methods and algorithms. We will cover syntactic, semantic and discourse processing models. We will describe the use of these methods and models in applications including syntactic parsing, information extraction, statistical machine translation, dialogue systems, and summarization. This course starts with language models (LMs), which are both front and center in natural language processing (NLP), and then introduces key machine learning (ML) ideas that students should grasp (e.g. feature-based models, log-linear models and then the neural models). We will land on modern generic meaning representation methods (e.g. BERT/GPT-3) and the idea of pretraining / finetuning.
Introduction by Professor Natural language processing (NLP) is the study of human language from a computational perspective. The course will investigate modern NLP algorithms from the machine learning perspective. We will cover syntactic, semantic and other useful models for language. We will describe the use of these methods and models in applications including syntactic parsing, information extraction, statistical machine translation, dialogue systems, and summarization. This course starts with language models (LMs), which are both front and center in natural language processing (NLP), and then introduces key machine learning (ML) ideas that students should grasp (e.g., feature-based models, log-linear models and then the neural models). We will land on modern generic meaning representation methods (e.g., BERT/GPT-3) and the idea of pretraining / finetuning / prompt-based learning.
Learning Outcomes
Course Learning Outcomes
CLO1. Able to understand the motivations and principles for building natural language processing systems
CLO2. Able to master a set of key machine learning / statistical methods which are widely used in and beyond NLP
CLO3. Able to implement practical applications of NLP using tools such as NLTK, Pytorch and Dynet
View Programme Learning Outcomes - MSc(CompSc)
Pre-requisites -
Compatibility Nil
Prior knowledge expected Basic knowledge about Machine Learning, Probability, Statistics, and Programming
Topics covered
Course Content No. of Hours Course Learning Outcomes
1. Introduction to NLP, Language Models, RNNLMs 3 CLO1
2. BERT, Pretraining + Fine-tuning 3 CLO1, CLO2, CLO3
3. Computational Graphs and Sequence to Sequence Model 3 CLO1, CLO2
4. Attention Mechanism and Transformers 3 CLO1, CLO2
5. Parsing, Context-free Grammars, Probabilistic Context-free Grammars 3 CLO1, CLO2
6. Recursive Neural Networks, Shift-reduce Parsing and Stack-LSTMs, Dependency Parsing, Recurrent Neural Network Grammars 3 CLO1, CLO2
7. Large Pretrained Models, Prompt, Prefix-Tuning and Adaptors 3 CLO1, CLO2
8. Natural Language Generation, Controllable Text Generation 3 CLO1, CLO2, CLO3
9. Question Answering 2 CLO2, CLO3
10. Multilinguality, Multimodality, NLP + Vision 2 CLO2, CLO3
11. Model Interpretability, Social NLP 2 CLO2, CLO3
 
Assessment
Description Type Weighting * Tentative Assessment Period /
Examination Period ^
Course Learning Outcomes
Quiz-based Assignment Continuous Assessment 25% - CLO1, CLO2
Programming-based Assignment Continuous Assessment 25%   CLO1, CLO2, CLO3
Project-based Assignment (Final Project) Continuous Assessment 50%   CLO1, CLO2, CLO3
* The weighting of coursework and examination marks is subject to approval
^ The exact examination date is typically announced by the Examinations Office seven weeks prior to the scheduled exam date (three weeks for the summer semester). Students are obliged to follow the examination schedule. If you are unsure of your availability during the examination period, you should NOT enroll in the course. Absence from the examination may result in failure of the course. Please note that there is no supplementary examination for this course.
Course materials Prescribed textbook:
  •  Jurafsky, Daniel, and James H. Martin. "Speech and Language Processing."
Session dates
Date Time Venue Remark
Session 1 4 Sep 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 2 11 Sep 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 3 18 Sep 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 4 25 Sep 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 5 2 Oct 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 6 4 Oct 2025 (Sat) 5:00pm - 8:00pm CYP-P2
Session 7 9 Oct 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 8 16 Oct 2025 (Thu) 1:00pm - 4:00pm MW-T2
Session 9 18 Oct 2025 (Sat) 5:00pm - 8:00pm CYP-P2
Session 10 20 Nov 2025 (Thu) 1:00pm - 4:00pm MW-T2
CYP - Chong Yuet Ming Building MW - Meng Wah Complex
Add/drop1 September, 2025 - 14 September, 2025