COMP7103B - Data mining

Semester 1, 2025-26

Instructor
Mauro Sozio
Syllabus Data mining is the automatic discovery of statistically interesting and potentially useful patterns from large amounts of data.  The goal of the course is to study the main methods used today for data mining and on-line analytical processing.  Topics include Data Mining Architecture; Data Preprocessing; Mining Association Rules; Classification; Clustering; On-Line Analytical Processing (OLAP); Data Mining Systems and Languages; Advanced Data Mining (Web, Spatial, and Temporal data).
Introduction by Professor Advances in data collection and generation technologies are producing massive amounts of data from which valuable information and knowledge can be derived. In this course we study various data mining techniques, which are powerful tools for data analysts to process data and to extract from it interesting patterns and models.  These models allow new scientific discoveries and intelligent business decisions be made.
Learning Outcomes
Course Learning Outcomes
CLO1. Understand the knowledge discovery process, which includes data collection, data cleaning, model building, model testing and evaluation.
CLO2. Understand the various data mining tasks and the fundamental algorithms for achieving those tasks.
View Programme Learning Outcomes - MSc(CompSc)
View Programme Learning Outcomes - MSc(FTDA)
Pre-requisites Nil
Compatibility Nil
Topics covered
Course Content No. of Hours Course Learning Outcomes
1. Data Cleaning 4 CLO1
2. Data Exploration 4 CLO1
3. Ranking 4 CLO1
4. Clustering 6 CLO2
5. Association Rules 4 CLO2
6. Recommender Systems 3 CLO2
7. Advanced applications 5 CLO1, CLO2
 
Assessment
Description Type Weighting * Examination Period ^ Course Learning Outcomes
Assignments Continuous Assessment 40% -   
Midterm exam Continuous Assessment 10% -   
Written exam Written Examination 50% 3 - 23 December 2025 CLO1, CLO2
* The weighting of coursework and examination marks is subject to approval
^ The exact examination date is typically announced by the Examinations Office seven weeks prior to the scheduled exam date (three weeks for the summer semester). Students are obliged to follow the examination schedule. If you are unsure of your availability during the examination period, you should NOT enroll in the course. Absence from the examination may result in failure of the course. Please note that there is no supplementary examination for this course.
Course materials Prescribed textbook:
  • Introduction to Data Mining, by Tan, Steinbach, and Kumar, Addison Wesley, 2006
  • Data Mining Concepts and Techniques, by Han and Kamber, Morgan Kaufmann
Session dates
Date Time Venue Remark
Session 1 2 Oct 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 2 9 Oct 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 3 12 Oct 2025 (Sun) 7:00pm - 10:00pm LE-4
Session 4 16 Oct 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 5 23 Oct 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 6 30 Oct 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 7 6 Nov 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 8 9 Nov 2025 (Sun) 7:00pm - 10:00pm LE-4
Session 9 13 Nov 2025 (Thu) 7:00pm - 10:00pm KK-102
Session 10 20 Nov 2025 (Thu) 7:00pm - 10:00pm KK-102
Exam 11 Dec 2025 (Thu) 6:30pm - 8:30pm CPD-LG.07-10, CPD-LG.60, CPD-LG.61 & CPD-LG.62
CPD - Central Podium Levels (Centennial Campus) KK - K.K. Leung Building LE - Library Extension Building
Add/drop1 September, 2025 - 9 October, 2025