| Instructor | Nikos Mamoulis |
| Teaching Assistant | |
| Syllabus | In the era of data, numerous real-world applications are best represented as networks. This perspective is vital as analyzing these networks can uncover valuable insights, extract interesting information, and make informed decisions. Modern technologies have significantly enhanced our ability to access vast volumes of data, simplifying and reducing the cost of storage. Understanding the importance of data is crucial in addressing diverse challenges, such as traffic congestion, financial network fraud detection, and the spread of misinformation in social networks, to name a few. Consequently, there is an increasing necessity to develop advanced tools that can address these challenges and further understand the importance of data is more necessary than ever. Examples of these technologies can be machine learning techniques (e.g., modeling different problems using GNNs), and natural language processing (NLP) techniques (text preprocessing and sentiment analysis). |
| Introduction by Professor | The main objective of this course is to provide a comprehensive analysis of data management tasks and resources, with a focus on effectively and efficiently working with big data. Specifically, this course will review state-of-the-art technologies for managing and storing data, as well as different approaches to data representation. In the era of big data analytics, many real-world applications can be represented as networks, for example, financial networks where nodes represent users and edges represent monetary transactions. Such network representations enable us to uncover valuable insights, extract meaningful information, and make informed decisions. This course covers big data systems (Apache Spark, Apache Flink), provenance analytics for graphs, and various types of network analysis. We will also explore different database paradigms, including graph databases (Neo4j, TigerGraph), relational databases (PostgreSQL, DuckDB), and NoSQL databases (MongoDB, Cassandra). |
| Learning Outcomes | |
| Pre-requisites | Very good knowledge of programming (Python and C are recommended) and knowledge of fundamental data science concepts and techniques (e.g. linear algebra) |
| Compatibility | - |
| Topics covered | |
| Assessment | |
| Course materials | |
| Session dates | |
| Add/drop | 5 January, 2026 - 14 February, 2026 |
| Maximum class size | |
| Moodle course website | |