This advanced data analytics module explores state-of-the-art techniques for analysing complex, high-dimensional, and multimodal data at scale. Students will apply modern methods in data integration, embedding, visualisation, and privacy-preserving analytics using scalable tools. The module aims to deepen students' understanding of applied machine learning and data infrastructure in practical, data-intensive environments.
The module builds on the module Foundations of Computational Data Analytics , equipping students with a deep understanding of state-of-the-art methods for analysing complex, high-dimensional, and multimodal data at scale. Data Analytics is a core discipline within computer science, with increasing importance in the age of digital transformation and emerging technologies, with significant economic impact. Because of the highly interdisciplinary nature of Data Analytics students will benefit from being able to pursue working in a wide range of application domains.
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
Introduction to Data Representations: Structured, semi-structured, and unstructured data; metadata, provenance, and data lineage standards.
Multimodal Data and Fusion: Hybrid data structures (e.g. GeoJSON + time series), basics of multimodal fusion and alignment.
Scalable Data Cleaning: Outlier detection in high dimensions; standardisation, encoding, and processing with Dask, Spark, and Polars.
Exploratory Data Analysis (EDA): Techniques for text, time series, and geospatial data; correlation analysis and common EDA pitfalls.
Entity Resolution & Data Integration: Blocking, hashing, probabilistic matching, and schema alignment.
Dimensionality Reduction & Embedding: Linear and nonlinear projections (PCA, UMAP, autoencoders); manifold learning and interpretability.
Advanced Data Visualisation: Interactive, geospatial, network, and high-dimensional visualisation using tools (e.g. Plotly, Dash).
Privacy-Aware Analytics: Differential privacy, federated learning, homomorphic encryption, and legal frameworks (GDPR, CCPA).
Modern Data Infrastructure: Data warehousing (BigQuery, Snowflake), ETL/ELT pipelines, and lakehouse architectures for large-scale analytics.
Applied ML for Complex Data: Contrastive and masked modelling,
pretraining across modalities, and trends in large-scale ML for complex data.
By the end of the module, students should be able to:
Reading lists can be found in Talis
Coursework will contain a research element.
Ability to analyse and transform complex, high-dimensional, and multimodal datasets using advanced data cleaning, integration, and embedding techniques.
Ability to apply scalable tools and frameworks to perform privacy-aware analytics and visualisations on large, heterogeneous data sources.
Skills to evaluate and implement modern data infrastructure and machine learning approaches for real-world, data-intensive applications.
Being able to apply Data Analytics knowledge and understanding of specialist theoretical and methodological approaches, suggesting and incorporating interrelationships with other relevant disciplines in abstract and unpredictably complex contexts.
Students will obtain the cognitive skills to critically contribute to existing discourses and methodologies in Data Analytics, suggesting new ideas, and designing systematic studies in Data Analytics based on critical analysis and evaluation.
Students will obtain practical skills in organising and communicating information, improving interpersonal, team
and networking skills through engaging in classes and computer laboratories. Formative assessment will allow students to strategically enhance their own learning.
Data Analytics is an area with immediate relevance for increasing ethical awareness and its practical application regarding privacy concerns. The associated values will help understanding the importance of personal responsibility and ethical leadership.
| Type | Required |
|---|---|
| Lectures | 30 sessions of 1 hour (10%) |
| Seminars | 5 sessions of 2 hours (3%) |
| Supervised practical classes | 9 sessions of 2 hours (6%) |
| Private study | 116 hours (39%) |
| Assessment | 126 hours (42%) |
| Total | 300 hours |
Private study, background reading and revision.
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
| Computational Data Analytics Coursework | 30% | 36 hours | No |
|
The coursework will test competency in applying tools and techniques to perform tasks in advanced data analytics and demonstrate in-depth understanding of methods and concepts. |
|||
| Computational Data Analytics Exam | 70% | 90 hours | No |
|
|||
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
| Computational Data Analytics Resit Exam | 100% | No | |
|
|||
For coursework individual feedback will be provided. For the exam collective feedback will be provided.
This module is Core optional for: