CS90915 Data Mining
Introductory description
Data Mining.
Module aims
Understanding of the value of data mining in solving realworld problems;
Understanding of foundational concepts underlying data mining;
Understanding of algorithms commonly used in data mining tools;
Ability to apply data mining tools to realworld problems.
Outline syllabus
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
Introduction to machine learning, basic concepts and motivation;
Data preprocessing and basic data transformations;
Regression models (linear regression, logistical regression);
Classification: decision trees, probabilistic generative models;
Model evaluation, biasvariance tradeoff;
Ensemble methods: boosting, bagging & random forests;
Dimensionality reduction: Principal Component Analysis (PCA), Tdistributed Stochastic Neighbour Embedding (tSNE);
Introduction to deep learning, backpropagation, gradient descent;
Convolutional neural networks;
Word embeddings;
Sequencetosequence models;
Attention mechanisms and memory networks;
Unsupervised deep learning and generative models;
Transfer learning.
Learning outcomes
By the end of the module, students should be able to:
 Display a comprehensive understanding of different data mining tasks and the algorithms most appropriate for addressing them.
 Evaluate models/algorithms with respect to their accuracy.
 Demonstrate capacity to perform a selfdirected piece of practical work that requires the application of data mining techniques.
 Critique the results of a data mining exercise.
 Develop hypotheses based on the analysis of the results obtained and test them.
 Conceptualise a data mining solution to a practical problem.
Indicative reading list
Please see Talis Aspire link for most up to date list.
View reading list on Talis Aspire
Research element
The students shall be required to explore the literature about latest methods related to classification and deep learning
Interdisciplinary
Data mining lies at the intersection of statistics, computer science and mathematics.
Subject specific skills
Design of data mining solutions
Learning to develop novel algorithms related to machine learning
Conducting proper experiment design in machine learning
Transferable skills
Experiment design
Critical Thinking
How to conduct literature reviews
Study time
Type  Required 

Lectures  30 sessions of 1 hour (20%) 
Practical classes  10 sessions of 1 hour (7%) 
Private study  110 hours (73%) 
Total  150 hours 
Private study description
Private study should focus on the following components:
a. Assigned reading
b. Coding exercises
c. Assignment solution
d. Review of the lab component
e. Revision of the lecture slides
Costs
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
Students can register for this module without taking any assessment.
Assessment group D2
Weighting  Study time  

Assignment 2  35%  
Assignment 2. This assignment is worth more than 3 CATS and is not, therefore, eligible for selfcertification. 

Assignment 1  25%  
Assignment 1. This assignment is worth more than 3 CATS and is not, therefore, eligible for selfcertification. 

Inperson Examination  40%  
CS909 Examination

Assessment group R1
Weighting  Study time  

Inperson Examination  Resit  100%  
CS909 MSc resit examination ~Platforms  AEP

Feedback on assessment
Formative feedback will be provided in lab sessions and also during lectures where answers are given in class to short exercises.
Summative feedback:
 Written feedback will be provided on the practical assignment and will be given electronically with explanation on the mark given.
Prerequisites
No Warwick module is required as prerequisite. However familiarity with basic probability and statistics (for example: discrete and continuous random variables, densities and distributions, common distributions including Bernoulli, binomial, uniform and normal distribution, expectations) will be needed.
Courses
This module is Core for:
 Year 1 of TPSSC803 Postgraduate Taught Behavioural and Data Science
 Year 1 of TCSAG5PA Postgraduate Taught Data Analytics
 Year 1 of TCSAG5PB Postgraduate Taught Data Analytics (CUSP)
This module is Optional for:
 Year 2 of TIMSL990 Postgraduate Big Data and Digital Futures
 Year 1 of TESAH641 Postgraduate Taught Communications and Information Engineering

TCSAG5PD Postgraduate Taught Computer Science
 Year 1 of G5PD Computer Science
 Year 1 of G5PD Computer Science

TIMAL995 Postgraduate Taught Data Visualisation
 Year 1 of L995 Data Visualisation
 Year 2 of L995 Data Visualisation
 Year 1 of TMAAG1PF Postgraduate Taught Mathematics of Systems
 Year 1 of TSTAG4P1 Postgraduate Taught Statistics
 Year 1 of TIMAL99D Postgraduate Taught Urban Analytics and Visualisation
 Year 2 of TIMAL99C Postgraduate Urban Informatics and Analytics
This module is Option list A for:
 Year 5 of UCSAG504 MEng Computer Science (with intercalated year)
 Year 1 of TIMSL990 Postgraduate Big Data and Digital Futures
 Year 1 of TMAAG1PF Postgraduate Taught Mathematics of Systems
 Year 1 of TIMAL99C Postgraduate Urban Informatics and Analytics
 Year 4 of UCSAG503 Undergraduate Computer Science MEng