Skip to main content Skip to navigation

WM931-15 Data Science & Machine Learning

Department
WMG
Level
Taught Postgraduate Level
Module leader
Michael Mortenson
Credit value
15
Module duration
2 weeks
Assessment
Multiple
Study locations
  • University of Warwick main campus, Coventry Primary
  • Distance or Online Delivery

Introductory description

Data Science and Machine Learning have become key drivers of business change and value generation in the modern digital economy. The ability to derive insights, recommendations and automate actions from a wide range of datasets (traditional and non-traditional - i.e. Big Data) is integral to the competitive advantage of many of the world's largest businesses. This module provides practical exposure to these methods, as well as the underlying theories and concepts.

Module aims

This module aims to enable participants to select, implement and evaluate machine learning algorithms in data science. In particular, the module highlights several of the most common, and in-demand, modern algorithms including classification, regression, clustering, dimension reduction and ensemble methods. Alongside technical knowledge, participants should develop an understanding of the applicability of different types of machine learning to common problems, and best practice for data science and Big Data analytics projects.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

Data Science Foundations: Core concepts of Data Science & Machine Learning; Data pre-processing & feature engineering.
Unsupervised learning: K-means clustering; DBSCAN; Principal Component Analysis
Classification: Theoretical background; Naïve Bayes; Decision Trees; Support Vector Machines; Model selection and evaluation.
Regression: Theoretical background; Linear models; Ridge Regression; Lasso Regression; Gaussian Process Regression; Model selection and evaluation.
Ensemble Methods: Bagging; Boosting; Voting.
Computational Complexity and High Performance Computing: Analysis of algorithms; Apache Spark and PySpark; Batch/Online algorithms

Learning outcomes

By the end of the module, students should be able to:

  • Interpret and evaluate various use-cases and the applicability of data science and machine learning.
  • Develop a comprehensive understanding of best practices for data processing and feature engineering.
  • Implement, interpret and critique current, professional standard learning models.
  • Automate deployment-ready data science pipelines and algorithms.
  • Evaluate and interpret the results of machine learning models and tune them to optimise performance.
  • Develop comprehension of the core topics of data science, machine learning and artificial intelligence.

Interdisciplinary

Statistics and computer science topics

International

Data science topics/skills are of high international demand

Subject specific skills

Data science, machine learning, statistics, ensemble learning, software development, data analysis

Transferable skills

Programming, statistics and modelling, team work, critical analysis

Study time

Type Required
Lectures 12 sessions of 1 hour (8%)
Seminars 10 sessions of 1 hour (7%)
Practical classes 8 sessions of 1 hour (5%)
Online learning (independent) 15 sessions of 1 hour (10%)
Assessment 105 hours (70%)
Total 150 hours

Private study description

No private study requirements defined for this module.

Costs

No further costs have been identified for this module.

You do not need to pass all assessment components to pass the module.

Assessment group A
Weighting Study time Eligible for self-certification
Model Development 10% 7 hours 30 minutes No

In teams, participants create a data science solution on a real-world dataset and present their approach

Post Module Assignment 80% 90 hours Yes (extension)

A two part submission - the first an essay-style question on a data science/machine learning topic; the second a working program that can model a given dataset

Feature engineering programming task 10% 7 hours 30 minutes No

Limited time test of programming and data skills via a feature engineering task

Assessment group R
Weighting Study time Eligible for self-certification
Post Module Assignment 100% Yes (extension)

A two part submission - the first an essay-style question on a data science/machine learning topic; the second a working program that can model a given dataset

Feedback on assessment

For In-module work – test scores, verbal feedback after presentation
For post module work - Annotated scripts returned to students, generic written feedback to
group.

Courses

This module is Optional for:

  • Year 1 of TWMS-H1SH Postgraduate Taught Cyber Security Management (Full-time)
  • Year 1 of TESA-H7PK Postgraduate Taught e-Business Management