Skip to main content Skip to navigation

ST349-15 Machine Learning frameworks

Department
Statistics
Level
Undergraduate Level 3
Module leader
Wenkai Xu
Credit value
15
Module duration
10 weeks
Assessment
Multiple
Study location
University of Warwick main campus, Coventry

Introductory description

This module introduces students to the contemporary practice of Machine Learning and Deep Learning. The module takes a hands-on approach where concepts are introduced using the various Machine Learning and Deep Learning frameworks developed in Python. This module is offered as an optional module to Statistics students, and as an unusual option to students from other departments, space permitting.

Prerequisites. This module assumes you have studied and completed ST231 Linear Statistical Modelling with R,

Recommended. It is recommended that you have studied and completed ST246 Python for Data Analytics Tasks and ST340 Programming for Data Science.

Module aims

The module aims to develop knowledge of:

  1. Machine Learning and Deep Learning concepts, tasks, models, and workflows.
  2. Machine Learning frameworks in Python.
  3. Deep learning frameworks in Python.
  4. Communicating outputs from Machine Learning and Deep Learning frameworks to literate audiences.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

This module covers the following topics.

  1. Modes of statistical learning / Common Machine learning tasks. Unsupervised Learning; Semi-supervised learning.

  2. Machine learning frameworks. Machine learning frameworks in Python (Scikit-Learn, JAX, graphlearning).

  3. Unsupervised learning / dimensionality reduction algorithms: for example, principal component analysis, singular value decomposition, t-SNE embeddings, k-means clustering, spectral clustering, Fokker-Planck clustering, Hierarchical clustering, mixture models, DBSCAN, OPTICS.

  4. Semi-Supervised Learning algorithms: for example, Transductive Support Vector Machines, Modularity MBO, Generative Models, Nearest Neighbour, Laplace Learning, Poisson Learning, Poisson Learning, Centered Kernel Method, p-Eikonal Classifier

Learning outcomes

By the end of the module, students should be able to:

  • Appraise and decide on the appropriate type of statistical learning required to address a real-world challenge.
  • Create data-analytic pipelines and workflows for the various modes of statistical learning.
  • Implement common Machine Learning and Deep Learning frameworks in Python.
  • Collaborate and disseminate fully documented Python code with reproducible outputs.
  • Interpret and communicate the outputs of a completed analysis to a range of audiences.

Indicative reading list

Reading lists can be found in Talis

Interdisciplinary

This module requires students to develop a balanced facility of familiarity with machine learning frameworks and data-analytic skills for solving real-world problems across disciplines.

Subject specific skills

  1. Demonstrate advanced facility with data handling and analysis methods in R and Python.

  2. Create readable, valid, reliable, reproducible and well-documented code.

  3. Appraise problems, abstracting their essential information to make judgements on the appropriate concepts to facilitate their solution.

  4. Demonstrate programming skills and knowledge of programming concepts, both explicitly and by
    applying them to the solution of real-world problems.

Transferable skills

  1. Problem-solving skills: The module requires students to solve problems and present their conclusions as logical and coherent arguments.

  2. Written communication skills: Students complete written assessments that require precise and unambiguous communication in the manner and style expected in mathematical sciences.

  3. Verbal communication skills: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorial sessions. Students can continually discuss specific aspects of the module with the module leader. This is facilitated by statistics staff office hours.

  4. Team working and working effectively with others: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorial sessions.

  5. Professionalism: Students work autonomously by developing and sustaining effective approaches to learning, including time management, organisation, flexibility, creativity, collaboratively and intellectual integrity.

Study time

Type Required
Lectures 20 sessions of 1 hour (13%)
Practical classes 10 sessions of 1 hour (6%)
Private study 80 hours (52%)
Assessment 44 hours (29%)
Total 154 hours

Private study description

Weekly revision of lecture notes and materials, wider reading and practice exercises, working on problem sets and preparing for examination.

Costs

No further costs have been identified for this module.

You do not need to pass all assessment components to pass the module.

Assessment group C
Weighting Study time Eligible for self-certification
Group Assignment 1 25% 10 hours No

A formal group report, to professional standards, presenting the analysis, interpretation and conclusion of the task set. All code must be version-controlled, well-documented and reproducible. The target audience are decision-makers who do not necessarily have advanced statistical training. For the purposes of this assessment 500 words is equivalent to one page of text, diagrams, formula or equations. Submitted code will be part of the report's appendix and will not count toward the page limit. This report must not exceed 8 pages in length.

Group Assignment 2 25% 10 hours No

A formal group report, to professional standards, presenting the analysis, interpretation and conclusion of the task set. All code must be version-controlled, well-documented and reproducible. The target audience are decision-makers who do not necessarily have advanced statistical training. For the purposes of this assessment 500 words is equivalent to one page of text, diagrams, formula or equations. Submitted code will be part of the report's appendix and will not count toward the page limit. This report must not exceed 8 pages in length.

Programming examination 50% 24 hours No

You will be a given problem that requires a programming solution. The output will present the analysis, interpretation and conclusion of the task set. All code must be well-documented and reproducible.

The problem must be completed in a specified time window.

Assessment group R
Weighting Study time Eligible for self-certification
Programming Examination 100% No

You will be a given problem that requires a programming solution. The output will present the analysis, interpretation and conclusion of the task set. All code must be well-documented and reproducible.

The problem must be completed with in a specified time window.

Feedback on assessment

Individual feedback will be provided on problem sheets by class tutors.

Cohort level feedback will be provided for the examination.

Students are actively encouraged to make use of office hours to build up their understanding and to view all their interactions with lecturers and class tutors as feedback.

Past exam papers for ST349

Courses

This module is Optional for:

  • USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics
    • Year 3 of G300 Mathematics, Operational Research, Statistics and Economics
    • Year 4 of G300 Mathematics, Operational Research, Statistics and Economics

This module is Option list A for:

  • Year 4 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)
  • Year 3 of USTA-GG14 Undergraduate Mathematics and Statistics (BSc)
  • Year 3 of USTA-Y602 Undergraduate Mathematics,Operational Research,Statistics and Economics

This module is Option list B for:

  • Year 3 of USTA-G302 Undergraduate Data Science
  • USTA-G304 Undergraduate Data Science (MSci)
    • Year 3 of G304 Data Science (MSci)
    • Year 4 of G304 Data Science (MSci)
  • Year 4 of USTA-G303 Undergraduate Data Science (with Intercalated Year)
  • Year 3 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)

This module is Option list G for:

  • Year 3 of USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics