ST349-15 Machine Learning frameworks
Introductory description
This module introduces students to the contemporary practice of Machine Learning and Deep Learning. The module takes a hands-on approach where concepts are introduced using the various Machine Learning and Deep Learning frameworks developed in Python. This module is offered as an optional module to Statistics students, and as an unusual option to students from other departments, space permitting.
Prerequisites. This module assumes you have studied and completed ST231 Linear Statistical Modelling with R,
Recommended. It is recommended that you have studied and completed ST246 Python for Data Analytics Tasks and ST340 Programming for Data Science.
Module aims
The module aims to develop knowledge of:
- Machine Learning and Deep Learning concepts, tasks, models, and workflows.
- Machine Learning frameworks in Python.
- Deep learning frameworks in Python.
- Communicating outputs from Machine Learning and Deep Learning frameworks to literate audiences.
Outline syllabus
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
This module covers the following topics.
-
Modes of statistical learning / Common Machine learning tasks. Unsupervised Learning; Semi-supervised learning.
-
Machine learning frameworks. Machine learning frameworks in Python (Scikit-Learn, JAX, graphlearning).
-
Unsupervised learning / dimensionality reduction algorithms: for example, principal component analysis, singular value decomposition, t-SNE embeddings, k-means clustering, spectral clustering, Fokker-Planck clustering, Hierarchical clustering, mixture models, DBSCAN, OPTICS.
-
Semi-Supervised Learning algorithms: for example, Transductive Support Vector Machines, Modularity MBO, Generative Models, Nearest Neighbour, Laplace Learning, Poisson Learning, Poisson Learning, Centered Kernel Method, p-Eikonal Classifier
Learning outcomes
By the end of the module, students should be able to:
- Appraise and decide on the appropriate type of statistical learning required to address a real-world challenge.
- Create data-analytic pipelines and workflows for the various modes of statistical learning.
- Implement common Machine Learning and Deep Learning frameworks in Python.
- Collaborate and disseminate fully documented Python code with reproducible outputs.
- Interpret and communicate the outputs of a completed analysis to a range of audiences.
Indicative reading list
Reading lists can be found in Talis
Interdisciplinary
This module requires students to develop a balanced facility of familiarity with machine learning frameworks and data-analytic skills for solving real-world problems across disciplines.
Subject specific skills
-
Demonstrate advanced facility with data handling and analysis methods in R and Python.
-
Create readable, valid, reliable, reproducible and well-documented code.
-
Appraise problems, abstracting their essential information to make judgements on the appropriate concepts to facilitate their solution.
-
Demonstrate programming skills and knowledge of programming concepts, both explicitly and by
applying them to the solution of real-world problems.
Transferable skills
-
Problem-solving skills: The module requires students to solve problems and present their conclusions as logical and coherent arguments.
-
Written communication skills: Students complete written assessments that require precise and unambiguous communication in the manner and style expected in mathematical sciences.
-
Verbal communication skills: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorial sessions. Students can continually discuss specific aspects of the module with the module leader. This is facilitated by statistics staff office hours.
-
Team working and working effectively with others: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorial sessions.
-
Professionalism: Students work autonomously by developing and sustaining effective approaches to learning, including time management, organisation, flexibility, creativity, collaboratively and intellectual integrity.
Study time
| Type | Required |
|---|---|
| Lectures | 20 sessions of 1 hour (13%) |
| Practical classes | 10 sessions of 1 hour (6%) |
| Private study | 80 hours (52%) |
| Assessment | 44 hours (29%) |
| Total | 154 hours |
Private study description
Weekly revision of lecture notes and materials, wider reading and practice exercises, working on problem sets and preparing for examination.
Costs
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
Assessment group C
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
| Group Assignment 1 | 25% | 10 hours | No |
|
A formal group report, to professional standards, presenting the analysis, interpretation and conclusion of the task set. All code must be version-controlled, well-documented and reproducible. The target audience are decision-makers who do not necessarily have advanced statistical training. For the purposes of this assessment 500 words is equivalent to one page of text, diagrams, formula or equations. Submitted code will be part of the report's appendix and will not count toward the page limit. This report must not exceed 8 pages in length. |
|||
| Group Assignment 2 | 25% | 10 hours | No |
|
A formal group report, to professional standards, presenting the analysis, interpretation and conclusion of the task set. All code must be version-controlled, well-documented and reproducible. The target audience are decision-makers who do not necessarily have advanced statistical training. For the purposes of this assessment 500 words is equivalent to one page of text, diagrams, formula or equations. Submitted code will be part of the report's appendix and will not count toward the page limit. This report must not exceed 8 pages in length. |
|||
| Programming examination | 50% | 24 hours | No |
|
You will be a given problem that requires a programming solution. The output will present the analysis, interpretation and conclusion of the task set. All code must be well-documented and reproducible. The problem must be completed in a specified time window. |
|||
Assessment group R
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
| Programming Examination | 100% | No | |
|
You will be a given problem that requires a programming solution. The output will present the analysis, interpretation and conclusion of the task set. All code must be well-documented and reproducible. The problem must be completed with in a specified time window. |
|||
Feedback on assessment
Individual feedback will be provided on problem sheets by class tutors.
Cohort level feedback will be provided for the examination.
Students are actively encouraged to make use of office hours to build up their understanding and to view all their interactions with lecturers and class tutors as feedback.
Courses
This module is Optional for:
-
USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics
- Year 3 of G300 Mathematics, Operational Research, Statistics and Economics
- Year 4 of G300 Mathematics, Operational Research, Statistics and Economics
This module is Option list A for:
- Year 4 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)
- Year 3 of USTA-GG14 Undergraduate Mathematics and Statistics (BSc)
- Year 3 of USTA-Y602 Undergraduate Mathematics,Operational Research,Statistics and Economics
This module is Option list B for:
- Year 3 of USTA-G302 Undergraduate Data Science
-
USTA-G304 Undergraduate Data Science (MSci)
- Year 3 of G304 Data Science (MSci)
- Year 4 of G304 Data Science (MSci)
- Year 4 of USTA-G303 Undergraduate Data Science (with Intercalated Year)
- Year 3 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)
This module is Option list G for:
- Year 3 of USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics