ST231-10 Linear Statistical Modelling with R
Introductory description
This module builds from ideas in statistical modelling introduced in Year 1 and embeds them into the framework of linear models. Linear regression models are widely used in statistical practice and aim to explain or predict a continuous response variable using a collection of explanatory variables. Students will learn the theoretical background of such models, how to fit linear models to a given data set using R and how to interpret and evaluate the results.
Pre-requisistes:
-
Statistics Students:
- First Year Statistics Core (including ST117 Introduction to Statistical Modelling, ST118 Probability 1, and ST119 Probability 2) AND
- ST229 Probability for Mathematical Statistics and ST230 Mathematical Statistics.
-
External Students:
- ST120 Introduction to Probability, ST121 Statistical Laboratory and ST232 Introduction to Mathematical Statistics.
Leads to
- ST340 Fundamentals of Machine Learning
- ST344 Professional Practice of Data Analysis
- ST346 Generalised Linear Models for Regression and Classification.
Other third-year statistics modules.
Module aims
- Introduce the application of statistical modelling and statistical model exploration.
- Use of R software and its use as a tool for statistical modelling, specifically for working with linear models in a variety of different scenarios.
Outline syllabus
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
This module introduces the theory of normal linear models and their practical application in R.
- Normal linear models: definition and model assumptions.
- Estimators for normal linear models and their sampling distributions.
- Diagnostics and model building.
- Confidence intervals and t-tests for normal linear models.
- F-tests and analysis of variance; model selection and diagnostics.
- Variable selection.
Learning outcomes
By the end of the module, students should be able to:
- Define a (normal) linear model and describe its modelling assumptions;
- Derive the properties of estimators for normal linear models; compute confidence intervals and perform hypothesis tests for normal linear models;
- Fit, diagnostically check, improve and compare regression models in R;
- Interpret and critically evaluate various linear models;
- Communicate solutions to problems accurately with structured and coherent arguments.
Indicative reading list
Reading lists can be found in Talis
Specific reading list for the module
Research element
Students complete guided exploration of data sets as part of the coursework which provides a foundation for applied statistics research in later years.
Interdisciplinary
While not explicitly interdisciplinary, students are exposed to dataset from a variety of application contexts.
Subject specific skills
- Demonstrate facility with advanced mathematical and probabilistic methods.
- Demonstrate knowledge of key mathematical and statistical concepts, both explicitly and by applying them to the solution of mathematical problems.
- Select and apply appropriate mathematical and/or statistical techniques
- Create structured and coherent arguments communicating them in written form.
- Select and apply appropriate computational techniques in a statistical programming language (for example, R) to build and evaluate linear models.
Transferable skills
- Problem solving skills: The module requires students to solve problems presenting their conclusions as logical and coherent arguments.
- Written communication skills: Students complete written assessments that require precise and unambiguous communication in the manner and style expected in mathematical sciences.
- Verbal communication skills: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorials sessions. Students can continually discuss specific aspects of the module with the module leader. This is facilitated by statistics staff office hours.
- Team working and working effectively with others: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorial sessions.
- Professionalism: Students work autonomously by developing and sustain effective approaches to learning, including time-management, organisation, flexibility, creativity, collaboration and intellectual integrity.
Study time
| Type | Required |
|---|---|
| Lectures | 20 sessions of 1 hour (20%) |
| Practical classes | 9 sessions of 1 hour (9%) |
| Private study | 46 hours (46%) |
| Assessment | 25 hours (25%) |
| Total | 100 hours |
Private study description
Weekly revision of lecture notes and materials, wider reading and practice exercises, working on problem sets and preparing for examination.
Other activity description
Revision support.
Costs
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
Assessment group D3
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
| Assignment | 20% | 18 hours | No |
|
You will use the statistical programming language R to carry out calculations and fit models on provided data sets in response to a set of questions. You will present, discuss, and evaluate the results. The length of the report will not exceed 18 pages, including figures, tables, code and R output, The preparation and completion time noted refers to the amount of time in hours that a well-prepared student who has attended lectures and carried out an appropriate amount of independent study on the material could expect to spend on this assignment. |
|||
| Set of short lab reports. | 10% | 5 hours | No |
|
There will be approximately weekly problem sets. Each set will contain a number of individual questions based on the material delivered in the lectures. Problem sheets are supported by practical classes, including analytical, computational tasks and computer-based work. Assessment is based on solutions to the problems and engagement with in-class practical classes. The preparation and completion time noted refers to the amount of time in hours that a well-prepared student who has attended lectures and carried out an appropriate amount of independent study on the material could expect to spend on this assessment. |
|||
| Centrally-timetabled examination (On-campus) | 70% | 2 hours | No |
|
You will be required to answer all questions on this examination paper. The study time noted refers to the length of the exam in hours.
|
|||
Assessment group R3
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
| In-person Examination - Resit | 100% | No | |
|
You will be required to answer all questions on this examination paper.
|
|||
Feedback on assessment
Individual feedback will be provided on problem sheets by class tutors.
Solutions and cohort level feedback will be provided for the examination
Students are actively encouraged to make use of office hours to build up their understanding, and to view all their interactions with lecturers and class tutors as feedback.
Anti-requisite modules
If you take this module, you cannot also take:
- ST240-15 Linear Statistical Modelling
- ST351-15 Linear Statistical Modelling (For Finalists)
Courses
This module is Core for:
-
USTA-G302 Undergraduate Data Science
- Year 2 of G302 Data Science
- Year 2 of G302 Data Science
- Year 2 of USTA-G304 Undergraduate Data Science (MSci)
-
USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics
- Year 2 of G30A Master of Maths, Op.Res, Stats & Economics (Actuarial and Financial Mathematics Stream)
- Year 2 of G30J Master of Maths, Op.Res, Stats & Economics (Data Analysis Stream)
- Year 2 of G30B Master of Maths, Op.Res, Stats & Economics (Econometrics and Mathematical Economics Stream)
- Year 2 of G30C Master of Maths, Op.Res, Stats & Economics (Operational Research and Statistics Stream)
- Year 2 of G30C Master of Maths, Op.Res, Stats & Economics (Operational Research and Statistics Stream)
- Year 2 of G30D Master of Maths, Op.Res, Stats & Economics (Statistics with Mathematics Stream)
- Year 2 of G300 Mathematics, Operational Research, Statistics and Economics
- Year 2 of G300 Mathematics, Operational Research, Statistics and Economics
- Year 2 of G300 Mathematics, Operational Research, Statistics and Economics
- Year 2 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)
-
USTA-GG14 Undergraduate Mathematics and Statistics (BSc)
- Year 2 of GG14 Mathematics and Statistics
- Year 2 of GG14 Mathematics and Statistics
-
USTA-Y602 Undergraduate Mathematics,Operational Research,Statistics and Economics
- Year 2 of Y602 Mathematics,Operational Research,Stats,Economics
- Year 2 of Y602 Mathematics,Operational Research,Stats,Economics
This module is Optional for:
-
UMAA-G100 Undergraduate Mathematics (BSc)
- Year 2 of G100 Mathematics
- Year 2 of G100 Mathematics
- Year 2 of G100 Mathematics
-
UMAA-G103 Undergraduate Mathematics (MMath)
- Year 2 of G100 Mathematics
- Year 2 of G103 Mathematics (MMath)
- Year 2 of G103 Mathematics (MMath)