Skip to main content Skip to navigation
Throughout the 2021-22 academic year, we will be prioritising face to face teaching as part of a blended learning approach that builds on the lessons learned over the course of the Coronavirus pandemic. Teaching will vary between online and on-campus delivery through the year, and you should read guidance from the academic department for details of how this will work for a particular module. You can find out more about the University’s overall response to Coronavirus at: https://warwick.ac.uk/coronavirus.

ST231-10 Linear Statistical Modelling with R

Department
Statistics
Level
Undergraduate Level 2
Module leader
Paul Jenkins
Credit value
10
Module duration
10 weeks
Assessment
Multiple
Study location
University of Warwick main campus, Coventry
Introductory description

This module introduces the ideas and methods of statistical modelling and statistical model exploration.

Pre-requisistes:

First Year Statistics Core (including ST117 Introduction to Statistical Modelling, ST118 Probability 1, and ST119 Probability 2) or equivalents.
AND
(Both ST229 Probability for Mathematical Statistics and ST230 Mathematical Statistics), or ST232/ST233 Introduction to Mathematical Statistics.

Leads to:
ST340 Programming for Data Science
ST344 Professional Practice of Data Analysis
ST346 Generalised Linear Models for Regression and Classification.

Module web page

Module aims

To introduce students to the application of R software and its use as a tool for statistical modelling, specifically for working with linear models in a variety of different scenarios.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

This module introduces the study of applied statistics and the use of R to perform inference.

  1. Introduction to R, including variables, functions, vectors matrices, lists and control flow.
  2. Exploratory data analysis using R, including summary statistics and a wide array of plots.
  3. Linear regression: model assumptions; least squares estimators; fits, residuals and predictions; diagnostics; maximum likelihood estimators for normal linear regression and their sampling distributions; confidence intervals and hypothesis tests.
  4. Multiple linear regression: model assumptions, least squares estimators and their properties; estimators for normal linear models and their sampling distributions; confidence intervals and hypothesis tests; F-tests and analysis of variance; model selection and diagnostics.
Learning outcomes

By the end of the module, students should be able to:

  • Know the fundamentals of using R for statistical computing.
  • Describe and analyse data sets by performing exploratory data analysis using R.
  • Analyse estimators for linear regression and multiple linear regression.
  • Analyse and interpret data using suitable linear models, and understand the limits of such models.
  • Communicate solutions to problems accurately with structured and coherent arguments.
Indicative reading list
  1. Linear Models In Statistics, Rencher and Schaalje, Wiley (2008).

View reading list on Talis Aspire

Subject specific skills

Demonstrate facility with advanced mathematical and probabilistic methods.

Demonstrate knowledge of key mathematical and statistical concepts, both explicitly and by applying them to the solution of mathematical problems. 

Select and apply appropriate mathematical and/or statistical techniques 

Create structured and coherent arguments communicating them in written form. 

Select and apply appropriate computational techniques in a statistical programming language (for example, R) for exploratory data analysis.

Transferable skills

Problem solving skills: The module requires students to solve problems presenting their conclusions as logical and coherent arguments.

Written communication skills: Students complete written assessments that require precise and unambiguous communication in the manner and style expected in mathematical sciences.

Verbal communication skills: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorials sessions. Students can continually discuss specific aspects of the module with the module leader. This is facilitated by statistics staff office hours.

Team working and working effectively with others: Students are encouraged to discuss and debate formative assessment and lecture material within small-group tutorials sessions.

Professionalism: Students work autonomously by developing and sustain effective approaches to learning, including time-management, organisation, flexibility, creativity, collaboratively and intellectual integrity.

Study time

Type Required Optional
Lectures 20 sessions of 1 hour (20%) 2 sessions of 1 hour
Practical classes 10 sessions of 1 hour (10%)
Private study 40 hours (40%)
Assessment 30 hours (30%)
Total 100 hours
Private study description

Weekly revision of lecture notes and materials, wider reading and practice exercises, working on problem sets and preparing for examination.

Costs

No further costs have been identified for this module.

You do not need to pass all assessment components to pass the module.

Assessment group D
Weighting Study time
Assignment 1 20% 14 hours

You will use the statistical programming language R to carry out calculations and fit models on provided data sets in response to a set of questions. You will present, discuss, and evaluate the results. The length of the report will not exceed 18 pages, including figures, tables, code and R output,

The preparation and completion time noted below refers to the amount of time in hours that a well-prepared student who has attended lectures and carried out an appropriate amount of independent study on the material could expect to spend on this assignment.

Assignment 2 20% 14 hours

You will use the statistical programming language R to carry out calculations and fit models on provided data sets in response to a set of questions. You will present, discuss, and evaluate the results. The length of the report will not exceed 18 pages, including figures, tables, code and R output,

The preparation and completion time noted below refers to the amount of time in hours that a well-prepared student who has attended lectures and carried out an appropriate amount of independent study on the material could expect to spend on this assignment.

Linear Statistical Modelling with R examination 60% 2 hours

You will be required to answer all questions on this examination paper.


  • Answerbook Pink (12 page)
  • Students may use a calculator
  • Graph paper
Assessment group R
Weighting Study time
In-person Examination - Resit 100%

You will be required to answer all questions on this examination paper. To account for the absence of an assignment in the reassessment, the paper will include questions on using R for statistical computing, and performing exploratory data analysis using R.


  • Answerbook Pink (12 page)
  • Students may use a calculator
  • Graph paper
  • Cambridge Statistical Tables (blue)
Feedback on assessment

Individual feedback will be provided on problem sheets by class tutors.

Solutions and cohort level feedback will be provided for the examination

Students are actively encouraged to make use of office hours to build up their understanding, and to view all their interactions with lecturers and class tutors as feedback.

Past exam papers for ST231

Courses

This module is Core for:

  • Year 2 of USTA-G302 Undergraduate Data Science
  • Year 2 of USTA-G304 Undergraduate Data Science (MSci)
  • USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics
    • Year 2 of G30A Master of Maths, Op.Res, Stats & Economics (Actuarial and Financial Mathematics Stream)
    • Year 2 of G30B Master of Maths, Op.Res, Stats & Economics (Econometrics and Mathematical Economics Stream)
    • Year 2 of G30C Master of Maths, Op.Res, Stats & Economics (Operational Research and Statistics Stream)
    • Year 2 of G30D Master of Maths, Op.Res, Stats & Economics (Statistics with Mathematics Stream)
    • Year 2 of G300 Mathematics, Operational Research, Statistics and Economics
  • Year 2 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)
  • Year 2 of USTA-GG14 Undergraduate Mathematics and Statistics (BSc)
  • Year 2 of USTA-Y602 Undergraduate Mathematics,Operational Research,Statistics and Economics