ST411-15 Dynamic Stochastic Control

Department: Statistics
Level: Undergraduate Level 4
Module leader: Gechun Liang
Credit value: 15
Assessment: 100% exam
Study location: University of Warwick main campus, Coventry

Introductory description

An example of stochastic control is AlphaGo, trained as an autonomous agent to play a game like chess. Its goal is to learn a policy to maximise winning chances by making optimal moves in different game states. Through iterative learning, AlphaGo refines its decision-making, demonstrating the application of stochastic control theory in reinforcement learning.

This module is available for students on a course where it is a listed option and as an Unusual Option to students who have completed the prerequisite modules

Prerequisites ST318 Probability Theory and ST333 Applied Stochastic Processes.

Module web page

Module aims

This module focuses on stochastic control in applied probability, including its applications in mathematical finance and reinforcement learning. It aims to prepare students for careers in business, industry, or government, while also exploring cutting-edge research in the field.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

This module will cover the following areas

Markov Decision Processes (Markov chains, dynamic programming for controlled Markov chains and optimal stopping)
Reinforcement Learning (temporal difference, Q-learning method, actor-critic method, policy gradient)
Continuous-Time Deterministic Optimal Control (calculus of variations, dynamic programming and Hamilton-Jacobi equations)
Continuous-Time Stochastic Optimal Control (controlled diffusion processes, dynamic programming and Hamilton-Jacobi-Bellman equations)
Continuous-Time Reinforcement Learning (entropy-regularized exploratory formulation)

Learning outcomes

By the end of the module, students should be able to:

Identify and to deal with stochastic control and optimal stopping problems.
Solve simple Hamilton-Jacobi-Bellman equations both explicitly and numerically.
Apply the above techniques to mathematical finance and reinforcement learning domains.
Select and apply appropriate reinforcement learning techniques and its basic algorithms to solve problems.

Indicative reading list

Reading lists can be found in Talis

Specific reading list for the module

Subject specific skills

Evaluate, select and apply appropriate mathematical and/or probabilist techniques.
Demonstrate knowledge of and facility with formal probability concepts, both explicitly and by applying them to the solution of problems.
Create structured and coherent arguments communicating them in written form. 
Construct logical mathematical arguments with clear identification of assumptions and conclusions.
Reason critically, carefully, and logically and derive (prove) mathematical results.

Transferable skills

Problem solving: Use rational and logical reasoning to deduce appropriate and well-reasoned conclusions. Retain an open mind, optimistic of finding solutions, thinking laterally and creatively to look beyond the obvious. Know how to learn from failure.
Self awareness: Reflect on learning, seeking feedback on and evaluating personal practices, strengths and opportunities for personal growth.
Communication: Present arguments, knowledge and ideas, in a range of formats.
Professionalism: Prepared to operate autonomously. Aware of how to be efficient and resilient. Manage priorities and time. Self-motivated, setting and achieving goals, prioritising tasks.

Study time

Type	Required	Optional
Lectures	30 sessions of 1 hour (60%)	2 sessions of 1 hour
Assessment	20 hours (40%)
Total	50 hours

Private study description

Weekly revision of lecture notes and materials, wider reading, practice exercises and preparing for examination.

Costs

No further costs have been identified for this module.

You must pass all assessment components to pass the module.

Students can register for this module without taking any assessment.

Assessment group B4

	Weighting	Study time	Eligible for self-certification
Assessment component
Centrally-timetabled examination (On-campus)	100%	20 hours	No
The examination paper will contain four questions, of which the best marks of THREE questions will be used to calculate your grade. Answerbook Pink (12 page)
Reassessment component is the same

Feedback on assessment

Solutions and cohort level feedback will be provided for the examination.

Past exam papers for ST411

Courses

This module is Optional for:

Year 1 of TMAA-G1P0 Postgraduate Taught Mathematics
Year 4 of USTA-G304 Undergraduate Data Science (MSci)
Year 5 of USTA-G305 Undergraduate Data Science (MSci) (with Intercalated Year)
UMAA-G105 Undergraduate Master of Mathematics (with Intercalated Year)
- Year 3 of G105 Mathematics (MMath) with Intercalated Year
- Year 5 of G105 Mathematics (MMath) with Intercalated Year
USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics
- Year 4 of G30A Master of Maths, Op.Res, Stats & Economics (Actuarial and Financial Mathematics Stream)
- Year 4 of G30J Master of Maths, Op.Res, Stats & Economics (Data Analysis Stream)
- Year 4 of G30B Master of Maths, Op.Res, Stats & Economics (Econometrics and Mathematical Economics Stream)
- Year 4 of G30C Master of Maths, Op.Res, Stats & Economics (Operational Research and Statistics Stream)
- Year 4 of G30C Master of Maths, Op.Res, Stats & Economics (Operational Research and Statistics Stream)
- Year 4 of G30D Master of Maths, Op.Res, Stats & Economics (Statistics with Mathematics Stream)
- Year 4 of G300 Mathematics, Operational Research, Statistics and Economics
- Year 4 of G300 Mathematics, Operational Research, Statistics and Economics
- Year 4 of G300 Mathematics, Operational Research, Statistics and Economics
USTA-G301 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics (with Intercalated
- Year 5 of G301 BSc Master of Mathematics, Operational Research, Statistcs and Economics (with Intercalated Year)
- Year 5 of G30E Master of Maths, Op.Res, Stats & Economics (Actuarial and Financial Mathematics Stream) Int
- Year 5 of G30K Master of Maths, Op.Res, Stats & Economics (Data Analysis Stream) Int
- Year 5 of G30F Master of Maths, Op.Res, Stats & Economics (Econometrics and Mathematical Economics Stream) Int
- Year 5 of G30G Master of Maths, Op.Res, Stats & Economics (Operational Research and Statistics Stream) Int
- Year 5 of G30H Master of Maths, Op.Res, Stats & Economics (Statistics with Mathematics Stream)
UMAA-G100 Undergraduate Mathematics (BSc)
- Year 3 of G100 Mathematics
- Year 3 of G100 Mathematics
- Year 3 of G100 Mathematics
UMAA-G103 Undergraduate Mathematics (MMath)
- Year 3 of G100 Mathematics
- Year 3 of G103 Mathematics (MMath)
- Year 3 of G103 Mathematics (MMath)
- Year 4 of G100 Mathematics
- Year 4 of G103 Mathematics (MMath)
- Year 4 of G103 Mathematics (MMath)
UMAA-G106 Undergraduate Mathematics (MMath) with Study in Europe
- Year 3 of G106 Mathematics (MMath) with Study in Europe
- Year 4 of G106 Mathematics (MMath) with Study in Europe
Year 4 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)
Year 5 of USTA-G1G4 Undergraduate Mathematics and Statistics (BSc MMathStat) (with Intercalated Year)
Year 4 of UMAA-G101 Undergraduate Mathematics with Intercalated Year