ST411-15 Dynamic Stochastic Control

Department: Statistics
Level: Undergraduate Level 4
Module leader: Gechun Liang
Credit value: 15
Assessment: 100% exam
Study location: University of Warwick main campus, Coventry

Introductory description

An example of stochastic control is AlphaGo, trained as an autonomous agent to play a game like chess. Its goal is to learn a policy to maximise winning chances by making optimal moves in different game states. Through iterative learning, AlphaGo refines its decision-making, demonstrating the application of stochastic control theory in reinforcement learning.

This module is available for students on a course where it is a listed option and as an Unusual Option to students who have completed the prerequisite modules

Prerequisites ST318 Probability Theory and ST333 Applied Stochastic Processes.

Module web page

Module aims

This module focuses on stochastic control in applied probability, including its applications in mathematical finance and reinforcement learning. It aims to prepare students for careers in business, industry, or government, while also exploring cutting-edge research in the field.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

This module will cover the following areas

Markov Decision Processes (Markov chains, dynamic programming for controlled Markov chains and optimal stopping)
Reinforcement Learning (temporal difference, Q-learning method, actor-critic method, policy gradient)
Continuous-Time Deterministic Optimal Control (calculus of variations, dynamic programming and Hamilton-Jacobi equations)
Continuous-Time Stochastic Optimal Control (controlled diffusion processes, dynamic programming and Hamilton-Jacobi-Bellman equations)
Continuous-Time Reinforcement Learning (entropy-regularized exploratory formulation)

Learning outcomes

By the end of the module, students should be able to:

Identify and to deal with stochastic control and optimal stopping problems.
Solve simple Hamilton-Jacobi-Bellman equations both explicitly and numerically.
Apply the above techniques to mathematical finance and reinforcement learning domains.
Select and apply appropriate reinforcement learning techniques and its basic algorithms to solve problems.

Indicative reading list

Reading lists can be found in Talis

Specific reading list for the module

Subject specific skills

Evaluate, select and apply appropriate mathematical and/or probabilist techniques.
Demonstrate knowledge of and facility with formal probability concepts, both explicitly and by applying them to the solution of problems.
Create structured and coherent arguments communicating them in written form. 
Construct logical mathematical arguments with clear identification of assumptions and conclusions.
Reason critically, carefully, and logically and derive (prove) mathematical results.

Transferable skills

Problem solving: Use rational and logical reasoning to deduce appropriate and well-reasoned conclusions. Retain an open mind, optimistic of finding solutions, thinking laterally and creatively to look beyond the obvious. Know how to learn from failure.
Self awareness: Reflect on learning, seeking feedback on and evaluating personal practices, strengths and opportunities for personal growth.
Communication: Present arguments, knowledge and ideas, in a range of formats.
Professionalism: Prepared to operate autonomously. Aware of how to be efficient and resilient. Manage priorities and time. Self-motivated, setting and achieving goals, prioritising tasks.

Study time

Type	Required	Optional
Lectures	30 sessions of 1 hour (100%)	2 sessions of 1 hour
Total	30 hours

Private study description

Weekly revision of lecture notes and materials, wider reading, practice exercises and preparing for examination.

Costs

No further costs have been identified for this module.

You must pass all assessment components to pass the module.

Students can register for this module without taking any assessment.

Assessment group B4

	Weighting	Study time	Eligible for self-certification
Assessment component
Centrally-timetabled examination (On-campus)	100%	20 hours	No
The examination paper will contain four questions, of which the best marks of THREE questions will be used to calculate your grade. Answerbook Pink (12 page)
Reassessment component is the same

Feedback on assessment

Solutions and cohort level feedback will be provided for the examination.

Past exam papers for ST411

Courses

This module is Optional for:

Year 1 of TMAA-G1P0 Postgraduate Taught Mathematics
TMAA-G1PC Postgraduate Taught Mathematics (Diploma plus MSc)
- Year 1 of G1PC Mathematics (Diploma plus MSc)
- Year 2 of G1PC Mathematics (Diploma plus MSc)
Year 1 of TSTA-G4P1 Postgraduate Taught Statistics
Year 4 of USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics

This module is Option list A for:

Year 1 of TSTA-G4P1 Postgraduate Taught Statistics
Year 4 of USTA-G1G3 Undergraduate Mathematics and Statistics (BSc MMathStat)

This module is Option list B for:

TSTA-G4P1 Postgraduate Taught Statistics
- Year 1 of G40B Statistics with Data Science (Taught)
- Year 1 of G40C Statistics with Finance (Taught)
Year 4 of USTA-G304 Undergraduate Data Science (MSci)

This module is Option list C for:

UMAA-G105 Undergraduate Master of Mathematics (with Intercalated Year)
- Year 4 of G105 Mathematics (MMath) with Intercalated Year
- Year 5 of G105 Mathematics (MMath) with Intercalated Year
UMAA-G103 Undergraduate Mathematics (MMath)
- Year 3 of G103 Mathematics (MMath)
- Year 4 of G103 Mathematics (MMath)
Year 4 of UMAA-G107 Undergraduate Mathematics (MMath) with Study Abroad

This module is Option list F for:

Year 4 of USTA-G300 Undergraduate Master of Mathematics,Operational Research,Statistics and Economics