ST123-15 Foundations of Data Science 2
Introductory description
This module builds on the material introduced in Foundations of Data Science 1 to provide an enhanced set of data analysis tools including pattern prediction using machine learning. These enhanced tools provide a richer suite of techniques to explore and question real-world data.
This module is designed for those who have taken Foundations of Data Science 1, but who are otherwise not taking a mathematics or statistics course.
Prerequisite module Foundations of Data Science 1.
Pre-registration required. This module requires pre--registration. This takes place in Week 1 Term 1. This module takes place in Term 2. To pre-register please visit the module page and complete the pre-registration form.
Availability This module can be taken either at level 1 or level 2. Students interested in the level 2 version should consider ST239 Principles of Data Science 2. Students may not take both version of this module. Students in Year 3 must take ST239 Principles of Data Science 2.
Module aims
This module aims to:
- explore hypothesis testing, prediction and regression through a Data Science lens.
- provide an introduction to classification as a route to machine learning.
- work hand-on with real data.
- develop Python programming skills to investigate data.
Outline syllabus
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
This module covers statistical areas such as testing hypotheses, estimation, predication, inference and classification problems. These areas will be explored using real data sets and Python. Python will be leveraged as a tool to deliver a pipeline from data, to analysis, to conclusions to results presentation that supports decision-making under uncertainty.
Learning outcomes
By the end of the module, students should be able to:
- Apply Python to test hypotheses, estimate unknowns, make predictions and draw inferences.
- Evaluate the outcomes of data analysis to make judgements under uncertainty.
- Communicate the outcomes of an analysis for a variety of audience coherently and clearly in different forms.
- Describe the main principles underlying inference from data.
Indicative reading list
Reading lists can be found in Talis
Specific reading list for the module
Research element
There will be the opportunity to conduct an analysis of a contemporary data science, drawing conclusions from this data set and presenting new insights.
Interdisciplinary
Data Science is an interdisciplinary endeavour leveraging mathematics, statistics, computer science to provide new opportunities to explore other disciplines. Data sets and students will be drawn together by a shared interest in data exploration.
Subject specific skills
- Select and apply appropriate data techniques.
- Create structured and coherent arguments communicating them in written form.
- Construct and develop logical arguments with clear identification of assumptions and conclusions.
- Communicate subject-specific information effectively and coherently.
- Analyse problems, abstracting their essential information formulating them using appropriate language to facilitate their solution.
- Select and apply appropriate statistical programming language for data analysis.
- Understand major aspects of data collection, generation, and quality, and how this influences analyses and conclusions.
Transferable skills
- Critical thinking: extracting patterns from incomplete data and using them to form evidence-based conclusions.
- Problem solving: use of logical reasoning to build arguments grounded in evidence and with explicit underlying assumptions.
- Self-awareness: monitoring of your own learning and seeking feedback.
- Communication: verbal discussion of ideas in seminars and among peers; written communication in assignments and the final project.
- Teamwork: collaboration with peers in seminars, during self-study and during the completion of extended tasks.
- Information literacy: evaluation of data and uncertainty in a model-based way.
- Digital literacy: use of computational tools to understand and visualise data, and to produce reports.
- Professionalism: self-motivation, taking charge of your own learning, and prioritising effectively.
- Ethics: reflect on professional responsibilities as a statistician in conjunction with the generation and dissemination of information.
Study time
| Type | Required |
|---|---|
| Practical classes | 10 sessions of 2 hours (13%) |
| Online learning (scheduled sessions) | 20 sessions of 1 hour (13%) |
| Private study | 60 hours (40%) |
| Assessment | 50 hours (33%) |
| Total | 150 hours |
Private study description
Studying online learning materials.
Preparing and consolidation of practical sessions.
Costs
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
Assessment group A
| Weighting | Study time | Eligible for self-certification | |
|---|---|---|---|
Assessment component |
|||
| Data Science Project | 100% | 50 hours | Yes (extension) |
|
A project carried out over the term that builds evidence of data analysis of a selected data set. The project requires:
Due to the nature of the work undertaken and the difficulty in assigning a word count to equations, figures, tables, graphics, data output and computer code, the word count is an approximation and an individual word count may vary depending on the nature of the analysis undertaken. The total length should not exceed 5 pages not including appendices. |
|||
Reassessment component is the same |
|||
Feedback on assessment
Grades and feedback will be returned online within 20 working days of the submission deadline.
Pre-requisites
The module is open to all students on all UG courses across the university, except for students taking mathematics or statistics degrees, who are already familiar with its key themes.
Pre-registration is required. This is available during Week 1 Term 1 via the Department of Statistics Module Information pages.
Only available to first- and second-year students.
The module Principles of Data Science 2 is a Level 5 (Year 2) version of this module with different learning and assessment outcomes. You cannot take Foundations of Data Science 2 and Principles of Data Science 2.
Anti-requisite modules
If you take this module, you cannot also take:
- ST239-15 Principles of Data Science 2
There is currently no information about the courses for which this module is core or optional.