Skip to main content Skip to navigation

IB99Z-15 Programming and Big Data Analytics

Department
Warwick Business School
Level
Taught Postgraduate Level
Module leader
Yi Ding
Credit value
15
Module duration
10 weeks
Assessment
20% coursework, 80% exam
Study location
University of Warwick main campus, Coventry

Introductory description

This is a foundational module offered in Term 1, specifically designed to establish a robust programming skillset for students. This immersive and hands-on module aims to provide a comprehensive understanding of the Python programming language, with a specific focus on its applications in data analysis and machine learning.

Module web page

Module aims

Throughout this module, students will explore the entire spectrum of big data analysis, starting from essential descriptive analysis tasks such as data cleaning, data preparation, data wrangling, and visualization. They will then progress towards advanced prescriptive analysis elements, including time series analysis and machine learning. We will also explore the essential difference between traditional and big data approaches, and the key data engineering and architectural patterns associated with each.

To equip students with the necessary tools, they will gain fundamental programming skills using key Python libraries such as NumPy and Pandas. These libraries are essential for manipulating complex datasets, particularly when working with large volumes of data. Additionally, students will delve into data management and machine learning libraries, which are pivotal for constructing, evaluating, and optimizing machine learning models.

By seamlessly blending theoretical knowledge with practical applications, this module ensures that students are well-prepared to tackle real-world challenges in the domains of data analysis, machine learning, and big data analytics. While no prior Python experience is required, a basic understanding of programming concepts will prove beneficial for students embarking on this learning journey

This module prepares the foundations for the Generative AI and FinTech Applications, and Machine Learning for Finance modules.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

Introduction to Data Science and Python

Data Structures and Python Language Basics

Big Data and Data Engineering

Data Loading, Cleaning, Wrangling, and Visualization

Exploratory Data Analysis using Python

Introduction to Machine Learning

Predictive Modeling and Decision Making

Network Analysis and real-time analytics

Ethical Considerations in Data Science

Data Science in the Real World

Learning outcomes

By the end of the module, students should be able to:

  • Demonstrate comprehensive understanding of the foundational principles of data science, its methodologies, and its role in facilitating data-driven decision-making
  • Demonstrate comprehensive understanding of the theoretical underpinnings of predictive modeling and decision theory
  • Demonstrate an intuitive grasp of Python syntax and diverse applications in data analysis
  • Demonstrate critical thinking abilities, evaluating data validity and drawing logical conclusions from diverse data sets.
  • Cultivate creativity, using acquired knowledge to design and execute unique data science projects

Indicative reading list

Buisson, F. (2021). Behavioral Data Analysis with R and Python. " O'Reilly Media, Inc.".
Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.".
McKinney, W. (2012). Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. " O'Reilly Media, Inc.".
Mount, G. (2021). Advancing into analytics: from Excel to Python and R. (No Title).
Nielsen, A. (2019). Practical time series analysis: Prediction with statistics and machine learning. O'Reilly Media.
Provost, F., & Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking. " O'Reilly Media, Inc."
Raschka, S., Liu, Y. H., Mirjalili, V., & Dzhulgakov, D. (2022). Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python. Packt Publishing Ltd.
Sarkar, D. (2019). Text analytics with Python: a practitioner's guide to natural language processing (pp. 1-674). Bangalore: Apress.
Viafore, P. (2021). Robust Python. Sebastopol: O’Reilly Media, Inc. Search date, 24, 2023.

Subject specific skills

Use Python proficiently as a tool for data analysis
Demonstrate descriptive statistics, data aggregation, group operations, and time series analysis techniques to derive insights from data
Implement and evaluate machine learning models
Create a complete data science project using Python

Transferable skills

Communication skills
Problem solving

Study time

Type Required
Practical classes 10 sessions of 2 hours (13%)
Online learning (scheduled sessions) 10 sessions of 1 hour (7%)
Private study 48 hours (32%)
Assessment 72 hours (48%)
Total 150 hours

Private study description

No private study requirements defined for this module.

Costs

No further costs have been identified for this module.

You do not need to pass all assessment components to pass the module.

Assessment group D
Weighting Study time Eligible for self-certification
Assessment component
Group Work 20% 14 hours Yes (extension)

15-minute presentation + 1,000 word report

Reassessment component
Individual Assignment Yes (extension)
Assessment component
Written Exam 80% 58 hours No
Reassessment component is the same
Feedback on assessment

via my.wbs

Past exam papers for IB99Z

There is currently no information about the courses for which this module is core or optional.