IB99Z-15 Programming and Big Data Analytics
Introductory description
This is a foundational module offered in Term 1, specifically designed to establish a robust programming skillset for students. This immersive and hands-on module aims to provide a comprehensive understanding of the Python programming language, with a specific focus on its applications in data analysis and machine learning.
Module aims
Throughout this module, students will explore the entire spectrum of big data analysis, starting from essential descriptive analysis tasks such as data cleaning, data preparation, data wrangling, and visualization. They will then progress towards advanced prescriptive analysis elements, including time series analysis and machine learning. We will also explore the essential difference between traditional and big data approaches, and the key data engineering and architectural patterns associated with each.
To equip students with the necessary tools, they will gain fundamental programming skills using key Python libraries such as NumPy and Pandas. These libraries are essential for manipulating complex datasets, particularly when working with large volumes of data. Additionally, students will delve into data management and machine learning libraries, which are pivotal for constructing, evaluating, and optimizing machine learning models.
By seamlessly blending theoretical knowledge with practical applications, this module ensures that students are well-prepared to tackle real-world challenges in the domains of data analysis, machine learning, and big data analytics. While no prior Python experience is required, a basic understanding of programming concepts will prove beneficial for students embarking on this learning journey
This module prepares the foundations for the Generative AI and FinTech Applications, and Machine Learning for Finance modules.
Outline syllabus
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
Introduction to Data Science and Python
Data Structures and Python Language Basics
Big Data and Data Engineering
Data Loading, Cleaning, Wrangling, and Visualization
Exploratory Data Analysis using Python
Introduction to Machine Learning
Predictive Modeling and Decision Making
Network Analysis and real-time analytics
Ethical Considerations in Data Science
Data Science in the Real World
Learning outcomes
By the end of the module, students should be able to:
- Demonstrate comprehensive understanding of the foundational principles of data science, its methodologies, and its role in facilitating data-driven decision-making
- Demonstrate comprehensive understanding of the theoretical underpinnings of predictive modeling and decision theory
- Demonstrate an intuitive grasp of Python syntax and diverse applications in data analysis
- Demonstrate critical thinking abilities, evaluating data validity and drawing logical conclusions from diverse data sets.
- Cultivate creativity, using acquired knowledge to design and execute unique data science projects
Indicative reading list
Buisson, F. (2021). Behavioral Data Analysis with R and Python. " O'Reilly Media, Inc.".
Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.".
McKinney, W. (2012). Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. " O'Reilly Media, Inc.".
Mount, G. (2021). Advancing into analytics: from Excel to Python and R. (No Title).
Nielsen, A. (2019). Practical time series analysis: Prediction with statistics and machine learning. O'Reilly Media.
Provost, F., & Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking. " O'Reilly Media, Inc."
Raschka, S., Liu, Y. H., Mirjalili, V., & Dzhulgakov, D. (2022). Machine Learning with PyTorch and Scikit-Learn: Develop machine learning and deep learning models with Python. Packt Publishing Ltd.
Sarkar, D. (2019). Text analytics with Python: a practitioner's guide to natural language processing (pp. 1-674). Bangalore: Apress.
Viafore, P. (2021). Robust Python. Sebastopol: O’Reilly Media, Inc. Search date, 24, 2023.
Subject specific skills
Use Python proficiently as a tool for data analysis
Demonstrate descriptive statistics, data aggregation, group operations, and time series analysis techniques to derive insights from data
Implement and evaluate machine learning models
Create a complete data science project using Python
Transferable skills
Communication skills
Problem solving
Study time
Type | Required |
---|---|
Practical classes | 10 sessions of 2 hours (13%) |
Online learning (scheduled sessions) | 10 sessions of 1 hour (7%) |
Private study | 48 hours (32%) |
Assessment | 72 hours (48%) |
Total | 150 hours |
Private study description
No private study requirements defined for this module.
Costs
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
Assessment group D
Weighting | Study time | Eligible for self-certification | |
---|---|---|---|
Assessment component |
|||
Group Work | 20% | 14 hours | Yes (extension) |
15-minute presentation + 1,000 word report |
|||
Reassessment component |
|||
Individual Assignment | Yes (extension) | ||
Assessment component |
|||
Written Exam | 80% | 58 hours | No |
Reassessment component is the same |
Feedback on assessment
via my.wbs
There is currently no information about the courses for which this module is core or optional.