PS382-15 Large Language Data
Introductory description
Psychologists are increasingly using large text and speech datasets to investigate the human mind. This trend raises two key questions: How can we effectively analyse such vast language data? And what unique insights can large language data offer into cognitive processes and human behaviour?
Module aims
This module equips psychology students with fundamental data science skills, enabling them to analyse and interpret large language datasets to gain deeper insights into cognitive processes and human behaviour. By incorporating programming—a highly sought-after skill in many jobs—this module enhances students’ employability in research, industry, and data-driven roles.
Outline syllabus
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
A 2-hour lecture and a 1-hour seminar each week.
This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.
Best practices and ethical issues in analysing/collecting large data; Basic Programming; Corpus Analysis; Data; Visualisation; Text Mining; Sentiment Analysis; Network Analysis; Speech Analysis.
Learning outcomes
By the end of the module, students should be able to:
- Critically evaluate quantitative psychological research that uses large language datasets.
- Organise, process, visualise, and analyse large language datasets.
- Communicate, report, and interpret large language data science findings accurately and concisely.
Indicative reading list
Balota, D. A., Yap, M. J., Cortese, M. J., Hutchison, K. A., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English Lexicon Project. Behavior Research Methods, 39(3), 445–459. https://doi.org/10.3758/BF03193014
De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M., & Storms, G. (2019). The “Small World of Words” English word association norms for over 12,000 cue words. Behavior Research Methods, 51(3), 987–1006. https://doi.org/10.3758/s13428-018-1115-7
Frank, M. C., Braginsky, M., Yurovsky, D., & Marchman, V. A. (2021). Variability and consistency in early language learning: The Wordbank project. MIT Press.
Gablasova, D., Brezina, V., & McEnery, T. (2017). Collocations in corpus-based language learning research: Identifying, comparing, and interpreting the evidence. Language Learning, 67, 155–179. https://doi.org/10.1111/lang.12225
Gries, S. Th. (2009). Quantitative corpus linguistics with R: A practical introduction. Routledge.
Hills, T. T., Proto, E., Sgroi, D., & Seresinhe, C. I. (2019). Historical analysis of national subjective wellbeing using millions of digitized books. Nature Human Behaviour, 3(12), 1271–1275. https://doi.org/10.1038/s41562-019-0750-z
Wickham, H. (2010). ggplot2: Elegant graphics for data analysis. Springer.
Winter, B. (2020). Statistics for linguists: An introduction using R. Routledge.
Research element
Seminars will provide students with opportunities to practise new data science skills, present their projects, and attend data clinics for support with their reports and presentations.
Interdisciplinary
This module is inherently interdisciplinary, integrating psychology, data science, and computational linguistics to explore both psychological and linguistic phenomena.
Subject specific skills
Students will develop subject-specific skills in data management, programming, data visualisation, and quantitative analysis. These skills will enable them to analyse large datasets, critically evaluate data-driven research, and apply data science techniques to psychological research.
Transferable skills
Students will develop transferable skills in critical evaluation, data analysis, and communication. These skills will enable them to critically assess data-driven research, analyse and interpret large datasets, and report findings accurately and concisely.
Study time
Type | Required | Optional |
---|---|---|
Lectures | 11 sessions of 2 hours (15%) | |
Seminars | 10 sessions of 1 hour (7%) | |
Online learning (independent) | (0%) | 15 sessions of 1 hour |
Private study | 40 hours (27%) | |
Assessment | 78 hours (52%) | |
Total | 150 hours |
Private study description
Students are encouraged to locate related reading materials online or in the library and engage with them as part of private and independent learning.
Costs
No further costs have been identified for this module.
You do not need to pass all assessment components to pass the module.
Assessment group D
Weighting | Study time | Eligible for self-certification | |
---|---|---|---|
Assessment component |
|||
Oral presentation | 5% | 8 hours | No |
In the presentation, students will introduce their chosen dataset and research question(s) for the written report. Students are expected to outline their analytical approach for feedback. |
|||
Reassessment component is the same |
|||
Assessment component |
|||
Written report | 40% | 30 hours | Yes (extension) |
In the written report, students will demonstrate their proficiency in conducting a large language data analysis to explore their research question(s). As part of the assessment, students are expected to document reproducible code. |
|||
Reassessment component is the same |
|||
Assessment component |
|||
Exam | 55% | 40 hours | No |
The exam will consist of a variety of question types, such as multiple-choice questions, short-answer questions, and fill-in-the-blank questions. ~Platforms - AEP,Moodle,WAS
|
|||
Reassessment component is the same |
Feedback on assessment
Oral feedback and written feedback on oral presentation.
Written feedback on script and in feedback form on written report.
No feedback on MCQ exam.
Feedback on the presentation will be provided within 20 working days. Grades on the report and exam will be provided in line with Department of Psychology procedure.
Pre-requisites
To take this module, you must have passed:
Courses
This module is Optional for:
- Year 3 of UPHA-VL78 BA in Philosophy with Psychology
-
UPSA-C800 Undergraduate Psychology
- Year 3 of C800 Psychology
- Year 3 of C800 Psychology
- Year 4 of UPSA-C801 Undergraduate Psychology (with Intercalated year)
- Year 3 of UIPA-C8L8 Undergraduate Psychology and Global Sustainable Development
- Year 3 of UPSA-C804 Undergraduate Psychology with Education Studies
-
UPSA-C802 Undergraduate Psychology with Linguistics
- Year 3 of C802 Psychology with Linguistics
- Year 3 of C802 Psychology with Linguistics