The traditional approach to research programmes is to assume that students will find a way to analyse and visualise their data. This assumption brings problems for the students, their supervisors and a significant waste of time. Many students are scared by the data rather than curious and usually skip exploratory data analysis and go straight to advanced statistical models that they cannot explain later because they do not understand their data in depth. This course aims to provide basic knowledge, skills and tools to perform such an exploratory data analysis, with a major focus on publication-ready data visualisation to detect patterns and trends in the data, to extract meaningful information from the data and to prepare for further inferential analysis.
On successful completion of the course, the students will be able to understand/perform:
On successful completion of the course, the students will have the knowledge and practical skills to successfully apply R and its essential functions and packages to wrangle and transform their research data to perform informative exploratory analyses and perform publication-ready visualisation of their data, enabling effective interpretation and communication of the research results and findings to the scientific community.
To approve the course, each student must present a final project and code script with the exploratory analysis of data from an original dataset of their choice.
Please tune into class with a laptop that has the following installed:
A recent version of R (>=3.9.0), which is available for free at https://cran.r-project.org/
A recent version of RStudio Desktop (>=1.3.0), available for free at https://www.rstudio.com/download (RStudio Desktop Open Source License)
The R packages we will use, which you can install by connecting to the internet, opening RStudio, and running at the command line:
install.packages(c("tidyverse", "gtsummary",
"janitor", "ggpubr",
"ggthemes", "naniar", "NHANES"))
Session | Theme | Contents |
---|---|---|
1 | Data Visualization: Why | - Introduction to R - The grammar of graphics: data, geoms, aes - Basic visualisations - How to deal with errors |
2 | Data Visualization II: How | geom_points - geom_histogram - geom_col - geom_bar - facets - Locating and dealing with NA |
3 | Data Wrangling: Why | Filter - Select - Mutate - Summarise I - Arrange - lubridate |
4 | Data Wrangling II: How | Pivoting - group_by - Summarise II - gtsummary |
5 | Data Management | GIT, OSF - Data workflow: data validation, data form entry, when and when not to use spreadsheets - Files naming - Basic of coding management: tidylog - Basic of data cleaning: janitor - Good code practices and data sharing Codebook (dataMaid) |
6 | Final project | Students presentations, 15 min per project, max 6 project |