install.packages("effsize")Most of the data analysis tasks you will perform won’t come from base R: you most often load additional packages for specialized functions
For development or personal use, you may occasionally install packages from outside CRAN, such as from GitHub:
After installing, you need to load the packages using function library:
After loading a package, its functions are directly callable throughout the R session:
you may directly call any function from any installed package, even without loading it, using “::”; this is especially useful when there is a risk of functions with conflicting names, or if you just don’t want to load an entire package for using a single function:
| Package | Used for what | Examples of functions |
|---|---|---|
base (base R) |
Basic functions | sum, mean, sqrt, abs, c, data.frame, summary, scale, plot, +, - |
stats (base R) |
Basic statistical calculations and functions | sd, cor, cor.test, t.test, lm, glm, AIC, rnorm, rbinom |
graphics (base R) |
Basic statistical calculations and functions | plot, boxplot, hist, barplot |
(You may actually use these “base” packages very often without even realizing that they are packages)
| Package | Used for what | Examples of functions |
|---|---|---|
lme4 |
Fitting (generalized) (non-)linear mixed-effects models | lmer, glmer, ranef |
performance |
Useful tools for models | check_collinearity, r2_nagelkerke, icc |
effects |
Display effects for various statistical models | allEffects, effect |
emmeans |
Estimate marginal means for various models | emmeans |
effectsize |
Compute or convert different effect sizes | cohens_d, hedges_g, cohens_f, d_to_r |
| Package | Used for what | Examples of functions |
|---|---|---|
ggplot2 |
Create beautiful plots using The Grammar of Graphics | ggplot, geom_point, geom_line, … |
lavaan |
Structural Equation Models (SEM) | sem, cfa |
semTools |
Useful tools for SEMs | compRelSEM, measEq.syntax |
metafor |
Perform meta-analysis | rma, rma.mv, forest, funnel, regtest |
brms |
Fitting practically any Bayesian model via MCMC with STAN | brm, set_prior |
blavaan |
Fitting Bayesian SEMs | bcfa, bsem |
sessionInfo()When sharing your full R project, also share info on your R version, OS, and loaded package versions for reproducibility
R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_United Kingdom.utf8
[2] LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
time zone: Europe/Rome
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_3.5.2 semTools_0.5-6 lme4_1.1-36 Matrix_1.6-2 lavaan_0.6-19
[6] psych_2.5.6
loaded via a namespace (and not attached):
[1] generics_0.1.3 sandwich_3.1-1 lattice_0.22-6 digest_0.6.37
[5] magrittr_2.0.3 evaluate_1.0.3 grid_4.5.1 estimability_1.5.1
[9] RColorBrewer_1.1-3 mvtnorm_1.3-3 fastmap_1.2.0 jsonlite_1.9.1
[13] survival_3.8-3 multcomp_1.4-28 scales_1.4.0 TH.data_1.1-3
[17] pbivnorm_0.6.0 codetools_0.2-20 mnormt_2.1.1 reformulas_0.4.0
[21] Rdpack_2.6.2 cli_3.6.4 rlang_1.1.5 rbibutils_2.3
[25] splines_4.5.1 withr_3.0.2 yaml_2.3.10 tools_4.5.1
[29] parallel_4.5.1 nloptr_2.1.1 coda_0.19-4.1 dplyr_1.1.4
[33] minqa_1.2.8 boot_1.3-31 vctrs_0.6.5 R6_2.6.1
[37] stats4_4.5.1 zoo_1.8-13 lifecycle_1.0.4 emmeans_1.10.7
[41] MASS_7.3-60 pkgconfig_2.0.3 pillar_1.10.2 gtable_0.3.6
[45] glue_1.8.0 Rcpp_1.0.14 tidyselect_1.2.1 xfun_0.51
[49] tibble_3.3.0 rstudioapi_0.17.1 knitr_1.49 farver_2.1.2
[53] xtable_1.8-4 htmltools_0.5.8.1 nlme_3.1-167 rmarkdown_2.29
[57] compiler_4.5.1 quadprog_1.5-8
Functions typically take some input parameters, known as arguments, process that, and yield some output/result(s)
for example, seq() generates a sequence of numbers; “from” and “to” are arguments: it will provide the integers between these two extremes:
length.out controls how many equally spaced numbers must be generated:
alternatively, by defines the step size between numbers:
rnorm() will generate “n” random numbers from a normal distribution with “mean” as the average and “sd” as the standard deviation:
Positional matching - arguments names may be omitted if placed in the correct order
Default arguments - a function might still work even if some arguments are omitted, if it can use its own default values (in this case “mean=0, sd=1”)
Errors - however, omitting mandatory arguments will result in an Error
Error in rnorm(mean = 100, sd = 15): argument "n" is missing, with no default
Warnings - Some inputs may cause the function to produce Warnings and bad output, but do not stop code execution
There are two ways to access documentation: using “?” and using help()
The Working Directory (WD) is the location of the folder in your computer where R reads and saves files by default.
Let’s see my own WD using the getwd() function:
If you import/export anything (data, figures, workspaces, etc.) you need to know your WD!
As a general rule:
When you open R or the RStudio app, the default WD may be the documents folder (in Windows) or the home directory (e.g., /home/username; in Linux or macOS);
This default may be reset at any time from inside RStudio on Tools > Global Options... > General;
If you open a file (e.g., a .R script) using RStudio, the WD is set at that file location (unless the RStudio app was already open before);
However, you can set a new WD at any time from within the R code, using the setwd() function, for example:
⚠️ Windows users, be careful!
When you copy-and-paste a folder address, Windows will probably take “\” as path separator, but “\” is the escape character in R, so you incur an error:
or
ABSOLUTE paths: indicate the full path from the root, e.g., "C:/Users/enric/" (in Linux it might be "/home/enric/"; in macOS "/Users/enric/")
RELATIVE paths: indicate the path starting from (relative to) the current directory. Most often, you prefer this for import/export, so the same script works on any device; e.g., "figures/" or ../figures/:
png(filename = "figures/Fig1.png") saves a file Fig1.png into figures, which a subfolder inside the current WD;png(filename = "../figures/Fig1.png") saves a file Fig1.png into figures, which is a folder outside, one level up ("../") the current WD.Rproj may eliminate the need of using setwd() within scripts, so you can use only relative paths
You can create a new project in RStudio with File > New Project... choose a specific folder
Keep all materials of your project in the same folder as the newly created .Rproj file
As you open the .Rproj, it will automatically start a new RStudio session with the WD set into that folder.
Opening the .Rproj file ensures that RStudio automatically sets the project folder as the working directory, so all files are managed relative to here:
Now let’s see how to perform import/export operations for:
The Workspace: all objects that exist in your current R session, all results and computations stored so far (see them in the “Environment” panel or with ls());
Data: SUPER IMPORTANT! we will focus especially on tabular (Excel-like) data, that we treat as dataframes;
Figures: save your plots for reports and more in .pdf, .png, and more formats.
All your R code (script) is generally stored in text files with a .R extension. But where do you save your results and objects?! Maybe you just don’t…
However, you can export the entire workspace using save.image()
Specifying "myWS.RData" is not mandatory but recommended, otherwise your file will simply be named ".RData". (By the way… where will it be saved?)
Alternatively, you can save just one or a few workspace objects using save()
This will save only objects df and fit into a newly created file named myWS.RData. This is useful when you have an overcrowded workspace and want to save only a few objects containing the final results
Once you open a new R session, you may load the previously stored workspace using the load() function, specifying load("workspace_name.RData"), like this:
Arguably a fundamental skill for anyone working in data science!
Most people use MS Excel or similar software (e.g., LibreOffice Calc) for handling data, which produce their own file formats (e.g., .xlsx). That’s perfectly fine. However… the most versatile data format is .csv (comma-separated values), a simple text (no formatting, no licences required) file format for storing tabular data/dataframes.
.csv format from your software of choice before importing it in R.Here’s an example of using read.csv() for importing data:
# IMPORT csv data from a "data" subfolder, and store it in an object named "df"
df = read.csv("data/Performance.csv")
# OR if you want to be explicit on settings:
df = read.csv("data/Performance.csv", header=TRUE, sep=",", dec=".")
head(df) # have a look at the first few rows id name anx acc time
1 1 nydga 20 15 2.077932
2 2 bwknr 14 9 2.436858
3 3 sauuj 18 12 2.549814
4 4 vnjgi 27 15 4.386718
5 5 oueiy 21 11 5.248933
6 6 neebj 12 13 3.463094
⚠️ in Italian Excel export settings, it is possible that separator character (sep) be “;” and decimal point character be “,” so be aware of your settings!
If you absolutely want to import your data directly from a MS Excel document (.xlsx), you may use function read_excel() from the package readxl:
library(readxl)
df = data.frame( read_excel("data/Performance.xlsx") )
# data.frame() forces it to be a dataframe, otherwise it's a tibble
head(df) id name anx acc time
1 1 nydga 20 15 2.077932
2 2 bwknr 14 9 2.436858
3 3 sauuj 18 12 2.549814
4 4 vnjgi 27 15 4.386718
5 5 oueiy 21 11 5.248933
6 6 neebj 12 13 3.463094
.sav) using the read.spss() function from the foreign packageA good trick if you don’t want to specify any relative or absolute path, and want to manually select data each time, is using the file.choose() function:
Other “tricks” for importing data involve using the functions in the RStudio menu, particularly:
File > Import Dataset > From text (base)…
File > Import Dataset > From Excel
File > Import Dataset > From SPSS…
However … using these functions is not best practice, because they are specific to the RStudio IDE. It’s better to use code for reproducibility
You have processed data with R, now… how to export it?
When collaborating with someone also using R, you may choose to exchange data directly by exporting the object or the entire workspace as a .RData file, using the save() or save.image() function respectively.
However, if you need to export a dataframe in a more universally readable tabular format, such as .csv, you may use write.csv():
R has a collection of functions for exporting figures in different formats: pdf(), png(), jpeg(), bmp(), tiff(), svg().
Here is an example using png() :