R – Page 2 – R-posts.com

Dealing with Duplicate Data in R workshop

Join our workshop on Dealing with Duplicate Data in R, which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Dealing with Duplicate Data in R

Date: Thursday, April 25th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Erin Grand works as a freelancer and Data Scientist at TRAILS to Wellness. Before TRAILS, she worked as a Data Scientist at Uncommon Schools, Crisis Text Line, and a software programmer at NASA. In the distant past, Erin researched star formation and taught introductory courses in astronomy and physics at the University of Maryland. In her free time, Erin enjoys reading, Scottish country dancing, and singing loudly to musical theatre.

Description: Maintaining high data quality is essential for accurate analyses and decision-making. Unfortunately, high data quality is often hard to come by. This talk will focus on some “how-tos” of cleaning data and removing duplicates to enhance data integrity. We’ll go over common data quality issues, how to use the {{janitor}} package to identify and remove duplicates, and business practices that can help prevent data issues from happening in the first place.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list).

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Thanks, on its way to CRAN

The generic seal of approval from the CRAN team – countless hours spent tabbing between R CMD check and R CMD build logs, ‘Writing R Extensions’ and Stackoverflow approved, with a single line. The equivalent of “Noted, thanks” after a painstakingly well-written e-mail to your professor – except, this has an amazing feeling and a clear meaning: {SLmetrics} (finally) found its way to CRAN!

What is {SLmetrics}? Why should we even care?

{SLmetrics} is a collection of AI/ML performance metrics written in ‘C++’ with three things in mind: scalability, speed and simplicity – all well-known buzzwords on LinkedIn. Below is the results of the benchmark on computing a 2×2 confusion matrix:

Median execution time for constructing 2x2 confusion matrices across R packages. — Median execution time across R packages. For each N, 1000 measures have been taken with {microbenchmark}

{SLmetrics} is much faster, and more memory efficient, than the R-packages in question when computing the confusion matrix – this is an essential difference, as many if not most classification metrics are based off of the confusion matrix.

What’s new?

Since the blog-post on scalability and efficiency in January, many new features have been added. Below is an example on the Relative Root Mean Squared Error:

## 1) actual and predicted
##    values
actual    <- c(0.43, 0.85, 0.22, 0.48, 0.12, 0.88)
predicted <- c(0.46, 0.77, 0.12, 0.63, 0.18, 0.78)

## 2) calculate
##    metric and print
##    values
cat(
  "Mean Relative Root Mean Squared Error", SLmetrics::rrmse(
    actual        = actual,
    predicted     = predicted,
    normalization = 0
  ),
  "Range Relative Root Mean Squared Error (weighted)", SLmetrics::rrmse(
    actual        = actual,
    predicted     = predicted,
    normalization = 1
  ),
  sep = "\n"
)
#> Mean Relative Root Mean Squared Error
#> 0.3284712
#> Range Relative Root Mean Squared Error (weighted)
#> 0.3284712

^{Created on 2025-03-24 with reprex v2.1.1}

Visit the online docs for a quick overview of all the available metrics and features.

Installing {SLmetrics}

{SLmetrics} can be installed via CRAN, or built from source using, for example, {pak}. See below:

Via CRAN

install.packages("SLmetrics")

Build from source

pak::pak(
    pkg = "serkor1/SLmetrics",
    ask = FALSE
)

Effective Data Visualization in R in Scientific Contexts workshop

Join our workshop on Effective Data Visualization in R in Scientific Contexts, which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Effective Data Visualization in R in Scientific Contexts

Date: Thursday, April 10th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Christian Gebhard is a specialist in medical genetics. His daily practice of communicating complex scientific facts to both laypersons and healthcare professionals has fostered a deep passion for clear information presentation and effective data visualization. Striving for both clarity and reproducibility, he primarily utilizes R and ggplot2 to create impactful and accessible visualization of scientific data.

Description: The workshop will start by establishing a structured approach to transforming complex data into clear, informative visual representations. We’ll address common challenges and visualization pitfalls in different presentation formats. This part is applicable across different scientific fields and independent of visualization tools. The second part applies those principles to real-world examples using R and ggplot2. Participants will gain hands-on experience applying the learned principles to improve data communication in various presentation settings.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Devops for Data Scientists (R & Python) workshop

Join our workshop on Devops for Data Scientists (R & Python), which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Devops for Data Scientists (R & Python)

Date: Thursday, April 3rd, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Rika Gorn is a Senior Platform Engineer at Posit where she helps customers and organizations create infrastructure for data analytics and data science. Her background is in data science and data engineering.

Description: In this workshop we will learn the key principles of DevOps and problems which it intends to solve for data scientists. We will discuss how DevOps practices such as CI/CD enhance collaboration, automation, and reproducibility. We will learn common workflows for environment management, package management, containerization, monitoring & logging, and version control. Participants will get hands-on experience with a variety of tools including Docker, Github Actions, and APIs.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Frame-by-Frame Modeling and Validation of NFL geospatial data using gganimate in R workshop

Join our workshop on Frame-by-Frame Modeling and Validation of NFL geospatial data using gganimate in R, which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Frame-by-Frame Modeling and Validation of NFL geospatial data using gganimate in R

Date: Thursday, March 27th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Pablo L. Landeras has an Applied Mathematics BsC from ITAM (Mexico City). His background as both an athlete and an analyst has shaped his approach to sports research, blending firsthand experience with cutting-edge data science to drive innovation. Today, he is a Data Scientist at Zelus Analytics, where he specializes in R&D in both ice hockey and basketball.

His career has spanned a variety of projects—from public health initiatives to data-driven scouting for soccer teams like FC Toluca. Before joining Zelus, he worked as a Data Scientist at Coca-Cola.

Description: This talk will explore the validation and visualization of spatio-temporal data in sports, focusing on the NFL tracking dataset and the application of frame-by-frame modeling. After a brief introduction to spatio-temporal data and its significance, we’ll highlight common errors in tracking datasets, such as missing data and implausible trajectories, emphasizing the importance of validation. The session will delve into the capabilities of gganimate, showcasing how it transforms static plots into dynamic animations to validate data and enhance storytelling. We’ll provide an overview of the NFL tracking dataset, its structure, and key challenges like data noise and synchronization issues. Through step-by-step examples, participants will learn to build animations that visualize player movements, pass probabilities, and pass rush models, while using techniques to identify anomalies and combine multiple data sources.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Introduction to Empirical Macroeconomics with R workshop

Join our workshop on Introduction to Empirical Macroeconomics with R, which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Introduction to Empirical Macroeconomics with R

Date: Thursday, March 20th, 14:00 – 16:00 CET (Rome, Berlin, Paris timezone)

Speaker: Xiaolei (Adam) Wang is an Economics PhD student at the University of Melbourne, his research focuses on Bayesian econometrics. He is the author of the R package bsvarSIGNs, which implements algorithms for macroeconomic analyses with C++ code.

Description: Structural Vector Autoregressions (SVARs) are multivariate time series models commonly used in empirical macroeconomics. By imposing a minimal set of assumptions, such as sign restrictions, these models allow us to recover meaningful economic shocks and their dynamic causal effects from real data. This workshop will provide a gentle introduction to SVARs and their estimation techniques. Then, we will show how to apply these models with a simple workflow using the R package “bsvarSIGNs”. No prior knowledge of macroeconomics is required, any R user interested in analysing macro data can benefit from this workshop.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Hitting web APIs with {httr2} in R workshop

Join our workshop on Hitting web APIs with {httr2} in R , which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Hitting web APIs with {httr2} in R

Date: Thursday, March 13th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Ted Laderas is the Director of Training and Community at the Data Science Lab at Fred Hutch Cancer Center. He has taught R and Python for over 10 years. He believes that research should not be lonely, and building communities of practice in science and research that are psychologically safe and inclusive are the key to doing better, more robust science.

Description: Do the words “Web API” sound intimidating to you? This talk is a gentle introduction to what Web APIs are and how to get data out of them using the {httr2}, {jsonlite}. and {tidyjson} packages. You’ll learn how to request data from an endpoint and get the data out. We’ll do this using an API that gives us facts about cats. By the end of this talk, web APIs will seem much less intimidating and you will be empowered to access data from them.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Decomposing within and between person effects in longitudinal data with SEM in R workshop

Join our workshop on Decomposing within and between person effects in longitudinal data with SEM in R, which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Decomposing within and between person effects in longitudinal data with SEM in R

Date: Thursday, February 27th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Dustin Haraden, PhD is a clinical psychologist interested in examining risk factors for depression in youth with an emphasis on sleep, circadian rhythms and pubertal development. He has a special interest in measurement, statistics and open science as it relates to research methods in psychology. Currently, Dustin is an assistant professor in psychology at the Rochester Institute of Technology.

Description: This workshop will introduce problems that arise when researchers fail to consider within vs. between sources of variance. We will explore the implementation of the Random Intercept Cross-Lagged Panel Model through model identification, comparing model fit, interpreting parameters and reporting results.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Tabular ML in R: an overview of tidymodels in R for tabularized data workshop

Join our workshop on Tabular ML in R: an overview of tidymodels in R for tabularized data, which is a part of our workshops for Ukraine series!

Here’s some more info:

Title: Tabular ML in R: an overview of tidymodels in R for tabularized data

Date: Thursday, February 20th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone)

Speaker: Frank Hull is currently Director of Analytics at ACES. Frank oversees ACES’ Data Science department, which works directly with Portfolio Strategy, Portfolio Modeling, Transmission, Resource Planning, Fundamentals, and Trading & Operations. Frank leads & advises various initiatives such as weather-driven stochastics (WDS), long-term load forecasting (LTLF), peak prediction services (PPS), dark calm (DC) and extreme weather event (EWE) analyses. Frank also hosts internal R meetings for programmers at ACES. Prior to his current role, Frank held various roles related to data science, systems, modeling, and quantitative analysis at AES & ACES. Frank holds a degree in physics with a concentration in engineering physics.

Description: In this workshop, we will 1) discuss what we mean by tabular ml in R, 2) why it’s important, 3) when can it be applicable, and 4) how to setup a robust pipeline for iterative machine learning workflows. We will start off by defining and discussing the prevalence of tabular data across sectors. Followed by data exploration to understand and interpret any known relationships with our example dataset. Lastly, we will establish key practices within the

tidymodels ecosystem to create a predictive framework and benchmark various ML engines.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

How can I register?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro. Feel free to donate more if you can, all proceeds go directly to support Ukraine.

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

How can I sponsor a student?

Go to https://bit.ly/3wvwMA6 or https://bit.ly/4aD5LMC or https://bit.ly/3PFxtNA and donate at least 20 euro (or 17 GBP or 20 USD or 800 UAH). Feel free to donate more if you can, all proceeds go to support Ukraine!

Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!

Flowcharts made easy with the package {flowchart}

In health research, a flowchart is the best way to show the flow of participants in a study when reporting results. But drawing flowcharts can be tedious to prepare and can get on your nerves.

Fortunately, there are several packages in R for drawing flowcharts using different approaches. The problem is that the programming is generally quite complex, and the numbers have to be entered manually or parameterized beforehand. These flowcharts can have reproducible problems because if data changes, we have to manually change the parameters again.

To make our lives easier, there’s a new {flowchart} package that uses the tidyverse workflow, which allows to create many different types of flowcharts in just a few steps.

The package provides a set of functions that are thought to be combined with a tidyverse pipe operator (%>% or |>) to create different flowchart designs directly from the study database. These functions are highly customizable and allow the user to create reproducible flowcharts in an easier and tidier way. Now we don’t need to manually set the flowchart parameters such as the box coordinates or the numbers to display, because it automatically adapts to the data we have.

For example, we can create a flowchart of the entire participant study flow with this simple tidy workflow:

Here, we will describe these steps that are involved in creating a flowchart in this example. We will use the built-in safo dataset, that comes with the package, which is a randomly generated dataset from the SAFO clinical trial. For more information and other examples, you can visit the vignette of the package.

Installing and loading the package

As of March of 2024, the package is available on CRAN:

install.packages("flowchart")

You can always install the development version from Github:

remotes::install_github("bruigtp/flowchart")

Initialize the flowchart

The first step is the initialisation of the flowchart with the function as_fc():

library(flowchart) 

x <- safo |> 
  as_fc(label = "Patients assessed for eligibility")

This will create an object of class fc, the class created for this package. Objects of this class consist of a list containing the dataset together with the information related to the flowchart being generated. Let’s see it for our example:

str(x, max.level = 1)

List of 2
 $ data: tibble [925 × 21] (S3: tbl_df/tbl/data.frame)
 $ fc  : tibble [1 × 17] (S3: tbl_df/tbl/data.frame)
 - attr(*, "class")= chr "fc"

The data tibble belongs to the entire SAFO dataset as we haven’t done any further operations:

x$data

# A tibble: 925 × 21
      id inclusion_crit exclusion_crit chronic_heart_failure expected_death_24h
   <int> <fct>          <fct>          <fct>                 <fct>             
 1     1 Yes            No             No                    No                
 2     2 No             No             No                    No                
 3     3 No             No             No                    No                
 4     4 No             Yes            No                    No                
 5     5 No             No             No                    No                
 6     6 No             Yes            No                    No                
 7     7 No             No             No                    No                
 8     8 No             Yes            No                    Yes               
 9     9 No             No             No                    No                
10    10 No             No             No                    No                
# ℹ 915 more rows
# ℹ 16 more variables: polymicrobial_bacteremia <fct>,
#   conditions_affect_adhrence <fct>, susp_prosthetic_valve_endocard <fct>,
#   severe_liver_cirrhosis <fct>, acute_sars_cov2 <fct>,
#   blactam_fosfomycin_hypersens <fct>, other_clinical_trial <fct>,
#   pregnancy_or_breastfeeding <fct>, previous_participation <fct>,
#   myasthenia_gravis <fct>, decline_part <fct>, group <fct>, itt <fct>, …

The fc tibble represents the information on the generated flowchart, which only contains a first initial box indicating the total number of patients assessed for eligibility in the SAFO trial:

x$fc

# A tibble: 1 × 17
     id     x     y     n     N perc  text  type  group just  text_color text_fs
  <dbl> <dbl> <dbl> <int> <int> <chr> <chr> <chr> <lgl> <chr> <chr>        <dbl>
1     1   0.5   0.5   925   925 100   "Pat… init  NA    cent… black            8
# ℹ 5 more variables: text_fface <dbl>, text_ffamily <lgl>, text_padding <dbl>,
#   bg_fill <chr>, border_color <chr>

Drawing the flowchart

We can always use the fc_draw() function to draw the associated flowchart from a fc object:

x |> 
  fc_draw()

Building the flowchart

To build the entire flowchart, we would need to combine the initialized fc object with the desired functions until we obtain the final flowchart.

The second box showing the patients excluded from randomization can be obtained using the fc_filter() function:

safo |> 
  as_fc(label = "Patients assessed for eligibility") |> 
  fc_filter(!is.na(group), label = "Randomized", show_exc = TRUE) |> 
  fc_draw()

with show_exc = TRUE to show the excluded subject box as well. Now $data contains the database filtered only for the randomized subjects while $fc contains the information for these new boxes.

Now, we can split the flowchart by the study group, using the fc_split() function:

safo |> 
  as_fc(label = "Patients assessed for eligibility") |> 
  fc_filter(!is.na(group), label = "Randomized", show_exc = TRUE) |> 
  fc_split(group) |> 
  fc_draw()

Now, $data contains the previously filtered database that has been grouped by the group variable.

Finally, we can apply two more times the fc_filter() function to generate the complete flowchart we want:

safo |> 
    as_fc(label = "Patients assessed for eligibility") |> 
    fc_filter(!is.na(group), label = "Randomized", show_exc = TRUE) |> 
    fc_split(group) |> 
    fc_filter(itt == "Yes", label = "Included in intention-to-treat\n population") |> 
    fc_filter(pp == "Yes", label = "Included in per-protocol\n population") |> 
    fc_draw()

The idea is to combine these basic functions, fc_filter() and fc_split(), in any way we want to create the desired flowchart. The resulting flowchart can be further customized and enhanced using the fc_modify() function, or combined with other flowcharts either horizontally or vertically using the fc_merge() and fc_stack() functions, respectively. Finally, once the final flowchart is drawn, it can be exported to the desired image format using the fc_export() function.

More information about these features and other examples can be found in the website of the package: https://bruigtp.github.io/flowchart/.