Unlock your next move with Datacamp: Save up to 67% on in-demand data upskilling

 Unlock your next move with Datacamp: Save up to 67% on in-demand data upskilling

For a limited time, save up to 67% on a DataCamp Premium subscription and unlock 410+ interactive courses for all levels in R, Python, SQL, Power BI, and more. Alongside, access bespoke career and skills tracks, projects, challenges, and industry-leading certifications to stand out.

Simply follow the link here.

Upcoming free DataCamp content to enhance your data learning journey 

RADAR: Thrive in the era of data

 Presented by DataCamp, RADAR is a free data science summit of industry experts designed to help aspiring data professionals accelerate their data learning and build stronger careers in 2023.

Gain a deeper understanding of the skills industry leaders are looking for. Learn how to navigate the evolving data talent pool, and uncover insights on data’s most pressing opportunities through a mix of sessions from world-class organizations such as Tableau, Alteryx, Qlik, Salesforce, JetBrains, Google, CBRE, and more.

An unmissable event for anyone looking to strengthen their wider data skillset and accelerate their careers.

March 22-23 2023, 9 AM – 3 PM EST. Register now for free.

  The State of Data Literacy 2023

According to 87% of business leaders, data literacy ranks as the most important skill behind basic computer skills.

Commissioned by DataCamp, The State of Data Literacy is a free-to-download report that examines the current state of the global data skills revolution. For real-world accuracy, DataCamp surveyed over 550 business leaders to uncover how they are approaching the data skills revolution, including

  • What data literacy means and what transformational benefits it brings
  • Crucial skills business leaders are looking for and their reasons why (great for discovering upskilling gaps to help frame your data learning)
  • What the future of data skills holds for individuals, organizations, and society at large

For the full report, including a deep dive into the key data skills employers are looking for, download your free copy today.

Download Now For Free

Introduction to data analysis with {Statgarten}.





Overview

Data analysis is a useful way to help solve problems in quite a few situations.

There are many things that go into effective data analysis, but three are commonly mentioned

1. defining the problem you want to solve through data analysis
2. meaningful data collected
3. the skills (and expertise) to analyze the data

R is often mentioned as a way to effectively fill the third of these, but at the same time, it’s often seen as a big barrier for people who haven’t used R before (or have no programming experience).

In my previous work experience, there were many situations where I was able to turn experiences into insights and produce meaningful results with a little data analysis, even if I was “not a data person”.

For this purpose, We have developed an open source R package called “Statgarten” that allows you to utilize the features of R without having to use R directly, and I would like to introduce it.

Here’s the repo link (Note, some description is written in Korean yet)


👣 Flow of data analysis

The order and components may vary depending on your situation, but I like to define it as five broad flows.

1. data preparation
2. EDA
3. data visualization
4. calculate statistics
5. share results

In this article, I’ll share a lightweight data analysis example that follows these steps (while utilizing R’s features and not typing R code whenever possible).

Note, Since our work is still in progress, including deployment in the form of a web application, we will utilize R packages.
Install
With this code, you can install all components of statgarten system. 
remotes::install_github('statgarten/statgarten')
library(statgarten)
Run
The core of the statgarten ecosystem is door, which allows you to bundle other functional packages together. (Of course, you can also use each package as a separate shiny module)

Let’s load the door library, and run it via run_app.
library(door)

run_app() # OR door::run_app()
If you didn’t set anything, the shiny application will run in Rstudio’s viewer panel, but we recommend running it in a web browser like Chrome via the Show in new window icon (Icon to the left of the Stop button)

Statgarten app main pageIf you don’t have any problems running it (please raise an issue on DOOR to let us know if you do), you should see the screen below.
1. Data preparation
There are four ways to prepare data for Statgarten. 1) Upload a file from your local PC, 2) Enter the URL of a file, 3) Enter the URL of a Google Sheet, or 4) Finally, utilize the public data included in statgarten, which can be found in the tabs File, URL, Google Sheet, and Datatoys respectively.

In this example, we will utilize the public data named bloodTest.

bloodTest
contains blood test data from 2014-15 provided by the National Health Insurance Service in South Korea.
1.5 Define the problem
Utilizing bloodtest data, we’ll try to see clues for this question

“Are people with high total cholesterol more likely to be diagnosed with anemia and cerebrovascular disease, and does the incidence vary by gender?” 
With a few clicks, select the data as shown below. (after selection, click Import data button)

statgarten data select


Before we start EDA, let’s process the data for analysis.

In keeping with the theme, we will “remove” data that is not needed and change some numeric values to the type of factor.

This can be done with the Update Data button, where data selection is done with the checkbox. The type can be changed in the New class.

2. EDA
You can see the organization of the data in the EDA pane below, where we see that the genders are 1 and 2, so we’ll use the Replace function on the Transform Data button to change them to M/F.


3. Data visualization
In the Vis Panel, you can also visualize anemia (ANE) and total cholesterol (TCHOL) by dragging, as well as total cholesterol by cerebrovascular disease  (STK) status. 



However, it’s hard to tell from the figure if there is a significant difference (in both case).
4. Statistics
You can view the distribution of values by data and key statistics via Distribution in the EDA panel.


For the anemia (ANE) and cerebrovascular disease variables (STK), we see that 0 (never diagnosed) is 92.2% and 93.7%, respectively, and 1 (diagnosed) is 7.8% and 6.3%, respectively.


In the Stat Panel, let’s create a “Table 1” to represent the baseline characteristics of the data, based on anemia status (ANE).


Cerebrovascular disease status(STK) , again from Table 1, we can see that the value of total cholesterol (TCHOL) by gender (SEX) is significant with a Pvalue less than 0.05.


5. Share result
I think quarto (or Rmarkdown) is the most effective way to share data analysis results in R, but utilizing it in a shiny app is another matter.

As a result, statgarten’s results sharing is limited to exporting a data table or downloading an image.



⛳ Statgarten as Open source

The statgarten project has goal for

In order to help process and utilize data in a rapidly growing data economy and foster data literacy for all.
The project is being developed with the support of the Ministry of Science and ICT of the Republic of Korea, and has been selected as a target for the 2022 Information and Communication Technology Development Project and the Standards Development Support Project.

But at the same time, it is an open source project that everyone can use and contribute to freely. (We’ve also used other open source projects in the development process)

It is being developed in various forms such as web app, docker, and R package, and is open to various forms of contributions such as development, case sharing, and suggestions.

Please try it out, raise an issue, fork or stargaze it, or suggest what you need, and we’ll do our best to incorporate it, so please support us 🙂

For more information, you can check out our github page or drop us an email.

Thanks.

(Translated with DeepL ❤️)

Spatial Data Wrangling with R workshop

Learn how to wrangle spatial data in R ! Join our workshop on Spatial Data Wrangling with R: A Comprehensive Guide which is a part of our workshops for Ukraine series. 


Here’s some more info: 

Title: Spatial Data Wrangling with R: A Comprehensive Guide

Date: Thursday, April 6th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone) 

Speaker: Long Nguyen is a PhD student at SOEP RegioHub at Bielefeld University. He likes to make pretty maps.

Description: This workshop is designed to provide a solid foundation for working with spatial data in R. Starting with fundamental concepts of spatial data types and structures, the workshop provides a systematic overview of techniques for manipulating spatial data, such as spatial aggregation, spatial joins, spatial geometry transformations, and distance calculations. With this focus, the workshop’s aim is to give participants a skill set that is easily extendable and transferable to new data and tools. The data wrangling techniques presented will be accompanied by instructions on creating maps – both static and interactive – to quickly explore and present the results of the operations performed.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

How can I register?

  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)
  • Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.

How can I sponsor a student?

  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)
  • Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list as only those students who are sponsored can participate). Since the number of sponsored places is usually lower than the number of people signing up for the waitlist, we ask you to sign up via the regular registration process to ensure your participation if you can.

You can also find more information about this workshop series,  a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.

Looking forward to seeing you during the workshop!


Dataviz with R and ggplot: Using colour and annotations for effective story telling workshop

Learn how to fit use annotations and colors in your ggplot plots! Join our workshop on Dataviz with R and ggplot: Using colour and annotations for effective story telling which is a part of our workshops for Ukraine series. 


Here’s some more info: 

Title: Dataviz with R and ggplot: Using colour and annotations for effective story telling

Date: Thursday, April 20th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone)

Speaker: Cara Thompson, Cara is a freelance data consultant with an academic background, specialising in dataviz and in “enhanced” reproducible outputs. She lives in Edinburgh, Scotland, and is passionate about maximising the impact of other people’s expertise.

Description: If we’re passionate about our data and the patterns we’ve found, a key part of our job is to find effective ways of communicating what we’ve discovered. Intuitive and compelling data visualisations are a great way to draw attention to our main story, and illustrate some of the details. 

In this workshop, we’ll talk about how we can make use of colour, fonts and a few other tricks to make it easier for readers to understand and remember our main story and make our plots publication-ready. We’ll be using R and ggplot to create, modify and annotate the plots we discuss, but the principles apply regardless of the tools you use to plot your data. 

Attendees are encouraged to bring along a plot of their own (which doesn’t need to be made with ggplot!) so that think about how best to apply the principles to their own context – and for a chance for some live feedback during our Q&A session.

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)


How can I register?


  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)
  • Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation). You can also submit a plot made by you for a chance for getting feedback!

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.


How can I sponsor a student?

  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)
  • Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.

If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list).


You can also find more information about this workshop series,  a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.


Looking forward to seeing you during the workshop!




Structural Equation Modeling in R with the Lavaan package workshop

Learn how to use Structural Equation modeling in R! Join our workshop on Structural Equation Modeling in R with the Lavaan package which is a part of our workshops for Ukraine series. 


Here’s some more info: 


Title: Structural Equation Modeling in R with the Lavaan package


Date: Thursday, March 30th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone) 


Speaker: Nino Gugushvili is a post-Doc researcher at the Department of Work and Social Psychology at Maastricht University.


Description: In this workshop, we will go over the basics of structural equation modelling (SEM). We will talk about what SEM is and cover the essential steps of SEM. Next, we will learn path analysis (SEM with observed variables), confirmatory factor analysis, and full SEM (SEM with latent variables + observed variables). Along the way, we will also talk about revising our models and interpreting the results, and we’ll do all this in R, using the Lavaan package.


Minimal registration fee: 20 euro (or 20 USD or 800 UAH)




How can I register?



  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

  • Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.


How can I sponsor a student?


  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

  • Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.


If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list).



You can also find more information about this workshop series,  a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.


Looking forward to seeing you during the workshop!





Generalized Additive Models in R workshop

Learn how to fit Generalized Additive Models in R! Join our workshop on Generalized Additive Models in R which is a part of our workshops for Ukraine series. 


Here’s some more info: 


Title: Generalized Additive Models in R


Date: Thursday, April 13th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone)


Speaker: Gavin Simpson, Gavin is a statistical ecologist and freshwater ecologist/palaeoecologist. He has a B.Sc. in Environmental Geography and a Ph.D. in Geography from University College London (UCL), UK. After submitting his Ph.D. thesis in 2001, Gavin worked as an environmental consultant and research scientist in the Department of Geography, UCL, before moving, in 2013, to a research position at the Institute of Environmental Change and Society, University of Regina, Canada. Gavin moved back to Europe in 2021 and is now Assistant Professor of Applied Statistics in the Department of Animal and Veterinary Sciences at Aarhus University, Denmark. Gavin’s research broadly concerns how populations and ecosystems change over time and respond to disturbance, at time scales from minutes and hours, to centuries and millennia. Gavin has developed several R packages, including gratia, analogue, and cocorresp, he helps maintain the vegan package, and can often be found answering R- and GAM-related questions on StackOverflow and CrossValidated.



Description: Generalized Additive Models (GAMs) were introduced as an extension to linear and generalized linear models, where the relationships between the response and covariates are not specified up-front by the analyst but are learned from the data themselves. This learning is achieved by representing the effect of a covariate on the response as a smooth function, rather than following a fixed form (linear, quadratic, etc). GAMs are a large and flexible class of models that are widely used in applied research because of their flexibility and interpretability.

The workshop will explain what a GAM is and how penalized splines and automatic smoothness selection methods work, before focusing on the practical aspects of fitting GAMs to data using the mgcv R package, and will be most useful to people who already have some familiarity with linear and generalized linear models.



Minimal registration fee: 20 euro (or 20 USD or 750 UAH)




How can I register?



  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

  • Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.


How can I sponsor a student?


  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

  • Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.


If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list).



You can also find more information about this workshop series,  a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.


Looking forward to seeing you during the workshop!




Using R in an High Performance Computing environment

In a common workflow when programming with R one only deals with a Desktop machine or a Laptop, for instance. This PC environment is convenient for R users as they can focus mainly on coding but it could be the case that the program is taking a long time to run (more than 1 hr. for instance) and one needs many repetitions for the same simulation. In some cases, the program could eat up the available memory of the PC. For a PC environment, tools such as Task Manager (Windows), Activity Monitor (Mac), and top/htop (Linux) could help you to monitor the usage of resources.

High Performance Computing (HPC) centers offer the possibility of increasing the resources (memory/CPU power) your program can utilize. If you opt for moving your workflow to an HPC environment, you would need to learn how to deal with it to take full advantage of the provided resources. In this post, I will write some recommendations that we offer to our users at the High Performance Computing Center North (HPC2N) but that could be applied to other centers as well.

One important aspect, that I observed tends to create issues when moving to HPC, is the terminology. Some of the common terms used in HPC such as cores, CPUs, nodes, shared memory, and distributed memory computing, among others are covered in an R for HPC course that we delivered previously in collaboration with the Parallelldatorcentrum (PDC) in Stockholm.

In an HPC environment, one allocates some resources (cores and memory) for running an R program. In a PC this step is hidden in most cases from the user but under the hood, the R program would assume that all resources in that machine are available and it would try to use them. As in HPC, this step should be done explicitly (through the use of batch text files or some web server such as Open OnDemand) you will need to consciously decide how much CPU and memory power your R program will use in an efficient manner. For instance, if you request 10 cores and 20 GB (RAM) but your application is not parallelized (serial code) and uses < 1GB, 9 cores will be idle during the simulation. Sometimes, it is fine to work with this type of setup if your application needs more memory than what is provided by a single core though. Also, take into account that most HPC centers work in a project-based manner with some possible cost (monetary or with job priority for instance).

Some R packages that make use of Linear Algebra libraries, such as BLAS and LAPACK, can automatically trigger the use of several threads. One way to explicitly control the number of threads to be used is with the package RhpcBLASctl as follows:

library(RhpcBLASctl)
blas_set_num_threads(8) #set the number of threads to 8

In some packages, a parallelization layer has been introduced by using a backend (such as the Parallel package), for instance in heavy routines like bootstrapping (boot package).  Other packages opted for a threaded mechanism, for instance for clustering there is a clusternor package. Examples of the usage of these packages can be found here

In the cases already mentioned, someone did the job of parallelizing the application for us and we only need to set the number of threads or workers. If we are the R code developers who want to port some serial into a parallel program, we would need most likely refactor the code and change our programming paradigm. It is important to mention that not all the parts of a program are suitable for parallelization and there could be parts that although parallelizable, one could not observe a significant speedup (ratio of simulation time with 1 core by time with N cores). Thus, one important aspect of code parallelization is to make a code analysis (profiling) by timing parts of the code and locating the bottlenecks that are suitable for parallelization.

In the following code in serial mode (unoptimized one), I am computing the 2D integral of the sinus function between 0 and π in both x and y ranges:

∫∫sin(x+y)dxdy = 0 

integral <- function(N){
# Function for computing a 2D sinus integral
h <- pi/N # Size of grid
mySum <-0 # Camel convention for variables' names

for (i in 1:N) { # Discretization in the x direction
x <- h*(i-0.5) # x coordinate of the grid cell
for (j in 1:N) { # Discretization in the y direction
y <- h*(j-0.5) # y coordinate of the grid cell
mySum <- mySum + sin(x+y) # Computing the integral
}
}

return(mySum*h*h)
}

One way to parallelize this code is by dividing the workload (for loop in the x direction) in an even manner by using some number of workers. In the present case, I will make use of the foreach function that is available in the doParallel package and that allows running tasks in parallel mode. Once I decided what part of the code I will parallelize (x integration) and the tools (foreach), I can refactor my original code. One possible parallel version can be:

integral_parallel <- function(N,i){
# Parallel function for computing a 2D sinus integral
myPartialSum <- 0.0
x <- h*(i-0.5) # x coordinate of the grid cell
for (j in 1:N) { # Discretization in the y direction
y <- h*(j-0.5) # y coordinate of the grid cell
myPartialSum <- myPartialSum + sin(x+y) # Computing the integral
}

return(myPartialSum)
}
 
Notice that here I changed the original programming paradigm because now my function only computes a partial value for each worker. The total value will be known only after all the workers finish their tasks and the result is summarized at the end. The doParallel package requires the initialization of a cluster and the foreach function requires the dopar option to run tasks in parallel mode:

library(doParallel)

cl <- makeCluster(M) # Create the cluster with M workers
registerDoParallel(cl)
r <- foreach(i=1:N, .combine = 'c') %dopar% integral_parallel(N,i)
stopCluster(cl)
integral <- sum(r)*h*h # Summarize and print out final result
integral

The complete example can be found here

A common mistake of HPC users is that they try to use batch scripts from other centers, assuming that SLURM or PBS job schedulers behave equally in different centers. Although that is true for the standard features, system administrators at one center could activate switches that are not available or behave slightly differently in other centers.

One recommendation is to use the HPC tools available in your center to monitor the resources’ usage by a simulation. If you have access to the computing nodes the most straightforward way to obtain this information is with top/htop commands. Otherwise, tools such as Grafana or Ganglia would be handy if they are available in your center.

Additional resources:
  • R in HPC course offered by HPC2N/PDC 

The State of Data Literacy 2023, by DataCamp

The State of Data Literacy 2023, by DataCamp
Download Now

In 2023, 87% of leaders recognize data literacy as the most important skill behind basic computer skills. However, only a third of organizations are offering data upskilling.

For most teams, bridging the data literacy skills gap is a universal challenge across modern businesses. Just as workforces adopted computers in the 1980s, and the internet in the 2000s, now organizations must embrace data skills to stay competitive, drive innovation, and attract top talent.

To help close this gap, DataCamp invested months into compelling The State of Data Literacy 2023 report, an expert-led and free-to-download guide to navigating the current data skills revolution, including a foreword from CEO and co-founder, Jonathan Cornelissen.

DataCamp independently surveyed over 550 business leaders across the UK and US to shed light on the most pressing data skills gaps facing modern organizations. In doing so, they uncovered key insights into the strategies data-first organizations are using to upskill their workforces. 

From companies taking their first steps into data literacy to data mature organizations, the report takes multiple leadership perspectives and dives into the business and individual benefits of data upskilling.

A key highlight revealed that leaders who engaged in data upskilling programs experienced more than 70% improvement in quality and speed of decision-making, innovation, customer experience, and employee retention across the board.

Whilst three of the top five fastest-growing skills in the past five years were data skills; business intelligence (41%), data science (37%), and data literacy 30%). In addition, 77% of leaders agreed they would pay a salary premium to candidates with data literacy skills

Download the report now to discover key insights that you can start applying in your organization today.

Download Now

RADAR 2023 | Free Annual Summit of the World’s Data Leaders

RADAR 2023 | Free Annual Summit of the World’s Data Leaders

Presented by DataCamp, join a selection of the world’s data leaders for a two-day
digital event designed to help data professionals build stronger careers in 2023.

From gaining a deeper understanding of which skills industry leaders are looking for to navigating the evolving data talent pool, uncover insights on data’s most pressing opportunities through a mix of keynotes, fireside chats, and panels.

Across these expert-led sessions, learn from the people at the forefront of data
transformation with leaders from world-class organizations such as Tableau, Alteryx, Qlik, Salesforce, JetBrains, Google, CBRE, and more.

From R to Python, Jupiter, and beyond, this is an unmissable event for anyone looking to strengthen their wider data skillset and accelerate their careers.

March 22-23 2023, 9 AM – 3 PM EST: Save your seat now.

Key sessions aimed at up-and-coming data scientists:

Breaking Into Data in 2023: How Building a Personal Brand Can Accelerate Data Careers
The secrets to a successful data career with the founder of DATAcated, Kate
Strachnyi. Learn how to build a personal brand, create opportunities through
networking, and build lasting connections within the data community.

How The Data Job Market Is Evolving in 2023
Stay informed on how the data job market is evolving in 2023. Join the CEO of
Orbition Group to learn about breaking into a competitive market, and the
importance of soft skills and value creation in building a successful data career.

An In-depth Guide to the DataCamp Certifications
Ranked at the #1 data certification program by Forbes, DataCamp’s VP of
Certification, Vicky Kennedy, discusses how a DataCamp certification can accelerate your data career. You’ll learn about the two levels of certification and how to prepare for exams. You’ll also uncover insider’s secrets to acing the case study—a take-home exercise based on real-world data scenarios.

Tips For Building An Effective Data Science Portfolio
Portfolio projects are the silver bullet for lack of work experience when it comes to finding data roles. Naledi Hollbruegge, Data Analytics Consultant, and James Le, Developer Advocate at Twelve Labs outline how to effectively present your portfolio projects to highlight your technical and soft skills.

Ask a Hiring Manager: The Keys to Landing a Job in Data Science
Google’s director of Ads Safety, Lukas Tencer, and DataCamp’s Director of Analytics, Jorge Vasquez on what drives successful data applicants. Throughout, they’ll answer audience questions on the key characteristics of successful data applicants, the questions hiring managers expect, and more.

View the full agenda and register here

Working with ChatGPT in R workshop

Learn how to use ChatGPT to improve your coding skills in R! Join our workshop on Working with ChatGPT in R which is a part of our workshops for Ukraine series. 


Here’s some more info: 


Title: Working with ChatGPT in R


Date: Thursday, March 9th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone) 


Speaker: Dariia Mykhailyshyna, PhD Economics student at the University of Bologna. Previously worked at a Ukrainian think tank Centre of Economic Strategy


Description: In this workshop we will learn how you can fully harness the power of ChatGPT to improve your R coding. We will learn how to access ChatGPT directly from R, how to make it write R code, including fairly long and complicated command, debug its (and your) code, translate code from one coding language to another, comment your code, make it more efficient and more! We will also explore some of the drawbacks of ChatGPT and examine when and why you can’t always rely on it.


Minimal registration fee: 20 euro (or 20 USD or 800 UAH)




How can I register?



  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

  • Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.


How can I sponsor a student?


  • Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)

  • Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.


If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list).



You can also find more information about this workshop series,  a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.


Looking forward to seeing you during the workshop!