ease_aes() Demo

Easing

In R, easing is the interpolation, or tweening, between successive states of a plot (1). It is used to control the motion of data elements in animated data displays (2), with different easing functions giving different appearances or dynamics to the display’s animation.

The ease_aes() Function

The ease_aes() function controls the easing of aesthetics or variables in gganimate. The default, ease_aes(), models a linear transition between states. Other easing functions are specified using the easing function name, appended with one of three modifiers (3) :
Easing Functions
quadratic models an exponential function of exponent 2.
cubic models an exponential function of exponent 3.
quartic models an exponential function of exponent 4.
quintic models an exponential function of exponent 5.
sine models a sine function.
circular models a pi/2 circle arc.
exponential models an exponential function of base 2.
elastic models an elastic release of energy.
back models a pullback and release.
bounce models the bouncing of a ball.     
Modifiers
-in applies the easing function as-is.
-out applies the easing function in reverse.
-in-out applies the first half of the transition as-is and the last half in  reverse.

The formulas used to implement these options can be found here (4).  They are illustrated below using animated scatter plots and bar charts.

Data File Description

‘data.frame’: 40 obs. of 7 variables:
Cat : chr “A” “A” “A” “A” …
OrdInt: int 1 2 3 4 5 1 2 3 4 5 …
X : num 70.5 78.1 70.2 78.1 70.5 30.7 6.9 26.7 6.9 30.7 …
Y : num 1.4 7.6 -7.9 7.6 1.4 -7 -23.8 19.8 -23.8 -7 …
Rank : int 2 2 2 2 2 7 8 8 8 7 … 
OrdDat: chr “01/01/2019” “04/01/2019” “02/01/2019” “05/01/2019” …
Ord : Date, format: “2019-01-01” “2019-04-01” “2019-02-01” “2019-05-01” …

The data used for this demo is a much abbreviated and genericized version of  this (5) data set.

Scatter Plots

Here is the code used for the ease_aes(‘cubic-in’) animated scatter plot:
# load libraries
library(gganimate) 
library(tidyverse)

# the data file format is indicated above
data <- read.csv('Data.csv')

# convert date to proper format
data$Ord <- as.Date(data$OrdDat, format='%m/%d/%Y') 
# specify the animation length and rate
options(gganimate.nframes = 30)
options(gganimate.fps = 10)

# loop the animation
options(gganimate.end_pause = 0) 
# specify the data source and the X and Y variables
ggplot(data, aes(X, Y)) + 

# specify the plot format
theme(panel.background = element_rect(fill = 'white'))+ 
theme(axis.line = element_line()) +
theme(axis.text = element_blank())+
theme(axis.ticks = element_blank())+
theme(axis.title = element_blank()) +
theme(plot.title = element_text(size = 20)) +
theme(plot.margin = margin(25, 25, 25, 25)) +
theme(legend.position = 'none') +

# create a scatter plot
geom_point(aes(color = Cat), size = 5) +

# indicate the fill color scale
scale_fill_viridis_d(option = "D", begin = 0, end = 1) +

# apply the fill to the 'Cat' variable
aes(group = Cat) +

# animate the plot on the basis of the 'Ord' variable
transition_time(Ord) +

# the ease_aes() function
ease_aes('cubic-in') +

# title the plot
labs(title = "'ease_aes(cubic-in)'")
ease_aes('linear') scatter plot
This is the default value, equivalent to ease_aes(). It is the only easing function that does not take a modifier.
ease_aes('quadratic-in') scatter plot
ease_aes(‘quadratic-in’) scatter plot
ease_aes('quadratic-out') scatter plot
ease_aes(‘quadratic-out’) scatter plot
ease_aes('quadratic-in-out') scatter plot
ease_aes(‘quadratic-in-out’) scatter plot
ease_aes('cubic-in') scatter plot
ease_aes(‘cubic-in’) scatter plot
ease_aes('cubic-out') scatter plot
ease_aes(‘cubic-out’) scatter plot
ease_aes('cubic-in-out') scatter plot
ease_aes(‘cubic-in-out’) scatter plot
ease_aes('quartic-in') scatter plot
ease_aes(‘quartic-in’) scatter plot
ease_aes('quartic-out') scatter plot
ease_aes(‘quartic-out’) scatter plot
ease_aes('quartic-in-out') scatter plot
ease_aes(‘quartic-in-out’) scatter plot
ease_aes('quintic-in') scatter plot
ease_aes(‘quintic-in’) scatter plot
ease_aes('quintic-out') scatter plot
ease_aes(‘quintic-out’) scatter plot
ease_aes('quintic-in-out') scatter plot
ease_aes(‘quintic-in-out’) scatter plot
ease_aes('sine-in') scatter plot
ease_aes(‘sine-in’) scatter plot
ease_aes('sine-out') scatter plot
ease_aes(‘sine-out’) scatter plot
ease_aes('sine-in-out') scatter plot
ease_aes(‘sine-in-out’) scatter plot
ease_aes('circular-in') scatter plot
ease_aes(‘circular-in’) scatter plot
ease_aes('circular-out') scatter plot
ease_aes(‘circular-out’) scatter plot
ease_aes('circular-in-out') scatter plot
ease_aes(‘circular-in-out’) scatter plot
ease_aes('exponential-in') scatter plot
ease_aes(‘exponential-in’) scatter plot
ease_aes('exponential-out') scatter plot
ease_aes(‘exponential-out’) scatter plot
ease_aes('exponential-in-out') scatter plot
ease_aes(‘exponential-in-out’) scatter plot
ease_aes('elastic-in') scatter plot
ease_aes(‘elastic-in’) scatter plot
ease_aes('elastic-out') scatter plot
ease_aes(‘elastic-out’) scatter plot
ease_aes('elastic-in-out') scatter plot
ease_aes(‘elastic-in-out’) scatter plot
ease_aes('back-in') scatter plot
ease_aes(‘back-in’) scatter plot
ease_aes('back-out') scatter plot
ease_aes(‘back-out’) scatter plot
ease_aes('back-in-out') scatter plot
ease_aes(‘back-in-out’) scatter plot
ease_aes('bounce-in') scatter plot
ease_aes(‘bounce-in’) scatter plot
ease_aes('bounce-out') scatter plot
ease_aes(‘bounce-out’) scatter plot
ease_aes('bounce-in-out') scatter plot
ease_aes(‘bounce-in-out’) scatter plot

Bar Charts

Here is the code used for the ease_aes(‘cubic-in’) animated bar chart:
# load libraries
library(gganimate) 
library(tidyverse)  

# the data file format is indicated above
data <- read.csv('Data.csv')

# convert date to proper format 
data$Ord <- as.Date(data$OrdDat, format='%m/%d/%Y')
            
# specify the animation length and rate
options(gganimate.nframes = 30)
options(gganimate.fps = 10)

# loop the animation 
options(gganimate.end_pause = 0)

# specify the data source
ggplot(data) +

# specify the plot format
    theme(panel.background = element_rect(fill = 'white'))+
    theme(panel.grid.major.x  = element_line(color='gray'))+
    theme(axis.text = element_blank())+
    theme(axis.ticks = element_blank())+
    theme(axis.title = element_blank()) +
    theme(plot.title = element_text(size = 20)) +
    theme(plot.margin = margin(25, 25, 25, 25)) +
    theme(legend.position = 'none') +

# specify the x and y plot limits
        aes(xmin = 0, xmax=X+2) +
        aes(ymin = Rank-.45,
        ymax = Rank+.45,
        y = Rank) +

# create a bar chart
    geom_rect() +

# indicate the fill color scale
    scale_fill_viridis_d(option = "D", begin = 0, end = 1) +

# place larger values at the top
    scale_y_reverse() +

# apply the fill to the 'Cat' variable
    aes(fill = Cat) +

# animate the plot on the basis of the 'Ord' variable
    transition_time(Ord) +

# the ease_aes() function
    ease_aes('cubic-in') +

# title the plot
    labs(title = "'ease_aes(cubic-in)'")

ease_aes('linear') bar chart
This is the default value, equivalent to ease_aes(). It is the only easing function that does not take a modifier.
ease_aes('quadratic-in') bar chart
ease_aes(‘quadratic-in’) bar chart
ease_aes('quadratic-out') bar chart
ease_aes(‘quadratic-out’) bar chart
ease_aes('quadratic-in-out') bar chart
ease_aes(‘quadratic-in-out’) bar chart
ease_aes('cubic-in') bar chart
ease_aes(‘cubic-in’) bar chart
ease_aes('cubic-out') bar chart
ease_aes(‘cubic-out’) bar chart
ease_aes('cubic-in-out') bar chart
ease_aes(‘cubic-in-out’) bar chart
ease_aes('quartic-out') bar chart
ease_aes(‘quartic-out’) bar chart
ease_aes('quartic-in') bar chart
ease_aes(‘quartic-in’) bar chart
ease_aes('quartic-in-out') bar chart
ease_aes(‘quartic-in-out’) bar chart
ease_aes('quintic-in') bar chart
ease_aes(‘quintic-in’) bar chart
ease_aes('quintic-out') bar chart
ease_aes(‘quintic-out’) bar chart
ease_aes('quintic-in-out') bar chart
ease_aes(‘quintic-in-out’) bar chart
ease_aes('sine-in') bar chart
ease_aes(‘sine-in’) bar chart
ease_aes('sine-out') bar chart
ease_aes(‘sine-out’) bar chart
ease_aes('sine-in-out') bar chart
ease_aes(‘sine-in-out’) bar chart
ease_aes('circular-in') bar chart
ease_aes(‘circular-in’) bar chart
ease_aes('circular-out') bar chart
ease_aes(‘circular-out’) bar chart
ease_aes('circular-in-out') bar chart
ease_aes(‘circular-in-out’) bar chart
ease_aes('exponential-in') bar chart
ease_aes(‘exponential-in’) bar chart
ease_aes('exponential-out') bar chart
ease_aes(‘exponential-out’) bar chart

ease_aes('exponential-in-out') bar chart
ease_aes(‘exponential-in-out’) bar chart
ease_aes('elastic-in') bar chart
ease_aes(‘elastic-in’) bar chart
ease_aes('elastic-out') bar chart
ease_aes(‘elastic-out’) bar chart
ease_aes('elastic-in-out') bar chart
ease_aes(‘elastic-in-out’) bar chart
ease_aes('back-in') bar chart
ease_aes(‘back-in’) bar chart
ease_aes('back-out') bar chart
ease_aes(‘back-out’) bar chart
ease_aes('back-in-out') bar chart
ease_aes(‘back-in-out’) bar chart
ease_aes('bounce-in') bar chart
ease_aes(‘bounce-in’) bar chart
ease_aes('bounce-out') bar chart
ease_aes(‘bounce-out’) bar chart
ease_aes('bounce-in-out') bar chart
ease_aes(‘bounce-in-out’) bar chart

Other Resources

 Here (6) is a nice illustration of the various easing options presented as animated paths.

Here (7) and here (8) are articles presenting various animated plots using ease_aes().

Here (9) are some other animations using ease_aes().

References

1 https://www.rdocumentation.org/packages/gganimate/versions/1.0.7/topics/ease_aes

2 https://rdrr.io/cran/tweenr/

3 https://gganimate.com/reference/ease_aes.html

4 https://github.com/thomasp85/tweenr/blob/master/src/easing.c

5 https://academic.udayton.edu/kissock/http/Weather/default.htm

6 https://easings.net/

7 https://github.com/ropenscilabs/learngganimate/blob/master/ease_aes.md

8 https://www.statworx.com/de/blog/animated-plots-using-ggplot-and-gganimate/

9 https://www.r-graph-gallery.com/animation.html

Isovists using uniform ray casting in R

Isovists are polygons of visible areas from a point. They remove views that are blocked by objects, typically buildings. They can be used to understanding the existing impact of, or where to place urban design features that can change people’s behaviour (e.g. advertising boards, security cameras or trees). Here I present a custom function that creates a visibility polygon (isovist) using a uniform ray casting “physical” algorithm in R. 

First we load the required packages (use install.packages() first if these are not already installed in R):

library(sf)
library(dplyr)
library(ggplot2) 

Data generation

First we create and plot an example footway with viewpoints and set of buildings which block views. All data used should be in the same Coordinate Reference System (CRS). We generate one viewpoint every 50 m (note density here is a function of the st_crs() units, in this case meters)
library(sf)
footway <- st_sfc(st_linestring(rbind(c(-50,0),c(150,0))))
st_crs(footway) = 3035 
viewpoints <- st_line_sample(footway, density = 1/50)
viewpoints <- st_cast(viewpoints,"POINT")

buildings <- rbind(c(1,7,1),c(1,31,1),c(23,31,1),c(23,7,1),c(1,7,1),
                   c(2,-24,2),c(2,-10,2),c(14,-10,2),c(14,-24,2),c(2,-24,2),
                   c(21,-18,3),c(21,-10,3),c(29,-10,3),c(29,-18,3),c(21,-18,3),
                   c(27,7,4),c(27,17,4),c(36,17,4),c(36,7,4),c(27,7,4),
                   c(18,44,5), c(18,60,5),c(35,60,5),c(35,44,5),c(18,44,5),
                   c(49,-32,6),c(49,-20,6),c(62,-20,6),c(62,-32,6),c(49,-32,6),
                   c(34,-32,7),c(34,-10,7),c(46,-10,7),c(46,-32,7),c(34,-32,7),
                   c(63,9,8),c(63,40,8),c(91,40,8),c(91,9,8),c(63,9,8),
                   c(133,-71,9),c(133,-45,9),c(156,-45,9),c(156,-71,9),c(133,-71,9),
                   c(152,10,10),c(152,22,10),c(164,22,10),c(164,10,10),c(152,10,10),
                   c(44,8,11),c(44,24,11),c(59,24,11),c(59,8,11),c(44,8,11),
                   c(3,-56,12),c(3,-35,12),c(27,-35,12),c(27,-56,12),c(3,-56,12),
                   c(117,11,13),c(117,35,13),c(123,35,13),c(123,11,13),c(117,11,13),
                   c(66,50,14),c(66,55,14),c(86,55,14),c(86,50,14),c(66,50,14),
                   c(67,-27,15),c(67,-11,15),c(91,-11,15),c(91,-27,15),c(67,-27,15))

buildings <- lapply( split( buildings[,1:2], buildings[,3] ), matrix, ncol=2)
buildings   <- lapply(X = 1:length(buildings), FUN = function(x) {
  st_polygon(buildings[x])
})

buildings <- st_sfc(buildings)
st_crs(buildings) = 3035 

# plot raw data
ggplot() +
  geom_sf(data = buildings,colour = "transparent",aes(fill = 'Building')) +
  geom_sf(data = footway, aes(color = 'Footway')) +
  geom_sf(data = viewpoints, aes(color = 'Viewpoint')) +
  scale_fill_manual(values = c("Building" = "grey50"), 
                    guide = guide_legend(override.aes = list(linetype = c("blank"), 
                                        nshape = c(NA)))) +
  
  scale_color_manual(values = c("Footway" = "black", 
                                "Viewpoint" = "red",
                                "Visible area" = "red"),
                     labels = c("Footway", "Viewpoint","Visible area"))+
  guides(color = guide_legend(
    order = 1,
    override.aes = list(
      color = c("black","red"),
      fill  = c("transparent","transparent"),
      linetype = c("solid","blank"),
      shape = c(NA,16))))+
  theme_minimal()+
  coord_sf(datum = NA)+
  theme(legend.title=element_blank())

Isovist function

Function inputs

Buildings should be cast to "POLYGON" if they are not already
buildings <- st_cast(buildings,"POLYGON")

Creating the function

A few parameters can be set before running the function. rayno is the number of observer view angles from the viewpoint. More rays are more precise, but decrease processing speed.raydist is the maximum view distance. The function takessfc_POLYGON type and sfc_POINT objects as inputs for buildings abd the viewpoint respectively. If points have a variable view distance the function can be modified by creating a vector of view distance of length(viewpoints) here and then selecting raydist[x] in st_buffer below. Each ray is intersected with building data within its raycast distance, creating one or more ray line segments. The ray line segment closest to the viewpoint is then extracted, and the furthest away vertex of this line segement is taken as a boundary vertex for the isovist. The boundary vertices are joined in a clockwise direction to create an isovist.
st_isovist <- function(
  buildings,
  viewpoint,
  
  # Defaults
  rayno = 20,
  raydist = 100) {
  
  # Warning messages
  if(!class(buildings)[1]=="sfc_POLYGON")     stop('Buildings must be sfc_POLYGON')
  if(!class(viewpoint)[1]=="sfc_POINT") stop('Viewpoint must be sf object')
  
  rayends     <- st_buffer(viewpoint,dist = raydist,nQuadSegs = (rayno-1)/4)
  rayvertices <- st_cast(rayends,"POINT")
  
  # Buildings in raydist
  buildintersections <- st_intersects(buildings,rayends,sparse = FALSE)
  
  # If no buildings block max view, return view
  if (!TRUE %in% buildintersections){
    isovist <- rayends
  }
  
  # Calculate isovist if buildings block view from viewpoint
  if (TRUE %in% buildintersections){
    
    rays <- lapply(X = 1:length(rayvertices), FUN = function(x) {
      pair      <- st_combine(c(rayvertices[x],viewpoint))
      line      <- st_cast(pair, "LINESTRING")
      return(line)
    })
    
    rays <- do.call(c,rays)
    rays <- st_sf(geometry = rays,
                  id = 1:length(rays))
    
    buildsinmaxview <- buildings[buildintersections]
    buildsinmaxview <- st_union(buildsinmaxview)
    raysioutsidebuilding <- st_difference(rays,buildsinmaxview)
    
    # Getting each ray segement closest to viewpoint
    multilines  <- dplyr::filter(raysioutsidebuilding, st_is(geometry, c("MULTILINESTRING")))
    singlelines <- dplyr::filter(raysioutsidebuilding, st_is(geometry, c("LINESTRING")))
    multilines  <- st_cast(multilines,"MULTIPOINT")
    multilines  <- st_cast(multilines,"POINT")
    singlelines <- st_cast(singlelines,"POINT")
    
    # Getting furthest vertex of ray segement closest to view point
    singlelines <- singlelines %>% 
      group_by(id) %>%
      dplyr::slice_tail(n = 2) %>%
      dplyr::slice_head(n = 1) %>%
      summarise(do_union = FALSE,.groups = 'drop') %>%
      st_cast("POINT")
    
    multilines  <- multilines %>% 
      group_by(id) %>%
      dplyr::slice_tail(n = 2) %>%
      dplyr::slice_head(n = 1) %>%
      summarise(do_union = FALSE,.groups = 'drop') %>%
      st_cast("POINT")
    
    # Combining vertices, ordering clockwise by ray angle and casting to polygon
    alllines <- rbind(singlelines,multilines)
    alllines <- alllines[order(alllines$id),] 
    isovist  <- st_cast(st_combine(alllines),"POLYGON")
  }
  isovist
}

Running the function in a loop

It is possible to wrap the function in a loop to get multiple isovists for a multirow sfc_POINT object. There is no need to heed the repeating attributes for all sub-geometries warning as we want that to happen in this case.
isovists   <- lapply(X = 1:length(viewpoints), FUN = function(x) {
  viewpoint   <- viewpoints[x]
  st_isovist(buildings = buildings,
             viewpoint = viewpoint,
             rayno = 41,
             raydist = 100)
})
All isovists are unioned to create a visible area polygon, which can see plotted over the original path, viewpoint and building data below.
isovists <- do.call(c,isovists)
visareapoly <- st_union(isovists) 

ggplot() +
  geom_sf(data = buildings,colour = "transparent",aes(fill = 'Building')) +
  geom_sf(data = footway, aes(color = 'Footway')) +
  geom_sf(data = viewpoints, aes(color = 'Viewpoint')) +
  geom_sf(data = visareapoly,fill="transparent",aes(color = 'Visible area')) +
  scale_fill_manual(values = c("Building" = "grey50"), 
                    guide = guide_legend(override.aes = list(linetype = c("blank"), 
                                         shape = c(NA)))) +
  scale_color_manual(values = c("Footway" = "black", 
                                "Viewpoint" = "red",
                                "Visible area" = "red"),
                     labels = c("Footway", "Viewpoint","Visible area"))+
  guides( color = guide_legend(
    order = 1,
    override.aes = list(
      color = c("black","red","red"),
      fill  = c("transparent","transparent","white"),
      linetype = c("solid","blank", "solid"),
      shape = c(NA,16,NA))))+
  theme_minimal()+
  coord_sf(datum = NA)+
  theme(legend.title=element_blank())

Zoom talk on “Alternatives to Rstudio” from the Grenoble (FR) R user group

The next talk of Grenoble’s R user group will be on December 17th, 2020 at 5PM (FR) and is free and open to all:

Alternatives to Rstudio
RStudio is the most widely used IDE designed to optimize the workflow with R language. However, there exists many alternatives designed for more specific use cases. Among the most popular, there are VS-Code, Jupyter, Atom and many others. What are the advantages they offer? Can they be more useful than RStudio?
Hope to see you there!

Members of the R community: be part of the response to COVID-19 (and future epidemic outbreaks)

Dear R users,

We are enthralled to present to you a tool we have been developing with the R epidemics consortium (RECON) thanks to a grant from the R Consortium: the COVID-19 challenge.

It is an online platform whose general goal is to connect members of the R community, R package developers and field agents working on the response to COVID-19 who use R (such as epidemiologists, statisticians or mathematical modellers) to help them fill-in their R related needs. It provides a single place for field agents to give feedback in real time on their analytical needs (such as requesting specific analysis templates, new functions, new method implementation, etc), these requests are then compiled and organized by order of priority (here) for package developers and (hopefully many!) members of the R community to browse and help contribute to.






Many COVID-19 field agents use R to develop their analysis pipelines, but may lack specific knowledge or time to implement some of their needs. That’s why trying to involve the R community in providing them help could turn out to be very important.

For members of the R community it is not only a great opportunity to contribute to the worldwide response to COVID-19 and provide an application of their skills with direct benefit to the community, but it is also a chance to encourage free, open and citizen science through the development of free and open source professional tools who aim at becoming the new standards in epidemic outbreak response.

These packages have already been successfully used in outbreaks such as the Ebola outbreaks in West Africa (2014-2016) and Eastern Democratic Republic of the Congo (2018-2020), and are currently used by various public health institutions and academic modelling groups in the COVID-19 response.

Although this platform has been developed specifically to contribute to the response to COVID-19 we hope to create a dynamic community that will outlast this epidemic, and become a long term methodological contributor.

If you have any question or suggestion, feel free to write to me at [email protected], we welcome all helpful feedback. Also please communicate this to your local R user group, the more you help us get to circulate the word, the most successful the project will be 🙂

Useful links:


Granger-causality without assuming linear regression, enhancements to generalCorr package

Consider the regression Y(t) =a0+a1 Y(t-1)+ .. +ap Y(t-p) +b1 X(t-1)+.. bp X(t-p) +e(t)

Let (X — g — > Y) denote that the time series X(t) Granger-causes the Y(t) series. The R package `lmtest’ has a function grangertest() for testing (X — g — > Y).  It tests the Granger non-causality Null Hypothesis H0: b1=b2=  …bp=0, that certain regression coefficients are all zero. This is a standard  procedure in econometrics textbooks and assumes linear regression and the F-test.  Now the F-test is correct only if the underlying distribution of regression errors e(t) is Normal.  Normality a strong assumption and easily relaxed by using the bootstrap.  generalCorr::bootGcRsq relaxes the Normality assumption and considers kernel regressions which provide far better fits (higher R-squares) generalCorr::causeSummary(mtx) is a powerful tool for assessing concurrent causality not covered by Granger causality Measures of dependence in statistics are symmetric.  Why? Dependence relations in nature or data are almost never symmetric. (a) An infant depends on mother for survival, but mother’s survival does not equally depend on the infant. (b) New York’s rainfall depends on the latitude, but latitude does not equally depend on the New York’s rainfall at all. As a measure of dependence the 100+ year old Pearson correlation coefficient miserably underestimates dependence. For example if x=1:10 and y=sin(x) perfectly depends on x, a good measure of dependence should be 1. Instead, the Pearson correlation coefficient -0.17 under-estimates it by 83%.  The gmcmtx0(mtx) function in `generalCorr’ package provides a non-symmetric matrix of generalized correlation coefficients with the correct measure of dependence. depMeas(x,y) gives a correct measure of dependence.

Tracking Indonesia’s economic recovery from COVID-19

While help is still on the way, it is still important to track a country’s progress to recover from the COVID-19 pandemic.

But the lack of single platform providing the necessary data in Indonesia poses a challenge to do just that. The relevant data, such as consumer confidence and confirmed cases and deaths, are publicly accessible but are scattered across several websites.

Using R, I developed in early November a website that I hope can overcome that challenge. By aggregating the data on one platform, the website seeks to serve as a sort of recovery tracker for not only the economy but also the public health.

You can visit the tracker at https://dzulfiqarfr.github.io/indonesia-recovery-tracker/

A summary table of the economic indicators on the website.

I use R Markdown to build the website. To create the interactive graphs, I use Plotly through the plotly package. I also use the gt package to create the tables on the website.

The indicators are as follow:
  • Gross domestic product (GDP);
  • Inflation;
  • Unemployment rate;
  • Poverty rate;
  • Consumer confidence;
  • Retail sales;
  • Movement trends;
  • Confirmed cases and deaths; and
  • Coronavirus tests

A graph of the retail sales index on the website.

The website does not cover all available economic and public health indicators, but I try to make sure it has timely and relevant data to see how the country is doing as the pandemic unfolds.

I may add another indicator to the website in the future, if necessary.

The website is still far from perfect. So if you have any suggestions or find a bug, please let me know!

R Shiny in the Classroom

Three years ago, while I was teaching elementary statistics at a community college in California, finally fed up with the cost and deficiencies of the technology foisted on students by proprietary textbook sites and math homework websites, I decided to create a suite of R Shiny applications which would provide the students with a free computing resource to accomplish three things. It would:

(1) give students the means to perform all basic statistical computations associated with an elementary statistics course. In particular, it would allow students to perform tests and compute confidence interval constructions beginning either raw data or, as is often the case in textbook problems, working from summary statistics,

(2) provide data simulations and interactive demonstrations to illustrate fundamental statistical concepts, and

(3) assist students in deciding what sort of statistical tool/computation is appropriate to use in a given situation through the use of an ‘expert system’ front end.

As matters eventuated, I ended up quitting my teaching and going on a very long bike ride across Western Europe before I could implement my idea (a different and perhaps more interesting tale). But I finally did do so when I returned to the US, deploying my apps using an R Shiny server deployed on an NGINX webserver on a cloud instance of Ubuntu 16.04 which I initiated specifically for the purpose. The result is here:

http://104.236.178.75:3838/sort/sorta/ Many of the apps could certainly be refined and stabilized. I am completely self-educated in all things computing, in particular as regards R programming and R Shiny app development. These apps were written as learning projects for me as much as they were intended to address the three goals enumerated above. But as an outgrowth of writing the statistics apps, it became clear to me that Shiny apps are widely adaptable to a host of pedagogical purposes, really across disciplines, although, because of the nature of R, applications for use in the teaching of math, statistics, and the sciences come first to mind.

I went on to create a few dozen Shiny apps for non-statistical but mostly pedagogical applications. Those of vaguely mathematical theme are here


https://www.apclam.com/shinymath.html


whereas an over-arching point of access to all such apps is here

https://www.apclam.com/shinydir.html .

My point here is not so much to draw attention to the apps which I have developed and deployed which, as I freely admit are rudimentary in many respects, as rather to call the attention of deans, department chairwomen and chairmen, and faculty members at high schools, colleges and univesities to the opportunity to use R Shiny to liberate curricula, and budgets, from the strictures imposed by propriety websites and software served up by textbook publishers and various other academic hangers-on. Here some key points militating in favor of doing so:

(1) The ‘pro’ version of the R Shiny server is available to academic institutions at no to very low cost.

(2) The learning curve involved in writing and deploying R Shiny apps is not steep. For people who already have experience writing R scripts it can be fairly characterized as easy. Even for people who have limited programming background, it is a manageable task, particularly if given institutional guidance in the form of workshops or training materials.

(3) Web activities can be created which are tuned to the very specific needs of one’s own curriculum, institution and even classroom. In particular, R’s extensive graphics capabilities in its base graphics package can be leveraged to create learning tools which integrate numerical, symbolically and graphical approaches to the demonstration of and interaction with key concepts.

(4) Developing and deploying R Shiny apps gives faculty and staff a constructive outlet for creativity which provides opportunities for collaboration within and across disciplines and results in a tangible (OK, internet ‘tangible’) product which can be used by students and faculty throughout the world.

I would be happy to share my experiences in this realm to help anyone or any institution pursue this path. I think the possibilies in this direction are almost without limit. Feel free to email me at [email protected]

COVID-19 Posts: A Public Dataset Containing 400+ COVID-19 Blog Posts

Over the last few months, we’ve been collecting hundreds of COVID-19 blog posts from the R community. Today, we are excited to share this dataset publicly, to help bloggers who want to analyze COVID-19 data by unleashing R and the resources of its community by being able to research such posts.

So far, we have found and recorded 423 COVID posts in English. In an effort to encourage others to explore such posts, we’ve published a Shiny web app which allow users to find the names of the 231 bloggers who wrote those posts, their roles, and their country of focus. The app also lets users interactively search the collection of posts by primary topic, post title, date, and whether the post uses a particular mathematical technique or data source. To learn more about the evolution of this dataset, one of the authors (Rees) has published nine articles on Medium, which you can find here.

We encourage users to submit their own posts-or others’ posts-for inclusion, which can be done on this Google Form. Our dataset, as well as the code for the Shiny app, is available on GitHub. If anyone has corrections to the dataset, please write Rees (at) ReesMorrison (dot) com.

The remainder of this post highlights some of the findings from the dataset of COVID-19 posts. As will be made evident by the plots that follow, this is by no means a comprehensive review of every COVID-19 R blog post, but rather an overview of the data that we have found.

Posts Over Time

As the pandemic has progressed, fewer bloggers have engaged with COVID-related data, as we notice that blog posts peaked in March of 2020.

Some bloggers have been prolific; many more have been one and done. The plot below shows the names and posts of the 23 bloggers who have so far published at least four posts. For an example of how to read the plot, Tim Churches, at the bottom of the y-axis, has published a total of nine posts, but none after early April.

The color of the points corresponds to the work role of the blogger as explained in the legend at the bottom. It is immediately apparent that professors and academic researchers predominate in this group of bloggers. If you include the postgraduate students, universities writ large account for nearly all of the prolific bloggers.

Roles of Authors

The bloggers in our dataset describe their work-day roles in a variety of ways. One of the authors (Rees) standardized these job roles by categorizing the multitude of terms and descriptions, but it is quite possible that this effort misrepresented what some of these bloggers do for a living. We welcome corrections.

We’ve further categorized roles into a broad typology where professions fall into one of five categories: university, corporate, professional, government, and nonprofit. Those broader categories are represented as columns in the following chart.

Data Sources

A greater number of data sources related to COVID-19 will yield richer insights. Combining different datasets can shed new light on an issue, yield improvements, and allow authors to contruct better indices and measures. For that reason, one of the authors (Rees) extracted dataset information from our collection of blog posts.

For the most part, bloggers identified the data source they drew on for their analysis. On occasion, we had to apply some effort to standardize the 140 data sources.

By far the most prevalent data source is Johns Hopkins University, who early, comprehensively and consistently has set the standard for COVID-19 data collection and dissemination to the public.

Blog Post Topics

It may also be the case that readers want a summary of blogs, or to only look at posts that pertain to a certain topic. Assigning each blog post a primary topic introduces a fair amount of subjectivity, to be sure, but the hope is that these broad topics will help researchers find content and colleagues who share similar interests.

Here, a balloon plot shows various categories that the 423 posts address as their primary topic. Topics fall on the y axis and the blogger’s category of employment is on the x axis. The size (and opacity) of each bubble represents the count of posts that match that combination. Epidemiology leads the way, as might be expected, but quite a few posts seem to use COVID data to showcase something else, or apply R in novel ways.

Concluding Thoughts

We encourage you to use our Shiny application to explore the data for yourself. If you’d like to submit your post to be included, fill out this Google Form.

As we note in the footer of the application, the R community is intelligent and produces interesting content, but not all of us are experts when it comes to COVID-19. Engaging with these posts will allow you to better understand the application of R to our current moment, and perhaps provide feedback to post authors. We do not endorse the findings of any particular author and encourage you to find accurate, relevant, and recent information from reputable sources such as the CDC and the WHO.


Originally published on my blog. Follow the authors Rees and Connor on Twitter.

Most popular on Netflix, Disney+, Hulu and HBOmax. Weekly Tops for last 60 days

Couple months ago I published Most popular on Netflix. Daily Tops for last 60 days – small research based on daily scraping data answering following questions: How many movies (titles) made the Netflix Daily Tops? What movie was the longest #1 on Netflix? For how many days movies / TV shows stay in Tops and as #1? etc.
This time I am sharing analysis of the most popular movies / TV shows across Netflix, Disney+, Hulu and HBOmax on weekly basis, instead of daily, with anticipation of better trends catching.

So, let`s count how many movies made the top5, I assume it is less than 5 *60…
library(tidyverse)
library (gt)

platforms <- c('Disney+','HBOmax', 'Hulu', 'Netflix') # additionally, load CSV data using readr 
Wrangle raw data – reverse (fresh date first), take top 5, take last 60 days
fjune_dt % rev () %>% slice (1:5) %>% select (1:60)
fdjune_dt % rev () %>% slice (1:5) %>% select (1:60)
hdjune_dt % rev () %>% slice (1:5) %>% select (1:60)
hulu_dt % rev () %>% slice (1:5) %>% select (1:60)
Gather it together and count the number of unique titles in Top5 for 60 days
fjune_dt_gathered <- gather (fjune_dt)
fdjune_dt_gathered <- gather (fdjune_dt)
hdjune_dt_gathered <- gather (hdjune_dt)
hulu_dt_gathered <- gather (hulu_dt)
unique_fjune_gathered % length ()
unique_fdjune_gathered % length ()
unique_hdjune_gathered % length ()
unique_hulu_gathered % length ()
unique_gathered <- c(unique_fdjune_gathered, unique_hdjune_gathered, unique_hulu_gathered, unique_fjune_gathered)
unique_gathered <- as.data.frame (t(unique_gathered), stringsAsFactors = F)
colnames (unique_gathered) <- platforms
Let`s make a nice table for the results
unique_gathered_gt %
tab_header(
  title = "Number of unique movies (titles) in Top5")%>%
  tab_style(
    style = list(
      cell_text(color = "purple")),
    locations = cells_column_labels(
      columns = vars(HBOmax)))%>%
  tab_style(
    style = list(
      cell_text(color = "green")),
    locations = cells_column_labels(
      columns = vars(Hulu))) %>%
  tab_style(
    style = list(
      cell_text(color = "red")),
    locations = cells_column_labels(
      columns = vars(Netflix)))
unique_gathered_gt


Using similar code we can count the number of unique titles which were #1 one or more days


What movie was the longest in Tops / #1?
table_fjune_top5 <- sort (table (fjune_dt_gathered$value), decreasing = T) # Top5
table_fdjune_top5 <- sort (table (fdjune_dt_gathered$value), decreasing = T)
table_hdjune_top5 <- sort (table (hdjune_dt_gathered$value), decreasing = T)
table_hulu_top5 <- sort (table (hulu_dt_gathered$value), decreasing = T)
Plotting the results
bb5fdjune <- barplot (table_fdjune_top5 [1:5], ylim=c(0,62), main = "Days in Top5, Disney+", las = 1, col = 'blue')
text(bb5fdjune,table_fdjune_top5 [1:5] +2,labels=as.character(table_fdjune_top5 [1:5]))
bb5hdjune <- barplot (table_hdjune_top5 [1:5], ylim=c(0,60), main = "Days in Top5, HBO Max", las = 1, col = 'grey', cex.names=0.7)
text(bb5hdjune,table_hdjune_top5 [1:5] +2,labels=as.character(table_hdjune_top5 [1:5]))
bb5hulu <- barplot (table_hulu_top5 [1:5], ylim=c(0,60), main = "Days in Top5, Hulu", las = 1, col = 'green')
text(bb5hulu,table_hulu_top5 [1:5] +2,labels=as.character(table_hulu_top5 [1:5]))
bb5fjune <- barplot (table_fjune_top5 [1:5], ylim=c(0,60), main = "Days in Top5, Netflix", las = 1, col = 'red')
text(bb5fjune,table_fjune_top5 [1:5] +2,labels=as.character(table_fjune_top5 [1:5]))

The same for the movies / TV shows reached the first place in weekly count


Average days in top distribution
#top 5
ad5_fjune <- as.data.frame (table_fjune_top5, stringsAsFActrors=FALSE)
ad5_fdjune <- as.data.frame (table_fdjune_top5, stringsAsFActrors=FALSE)
ad5_hdjune <- as.data.frame (table_hdjune_top5, stringsAsFActrors=FALSE)
ad5_hulu <- as.data.frame (table_hulu_top5, stringsAsFActrors=FALSE)
par (mfcol = c(1,4))
boxplot (ad5_fdjune$Freq, ylim=c(0,20), main = "Days in Top5, Disney+")
boxplot (ad5_hdjune$Freq, ylim=c(0,20), main = "Days in Top5, HBO Max")
boxplot (ad5_hulu$Freq, ylim=c(0,20), main = "Days in Top5, Hulu")
boxplot (ad5_fjune$Freq, ylim=c(0,20), main = "Days in Top5, Netflix")

The same for the movies / TV shows reached the first place in weekly count (#1)