Introduction
In social sciences, variables of interest are often conceptualized as latent variables—hidden continuous variables measured through Likert scale questions, typically categorized as Strongly disagree, Disagree, Neutral, Agree, and Strongly agree. Researchers frequently aim to uncover these latent variables using various statistical techniques.
Accurate modeling of survey data is crucial for comparative analysis through simulation, especially when applying statistical techniques that require metric data. The latent2likert package addresses this need by providing an effective algorithm to simulate Likert response variables from hypothetical latent variables. This post introduces the features of the latent2likert package.
Simulating Likert Scale Responses
Using the rlikert
function, you can generate random responses to Likert scale questions based on specified means and standard deviations of latent variables, with optional settings for skewness and correlations.
Reproducing Rating-Scale Data
From existing survey data, you can estimate the values of latent parameters using the estimate_params
function. You can then generate new responses using the estimated parameters to create a new dataset with very similar properties.
Further Reading
For more detailed information and practical examples, please refer to the package website and vignette. The implemented algorithms are described in the function reference.
Related R Packages
To simulate Likert scale responses, the draw_likert
function from the fabricatr
package can recode a latent variable into a Likert response variable by specifying intervals that subdivide the continuous range. However, the latent2likert package offers an advantage by automatically calculating optimal intervals that minimize distortion between the latent variable and the Likert response variable for both normal and skew normal latent distributions, eliminating the need to manually specify the intervals.
There are also alternative approaches that do not rely on latent distributions. One method involves directly defining a discrete probability distribution and sampling from it using the sample
function in R or the likert
function from the wakefield
package. Another approach is to specify the means, standard deviations, and correlations among Likert response variables. For this, you can use LikertMakeR
or SimCorMultRes
to generate correlated multinomial responses.
Additionally, you can define a data-generating process. For those familiar with item response theory, the mirt
package allows users to specify discrimination and difficulty parameters for each response category.