Session: Analyzing Categorical and Continuous Limited Dependent Variables with R (Society for Social Work and Research 20th Annual Conference - Grand Challenges for Social Work: Setting a Research Agenda for the Future)

198 Analyzing Categorical and Continuous Limited Dependent Variables with R

Schedule:
Saturday, January 16, 2016: 9:45 AM-11:15 AM
Lobby Level-Penn Quarter (Renaissance Washington, DC Downtown Hotel)
Cluster: Research Design and Measurement
Speakers/Presenters:
Wendy Zeitlin, PhD, Yeshiva University, Charles Auerbach, PhD, Yeshiva University, Matthew J. Cuellar, MSW, University of Tennessee, Knoxville and John G. Orme, PhD, University of Tennessee, Knoxville
Linear regression is a member of a family of statistical models known as the general linear model (GLM). The GLM incorporates a number of different statistical models for use with one or more continuous DVs. Despite its versatility, the GLM does not handle discrete or otherwise limited DVs (Smithson & Merkle, 2014; Orme & Combs-Orme, 2009).

Discrete variables have a finite number of indivisible values; they cannot take on all possible values within the limits of the variable. They include variables that are dichotomous, polytomous (three or more unordered categories), ordinal, or counts. In addition, many continuous DVs are “limited” in that they are bounded in some way. For example, continuous DVs might be censored. Censored variables are variables whose values are known over some range, but are unknown beyond a certain value because they were recorded or collected only up (or down) to a certain value (e.g., income only recorded up 100,000 per year). Continuous DVs also might be truncated. Continuous variables also might be bounded in other ways. For example, percentages and proportions have boundaries of 0 to100 and 0 to 1, respectively.

The generalized linear model (GZLM) extends linear regression to DVs that are not continuous and to DVs that do not have normally distributed errors with a constant variance, assumptions underlying the GLM. The GZLM subsumes the GLM and thus provides a framework for analyzing a wide-ranging class of models using unified techniques (Smithson & Merkle, 2014). At the same time, it uses many familiar ideas from linear regression.

The purpose of this workshop is twofold. First, we will provide an overview of the GZLM in relation to DVs of interest to social work researchers and give participants an extensive set of resources they can use to further learn about the GZLM.  Topics covered will include binary and multinomial logistic regression and the negative binomial regression.

Second, we will demonstrate the ease with which GZLM regression models can be estimated using R, an open-source language for statistical computing and graphical production (Bilder & Loughin, 2015; Smithson & Merkle, 2014; The R Project for Statistical Computing, n.d.). This will include a brief introduction to downloading R and its freely available components, but will then move on to demonstrate GZLM functions available in R and user-contributed packages.  Scripts and datasets used in the workshop will be provided to participants. 

R provides a number of advantages that have benefits for social work practitioners and researchers, such as its packages and their adaptable resources, the flexibility it provides users when executing statistical functions, its ability to simply integrate statistical results in publishable documents (e.g. LaTex and Word documents), and its easy-to-use graphical interface options (Dalzell, 2013; Muncheon, 2009). Social work practitioners can easily access R with no cost and use it to improve their practice and research efforts. Thus this workshop will provide the foundation necessary for attendees to get started with R through the introduction of GZLM functions that have relevance to DVs of interest in social work research.

See more of: Workshops