--- title: "hR: Toolkit for Data Analytics in Human Resources" author: "Dale Kube" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{hR: Toolkit for Data Analytics in Human Resources} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} library(hR) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` Transform and analyze workforce data in meaningful ways for human resources (HR) analytics. The use of two functions, `hierarchy` and `hierarchyStats`, is demonstrated below. Convert standard employee and supervisor relationship data into useful formats for robust analytics, summary statistics, and span of control metrics. Install the package from CRAN by running the `install.packages("hR")` command. ## workforceHistory data The examples in this vignette use the sample `workforceHistory` data set. This data set reflects an artificial organization's historical workforce/employment data. The sample is reduced to a data.table containing one row per active employee and contractor in order to properly iterate over the current hierarchy structure in the following sections. ```{r workforceHistory} data("workforceHistory") # Reduce to DATE <= today to exclude future-dated records dt = workforceHistory[DATE<=Sys.Date()] # Reduce to max DATE and SEQ per person dt = dt[dt[,.I[which.max(DATE)],by=.(EMPLID)]$V1] dt = dt[dt[,.I[which.max(SEQ)],by=.(EMPLID,DATE)]$V1] # Only consider workers who are currently active # This provides a reliable 'headcount' data set that reflects today's active workforce dt = dt[STATUS=="Active"] # Exclude the CEO because she does not have a supervisor CEO = dt[TITLE=="CEO",EMPLID] dt = dt[EMPLID!=CEO] # Show the prepared table # This represents an example, active workforce print(dt[,.(EMPLID,NAME,TITLE,SUPVID)]) ``` ## hierarchy The `hierarchy` convenience function transforms a standard set of unique employee and supervisor identifiers (employee IDs, email addresses, etc.) into an elongated or wide format that can be used to aggregate employee data by a particular line of leadership (i.e. include everyone who rolls up to Susan). When `format = "long"`, the function returns a long data.table consisting of one row per employee for every supervisor above them, up to the top of the tree. ```{r hierarchyLong} hLong = hierarchy(dt$EMPLID,dt$SUPVID,format="long") print(hLong) # Who reports up through Susan? (direct and indirect reports) print(hLong[Supervisor==CEO]) ``` When `format = "wide"`, the function returns a wide data.table with a column for every level in the hierarchy, starting from the top of the tree (i.e. "Supv1" is the top person in the hierarchy). ```{r hierarchyWide} hWide = hierarchy(dt$EMPLID,dt$SUPVID,format="wide") print(hWide) # Who reports up through Pablo? (direct and indirect reports) print(hWide[Supv2==199827]) ``` ## hierarchyStats The `hierarchyStats` function computes summary statistics and span of control metrics from a standard set of unique employee and supervisor identifiers (employee IDs, email addresses, etc.). The resulting metrics and table are accessible from a list object. ```{r hierarchyStats} hStats = hierarchyStats(dt$EMPLID,dt$SUPVID) # Total Levels: print(hStats$levelsCount$value) # Total Individual Contributors: print(hStats$individualContributorsCount$value) # Total People Managers: print(hStats$peopleManagersCount$value) # Median Direct Reports: print(hStats$medianDirectReports$value) # Median Span of Control (Direct and Indirect Reports): print(hStats$medianSpanOfControl$value) # Span of Control Table print(hStats$spanOfControlTable) ```