| Title: | Data Sets for Craig Starbuck's Book, "The Fundamentals of People Analytics: With Applications in R" |
| Version: | 0.1.0 |
| Description: | Data sets associated with modeling examples in Craig Starbuck's book, "The Fundamentals of People Analytics: With Applications in R". |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.2.1 |
| Depends: | R (≥ 3.5.0) |
| LazyData: | true |
| NeedsCompilation: | no |
| Packaged: | 2022-10-21 15:03:52 UTC; craig.starbuck |
| Author: | Craig Starbuck |
| Maintainer: | Craig Starbuck <cstarbuck@orgacuity.com> |
| Repository: | CRAN |
| Date/Publication: | 2022-10-25 17:22:35 UTC |
benefits
Description
Fictitious benefits data for employees in a mid-size company
Usage
data("benefits")
Format
A data frame with 1471 observations on the following 3 variables.
employee_idUnique identifier for each employee
stock_opt_lvlJob level, where 1 = 'Junior' and 5 = 'Senior'
trainingsNumber of trainings completed within the past year
Examples
data(benefits)
demographics
Description
Fictitious demographics data for employees in a mid-size company
Usage
data("demographics")
Format
A data frame with 1470 observations on the following 7 variables.
employee_idUnique identifier for each employee
ageEmployee age in years
commute_distCommute distance in miles
ed_lvlEducation level, where 1 = 'High School', 2 = 'Associate Degree', 3 = 'Bachelor's Degree', 4 = 'Master's Degree', and 5 = 'Doctoral Degree'
ed_fieldEducation field associated with most recent degree
genderGender self-identification
marital_stsMarital status
Examples
data(demographics)
employees
Description
Fictitious data on employees in a mid-size company
Usage
data("employees")
Format
A data frame with 1470 observations on the following 36 variables.
employee_idUnique identifier for each employee
activeFlag set to 'Yes' for active employees and 'No' for inactive employees
stock_opt_lvlStock option level
trainingsNumber of trainings completed within the past year
ageEmployee age in years
commute_distCommute distance in miles
ed_lvlEducation level, where 1 = 'High School', 2 = 'Associate Degree', 3 = 'Bachelor's Degree', 4 = 'Master's Degree', and 5 = 'Doctoral Degree'
ed_fieldEducation field associated with most recent degree
genderGender self-identification
marital_stsMarital status
deptDepartment of which an employee is a member
engagementEmployee engagement score measured on a 4-point Likert scale, where 1 = 'Highly Disengaged' and 4 = 'Highly Engaged'
job_lvlJob level, where 1 = 'Junior' and 5 = 'Senior'
job_titleJob title
overtimeFlag set to 'Yes' if the employee is nonexempt and works overtime and 'No' if the employee does not work overtime
business_travelBusiness travel frequency
hourly_rateHourly rate calculated irrespective of hourly/salaried employees
daily_compHourly rate * 8
monthly_compHourly rate * 2080 / 12
annual_compHourly rate * 2080
ytd_leadsYear-to-date (YTD) number of leads generated for employees in Sales Executive and Sales Representative positions
ytd_salesYear-to-date (YTD) sales measured in USD for employees in Sales Executive and Sales Representative positions
standard_hrsExpected working hours over a two-week payroll cycle
salary_hike_pctThe percent increase in salary for the employee's most recent compensation adjustment (whether due to a standard merit increase, off-cycle adjustment, or promotion)
perf_ratingMost recent performance rating, where 1 = 'Needs Improvement', 2 = 'Core Contributor', 3 = 'Noteworthy', and 4 = 'Exceptional'
prior_emplr_cntNumber of prior employers
env_satEnvironment satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
job_satJob satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
rel_satCollegue relationship satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
wl_balanceWork-life balance score measured on a 4-point Likert scale, where 1 = 'Poor Balance' and 4 = 'Excellent Balance'
work_expTotal years of work experience
org_tenureYears at current company
job_tenureYears in current job
last_promoYears since last promotion
mgr_tenureYears under current manager
interview_ratingAverage rating across the interview loop for the onsite stage of the employee's recruiting process, where 1 = 'Definitely Not' and 5 = 'Definitely Yes'
Examples
data(employees)
job
Description
Fictitious job data for employees in a mid-size company
Usage
data("job")
Format
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
deptDepartment of which an employee is a member
job_lvlJob level, where 1 = 'Junior' and 5 = 'Senior'
job_titleJob title
overtimeFlag set to 'Yes' if the employee is nonexempt and works overtime and 'No' if the employee does not work overtime
business_travelBusiness travel frequency
Examples
data(job)
payroll
Description
Fictitious payroll data for employees in a mid-size company
Usage
data("payroll")
Format
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
hourly_rateHourly rate calculated irrespective of hourly/salaried employees
daily_compHourly rate * 8
monthly_compHourly rate * 2080 / 12
annual_compHourly rate * 2080
standard_hrsExpected working hours over a two-week payroll cycle
Examples
data(payroll)
performance
Description
Fictitious performance data for employees in a mid-size company
Usage
data("performance")
Format
A data frame with 1470 observations on the following 3 variables.
employee_idUnique identifier for each employee
salary_hike_pctThe percent increase in salary for the employee's most recent compensation adjustment (whether due to a standard merit increase, off-cycle adjustment, or promotion)
perf_ratingMost recent performance rating, where 1 = 'Needs Improvement', 2 = 'Core Contributor', 3 = 'Noteworthy', and 4 = 'Exceptional'
Examples
data(performance)
prior_employment
Description
Fictitious prior employment data for employees in a mid-size company
Usage
data("prior_employment")
Format
A data frame with 1470 observations on the following 2 variables.
employee_idUnique identifier for each employee
prior_emplr_cntNumber of prior employers
Examples
data(prior_employment)
sentiment
Description
Fictitious sentiment data for employees in a mid-size company
Usage
data("sentiment")
Format
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
env_satEnvironment satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
engagementEmployee engagement score measured on a 4-point Likert scale, where 1 = 'Highly Disengaged' and 4 = 'Highly Engaged'
job_satJob satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
rel_satColleague relationship satisfaction score measured on a 4-point Likert scale, where 1 = 'Highly Dissatisfied' and 4 = 'Highly Satisfied'
wl_balanceWork-life balance score measured on a 4-point Likert scale, where 1 = 'Poor Balance' and 4 = 'Excellent Balance'
Examples
data(sentiment)
status
Description
Fictitious data on the active status of employees in a mid-size company
Usage
data("status")
Format
A data frame with 1470 observations on the following 2 variables.
employee_idUnique identifier for each employee
activeFlag set to 'Yes' for active employees and 'No' for inactive employees
Examples
data(status)
survey_responses
Description
Fictitious survey responses for anonymized employees in a mid-size company
Usage
data("survey_responses")
Format
A data frame with 400 observations on the following 12 variables.
belongBelonging score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
effortDiscretionary Effort score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
inclInclusion score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
eng_1Engagement score on item 1 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Disengaged' and 5 = 'Highly Engaged'
eng_2Engagement score on item 2 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Disengaged' and 5 = 'Highly Engaged'
eng_3Engagement score on item 3 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Disengaged' and 5 = 'Highly Engaged'
happHappiness score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
psafetyPsychological Safety score measured on a 7-point Likert scale, where 1 = 'Highly Unfavorable' and 7 = 'Highly Favorable'
ret_1Retention score on item 1 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
ret_2Retention score on item 2 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
ret_3Retention score on item 3 of 3 measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
ldrshpSenior Leadership score measured on a 5-point Likert scale, where 1 = 'Highly Unfavorable' and 5 = 'Highly Favorable'
Examples
data(survey_responses)
tenure
Description
Fictitious tenure data for employees in a mid-size company
Usage
data("tenure")
Format
A data frame with 1470 observations on the following 6 variables.
employee_idUnique identifier for each employee
work_expFlag set to 'Yes' for active employees and 'No' for inactive employees
org_tenureYears at current company
job_tenureYears in current job
last_promoYears since last promotion
mgr_tenureYears under current manager
Examples
data(tenure)
turnover_trends
Description
Fictitious monthly employee turnover rates by several dimensions
Usage
data("turnover_trends")
Format
A data frame with 3000 observations on the following 6 variables.
yearInteger representing the year, which ranges from 1 (earliest) to 5 (most recent)
monthInteger representing the month, which ranges from 1 (January) to 12 (December)
jobJob title
levelJob level, where 1 = 'Junior' and 5 = 'Senior'
remoteFlag set to 'Yes' for a remote worker and 'No' for a non-remote worker
turnover_rateMonthly turnover rate, calculated by dividing the termination count into the average headcount (beginning headcount + ending headcount / 2) for the respective month
Examples
data(turnover_trends)