socialrisk
PackageThe goal of socialrisk
is to create an efficient way to identify social risk from administrative health care data using ICD-10 diagnosis codes.
We’ve created a sample dataset of ICD-10 administrative data which we can load in.
i10_wide#> patient_id sex date_of_serv dx1 dx2 dx3 dx4 dx5 visit_type
#> 1 1001 male 2020-02-14 E876 Z560 Z6372 Z654 E440 ip
#> 2 1001 male 2021-05-15 J189 Z644 A408 I10 G309 ip
#> 3 1001 male 2021-01-10 I119 Z628 I10 <NA> <NA> ot
#> 4 1001 male 2021-04-02 G309 K731 Z591 <NA> <NA> ot
#> 5 1001 male 2021-05-06 E039 I10 J189 <NA> <NA> ot
#> 6 1001 male 2021-06-04 J189 Z604 F329 <NA> <NA> ot
#> 7 1001 male 2021-10-01 E0800 G309 I10 <NA> <NA> ot
#> 8 1001 male 2021-11-05 I6011 I10 F329 R930 <NA> ot
#> 9 1001 male 2022-02-01 M546 G309 I10 I6011 <NA> ot
#> 10 1001 male 2022-03-15 E0800 I10 J189 F329 <NA> ot
#> 11 1002 female 2020-01-09 G459 Z598 E840 <NA> <NA> ip
#> 12 1002 female 2020-03-23 E840 Z591 <NA> <NA> <NA> ot
#> 13 1002 female 2020-09-07 E119 Z558 <NA> <NA> <NA> ot
#> 14 1002 female 2020-12-05 E840 E119 <NA> <NA> <NA> ot
#> 15 1002 female 2022-03-25 F419 E119 G459 <NA> <NA> ot
#> 16 1003 male 2020-02-15 F3010 F1910 I10 G40909 R296 ip
#> 17 1003 male 2020-03-31 F3010 Z562 E109 <NA> <NA> ot
#> 18 1003 male 2020-12-31 K762 R569 Z576 <NA> <NA> ot
#> 19 1003 male 2021-12-22 E109 R569 F1910 F4310 <NA> ot
#> 20 1003 male 2021-12-25 G40909 F1910 R569 <NA> <NA> ot
#> 21 1003 male 2022-08-28 K762 Z564 <NA> <NA> <NA> ot
#> 22 1003 male 2022-09-05 E109 K762 F4310 <NA> <NA> ot
#> 23 1004 female 2021-01-09 C50111 F1020 F330 <NA> <NA> ot
#> 24 1004 female 2021-04-15 C50111 F330 <NA> <NA> <NA> ot
#> 25 1004 female 2021-06-08 F329 C50111 F1020 <NA> <NA> ot
#> 26 1005 female 2020-01-27 K4000 G839 R1030 R251 G43909 ip
#> 27 1005 female 2020-11-13 G43909 K4000 G839 <NA> <NA> ot
#> 28 1005 female 2021-12-07 J22 G839 G43909 <NA> <NA> ot
#> 29 1005 female 2021-12-26 B2790 J22 G839 <NA> <NA> ot
#> hcpcs icd_version
#> 1 E2201 10
#> 2 E2201 10
#> 3 E2201 10
#> 4 E2201 10
#> 5 E2201 10
#> 6 E2201 10
#> 7 E2201 10
#> 8 E2201 10
#> 9 E2201 10
#> 10 E2201 10
#> 11 E0159 10
#> 12 E0159 10
#> 13 E0159 10
#> 14 E0159 10
#> 15 E0159 10
#> 16 E1353 10
#> 17 E1353 10
#> 18 E1353 10
#> 19 E1353 10
#> 20 E1353 10
#> 21 E1353 10
#> 22 E1353 10
#> 23 A7047 10
#> 24 A7047 10
#> 25 A7047 10
#> 26 K0669 10
#> 27 K0669 10
#> 28 <NA> 10
#> 29 <NA> 10
We use the built-in clean_data()
function to specify the: dataset, patient id, current data format (wide or long), and the prefix of the diagnoses variables.
<- clean_data(dat = i10_wide,
data id = patient_id,
style = "wide",
prefix_dx = "dx")
#> # A tibble: 10 × 2
#> patient_id dx
#> <fct> <chr>
#> 1 1001 E876
#> 2 1001 Z560
#> 3 1001 Z6372
#> 4 1001 Z654
#> 5 1001 E440
#> 6 1001 J189
#> 7 1001 Z644
#> 8 1001 A408
#> 9 1001 I10
#> 10 1001 G309
Social Risk
Now, we can run our various social risk functions, with varying taxonomies.
Centers for Medicare and Medicaid Services (CMS)
Missouri Hospital Association
SIREN - UCSF