| Type: | Package |
| Title: | Data to Accompany Smithson & Merkle, 2013 |
| Version: | 1.2 |
| Date: | 2018-05-22 |
| Author: | Ed Merkle and Michael Smithson |
| Maintainer: | Ed Merkle <merklee@missouri.edu> |
| Description: | Contains data files to accompany Smithson & Merkle (2013), Generalized Linear Models for Categorical and Continuous Limited Dependent Variables. |
| License: | GPL-2 |
| NeedsCompilation: | no |
| Packaged: | 2018-05-22 16:32:15 UTC; merkle |
| Repository: | CRAN |
| Date/Publication: | 2018-05-22 16:38:40 UTC |
Babies gaze data
Description
Gaze patterns of four babies in a group.
Usage
data("babies")
Format
A data frame with 1180 observations on the following 6 variables.
rowa numeric vector
timea numeric vector indexing the target baby
ida numeric vector indexing the observations
gazea factor indicating whether a baby was looked at, with levels
noyesbabiesa factor indexing which baby was chosen to be looked at with levels
baby1baby2baby3baby4lookedata numeric vector registering whether gaze was initiated by the target baby, with levels
0indicating “no” and1indicating “yes”
Source
These are hypothetical data.
Examples
data("babies", package="smdata")
Car salesperson problem
Description
Replication of the car salesperson problem in See, Fox, and Rottenstreich (2006)
Usage
data("carsales")
Format
A data frame with 155 observations on the following 4 variables.
initiala numeric vector taking the value
0for the Car condition and1for the Salesperson conditionproba numeric vector recording the respondent's probability estimate that the car was purchased from Carlos
NFCCa numeric vector recording respondents' scores on the Need for Certainty and Closure scale
ctrNFCCa numeric vector that is NFCC standardized to have a mean of 0 and standard deviation of 1
Source
Data provided by Gurr, M. (2009).
References
Gurr, M. (2009). Partition dependence: Investigating the principle of insufficient reason, uncertainty and dispositional predictors. (Unpublished Honours thesis: The Australian National University, Canberra, Australia)
See, K. E., Fox, C. R., & Rottenstreich, Y. S. (2006). Between ignorance and truth: Partition dependence and learning in judgment under uncertainty. Journal of Experimental Psychology, 32, 1385-1402.
Examples
data("carsales", package="smdata")
Sex by method of cocaine ingestion
Description
Data from the 1991-1994 Drug Abuse Treatment Outcome Study on cocaine usage patterns.
Usage
data("cocaine")
Format
A data frame with 7592 observations on the following 2 variables.
sexa factor with levels
femalemalemodea factor recording self-reported method of cocaine ingestion with levels
crackfreebaseinhaleinject
Source
The data are extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("cocaine", package="smdata")
Sex and race by method of cocaine ingestion
Description
Data from the 1991-1994 Drug Abuse Treatment Outcome Study on cocaine usage patterns.
Usage
data("cocaineplus")
Format
A data frame with 7592 observations on the following 8 variables.
sexsrta factor with levels
FEMALEMALEagea numeric vector
mstatstra factor with levels
BLANKDIVORCEDLIVINGASMARRIEDMARRIEDNEVERMARRIEDSEPARATEDWIDOWEDmodestra factor with levels
crackfreebaseinhaleinjectracestra factor with levels
AfroAmericanCaucasianHispanicOthersexa numeric vector that takes the value
1if male and0if femalemodea numeric vector that takes the value
1if cocaine usage method is crack,2if method is freebase,3if method is inhale, and4if method is injectracea numeric vector that takes the value
1if AfroAmerican,2if Caucasian,3if Hispanic, and4if Other
Source
The data were extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("cocaineplus", package="smdata")
Depression, Anxieity, and Stress
Description
Depression, Anxieity, and Stress Scale Data.
Usage
data("dass")
Format
A data frame with 166 observations on the following 3 variables.
depressa numeric vector measuring depression, scored from 0 to 20
anxietya numeric vector measuring anxiety, scored from 0 to 20
stressa numeric vector measuring stress, scored from 0 to 20
Source
Data from a pilot study by Michael Smithson.
References
Lovibond, P. F., & Lovibond, S. H. (1995). The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales with the Beck Depression and Anxiety Inventories. Behavior Research and Therapy, 33, 335-343.
Examples
data("dass", package="smdata")
Dyslexic readers data
Description
Reading scores and nonverbal IQ scores for gender- and age-matched dyslexic and non-dyslexic readers.
Usage
data("dyslexic3")
Format
A data frame with 44 observations on the following 3 variables.
scorea numeric vector recording childrens' scores on a reading accuracy test
dysa numeric vector taking the value
1if dyslexic and0if notziqa numeric vector recording childrens' nonverbal IQ scores, standardized to have a mean of 0 and standard deviation of 1
Details
The reading accuracy scores have a maximum score of 1, indicating a perfect score on the test. In the Example 6.2 analysis, these are recoded to .99; whereas in the 1's inflated model in Ch. 6 and the censored regression model in Ch. 7 they have a value of 1.
Source
Data provided from Pammer and Kevan (2007), first analyzed in Smithson and Verkuilen (2006).
References
Pammer, K., & Kevan, A. (2007). The contribution of visual sensitivity, phonological processing, and nonverbal IQ to childrens reading. Scientific Studies in Reading, 11, 33-53.
Smithson, M. J., & Verkuilen, J. (2006). A better lemon squeezer? maximum likelihood regression with beta-distributed dependent variables. Psychological Methods, 11, 54-71.
Examples
data("dyslexic3", package="smdata")
Marital Status and Email Usage
Description
Data from the U.S. General Social Surveys on marital status (ordinal; see details) and email usage.
Usage
data("email")
Format
A data frame with 3967 observations on the following 3 variables.
maritalMarital status, an ordered factor with levels
never.married<married<divorced.email.hrsReported weekly hours spent emailing.
z.emailStandardized version of
email.hrs.
Details
In creation of this dataset, an additional GSS item (DIVORCE) was used to ensure that married people in the sample had not been previously divorced or widowed. Thus, the marital status variable in this dataset is truly ordinal, as individuals can only progress through the statuses in one order.
Source
The Survey Documentation and Analysis system hosted at UC, Berkeley: http://sda.berkeley.edu/GSS/.
References
Smith, T. W., Marsden, P. V., Hout, M., & Kim, J. (2011). General Social Surveys, 1972 - 2010. Principal Investigator, Tom W. Smith; Co-Principal-Investigators, Peter V. Marsden and Michael Hout, NORC ed. Chicago: National Opinion Research Center, producer, 2005; Storrs, CT: The Roper Center for Public Opinion Research, University of Connecticut, distributor. 1 data file (55,087 logical records) and 1 codebook (3,610 pp).
Examples
data("email", package="smdata")
Euthanasia Scale
Description
Euthanasia scale and Christian identification scale data.
Usage
data("euthan")
Format
A data frame with 351 observations on the following 3 variables.
midenta numeric vector measuring the degree to which respondents identify themselves as Christian, on a scale from 0 to 1
teutha numeric vector measuring the degree to which respondents favor euthanasia, on a scale from 0 to 1
statusa numeric vector taking the value
0if the observation is censored and1if not
Source
Data obtained from Mavor's (2004) study.
References
Mavor, K. (2004). Religious orientation, social identity and attitudes to homosexuality. Unpublished doctoral dissertation, School of Psychology, The Australian National University, Canberra, A.C.T., Australia.
Examples
data("euthan", package="smdata")
Exam data
Description
Grades achieved by second-year psychology students at The Australian National University in an introductory research methods course and the percentage marks they received in the laboratory component of that course.
Usage
data("exam")
Format
A data frame with 154 observations on the following 3 variables.
Labsa numeric vector recording the percentage mark for the laboratory component of the course
Finala numeric vector recording the percentage mark for the final exam
censa numeric vector taking the value
100to indicate the value of censored observations
Source
Data obtained from Michael Smithson.
Examples
data("exam", package="smdata")
Confidence in financial knowledge
Description
Choice and confidence data from a study of financial knowledge involving U.S. undergraduates.
Usage
data("finance")
Format
A data frame with 4230 observations on the following 11 variables.
subParticipant number.
jmethExperimental condition, with levels
1cd2ci3ei(see details).itemItem number.
easyfoilEquals 1 if the foil (incorrect alternative) was easy, 0 if the foil was hard (see details).
targtopEquals 1 if the correct alternative was the first one displayed (on top), 0 otherwise.
choParticipant's choice (equals one for the first alternative, 0 for the second alternative).
corrParticipant's accuracy (essentially
targtop==cho).iprobaFor conditions
2ciand3ei, the participant's confidence in the first alternative.iprobbFor conditions
2ciand3ei, the participant's confidence in the second alternative.probcThe participant's confidence in his/her choice (see details).
nchorevThe number of choice revisions that the participant made.
Details
The data come from Study 2 of Sieck, Merkle, and Van Zandt (2007). Experimental participants completed a 30-item, 2-alternative test of financial knowledge. For each item, the participant first chose an alternative and then made a confidence judgment.
The confidence
elicitation method varied across three between-subjects conditions.
For condition 1cd, participants reported confidence in their
chosen alternative on a scale from 50% to 100%. For conditions 2ci
and 3ei, participants reported independent confidence judgments
for each alternative on scales from 0% to 100%. These independent
confidence judgments are contained in iproba and iprobb.
In these conditions, probc is obtained by normalizing confidence
in the chosen alternative by the sum of independent judgments.
In addition to reporting independent confidence judgments in condition
3ei,
participants wrote an explanation in response to the
question "Why is this option true?" prior to reporting each confidence
judgment.
For each item, the incorrect alternative was manipulated to sometimes be
easy (easyfoil==1) and sometimes be difficult
(easyfoil==0). Foil difficulty was defined by the accuracy of an
independent group of students on a four-alternative version of the
financial knowledge test; see Sieck et al. for more detail.
Source
Provided by Ed Merkle.
References
Sieck, W.R., Merkle, E.C., & Van Zandt, T. (2007). Option fixation: A cognitive contributor to overconfidence. Organizational Behavior and Human Decision Processes, 103, 68-83.
Examples
data("finance", package="smdata")
Word Color and Fixations
Description
Summary eyetracking data from a study examining the impact of text saliency on eye movements.
Usage
data("fixations")
Format
A data frame with 48 observations on the following 6 variables.
idParticipant ID label.
conditionCondition, signifying whether a channel had a red title (see details).
countleftCount of fixations in the middle, left channel.
countrightCount of fixations in the middle, right channel.
gazetimeTotal gaze time on the webpage.
rt.condEquals
redif the middle, right channel title was red;blackotherwise.
Details
The data are taken from Owens, Shrestha, & Chaparro (2009). A webpage was divided into 9 channels (sections), and the title color of the "middle, left" and "middle, right" channels were manipulated.
The variable condition takes the value Control if all
title colors were black; Left if the "middle, left" channel title
was red; and Right if the "middle, right" channel title was red.
Source
Provided by Justin W. Owens.
References
Owens, J.W., Shrestha, S., & Chaparro, B.S. (2009). Effects of text saliency on eye movements while browsing a web portal. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (Vol. 53, pp. 1257-1261).
Examples
data("fixations", package="smdata")
Grades and marks for an undergraduate course
Description
Lab percentage mark, letter grade, lower and upper grade thresholds, a censored variable value, and the final percentage course mark.
Usage
data("grades")
Format
A data frame with 165 observations on the following 6 variables.
laba numeric vector recording the percentage mark for the laboratory component of the course
gradecata factor denoting the letter grade for the course, with levels
CRDHDNPlowera numeric vector denoting the lower threshold for the corresponding letter grade
uppera numeric vector denoting the upper threshold for the corresponding letter grade
censa numeric vector listing the censoring value of a mark,
3finmarka numeric vector recording the final percentage mark for the course
Source
Data obtained from Michael Smithson.
Examples
data("grades", package="smdata")
Study 1 judged probabilities of guilt
Description
Judged probabilities of guilt in a criminal trial scenario (Study 1).
Usage
data("guilt1")
Format
A data frame with 104 observations on the following 7 variables.
observa numeric vector indexing cases
crguilta numeric vector recording the judged probability of guilt in a criminal trial scenario
cigulta numeric vector recording the judged probability of guilt in a civil trial scenario
crvd1a numeric vector taking the value
1if the respondent returned a “guilty” verdict in the criminal trial and0otherwisecrvd2a numeric vector taking the value
1if the respondent returned a “not guilty” verdict in the criminal trial and0otherwisecivd1a numeric vector taking the value
1if the respondent returned a “guilty” verdict in the civil trial and0otherwisecivd2a numeric vectortaking the value
1if the respondent returned a “not guilty” verdict in the civil trial and0otherwise
Source
Data provided from Study 1 of Smithson, Deady and Gracik (2007).
References
Smithson,M., Gracik, L., & Deady, S. (2007). Guilty, not guilty, or ?multiple verdict options in jury verdict choices. Journal of Behavioral Decision Making, 20, 481-498.
Examples
data("guilt1", package="smdata")
Study 3 judged probabilities of guilt
Description
Judged probabilities of guilt in a criminal trial scenario (Study 3).
Usage
data("guilt3")
Format
A data frame with 96 observations on the following 3 variables.
pguilta numeric vector recording the judged probability of guilt in a criminal trial scenario
v1a numeric vector taking the value
1if the respondent returned a “guilty” verdict in the criminal trial and0otherwisev2a numeric vector taking the value
1if the respondent returned a 'not 'guilty” verdict in the criminal trial and0otherwise
Source
Data provided from Study 3 of Smithson, Deady and Gracik (2007).
References
Smithson, M., Gracik, L., & Deady, S. (2007). Guilty, not guilty, or ?multiple verdict options in jury verdict choices. Journal of Behavioral Decision Making, 20, 481-498.
Examples
data("guilt3", package="smdata")
Lower and upper probability estimates
Description
Lower and upper probability estimates provided by the Busdecu et al. (2009) respondents in their interpretations of the phrase “very likely” in an IPCC report statement, along with dummy variables indicating the experimental condition.
Usage
data("intervalbeta")
Format
A data frame with 220 observations on the following 5 variables.
ta numeric vector taking the value
1if the respondent is in the Translation condition, and0otherwisena numeric vector taking the value
1if the respondent is in the Narrow condition, and0otherwisewa numeric vector taking the value
1if the respondent is in the Wide condition, and0otherwisey1a numeric vector recording the respondent's lower probability estimate
y2a numeric vector recording the respondent's upper probability estimate
Source
Data provided by D. V. Budescu from the Budescu et al. (2009) study.
References
Budescu, D.V., Broomell, S., and Por,H.-H. (2009). Improving the communication of uncertainty in the reports of the Intergovernmental panel on climate change, Psychological Science, 20, 299-308.
Examples
data("intervalbeta", package="smdata")
Word and non-word response data
Description
Frequency with which respondents correctly identified 0, 1, 2, 3, or 4 letters (in correct versus incorrect order) of a word or non-word based on a cue.
Usage
data("phono")
Format
A data frame with 16 observations on the following 3 variables.
treeida numeric vector, a tree identification code needed by the R package for estimating MPT models
respa factor denoting whether a respondent correctly identified 0, 1, 2, 3, or 4 letters, with CO denoting the 4 letters were in the correct order and IO indicating that they were not, with levels
0L1L2L3L4LCO4LIOfra numeric vector recording the frequency of each response type
Source
These data are extracted from Maris (2002) figure 7, pg. 1421.
References
Maris, E. (2002). The role of orthographic and phonological codes in the word and the pseudoword superiority effect: An analysis by means of multinomial processing tree models. Journal of Experimental Psychology: Human Perception and Performance, 28, 1409-1431.
Examples
data("phono", package="smdata")
Censored response time data
Description
Response times for a task timed-out at 1200 ms, and a prime (either respondents were primed to use intuition or deliberation in the task).
Usage
data("rtime")
Format
A data frame with 300 observations on the following 3 variables.
RTa numeric vector, response time in milliseconds
primea numeric vector taking the value
0if primed to use intuition or1if primed to use deliberationstatusa numeric vector taking the value
0if the observation is censored and1if not
Source
These are hypothetical data.
Examples
data("rtime", package="smdata")
School Skipping
Description
Data from the U.S. National Survey on Drug Use and Health on the frequency with which individuals skip school and other covariates.
Usage
data("skipping")
Format
A data frame with 252 observations on the following 6 variables.
incomeReported household income, where
1means < $20k;2means >= $20k and < $50k;3means >= $50k and < $75k;4means >= $75k.irsexGender;
1is male and2is female.educatn2Grade in school (see details).
schdskipReported number of school days skipped out of the past 30.
wrkhrsw2Reported number of hours worked in the past week.
anyskipA binary version of
schdskip, signifying whether the respondent skipped any days of school out of the past 30.
Details
Variable names match those from the National Survey on Drug Use and Health, so more
details can be obtained from the survey codebook. Missing data codes have been
changed to NA. Additionally, the educatn2 has been recoded to generally
match the actual grade in which the respondent is enrolled. The only exceptions to this
are that 14 means the second and third years in college, and 15 means the fourth
or higher year in college.
Source
Obtained from the Inter-University Consortium for Political and Social Research, University of Michigan, http://www.icpsr.umich.edu.
References
United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Center for Behavioral Health Statistics and Quality. National Survey on Drug Use and Health, 2010. ICPSR32722-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011-12-05. doi:10.3886/ICPSR32722.v1
Examples
data("skipping", package="smdata")
Transportation mode choice
Description
Choice of transportation mode by gender.
Usage
data("trchoice")
Format
A data frame with 10 observations on the following 4 variables.
treeida numeric vector needed for identifying a tree in the MPT algorithm
sexa numeric vector taking the value
1if male and0if femalerespa factor denoting the transport mode choice, where
Ddenotes driving one's own vehicle,Fdenotes getting a ride with a friend,Odenotes other,Pdenotes using public transport, andWdenotes walkingfra numeric vector recording the frequency with which each transport mode is chosen
Source
The data are extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("trchoice", package="smdata")
Chest Pain Treatment Preferences
Description
Experimental data in which participants were presented with statistical information about two treatments for chest pain, then asked about their preference for a treatment.
Usage
data("treatment")
Format
A data frame with 235 observations on the following 4 variables.
condCondition, referring to the way that statistical information was presented (see details).
choiceTreatment preference on an ordinal, 6-level scale from "definitely angioplasty" to "definitely bypass".
effectivenessParticipant ratings of the importance of treatment effectiveness on treatment choice (1 is extremely unimportant; 6 is extremely important).
invasivenessParticipant ratings of the importance of treatment invasiveness on treatment choice (1 is extremely unimportant; 6 is extremely important).
Details
The data were taken from Hulsey (2010). Study participants were asked to make a hypothetical decision between two treatments for chest pain: bypass surgery or balloon angioplasty. Bypass is generally more effective, but it is also more invasive and has a longer recovery time.
Conditions were defined by the way participants received statistical
information concerning the two treatments. In condition
pictograph, participants viewed
visual information via a pictograph. In condition
statistics, participants view numerical information.
Source
Provided by Lukas Hulsey.
References
Hulsey, L. (2010). Testimonials and statistics in patient medical decision aids. Unpublished master's thesis, Wichita State University.
Examples
data("treatment", package="smdata")
Transportation mode choice, long format
Description
Choice of transportation mode by gender, in long format so that each choice occupies 5 rows.
Usage
data("trlong")
Format
A data frame with 31680 observations on the following 6 variables.
obsa numeric vector
casea numeric vector
sexa numeric vector, =
1if male and0if femalerespa factor indicating the transport mode choice, and
Bdenotes taking the bus, codeD denotes driving one's own vehicle,Fdenotes getting a ride with a friend,Odenotes other, andWdenotes walkingchosena numeric vector taking the value
1if the transport mode was chosen and0if notpubpriva numeric vector that takes a value of
1if the transportation mode is private and0if it is public
Source
The data are extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("trlong", package="smdata")
Work Days Missed
Description
Data from the U.S. National Survey on Drug Use and Health on the frequency with which individuals miss work due to mental health issues and other covariates.
Usage
data("workdays")
Format
A data frame with 777 observations on the following 8 variables.
cigtryReported age that the respondent first smoked a cigarette.
impydaysReported days in the past year the respondent was unable to work due to mental health (see details).
age2Respondent age (see details).
serviceHas the respondent been in the U.S. Armed Forces? (
1=yes,0=no)healthRating of overall health, where
1is excellent and5is poor.movespy2Number of times the respondent moved in the past 12 months.
schenrlWhether the respondent is enrolled in any school (
1=yes,0=no).coutyp2Type of county in which the respondent resides: large metro (
large), small metro (small), nonmetro (nonmetro).
Details
Variable names match those from the National Survey on Drug Use and Health, so more
details can be obtained from the survey codebook. Missing data codes have been
changed to NA. Additionally, age2 is coded so that 7 means 18 years of age, 8 means 19 years of age, ..., 11 means 22 or 23 years of age, 12 means 24 or 25 years,
13 means 26-29, 14 means 30-34, 15 means 35-49, 16 means 50-64, and 17 means 65 and over.
The variable impydays contains responses to the question "About how many days out of 365 in the past 12 months were you totally unable to work or carry out your normal activities because of your emotions, nerves, or mental health?"
Source
Obtained from the Inter-University Consortium for Political and Social Research, University of Michigan, http://www.icpsr.umich.edu.
References
United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Center for Behavioral Health Statistics and Quality. National Survey on Drug Use and Health, 2010. ICPSR32722-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011-12-05. doi:10.3886/ICPSR32722.v1
Examples
data("workdays", package="smdata")