Cohen's Kappa — kappa • lrd

This function returns Cohen's Kappa k for two raters. Kappa indicates the inter-rater reliability for categorical items. High scores (closer to one) indicate agreement between raters, while low scores (closer to zero) indicate low agreement between raters. Negative numbers indicate they don't agree at all!

kappa(rater1, rater2, confidence = 0.95)

Arguments

rater1: Rater 1 scores or categorical listings
rater2: Rater 2 scores or categorical listings
confidence: Confidence interval proportion for the kappa interval estimate. You must supply a value between 0 and 1.

Value

p_agree: Percent agreement between raters
kappa: Cohen's kappa for yes/no matching
se_kappa: Standard error for kappa wherein standard error is the square root of: (agree \* (1-agree)) / (N \* (1 - random agreement)^2)
kappa_LL: Lower limit for the confidence interval of kappa
kappa_UL: Upper limit for the confidence interval of kappa

Details

Note: All missing values will be ignored. This function calculates kappa for 0 and 1 scoring. If you pass categorical variables, the function will return a percent match score between these values.

Examples


#This dataset includes two raters who wrote the word listed by
#the participant and rated if the word was correct in the recall
#experiment.

data(rater_data)

#Consider normalizing the text if raters used different styles
#Calculate percent match for categorical answers
kappa(rater_data$rater1_word, rater_data$rater2_word)
#> [1] 90

kappa(rater_data$rater1_score, rater_data$rater2_score)
#> $p_agree
#> [1] 40
#> 
#> $kappa
#> [1] -0.2
#> 
#> $se_kappa
#> [1] 0.219089
#> 
#> $kappa_LL
#> [1] -0.6294066
#> 
#> $kappa_UL
#> [1] 0.2294066
#>