This function returns Cohen's Kappa k for two raters. Kappa indicates the inter-rater reliability for categorical items. High scores (closer to one) indicate agreement between raters, while low scores (closer to zero) indicate low agreement between raters. Negative numbers indicate they don't agree at all!

kappa(rater1, rater2, confidence = 0.95)

Arguments

rater1

Rater 1 scores or categorical listings

rater2

Rater 2 scores or categorical listings

confidence

Confidence interval proportion for the kappa interval estimate. You must supply a value between 0 and 1.

Value

p_agree

Percent agreement between raters

kappa

Cohen's kappa for yes/no matching

se_kappa

Standard error for kappa wherein standard error is the square root of: (agree \* (1-agree)) / (N \* (1 - random agreement)^2)

kappa_LL

Lower limit for the confidence interval of kappa

kappa_UL

Upper limit for the confidence interval of kappa

Details

Note: All missing values will be ignored. This function calculates kappa for 0 and 1 scoring. If you pass categorical variables, the function will return a percent match score between these values.

Examples


#This dataset includes two raters who wrote the word listed by
#the participant and rated if the word was correct in the recall
#experiment.

data(rater_data)

#Consider normalizing the text if raters used different styles
#Calculate percent match for categorical answers
kappa(rater_data$rater1_word, rater_data$rater2_word)
#> [1] 90

kappa(rater_data$rater1_score, rater_data$rater2_score)
#> $p_agree
#> [1] 40
#> 
#> $kappa
#> [1] -0.2
#> 
#> $se_kappa
#> [1] 0.219089
#> 
#> $kappa_LL
#> [1] -0.6294066
#> 
#> $kappa_UL
#> [1] 0.2294066
#>