Agreement Kappa

Suppose you are analyzing data on a group of 50 people applying for a grant. Each request for assistance was read by two readers and each reader said “yes” or “no” to the proposal. Suppose that the data relating to the number of disagreements are as follows, A and B being readers, the data on the main diagonal of the matrix (a and d) include the number of matches and the off-diagonal data (b and c) count the number of disagreements: graphical representation of the amount of correct data by % of conformity or quadrierce kappa. A similar statistic, called Pi, was proposed by Scott (1955). Cohens Kappa and Scotts Pi differ as to how pe is calculated. Kappa is a form of correlation coefficient. Correlation coefficients cannot be interpreted directly, but a square correlation coefficient called a coefficient of determination (COD) is directly interpretable. The CSB is explained as the amount of variation in the dependent variable that can be explained by the independent variable. While the actual CSB is calculated only on the pearson r, an estimate of the variance taken into account for each correlation statistic can be obtained by quadritating the correlative value. More broadly, squaring the Kappa conceptually translates into accuracy (i.e. error reversal) in the data due to congruence between data collectors. Figure 2 shows an estimate of the amount of correct and erroneous data in the research data sets as a function of the degree of congruence, measured either by overcentage or kappa.

If the observed conformity is due only to chance, that is: If the evaluations are completely independent, then each diagonal element is a product of the two marginals. Cohens κ was carried out to determine whether there was a match between two police officers, whether 100 people in a shopping mall had normal or suspicious behaviour. There was moderate agreement between the judgments of the two officers, κ = .593 (95% CI, .300 to .886), p < .0005. The standard kappa error for the data figures 3, P = 0.94, pe = 0.57 and N = 222 Landis, J. R., &Koch, G. G. (1977). The measurement of observer compliance for categorical data. Biometrics, 33, 159-174.

As Marusteri and Bacarea noted (9), there is never 100% certainty about research results, even if statistical significance is achieved. The statistical results for testing hypotheses about the relationship between independent and dependent variables become insignificant in the event of inconsistency in the evaluation of variables by the evaluators. If compliance is less than 80%, more than 20% of the analyzed data is defective. With a reliability of only 0.50 to 0.60, it must be understood that 40% to 50% of the analyzed data is defective. If kappa levels are less than 0.60, the confidence intervals above the kappa received are so large that it can be assumed that about half of the data could be false (10). It is clear that statistical significance means little when there are so many errors in the results tested. The reliability of interraters is, to some extent, an issue in most large studies, as several people who collect data may experience and interpret phenomena of interest differently. Variables subject to interfering errors are easy to find in the clinical research and diagnostic literature. For example, studies of pressure ulcers (1,2) include elements such as redness, edema and erosion in the affected area. While data collectors can use measurement tools for size, color is as subjective as edema.

In head trauma research, data collectors assess the size of the patient`s pupils and the degree to which the pupils react to light as they shrink. In the lab, it was found that people who read papanicolaou (pap) cups for cervical cancer vary in their interpretations of cells on the laurières (3). As a potential source of error, researchers are expected to provide training for data collectors to reduce variability in data display and interpretation and record it on data collection tools. . . .