What Is a Chi-Square Test?
A chi-square (χ²) test is a statistical method used to determine whether there is a significant association between categorical variables — that is, whether the observed distribution of data across categories differs significantly from what would be expected by chance. In polygraph research, the chi-square test is one of the most frequently used statistical tools for evaluating technique performance, comparing classification outcomes, and assessing the significance of accuracy differences.
Applications in PDD Research
The chi-square test is particularly suited to polygraph research because the primary outcome data are categorical. Examinees are classified as DI, NDI, or INC (categories, not continuous measurements), and ground truth is also categorical (deceptive or truthful). Common applications include comparing the accuracy rates of two techniques, testing whether a scoring method produces significantly different classification distributions than expected, and evaluating whether the rates of false positives and false negatives differ significantly between experimental conditions.
How the Chi-Square Test Works
The test compares observed frequencies in a contingency table against the frequencies expected under the null hypothesis of no association. For example, a 2×2 table comparing Technique A versus Technique B across correct versus incorrect classifications. The chi-square statistic (χ²) is calculated as the sum of (observed − expected)² / expected across all cells. With degrees of freedom determined by the table dimensions [(rows−1) × (columns−1)], the statistic is compared to the chi-square distribution to obtain a p-value indicating statistical significance.
Limitations and Complementary Methods
The chi-square test requires sufficient expected cell frequencies (typically ≥5 per cell) and works with categorical data only. For polygraph research with small sample sizes, Fisher’s exact test may be more appropriate. For studies evaluating the agreement between examiners, Fleiss’ kappa provides a better measure of interrater reliability. For continuous score data, parametric methods such as t-tests or regression may be preferred. The chi-square test is one tool in the statistical toolkit used to evaluate polygraph research, best suited for comparing classification distributions across groups.