BIOSTATISTICS SERIES 
https://doi.org/10.5005/jpjournals100281618 
Statistics Corner: Chisquared Test
^{1}Department of Biostatistics, Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, India
^{2}Department of Psychology, Mehr Chand Mahajan DAV College for Women, Chandigarh, India
Corresponding Author: Kamal Kishore, Department of Biostatistics, Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, India, Phone: +91 9591349768, email: kkishore.pgi@gmail.com
Received on: 01 February 2023; Accepted on: 25 February 2023; Published on: 10 April 2023
ABSTRACT
It is desirable to collect and analyze quantitative data with parametric tests. Researchers, however, also gather categorical variables such as cured vs noncured and diseased vs nondiseased. And many times, they convert continuous variables such as body mass index (BMI) to high, regular, and low BMI and quality of life to good, average, and poor categories. When both independent and dependent variables are categorical—Chisquare is a standard test. A researcher designs a study and collects data with many categorical outcome variables. The literature search recommends applying the Chisquared test. The researcher, however, has a few vital questions related to the Chisquared test:
What is Yates’ correction?
When to apply Fisher’s exact test?
What is the post hoc Chisquared test?
How to assess the strength of an association?
How to cite this article: Kishore K, Jaswal V. Statistics Corner: Chisquared Test. J Postgrad Med Edu Res 2023;57(1):4044.
Source of support: Nil
Conflict of interest: None
Keywords: Categorical data, Chisquare, Fisher’s exact test, Nonparametric, Test of association.
INTRODUCTION
Health researchers frequently collect nominal variables such as recovered (yes vs no) or diseased (yes vs no) in routine investigations. The ttest and WilcoxonMannWhitney test discussed in previous articles are ideal for continuous or ordinal outcome variables; however, the same does not apply to nominal data analysis.^{1}^{,}^{2} Karl Pearson proposed the Chisquared test in 1900 to analyze nominal data.^{3} The test has become one of the most popular nonparametric tests due to ease of understanding and calculation. The results from the Chisquared test are not valid when the sample size is small—the extensions or alternatives are proposed by various researchers.
This manuscript will extend the discussion to analyze, report, and interpret study findings from two independent groups discussed in the previous articles.^{1}^{,}^{2} The current article will discuss—(1) the problem statement, (2) the Chisquared test, (3) the types of Chisquared tests, (4) post hoc Chisquared tests, (5) the strength of association, and (6) the interpretation and reporting of study findings. We will begin by framing a research question. All data analysis was conducted using R Commander (Rcmdr)—a graphical user interface for free, opensource, and commanddriven R software.
PROBLEM STATEMENT
The Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, India is a tertiary care institute—the mandate of PGIMER is to undertake intensive research in patient care. Statistical analysis training is an integral part of student’s academic learning. The literature has shown increasing statistical anxiety among medical students. The researcher selected a reliable and valid structure of the survey of attitudes toward statistics (SATS36) questionnaire to capture statistical anxiety. We segregated SATS anxiety scores into mild, moderate, and high anxiety categories. The survey’s objective is whether there is any significant association between the levels of designation and statistical anxiety. To further clarify, the “faculty, research staff, and students” are the three levels of designation (independent variable), and mild, moderate, and high statistics anxiety is the outcome variable in the study.
Disclaimer
For demonstration, Excel^{®} was used to generate the data for the analysis. However, the SATS36 questionnaire genuinely assesses statistics anxiety.
Chisquared Test (χ2)
Chisquare is a test of significance when the dependent variable is nominal—it does not tell the strength of the association. Further, the order of categories does not affect Chisquare—it is affected only by differences between groups. The Chisquared test can be extended to interval or ratio data that researchers collapse into ordinal categories. The distribution of Chisquare is continuous, whereas the test applies to nominal data. As continuous distribution estimates the discrete probability of observed binomial (yes vs no) frequencies, the same overestimates the Chisquare value, which results in a smaller pvalue than expected—a crucial reason for the increase in typeI error. The continuity correction for large samples does not make a significant difference—however, a small, expected value in a few cells for a small sample can skew the calculation of Chisquare statistics and hence the pvalue. Yates’ continuity correction is applied to reduce error in the approximation. For a 2 × 2 table, it assumes to have at least an expected value—not observed value of five or more. The Chisquared test for more than two rows and columns assumes that not >20% of cells have an expected value less than five and no cell has a value less than one. The researchers don’t have to apply Yates’ correction for more than two columns and rows or when the test result is not statistically significant. Unlike bidirectional tests such as Z and ttests, Chisquare is a onetailed (right side) test—only positive and large values can reject the null hypothesis.^{4}
In general, there are three types of Chisquared tests:
Test of Association: It Assesses the association between Two Categorical Variables

Research question: Is there any association between designation (faculty, research staff, and students) and statistical anxiety (mild, moderate, and high)?

Null hypothesis (H0): No statistically significant association exists between designation and statistical anxiety.

Alternative hypothesis (H1): There is a statistically significant association between designation and statistical anxiety.
Test of Homogeneity: It Assesses whether the Proportion of Statistical Anxiety Outcomes (Mild, Moderate, and High) is Similar between the Designations

Research question: Is the proportion of statistical anxiety outcomes (mild, moderate, and high) similar between designations?

Null hypothesis (H0): The proportion of statistical anxiety outcomes is not statistically significantly different between designations.

Alternative hypothesis (H1): The proportions of statistical anxiety outcomes are statistically significantly different between designations.
Test of Goodness of Fit: It Assesses whether Proportions of Outcomes, such as the Proportion of Diseased in Intervention and Control Groups, are Distributed According to a Prespecified Set of Population Proportions

Research question: Are the proportions of mild, moderate, and severe statistical anxiety distributed 15, 5, and 10% in each faculty, research staff, and student category, respectively? Please note that proportion distribution is per certain distributions such as binomial, poison, or previous years’ population standards, such as 30% of students opting for medical, 20% for engineering, and 30% for others.

Null hypothesis (H0): The proportion of statistical anxiety is not statistically significantly different than specified proportions (15: 5: 10%) among designations.

Alternative hypothesis (H1): The proportion of statistical anxiety is statistically significantly different than specified proportions among designations.
Assumptions

The patients are randomly selected.

Patients in the sample are independent of each other.

The data in the cells are frequencies or counts—no percentages or proportions.

The variable levels are mutually exclusive—a single subject contributes data to one and only one group.

Independent and dependent variables should be categorical (nominal or ordinal).
Limitations

Small sample size—overly sensitive, give smaller pvalue. Apply Yates correction or Fisher’s exact test.

Interpretation is challenging for large numbers of categories (20 or more).

May obtain low association despite significant results.
Fisher’s Exact Test
When the sample size is small, or some cells have a frequency zero or expected value <5—apply Fisher’s exact test. The test is more precise than the Chisquare. Still, the same is applicable only for 2 × 2 contingency tables—no Fisher test for more than two categories of the independent and dependent variable. Fisher’s exact test avoids flaws of Chisquare by not approximating the pvalue from continuous distribution—it directly calculates the pvalue from the data.
Post hoc Test
The Chisquared test for three or more groups is a global test—it does not tell what rows and columns are significantly associated. The practitioner is often interested in knowing the association between specific groups. Many researchers undertake subjective inspection by eyeballing the data and then decide whether specific cells are different—not a reliable and valid method. Sharpe discussed four variations of post hoc tests after obtaining statistically significant omnibus Chisquared test results; these are—(1) residual analysis, (2) comparing cells, (3) ransacking, and (4) partitioning.^{5} We will discuss only “partitioning” for its ease of understanding and application; interested readers can consult Sharpe for more detail about other methods.^{5} In the partition test, the researcher systematically partitions the original r × c contingency table into an orthogonal set of 2 × 2. (Table 1) displays the partitioning of a 3 × 3, and (Table 2) exhibits a 3 × 2 table. The number of partitions depends on the degree of freedom (df) for the original contingency table; a 3 × 2 contingency table with two df allows two partitions, and a 3 × 3 table with four df allows for four divisions. The sum of the likelihood ratio Chisquare value for partitioned tables will be equal to the original contingency table. A vital rule is that each marginal total of the original table must be a marginal total for one and only one subtable.
Designations  

Clinical  Surgical  Basic  
Departments  Research staff  12  14  27 
Students  7  18  27  
Faculty  6  8  15  
χ^{2} = 1.93, p=0.75  
Clinical  Surgical  Basic  
Research staff  12  14    
Students  7  18    
Faculty        
χ^{2} = 1.80, p = 0.18, 2x2 table  
Clinical  Surgical  Basic  
Research staff  12  14  27  
Students  7  18    
Faculty    27    
χ^{2} = 0.01, p = 0.92  
Clinical  Surgical  Basic  
Research staff  12  14    
Students  7  18    
Faculty  6  8    
χ^{2} = 0.15, p = 0.70  
Clinical  Surgical  Basic  
Research staff  12  14  27  
Students  7  18  27  
Faculty  6  8  15  
χ^{2} = 0.00, p = 0.98 
Add bold item in the same cell to calculate Chisquare
Designations  

Clinical  Surgical  
Departments  Research staff  12  14 
Students  7  18  
Faculty  6  8  
χ^{2} = 1.92, p = 0.38  
Clinical  Surgical  
Research staff  12  14  
Students  7  18  
Faculty      
χ^{2} = 1.80, p = 0.18  
Clinical  Surgical  
Research staff  12  14  
Students  7  18  
Faculty  6  8  
χ^{2} = 0.15, p = 0.70 
Add bold Item in the same cell to calculate Chisquare
Strength of Association
The Chisquared test, Yates’ correction, and Fisher’s exact test only give statistical significance (pvalue)—it does not tell the researcher about the strength of the relationship between variables. Φ and Cramer’s V are alternatives to the correlation coefficient (continuous data) for two nominal variables. The Φ coefficient for 2 × 2 and Cramer’s V for more than two rows and columns determine strengths of association––how strongly two categorical variables are associated. It ranges from 0 to 1, where, 0 indicates no association, and 1 indicates a perfect association between the two variables. The heuristic to interpret the association is—(1) <0.20, weak association, (2) 0.2–0.6 moderate association, and (3) >0.6, strong association.^{6} The Cramer’s V statistic doesn’t show direction. On a 2 × 2 table, Φ shows direction with a positive or negative sign, but directionality doesn’t make much sense in a larger table of nominal categories. Rcmdr does not calculate Φ and Cramer’s V—the interested reader can calculate the same from VASSAR university online calculator. (VassarStats: Website for Statistical Computation)
Chisquared Test in Rcmdr
After opening Rcmdr, researchers can access the Chisquared test in the menu “statistical analysis < discrete variables < enter and analyze twoway table (option 1) and create a twoway table and compare two proportions (Fisher’s exact test—option 2).” Option 1, displayed in (Fig. 1), needs summarized data to run a Chisquare comparison, and option 2, depicted in (Fig. 2), works with raw data. By default, option 1 displays two rows and columns—the user can change the same by dragging the square grid, as demonstrated in (Fig. 1). Bonferroni and Holm’s correction for pairwise (2 × 2) comparison is available only with raw data (option 2). Further, there is flexibility to run Chisquare with and without continuity correction with option 2, compared to mandatory continuity correction with option 1.
Reporting and Interpretation
We intend to find an association between designation (faculty, students, and research staff) and statistical anxiety (mild, moderate, and severe). For parsimony, we are reporting only Chisquare for the 3 × 3 table. On inspection, we found that all expected cell frequencies are more than five, but the sample size is relatively small—therefore, we applied the Chisquared test with continuity correction. The result shows no statistically significant association between designation and statistical anxiety (p = 0.75). There was a weak association (Φ = 0.09) between designation and statistical anxiety—it is for demonstration only, and there is no need to report if the researcher does not find a statistically significant association.
CONCLUSION
The Chisquared test is one of the most popular nonparametric statistical tests. The conventional wisdom that Chisquared tests do not make assumptions is wrong and flawed—it makes crucial assumptions but not about homogeneity and normality of data. Further, many researchers do not use Yates’ correction for the small sample size to obtain the correct pvalue. Most researchers conclude by stating a statistically significant association between variables—it is advisable to report the strength of association with φ or Cramer’s V. There are at least four post hoc Chisquared tests available. Most researchers, however, do not apply post hoc Chisquare to calculate specific sources of association for three or more rows and columns. We hope the current manuscript will motivate the researchers to adopt and report correct practices.
ACKNOWLEDGMENT
We acknowledge Mr Tejinder Singh from the Department of Biostatistics, Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, India, for his valuable time and input to improve the quality of the article.
ORCID
Kamal Kishore https://orcid.org/0000000189360843
Vidushi Jaswal https://orcid.org/0000000289144283
REFERENCES
1. Kishore K, Jaswal V. Statistics corner: comparing two unpaired groups. J Postgrad Med Educ Res 2022;56(3):145–148. DOI: 10.5005/jpjournals100281594
2. Kishore K, Jaswal V. Statistics Corner: Wilcoxon–Mann–Whitney Test. J Postgrad Med Educ Res 2022;56(4):199–201. DOI: 10.5005/jpjournals100281613
3. Crack TF. A note on Karl Pearson’s 1900 Chisquared test: two derivations of the asymptotic distribution, and uses in goodness of fit and contingency tests of independence, and a comparison with the exact sample variance Chisquare result. Res Methods Methodol Account Ejournal 2018:1–29. DOI: 10.2139/ssrn.3284255
4. Driscoll P, Lecky F. Article 8. An introduction to hypothesis testing. Nonparametric comparison of two groups—1. Emerg Med J 2001;18(4):276–282. DOI: 10.1136/emj.18.4.276
5. Sharpe D. Chisquare test is statistically significant: now what? Pract Assess Res Evaluation 2015;20(8):1–10. DOI: 10.7275/tbfax148
6. Kotrlik J, Williams H, Jabor K. Reporting and interpreting effect size in quantitative agricultural education research. J Agric Educ 2011;52(1):132–142. DOI: 10.5032/jae.2011.01132
________________________
© The Author(s). 2023 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/bync/4.0/), which permits unrestricted use, distribution, and noncommercial reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.