Chi-squared test

Last modified by 14zunde on 2024/02/13 07:41

Chi-squared tests are used to test the significance of statistical findings. Formally chi-squared (or χ2) denotes a category of statistical tests that can be used to calculate whether the distribution of values in a statistical sample conforms to a known theoretical distribution (the actual chi-squared distribution). If no further specification is given, usually 'chi-squared' means the Pearson's chi-squared test. Pearson's chi-squared tests if the frequency of observed events is distributed according to an expected theoretical distribution. A simple example of it in stemmatology would be to calculate whether variants found in witnesses are uniformly divided between witnesses:

Witness

Variants

W1

80

W2

54

W3

34

W4

16

W5

36

If the appearance of spelling variation is random, we expect the frequencies in the cells of this table to be uniformly divided, thus we would expect values not to deviate significantly from 44, the average of all values found. Chi-squared allows us to judge how far the frequencies deviate from that average, or in other words in how far they should or should not be attributed to chance. Chi-squared can be computed as follows:

image03.png

Here O is the observed value, E is the expected value. So χ2 is the sum of the squares of the differences between each observed value and the expected value divided by the expected value, so in this case:

image02.png

Usually chi-squared distribution tables are used to check if the computed value is larger than a critical χ2 value. If so, we will conclude that the distribution of values found is not random. (An example of such a table can be found at http://www.medcalc.org/manual/chi-square-table.php.) For this we need to know the degrees of freedom in this test, which is simply the number of cells containing values minus 1, so 4. The row for a degree of freedom of 4 in the look-up table shows that the found value is larger even than the χ2 value associated with a probability (p) of 0.001 that a distribution is due to chance. So for these spelling variants we can conclude they are not.

In a similar vein Pearson's chi-squared test can be applied in cases where two variables need to be tested, e.g. spelling and grammar variants distribution. For more details, however, we have to refer to existing statistics tutorials.

Reference

– Columbia Center for New Media Teaching and Learning (CCNMTL). “D. The Chi-Square Test.” Quantitative Methods in the Social Sciences e-Lessons. Accessed 11 October 2015. http://ccnmtl.columbia.edu/projects/qmss/the_chisquare_test/about_the_chisquare_test.html.

In other languages

DE: Chi-Quadrat (χ2)-Test
FR: test du chi carré (χ2)
IT: test del chi-quadrato (χ2)

JZ