from a handpicked tutor in LIVE 1-to-1 classes
Chi Square Formula
Chi-square formula is used to compare two or more statistical data sets. The chi-square formula is used in data that consist of variables distributed across various categories and helps us to know whether that distribution is different from what one would expect by chance.
Example: You research two groups of women and put them in categories of student, employed or self-employed.
Group 1 | Group 2 | |
Student | 40 | 30 |
Employed | 89 | 67 |
Self-employed | 3 | 7 |
The numbers collected are different, but you now want to know
- Is that just a random occurrence? Or
- Is there any correlation?
What is the Chi Square Formula?
The chi-squared test checks the difference between the observed value and the expected value. Chi-Square shows or in a way check the relationship between two categorical variables which can be can be calculated by using the given observed frequency and expected frequency.
Chi Square Formula
The Chi-Square is denoted by χ^{2}. The chi-square formula is:
χ^{2} = ∑(O_{i} – E_{i})^{2}/E_{i}
where
- O_{i} = observed value (actual value)
- E_{i }= expected value.
Finding P-value
The Chi-Square test gives a P-value to help you know the correlation if any!
A hypothesis is in consideration, that a given condition or statement might be true, which we can test later. For example
- A very small Chi-Square test statistic indicates that the collected data matches the expected data extremely well.
- A very large Chi-Square test statistic indicates that the data does not match very well. If the chi-square value is large, the null hypothesis is rejected.
Chi-Square test statistic is called P-value. The P-value is short for probability value. It defines the probability of getting a result that is either the same or more extreme than the other actual observations. The P-value represents the probability of occurrence of the given event. The P-value is used as an alternative to the rejection point to provide the least significance for which the null hypothesis would be rejected. The smaller the P-value, the stronger is the evidence in favor of the alternative hypothesis given observed frequency and expected frequency.
P-value | Description | Hypothesis Interpretation |
P-value ≤ 0.05 | It indicates the null hypothesis is very unlikely. | Rejected |
P-value > 0.05 | It indicates the null hypothesis is very likely. | Accepted or it “fails to reject”. |
P-value > 0.05 | The P-value is near the cut-off. It is considered as marginal | The hypothesis needs more attention. |
Applications of Chi Square Formula
Given below are a few most common applications of the chi-square formula
- used by Biologists to determine if there is a significant association between the two variables, such as the association between two species in a community.
- used by Genetic analysts to interpret the numbers in various phenotypic classes.
- used in various statistical procedures to help to decide if to hold onto or reject the hypothesis.
- used in the medical literature to compare the incidence of the same characteristics in two or more groups.
Examples Using Chi Square Formula
Example 1: Calculate the Chi-square value for the following data of incidences of water-borne diseases in three tropical regions.
India | Equador | South America | Total | |
Typhoid | 31 | 14 | 45 | 90 |
Cholera | 2 | 5 | 53 | 60 |
Diarrhoea | 53 | 45 | 2 | 100 |
86 | 64 | 100 | 250 |
Solution:
Setting up the following table:
Observed | Expected | O_{i} – E_{i} | (O_{i} – E_{i})^{2} | (O_{i} – E_{i})^{2}/Ei |
31 | 30.96 | 0.04 | 0.0016 | 0.0000516 |
14 | 23,04 | 9.04 | 81.72 | 3.546 |
45 | 36.00 | 9.00 | 81.00 | 2.25 |
2 | 20.64 | 18.64 | 347.45 | 16.83 |
5 | 15.36 | 10.36 | 107.33 | 6.99 |
53 | 24.00 | 29.00 | 841.00 | 35.04 |
53 | 34.40 | 18.60 | 345.96 | 10.06 |
45 | 25.60 | 19.40 | 376.36 | 14.70 |
2 | 40.00 | 38.00 | 1444.00 | 36.10 |
Solution:
Since the p-value of 0.068 is greater than 0.05, it would fail to reject the null hypothesis.
Answer: As the value of p < 0.05, the null hypothesis is rejected.
Example 3: As per the survey on cars owned by each family in the locality the data has been arranged in the following table.
Number of cars | Oi | Ei |
One car | 30 | 25.6 |
Two cars | 14 | 15 |
Three cars | 6 | 5.2 |
Total | 50 |
Solution:
Setting up the following table:
Oi | Ei | (O_{i} – E_{i})^{2} | (O_{i} – E_{i})^{2}/E_{i} | |
One car | 30 | 25.6 | 19.36 | 0.645 |
Two cars | 14 | 15.1 | 1.21 | 0.086 |
Three cars | 6 | 5.2 | 0.64 | 0.106 |
Total | 50 | 0.837 |
Therefore, χ^{2} = ∑(O_{i} – E_{i})^{2}/E_{i} = 0.837
Answer: Chi Square = 0.837
FAQs on Chi Square Formula
What Is Chi Square Formula in Statistics?
Chi-square formula is a statistical formula to compare two or more statistical data sets. It is used for data that consist of variables distributed across various categories and is denoted by χ^{2}. The chi-square formula is: χ^{2} = ∑(O_{i} – E_{i})^{2}/E_{i,} where O_{i} = observed value (actual value) and E_{i }= expected value.
How To Calculate p value from Chi Square Formula?
The Chi-Square test gives a P-value that helps determine the correlation.
- A very small Chi-Square test statistic indicates that the collected data matches the expected data extremely well.
- A very large Chi-Square test statistic indicates that the data does not match very well. If the chi-square value is large, the null hypothesis is rejected.
When To Use Chi Square Formula?
Chi square formula is used for statistical analysis but the given data should be frequencies rather than percentages or some other transformation of the data.
What Are the Applications of Chi Square Formula?
The applications of the chi-square formula are as follows
- used by Biologists to determine if there is a significant association between the two variables, such as the association between two species in a community.
- used by Genetic analysts to interpret the numbers in various phenotypic classes.
- used in various statistical procedures to help to decide if to hold onto or reject the hypothesis.
- used in the medical literature to compare the incidence of the same characteristics in two or more groups.
visual curriculum