# Pearson Correlation Formula

The correlation coefficient is the measurement of the correlation between two variables. Pearson correlation formula is used to see how the two sets of data are co-related. The linear dependency between the data set is checked using the Pearson correlation coefficient. It is also known by the name of the Pearson product-moment correlation coefficient. The value of the Pearson correlation coefficient product lies between -1 to +1. If the correlation coefficient is zero, then the data is said to be not related. A value of +1 indicates that the data are positively correlated and a value of -1 indicates a negative correlation.

**Break down tough concepts through simple visuals.**

## What Is Pearson Correlation Formula?

The Pearson correlation coefficient is symbolised by the letter “r”. RephraseThe Pearson correlation formula for the coefficient r is given by:

### \(r=\frac{n\left(\sum x y\right)-\left(\sum x\right)\left(\sum y\right)}{\sqrt{\left[n \sum x^{2}-\left(\sum x\right)^{2}\right]\left[n \sum y^{2}-\left(\sum y\right)^{2}\right]}}\)

Where,

\(r=\) Pearson correlation coefficient

\(x=\) Values in the first set of data \(y=\) Values in the second set of data \(n=\) Total number of values

Let's solve a few solved examples based on the Pearson correlation formula.

## Solved Examples Using Pearson Correlation Formula

**Example 1:** A survey was conducted in your city. Given is the following sample data containing a person's age and their corresponding income. Find out whether the increase in age has an effect on income using the correlation coefficient formula. (Use \(\frac{1}{\sqrt{181}}\) as 0.074 and \(\frac{1}{\sqrt{209}}\) as 0.07)

Age |
25 | 30 | 36 | 43 |

Income |
30000 | 44000 | 52000 | 7000 |

**Solution:**

To simplify the calculation, we divide y by 1000.

Age (x_{i}) |
Income/1000 (y_{i}/1000) |
\(x_i - \bar{x}\) | \(y_i - \bar{y}\) | \((x_i - \bar{x})^2\) | \((y_i - \bar{y})^2\) | \((x_i - \bar{x})(y_i - \bar{y})\) |

25 | 30 | -8.5 | -19 | 72.25 | 361 | 161.5 |

30 | 44 | -3.5 | -5 | 12.25 | 25 | 17.5 |

36 | 52 | 2.5 | 3 | 6.25 | 9 | 7.5 |

43 | 70 | 9.5 | 21 | 90.25 | 441 | 199.5 |

\(\bar{x} = 33.5\) | \(\bar{y} = 49\) | \(\Sigma (x_i - \bar{x})^2 = 181\) | \(\Sigma (y_i - \bar{y})^2 = 836\) | \(\Sigma(x_i-\bar{x})(y_i - \bar{y}) = 386\) |

Pearson correlation coefficient for sample = \(\dfrac{\Sigma (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\Sigma (x_i - \bar{x})^2 \Sigma (y_i - \bar{y})^2 }}\) = \(\dfrac{386}{\sqrt{181}\sqrt{836}}\) = \(\dfrac{193}{\sqrt{181}\sqrt{209}}\) = 0.99

**Answer: Yes, with the increase in age a person's income increases as well, since the Pearson correlation coefficient between age and income is very close to 1.**

### Example 2: Marks obtained by 5 students in algebra and trigonometry as given below:

### \(\begin{array}{|c|c|c|c|c|c|} \hline \text { Science } & 16 & 15 & 12 & 10 & 8 \\ \hline \text { Geometry } & 11 & 18 & 10 & 20 & 17 \\ \hline \end{array}\)

Calculate the Pearson correlation coefficient.

Solution:

Construct the following table:

The formula for Pearson correlation coefficient is:

\(r=\frac{n\left(\sum x y\right)-\left(\sum x\right)\left(\sum y\right)}{\sqrt{\left[n \sum x^{2}-\left(\sum x\right)^{2}\right]\left[n \sum y^{2}-\left(\sum y\right)^{2}\right]}}\) \(r=\frac{5 \times 902-61 \times 76}{ \left.\sqrt{\left[5 \times 789(61)^{2} \| 5 \times 1234-(76)^{2}\right.}\right]}\) \(r=-0.424\)

**Answer: r = -0.424**