Chapter1 Sigma notation
1.1 🧠Sigma Notation Refresher🧠
“\(\Sigma\)” means “sum”, “add everything together”. In the context of statistics, we’re usually adding numbers from a data set together.
Assume \(x = [1, 2, 3, 4]\)
What is \(\Sigma x\)?
\(1 + 2 + 3 + 4 = 10\)
Oftentimes, you will need to perform an operation on each observation in some data:
What is \(\Sigma (x+1)\)?
\((1+1) + (2+1) + (3+1) + (4+1) = 14\)
We can also have two variables.
Suppose x = [1, 2, 3, 4]
and y = [0, 1, 2, 3]
What is \(\Sigma (x+y)\)?
\((1+0)+(2+1)+(3+2)+(4+3)=16\)
One very basic application of sigma notation is \(\frac{\Sigma x}{N}\), where x might be some exam grades \([87, 95, 88, 85]\) and N is the number of data points in x. In this case, N would be the number of exams in a class.
x = [87, 95, 88, 85]
What is \(\frac{\Sigma x}{N}\) where N is the number os elements in x?
\(\frac{\Sigma x}{N} = \frac{87+95+88+85}{4}\)
\(= \frac{355}{4} = 88.75\)
1.2 💪Worked examples💪
1.2.1
Suppose \(x = [2, 4, 6, 8]\)
What is \(\Sigma x\)?
Show answer
Answer: 20 $ 2 + 4 + 6 + 8 = 20$1.2.2
Suppose \(x = [3, 7, 7]\)
What is \(\Sigma (x+N)\)?,
given that N = the number of elements in x?
Show answer
Answer: 30 \((3+3) + (7+3) + (7+3) = 26\)1.2.3
Suppose \(x = [2, 3, 4]\)
What is \(\Sigma x^2\)?,
Show answer
Answer: 30 \(2^2 + 3^2 + 4^2\) $ = 4 + 9 + 16 = 29$1.3 📝Homework problems📝
For the following problems, assume
x = [1, 2, 3, 4]
and
y = [0, 1, 2, 3]
- \(\Sigma y\)
- \(\Sigma y^2\)
- \(\Sigma (y)^2\)
- \(\Sigma(x-y)\)
- \(\Sigma x - \Sigma y\)
- \(\Sigma xy\)
- \((\Sigma x) - \bar{x}\)
- \(\Sigma (x-\bar{x})\)
\(\Sigma (y-\bar{y})\)
\(\frac{\Sigma (y-\bar{y})}{N}\)
1.4 😤Secrets they don’t tell you before the homework is due😤
For numbers 2 and 3, it’s very important to figure out whether you’re squaring each number in the data set and them adding them together or adding all the data together and then squaring that number. The two procedures result in completely different numbers!
You also may have noticed that for any data set \(\Sigma(x-\bar{x})\) always equals zero. All the data points in x that are above the mean are cancelled out by all the data points below the mean. Thus, \(\Sigma(x-\bar{x})\) divided by anything is 0. \(\Sigma(x-\bar{x})\) squared is \(0^2\), which is 0.
Thus, you could’ve skipped numbers 8, 9, and 10 and just put 0 for all three then. 😈
The standard deviation is a measure of how much each data point deviates from the mean, on average. But we can’t get the average \(\Sigma(x-\bar{x})\) with \(\frac{\Sigma(x-\bar{x})}{N}\)
Since \(\Sigma(x-\bar{x})\) always equals 0, you can’t take an average. You have to divide by N to get an average and if \(\Sigma(x-\bar{x})\) always equals zero, then \(\frac{\Sigma(x-\bar{x})}{N}\) will always equal zero too.
That’s why we square all the deviations: \(\frac{\Sigma(x-\bar{x})^2}{N}\)
That little two at the end of the numerator does something very important. It forces all the deviations to be positive numbers.
\(\frac{\Sigma(x-\bar{x})^2}{N}\) is called the variance. Why “the variance” and not “the standard deviation”? That’s because the average SQUARED ditsance between each data point and the average is SQUARED! You have to UN-square it!
\[\sqrt{\frac{\Sigma(x-\bar{x})^2}{N}}\]
^ The standard deviation! :::chef’s kiss:::