Chapter3 Baby’s first statistical tests

3.1 Z-Scores

3.1.1 🧠Referesher🧠

A z-score tells you how far away a data point is from its population mean, in standard deviation units. In other words, it answers, “how many standard deviations away from the mean are you?”

\(z = \frac{x-\mu}{\sigma}\)

\(z\) is the test statistic you will calculate

\(x\) is the observation you want to calculate the z-score for

\(\mu\) is the population mean

3.1.2 💪Worked examples💪

3.1.2.1 Problem 1

How many base hits can a baseball team expect per game? Frohlich (1994) recorded the number of hits by the 28 Major League Baseball reams for all of the games played from 1989 to 1993. He found that the number of hits the teams made in the games was normally distributed, with a mean of 8.72 and a standard deviation of 1.10.

What would the z-score be for a player who hit 13 times in a game?

Show answer

Step 1: Figure out what you have and note it down. For a z-score, you need a single observation, x, the population mean, \(\mu\), and the population standard deviation, \(\sigma\).

\(x = 13\), the number of hits for this particular player.

\(\mu = 8.72\), the population average hits for all players

\(\sigma = 1.10\), the population standard deviation of hits for all players

Step 2: Input the values into the correct positions in the equation

\(z =\frac{13-8.72}{1.10}\)

Step 3: Final calculation

\(z = 3.89\)

With this \(z\)-score, it suggests this batter is exceptional. He hits 3 standard deviations above the average batter. Recall from the empirical rule that 99.7 observations on a normal curve fall between -3 and +3 standard deviations of the mean.

3.1.2.2 Problem 2

You are curious about where you sit in relation to the rest of your class on the most recent exam. You ask the instructor for the average and standard deviation for all student scores on this particular exam. The instructor reports that, on average, students scored a 60. The standard deviation for grades was 12. You scores a 72. Calculate your z-score.

Show answer

Given: - \(x = 72\), the specific observation you want a z-score for
- \(\mu = 60\), the population average
- \(\sigma = 12\), the population standard deviation

Calculation:
\(z = \frac{72 - 60}{12} = 1.00\)

Answer: \(z = 1.00\)

3.1.3 📝Homework problems📝

The population average IQ is 100. The population standard deviation is 15. What is the z-score for an individual with an IQ score of 124?

A coding website says on their website that, on average, it takes 4.5 months for someone to complete their course. They note a standard deviation of 2.5 months in completion time. If you completed the course in 6 months, where would you fall in relation to the average user? Calculate a z-score to find out.
Josh has a z-score of 1.50 for his puzzle solution time. The population mean is 3 minutes, with a standard deviation of 6 minutes. How long did it take josh to solve the puzzle?
Jaime took 10 tries to make a free throw. The population standard deviation for the number of tries needed to make a free throw is 2. Jaime’s z-score was -1. What’s the population mean for number of tries needed to make a free throw?
Luka scored an 8 on the GNC exam, this is above the population average of 4. Her z-score was 2. What is the population standard deviation of GNC exam scores?

3.2 Group z-test

3.2.1 🧠Referesher🧠

A z-score tells you where a single observation falls in relation to the rest of the population. But if you’re comparing a sample average to a population, you’ll need a one-sample z-test instead. These tests account for sampling error—the natural variation that occurs from sample to sample. For example, if you randomly select 10 band kids and calculate their average IQ, that average likely won’t match the population average. If you repeated this process with many random samples of 10 band kids, you’d see each group’s mean IQ fluctuate slightly.

Among all these random sample means from band kids, you’ll see certain scores more than others. We call this distribution of sample means the “sampling distribution”.

With a z-score, you divide by \(\sigma\), the population standard deviation. For a group z-test, we instead divide by the standard error of the mean (SEM). For a one-sample z-test, SEM = \(\frac{\sigma}{\sqrt{n}}\).

The SEM is the standard deviation of the sampling distribution.

3.2.2 💪Worked examples💪

3.2.2.1 Problem 1

The population average SAT score for all students is 1050 with a population standard deviation of 195. Your special, gifted class of 30 students had an average SAT score of 1250.

from a specific class of 30 students to the general population.

Show answer

Step 1: Organize all the information you have.

\(\bar{x} = 1250\), the sample mean we want to compare to the population mean
\(\mu = 1050\), the population mean
\(\sigma = 195\), the population standard deviation
\(n = 30\), the sample size

Step 2: Plug them into the correct formulas.

The one-sample z-test formula is:
\(z = \frac{\bar{x} - \mu}{SEM}\)

First, calculate the standard error of the mean (SEM):
\(SEM = \frac{\sigma}{\sqrt{n}} = \frac{195}{\sqrt{30}} \approx 35.60\)

Now compute the z-score:
\(z = \frac{1250 - 1050}{35.60} = \frac{200}{35.60} \approx 5.62\)

Your class is elite!

3.2.2.2 Problem 2

The national average ACT score is 19.6 with a standard deviation of 5.9. Calculate a group z score for a class of 30 students, whose average ACT score was 22 and whose sample standard deviation was 4.

Show answer

Step 1: Organize all your info

\(\bar{x} = 22\), the classe’s average ACT score

\(\mu = 19.6\), the population average ACT score

\(\sigma = 5.9\), the population standard deviation of ACT scores

n = 22, the sample size (of the class)

s = 4, the sample standard deviation for the class, WHICH IS IRRELEVANT 🤪

Step 2: Plug everything into the right places

For a one-sample z-test, \(z = \frac{\bar{x}-\mu}{SEM}\). So, here \(z=\frac{22-19.6}{SEM}\). So, \(z = \frac{2.4}{SEM}\)

For a one-sample z-test, SEM = \(\frac{\sigma}{\sqrt{n}}\). So, SEM \(= \frac{5.9}{\sqrt{22}}\).

Thus, SEM here is \(1.257884\)

\(z = \frac{2.4}{1.257884} = 1.907966\)

According to the empirical rule (68-95-99.7), we are just short of being outside \(\pm\) 2 standard deviations from the population mean, so this would not be statistically significant at \(\alpha = .05\)

3.2.3 📝Homework problems📝

The average mile time is 10 minutes (SD = 2). Your 4-week average is 8 minutes. What’s the z-score for your average run time?
Suppose the average number of songs on a one hour and fifteen minute concert set list is 15, with a standard deviation of 3. You look at 9 concerts by a band you like, and they average 11 songs per show. What is their z-score compared to the population?
Suppose the average adult watches 20 hours of TV per week, with a population standard deviation of 5 hours. You collect data from a random sample of 16 college students and find that they watch an average of 16 hours of TV per week. The sample had a standard deviation of 4.2 hours. What is the z-score for the sample mean compared to the population?
The average number of hours people sleep per night is 7.5 with a standard deviation of 1.2 hours. A sample of 25 nurses yields a z-score of -1.25 when compared to the population. What was the average number of hours slept in the sample?
In a study of 64 teachers, the sample mean stress score was 58. The population standard deviation is 8, and the z-score was calculated to be 1. What was the assumed population mean?
A sample of 36 students had an average math score of 82. The population mean is 88, and the z-score comparing this sample to the population was -3. What is the population standard deviation?
A sample had a mean of 74 compared to a population mean of 70. The population standard deviation is 4. The resulting z-score was 2. What was the sample size?

3.3 Confidence Intervals for the Group Z-Test

3.3.1 🧠Referesher🧠

It’s appropriate to construct confidence intervals (CIs) using z-scores when the population standard deviation is known and the population is normally distributed (or the sample size is large enough for the Central Limit Theorem to apply).

A CI takes a point estimate—such as a sample mean—and transforms it into a range of plausible values for the population parameter. It’s a way of hedging your bets: by accounting for sample size and variability, you generate a margin of error around your estimate that reflects the uncertainty inherent in sampling.

You want to take your point estimate, \(\bar{x}\), and add a “plus or minus” (\(\pm\)) amount to express the uncertainty around it. That “amount” is determined by two things: (a) the standard error of the mean, and (b) the z-score that corresponds to your desired level of confidence. For a 95% confidence interval, we’ll always use the same z-score: \(\pm 1.96\), which captures the middle 95% of the normal distribution. While we’ll need to calculate (a) each time based on the data, (b) stays constant. So, the 95% confidence interval for \(\bar{x}\) is:

\[ \bar{x} \pm SEM \times 1.96 \]

3.3.2 💪Worked examples💪

3.3.2.1 Problem 1

You think goths have a higher IQ than the population. Recall, for IQ, \(\mu\) = 100 and \(\sigma\) = 15. You sample 8 goths and find an average sample IQ of 109. Calculate a 95% CI for this sample mean.

Show answer

Step 1: Organize your info

\(\bar{x} = 109\), sample mean
\(n = 8\), sample size
\(\mu = 100\), population mean
\(\sigma = 15\), population standard deviation for IQ

Step 2: Plug everything into the right place

We want the 95% confidence interval for \(\bar{x}\), which is:
\(\bar{x} \pm (SEM \times 1.96)\)

Compute SEM:
\(SEM = \frac{\sigma}{\sqrt{n}} = \frac{15}{\sqrt{8}} \approx 5.3033\)

Now plug into the CI formula:
\(109 \pm (5.3033 \times 1.96) = 109 \pm 10.39\)

Final result:
The 95% confidence interval is approximately [98.61, 119.39]

So, we observed an average sample IQ of 109 for our goths, but given randomness, uncertainty, and a small sample size, we must admit that the population average IQ for goths could plausibly be anywhere between 98.61 and 119.39.

3.3.2.2 Problem 2

You think that chess enthusiasts might have a higher average IQ than the general population. You find 500 chess enthusiasts and their average sample IQ was 101. Calculate a 95% CI for this sample mean.

Show answer

The 95% confidence interval (CI) for \(\bar{x}\) is calculated as:
\[ \bar{x} \pm SEM \times 1.96 \]

Plugging in the numbers:
\[ 101 \pm \frac{15}{\sqrt{500}} \times 1.96 \]

Calculate SEM:
\[ SEM = \frac{15}{\sqrt{500}} \approx 0.6708 \]

So the CI becomes:
\[ 101 \pm 0.6708 \times 1.96 = 101 \pm 1.3148 \]

Interpretation:
The chess enthusiasts’ average IQ is likely between approximately 99.69 and 102.31. Because this interval includes the population mean of 101, we cannot rule out that their average IQ is the same as the general population. Given how narrow the interval is, we can be confident their IQ is very close to the population average.

3.3.3 📝Homework problems📝

A random sample of 36 students at a university had an average exam score of 82. Assume the population standard deviation is 6 and the population is normally distributed. Construct a 95% confidence interval for the average exam score of all students at the university.
A researcher wants to estimate the average number of hours people sleep per night. A random sample of 25 adults had a mean of 7.2 hours. The population standard deviation is known to be 0.8 hours. (The sample standard deviation was 0.75.) What is the 95% confidence interval for the population mean number of hours slept?
You are told that a 95% confidence interval for a population mean is (12.5, 15.5), and that the standard error of the mean (SEM) is 1.5. What was the sample mean?
A researcher found a sample mean of 50, and reported a 95% confidence interval of (47.04, 52.96). What was the standard error of the mean (SEM) used to construct this interval?
A sample of students at my charter school had an average ACT score of 28.3 (SEM = 0.7). What is my best (95%) estimate for the upper bound of the true average ACT score at the school?
For Question #5, what is my best (95%) estimate for the lower bound of the true average ACT score at the school?
A researcher calculates a 95% confidence interval for the average test anxiety score of homeschooled students and finds it to be (53.2, 59.8). The national average anxiety score for high school students is 50. Based on this confidence interval, is the difference statistically significant?
A psychologist computes a 95% confidence interval for the average hours of sleep college students get during finals week and finds it to be (5.6, 7.4). The typical college student is reported to get 7 hours of sleep. Based on this confidence interval, is the difference statistically significant?

3.3.4 😤Secrets they don’t tell you before the homework is due😤

Turns out questions 7 and 8 had shortcuts! If you’re asked whether a result is statistically significant at the 0.05 level and you’ve already calculated a 95% confidence interval, there’s no need to do a separate hypothesis test. Just check whether the population mean is outside the interval. If it is, the result is significant. If it’s inside, it’s not. Confidence intervals and one-sample z-tests are two sides of the same coin!