Semester 1, 2025
After completing this tutorial you should be comfortable with
You should have also revisited
marginal, joint and conditional distributions
linear transformations of probability distributions
using Summation and Expectation operators
Tutor: Richard Hayes (Richard)
Email: rjhayes@unimelb.edu.au
Consultation Time: Wednesdays, 02:00-03:00 pm
\(\hspace 8cm\) Room 473, 4th floor FBE Building
First, you should already have the data file,tute2_crime.csv
, and \(\texttt{R}\) script file,tute2.R
, in your T2 folder (as we discussed last week).
Then open \(\texttt{RStudio}\) go to the top menu bar and select
\[ \text{Session} \rightarrow \text{Set Working Directory} \rightarrow \text{Choose Directory ...} \]
highlight your T2 folder and hit Open.
Now, in R studio, click on the \(\texttt{R}\) script file,tute2.R
to open this in the Script Window.
Highlight then Run the following lines
The line library(stargazer)
loads the stargazer package.
The next (executable) line data=read.csv(file="tute2_crime.csv")
creates a data frame named data
.
The variables are
We are then asked what does a “typical” state look like. Use stargazer
Statistic | N | Mean | St. Dev. | Median | Pctl(25) | Pctl(75) | Min | Max |
State | 45 | 23.000 | 13.134 | 23 | 12 | 34 | 1 | 45 |
Violent Rate | 45 | 431.484 | 209.541 | 382.800 | 275.500 | 570.000 | 66.900 | 854.000 |
Robbery Rate | 45 | 106.656 | 64.193 | 100.900 | 75.300 | 152.500 | 8.800 | 240.800 |
Pop. Density | 45 | 105.656 | 97.664 | 76.530 | 34.542 | 157.042 | 1.086 | 385.441 |
Ave. PC Income | 45 | 15.816 | 1.937 | 15.797 | 13.919 | 17.114 | 12.370 | 20.273 |
Statistic | N | Mean | St. Dev. | Median | Pctl(25) | Pctl(75) | Min | Max |
State | 45 | 23.000 | 13.134 | 23 | 12 | 34 | 1 | 45 |
Violent Rate | 45 | 431.484 | 209.541 | 382.800 | 275.500 | 570.000 | 66.900 | 854.000 |
Robbery Rate | 45 | 106.656 | 64.193 | 100.900 | 75.300 | 152.500 | 8.800 | 240.800 |
Pop. Density | 45 | 105.656 | 97.664 | 76.530 | 34.542 | 157.042 | 1.086 | 385.441 |
Ave. PC Income | 45 | 15.816 | 1.937 | 15.797 | 13.919 | 17.114 | 12.370 | 20.273 |
Interpretation:
Now, we are asked to produce probability density plots for these variables. This can be done using the plot(density())
command in R .
Better looking Probability Density Plots:
You should also add a title, and label the X and Y axis (particularly when including graphs in an assignment). For example, use
Interpretation
# create a new plotting window and set the plotting area into a 1*2 array
par(mfrow = c(1, 2))
# create density plots
plot(density(data$rob),
main="Density of Robbery Rate",
xlab="Robberies per 100,000 People",
ylab="Density",
col="orange",
lwd=2)
plot(density(data$avginc),
main="Per Capita Income (in $000's)",
xlab="Per Capita Income",
ylab="Density",
col="orange",lwd=2)
The distributions of vio
, rob
and avginc
appear symmetrically distributed around their respective means.
continued
plot(density(data$dens),
main="Density of People Per Square Mile",
xlab="People Per Square Mile",
ylab="Density",
col="orange", lwd=2)
#### The following is optional you do NOT need to include
#### these lines in your assignment code
# to put in the mean and median add (as a dashed lines) use
abline(v=mean(data$dens),col="red", lty=2)
abline(v=median(data$dens),col="blue", lty=2)
# and add a legend
legend(220, 0.005, legend=c("PDF", "mean of pop density",
"median of pop density"),
col=c("orange", "red","blue"), lty=1:3, cex=0.8,lwd=2)
The urban density variable is right skewed, meaning there are many similarly dense US states, but a few in the right tail of the distribution as very densely populated such as New York and California.
Histograms are closely related to probability density plots. For example, again looking at the dens
variable, we have:
Another method we could use is a box-plot e.g.
Scatter plots are used to visualise the relationship between two variables.
Again, we have to be careful about formatting the plots correctly.
Let’s start with looking at the relationship (if any) between rob
and vio
.
What do you think?
Positive/Negative /No Relationship?
Linear/Non-Linear?
Other examples:
You have been asked to offer an economic explanation of why a relationship may exist.
Meaning
Economic explanations focus on the costs and benefits of a particular behaviour to explain empirical patterns
There may be multiple explanations; one is fine!
There is not one “correct” explanation so as long the one you come up with makes sense go with that.
However, if the explanation does not make sense you would lose marks in an assignment.
Example
We see a positive relationship between robbery rates and urban density.
This could potentially reflect:
The cost of robbery being lower in more dense states as potential robbery targets are more plentiful in close proximity.
The benefit of robbery being higher if more dense locations attract more retail shops and merchants (called “agglomeration” benefits of urban density), which provides more opportunities and hence benefit for robbery.
More difficult for police to identify potential robbers in more crowded places, which again makes the expected costs of robbery lower since robbers are less likely to be caught
Some useful rules for Summations
If \(\text{a}\) and \(\text{b}\) are constants and \(X\) and \(Y\) random variables then:
\(\qquad \sum\limits_{i=1}^n \text{a}X_i = \text{a}\sum\limits_{i=1}^n X_i\)
\(\qquad \sum\limits_{i=1}^n (X_i+Y_i) = \sum\limits_{i=1}^n X_i+\sum\limits_{i=1}^n X_i\)
\(\qquad \sum\limits_{i=1}^n \text{b} = n\text{b}\)
\(\qquad \overline{X} = \dfrac{\sum\limits_{i=1}^n X_i}{n} \Leftrightarrow \sum\limits_{i=1}^n X_i=n\overline {X}\)
Useful rules for Expectation and Variance operators can be found in the solutions (HTML version).
Also see Lecture 2 slides 23-28.
Let’s go through an example of how these rules can be applies (see Part 2 Summation example 3 in the tutorial questions).
Show the following equality is true
\[\sum\limits_{i=1}^{n}\left(x_i - \bar{x} \right)^2 = \sum\limits_{i=1}^{n} x_i^2 - n\bar{x}^2 \] \[\begin{align}
\displaystyle \sum\left( x_i - \overline{x}\right)^2 &= \sum \left( x_i^2 - 2\overline x x_i + \overline {x}^2 \right) \tag{1}\\
&= \displaystyle \sum x_i^2 -\sum\left( 2 \overline{x} x_i \right) + \sum\left( \overline {x} ^2 \right) \tag{2}\\
&= \displaystyle \sum x_i^2 - 2 \overline {x } \sum x_i + n \overline {x}^2 \tag{3} \\
&= \displaystyle \sum x_i^2 - 2 \overline {x }n \overline {x}+n \overline {x}^2 \tag{4}\\
&= \displaystyle \sum x_i^2 - n \overline {x}^2
\end{align}\] In line 3, you could also multiply the term \(2 \overline {x } \sum x_i\) by \(\dfrac{n}{n}\) e.g
multiply by \(\dfrac{n}{n} \Rightarrow \displaystyle \sum x_i^2 - 2 n \overline {x } \frac{\sum x_i}{n} + n \overline {x}^2\) which would give the same result as above.
In Part 2 Qn1, we have a random variable, \(X\) that is i.i.d. from a \(N(\mu_X,1)\) distribution and another random variable, \(Y\) defines as \(Y=2+2X\).
It turns out that \(Y \thicksim N(2+2\mu_X,4)\)
How did we get this?
In general (using Expectation outlined in Lecture 2), if one i.i.d. random variable i.e. \(Y\) is a linear combination of another i.i.d. variable, \(X\) such that \[ Y = a + bX \] the mean of \(Y\) is \[\mu_Y = a + b \mu_X \] and the variance of \(Y\) \[ \sigma_Y^2 = b^2 \sigma_X^2 \] In this case \(a=2,b=2\) and \(\,\sigma_X^2=1\)
e.g. \(\qquad \mu_Y = 2+2\mu_X\) and \(\, \sigma_Y^2 = 2^2 \times1=4\).
If \(\mu_x=2,5\) or \(10\), then the distribution of \(Y\) is \(N(6,4)\), \(N(12,4)\) and \(N(22,4)\) respecitvely.
The following graph plots the distributions of \(Y\), conditional on the three \(\mu_x\) values.
Larger values of \(X\) shift the distribution of \(Y\) to the right.
High Grade | Medium Grade | Low Grade | Total | |
---|---|---|---|---|
Study Hard | 0.20 | 0.10 | 0.02 | 0.32 |
Sometimes | 0.07 | 0.30 | 0.10 | 0.47 |
Never Study | 0.01 | 0.05 | 0.15 | 0.21 |
Total | 0.28 | 0.45 | 0.27 | 1.00 |
Study_Grade | High | Medium | Low | Total |
---|---|---|---|---|
Hard | joint | joint | joint | marginal |
Sometimes | joint | joint | joint | marginal |
Never | joint | joint | joint | marginal |
Total | marginal | marginal | marginal | 1.00 |
Marginal and Conditional Probabilities
The marginal distribution for studying is
P(Study Hard)= 0.32
P(Study Sometimes)= 0.47
P(Study Never) = 0.21
The marginal distribution for performance is
P(High Grade)= 0.28
P(Medium Grade)= 0.45
P(Low Grade) = 0.27
The probability distribution for performance, conditional on Studying Hard is
P(High Grade|Study Hard)= 0.20/0.32 = 0.625
P(Medium Grade|Study Hard)= 0.10/0.32=0.3125
P(Low Grade|Study Hard) = 0.02/0.32=0.0625
Statistical Independence
If, for example, Studying and Performance were independent, then the joint probability (Study Hard,High Grade) would equal the product of the respective marginal probabilities.
P(Study Hard) \(\times\) P(High Grade)
Computing this product we get \(0.32 \times 0.28=0.0896\).
This is not equal to the joint probability of P(Study Hard,High Grade) which is, from the table, \(0.20\).
Therefore, the random variables Studying and Performance are not independent.
you can pick any other of the joint distributions and compute the product of the respective marginal probabilites - you will obtain the same result in this example.