Chapter 3 Block 2

3.1 Lesson 16: Introduction to Vectors I

3.1.1 Objectives

Describe the difference between a vector and a scalar.
Given a 2-D vector, draw that vector by hand on a set of coordinate axes.
Given a vector, compute the length (or magnitude) of that vector.
Perform vector addition, vector subtraction, and scalar multiplication by hand.
Graphically represent the sum or difference of two or more 2-D vectors by hand.

3.1.2 Reading

Section 3.1.

3.1.3 AD’s Note

Begin your lesson with an introduction to Block 2 – what are we doing in this block and why are we doing it? Students will see some of these linear algebra concepts in future Math courses and we are also building towards an understanding of how fitModel works.

3.1.4 In class

Intro. The course takes a bit of a pivot now. In this block we will introduce some basic vector and linear algebra concepts en route to a “behind-the-curtain” look into the fitModel function that we used in the Functions & Modeling block.
Vectors vs Scalars, Vector Length, and Vector Arithmetic. We’ll start with a basic intro to vectors and how they are different from scalars. Later, we will use R for vector operations, but today, cadets should focus on hand calculations. You should demonstrate how to find the length of a vector by using a 2-D example, and you should graphically demonstrate what it means to multiply a vector by a scalar and what it means to add two vectors together.
A Note on the Text. While the cadets should read the whole chapter, the most important material starts on page 267. Also, the text denotes vectors as \(\overline{v}\) (a letter with a bar overhead).

3.1.5 R Commands

Calculator functions only

3.1.6 Problems & Activities

Note: When you draw vectors, you can tell students that vectors can be represented with their base *anywhere; it doesn’t matter. We will draw them anywhere that is convenient to get the point across.

Start with a simple example of a vector, say \(\overline{v} = \begin{pmatrix} 1 \\ 3 \end{pmatrix}\). Plot this on a set of axes on the board. Ask the cadets how they would find the length of this vector. Note that the length of a vector is denoted by double bars around the name of the vector:
```
blank.canvas(c(-3,4),c(-3,8))
place.vector(c(1,3))
mathaxis.on()
grid.on()
```
\[\left\| \overline{v}\ \right\| = \sqrt{1^{2} + 3^{2}} = \sqrt{10}\]
```
sqrt(10)
```
```
## [1] 3.162278
```
At this point, cadets can practice finding the length of vectors on their own. They can practice Exercises 29-36 in Section 3.1. Alternatively, you can get through the rest of vector arithmetic and defer this practice to the end.
Again consider the vector \(\overline{v} = \begin{pmatrix} 1 \\ 3 \end{pmatrix}\). We can multiply this vector by a scalar. To do so simply involves multiplying each element of the vector by that scalar. The result is a scaled version of the vector \(\overline{v}\). For example, let \(m = 2\). Plot \(\overline{v}\) and \(2\overline{v}\) on the same set of axes. What is the length of \(\overline{v}\)? What is the length of \(2\overline{v}\)?
```
place.vector(c(2,6),col="cadetblue")
place.vector(c(0.5,1.5),col="magenta")
```

Next consider the vectors \(\overline{u} = \begin{pmatrix} -2 \\ 3 \end{pmatrix}\) and \(\overline{w} = \begin{pmatrix} 1 \\ 2 \end{pmatrix}\). Plot these on the board and find \(\overline{u} + \overline{w}\) and \(\overline{u} - \overline{w}\). Also, represent these vectors graphically. For subtraction, note that \(\overline{u} - \overline{w} = \overline{u} + (-\overline{w})\). This makes it easier to visualize the difference between vectors.

v1=c(-2,3)
v2=c(1,2)

blank.canvas(c(-3,3),c(-1,6))
mathaxis.on()
grid.on()
place.vector(v1,col="magenta")
place.vector(v2,col="firebrick3")
place.text("$v_1$",-1.2,1.2)
place.text("$v_2$",0.4,0.4)

place.vector(v2,base=v1,col="firebrick3")
place.vector(v1,base=v2,col="magenta")

place.vector(v1+v2,col="dodgerblue")
place.text("$v_1+v_2$",-1,3.5)

place.vector(v2-v1,base=v1,col="forestgreen")
place.text("$v_2-v_1$",0.3,2.6)

For the rest of the time, cadets should do practice: Section 3.1 exercises 29-36, 41-48, and 57-64.

3.2 Lesson 17: Intro to Vectors II

3.2.1 Objectives

Use R to define a vector.
Use R to perform vector addition, vector subtraction, and scalar multiplication.
Given a vector field and an input point, evaluate the vector field at the specified point and draw the 2-D output vector on a set of coordinate axes by hand, with the output vector originating from the input point.
Use R to plot a given vector field.

3.2.2 Reading

Section 3.1.

3.2.3 In class

Vector Arithmetic. Spend about 10 minutes reviewing basic vector arithmetic (length, addition, subtraction, scalar multiplication) and graphical representation in 2-D.
Vectors in R. Demonstrate how to define vectors in R, and how to perform the basic vector arithmetic operations in R.
Vector Fields. Discuss how a 2-D vector field can be used to describe a process that behaves differently based on the values of the variables. Think flow of water. This anticipates the Block 6 topic of differential equations systems in 2 dimensions. Emphasize that a vector field is another type of function, where the output is a vector.

3.2.4 R Commands

c, basic arithmetic in R, plotVectorField

3.2.5 Problems & Activities

Consider the vectors \(\overline{u} = \begin{pmatrix} -1 \\ 1 \end{pmatrix}\), \(\overline{v} = \begin{pmatrix} 4 \\ 0 \end{pmatrix}\) and \(\overline{w} = \begin{pmatrix} 1 \\ 2 \end{pmatrix}\). Plot these on the board. Each cadet should plot and find the following:
1. \(\overline{u} + \overline{v}\)
  
  \[\overline{u} + \overline{v} = \begin{pmatrix} -1 \\ 1 \end{pmatrix} + \begin{pmatrix} 4 \\ 0 \end{pmatrix} = \begin{pmatrix} -1 + 4 \\ 1 + 0 \end{pmatrix} = \begin{pmatrix} 3 \\ 1 \end{pmatrix}\]
2. \(2\overline{v} + \overline{w}\)
3. \(-\overline{w} + 3\overline{u}\)
4. \(\left\| \overline{u} \right\|\)
5. \(\left\| 2\overline{w} \right\|\)
We can use our fancy calculator (R) to conduct vector arithmetic. Consider the vectors from #1 above. First, we should define these in R. Creating a vector in R is easy. You simply use the c function (the c stands for combine) to combine a list of numbers into a vector.
```
u = c(-1,1)
v = c(4,0)
w = c(1,2)
```
Now we can use these vectors defined in R to perform the same vector operations from #1:
```
  u + v
```
```
## [1] 3 1
```
```
2*v + w
```
```
## [1] 9 2
```
```
-w + 3*u
```
```
## [1] -4  1
```
```
sqrt(sum(u^2))
```
```
## [1] 1.414214
```
```
sqrt(sum((2*w)^2))
```
```
## [1] 4.472136
```
You might need to take some time explaining the operation for finding the length of a vector in R.
The next topic is a bit of a pivot; the purpose is to point out that functions can have vectors as outputs as well. A vector field is a function that takes in more than one input and outputs a vector of the same dimension. These are often meant to model processes that flow, like liquids. Consider the vector field \(\overline{F}(x,y) = \begin{pmatrix} x + y \\ x - y \end{pmatrix}\) from Exercise 95 in Section 3.1. Demonstrate how to sketch the vector field “by hand”.

On the board, or using R, do the following (students are not expected to know place.vector, but it is fine for you to use in class.)
```
blank.canvas(c(-5,5),c(-5,5))
mathaxis.on()
grid.on()
```
First, evaluate \(F(1,0)=\begin{pmatrix} 1 \\ 1 \end{pmatrix}\). So, we should draw the vector \(\begin{pmatrix} 1 \\ 1 \end{pmatrix}\) with base at \(\begin{pmatrix} 0 \\ 0 \end{pmatrix}\).
```
place.vector(c(1,1),base=c(0,0),col="cadetblue")
```
Now evalute \(F(2,0) = \begin{pmatrix} 2 \\ 2 \end{pmatrix}\).
```
place.vector(c(2,2),base=c(2,0),col="cadetblue")
```
Evaluating \(F(1,1)= \begin{pmatrix} 2 \\ 0 \end{pmatrix}\), so plot
```
place.vector(c(2,0),base=c(1,1),col="cadetblue")
```
Evaluate at \(F(0,2)=\begin{pmatrix} 2 \\ -2 \end{pmatrix}\), so plot:
```
place.vector(c(2,-2),base=c(0,2),col="cadetblue")
```
Evaluate \(F(-1,-2)=\begin{pmatrix} -3 \\ 1 \end{pmatrix}\), so plot
```
place.vector(c(-3,1),base=c(-1,-2),col="cadetblue")
```
If we repeated this process over and over from different input vectors, we would have a complete picture of the vector field.
Next, demonstrate using plotVectorField. Define \(F\) like this:
```
F=makeFun(c(x+y,x-y)~x&y)

plotVectorField(F(x,y)~x&y,xlim=c(-5,5),ylim=c(-5,5))
mathaxis.on(); grid.on()
```
plotVectorField takes the normal optional arguments (xlim, ylim, col, lwd, etc).

Note that plotVectorField scales down the vectors that it draws so that they don’t overlap each other, but still give some sense of length. That is, if the vectors are drawn the same length by plotVectorField, that indeed have the same length. If the length of one vector is drawn shorter than anther, then the first vector is really shorter than the second.
Section 3.1, exercises 49-56, 65-72 , 90-95

3.3 Lesson 18: Linear Combinations

3.3.1 Objectives

Describe the definition of linear combination and what vector operations can be included in a linear combination.
Compute a specified linear combination using given vectors and scalars, by hand and in R.
Rewrite a system of equations as an equivalent vector equation (and vice-versa).
Given a vector equation, solve for the scalars of the linear combination by hand.
Determine the size (rows and columns) of a given matrix.
Compute the product of a vector and a matrix, or determine that the product is not defined.

3.3.2 Reading

Section 3.2.

3.3.3 In class

Motivating Problem. Start with Example 1 in Section 3.2. We would like to build a linear model fitting Twitter users to year. This means determining whether a linear combination of a vector of year values and a vector of ones exists that equals a vector of Twitter users.
Linear Combinations. Introduce the concept of linear combinations as the sum of multiple vectors scaled by constants. We have already dealt with linear combinations in this block. Note that a linear combination only applies to vector of the same dimension. If vectors differ in dimension, we cannot add them.
Vector Equations. Next, introduce the concept of vector equations. Our motivating example was a vector equation: a linear combination of multiple vectors set equal to another vector. For simple examples, demonstrate how to find the solution of a vector equation.
Matrix Equations. Now go over how a vector equation can be re-written as a matrix equation. This will likely require a review of what a matrix is and the rules of matrix multiplication.

3.3.4 R Commands

c, basic arithmetic in R

3.3.5 Problems & Activities

Example 1, Section 3.2. The table contains five data points representing year and number of Twitter users. We think this is a linear relationship. To show, we can plot by hand on the board or in R.
```
Year = c(11,11.25,11.5,11.75,12.75)
Users = c(68,85,101,117,185)
plotPoints(Users~Year)
```
In the last block, we used the fitModel function to determine the “best fit” line to describe this relationship, but how did the function obtain that model?

We need values m and b such that \(Users = m\cdot Year + b\), for all pairs of users and year. Write this in equation form (see page 281, example 1). Note that we can write this in vector form too:

\[\left( \begin{array}{r} \begin{matrix} 68 \\ 85 \\ 101 \end{matrix} \\ 117 \\ 185 \end{array} \right) = m\left( \begin{array}{r} \begin{matrix} 11 \\ 11.25 \\ 11.5 \end{matrix} \\ 11.75 \\ 12.75 \end{array} \right) + b\left( \begin{array}{r} \begin{matrix} 1 \\ 1 \\ 1 \end{matrix} \\ 1 \\ 1 \end{array} \right)\]

This is known as a vector equation. The right side of the equation is a linear combination. Before we explore this example further, we need to discuss these two terms in greater detail.
A linear combination is the sum of multiple vectors scaled by constants. Have them work through exercises 5-8 and 13-16. This will serve as review of vector arithmetic and will reinforce the concept of linear combinations.
For our purposes, a vector equation is an equality between linear combinations of vectors. Our beginning example in #1 is a vector equation. For illustration consider the following vector equation:

\(\begin{pmatrix} 5 \\ 1 \end{pmatrix} = a\begin{pmatrix} 2 \\ 0 \end{pmatrix} + b\begin{pmatrix} -1 \\ 1 \end{pmatrix}\)

The solutions to this equation are the values \(a\) and \(b\) that, together, satisfy this equation. Note that we can re-write this as a system of equations:

\[5 = 2a + (-b)\]

\[1 = 0a + b\]

Thus, \(b = 1\) and \(a = 3\).
We are going to define a matrix to be a grouping of columns. So we’ll think of

\[\begin{pmatrix} 1 & 4 \\ 6 & 0 \end{pmatrix}\]

As a matrix holding two vectors, \(\left( \begin{array}{r} 1 \\ 6 \end{array} \right)\) and \(\left( \begin{array}{r} 4 \\ 0 \end{array} \right)\).

Similarly, \[\begin{pmatrix} 4 & 6 \\ 3 & 3 \\ 2 & 2 \end{pmatrix}\]

has two column vectors smashed together in it. \(\left( \begin{array}{r} 4 \\ 3 \\ 2 \end{array} \right)\) and \(\left( \begin{array}{r} 6 \\ 3 \\ 2 \end{array} \right)\).
We are going to define the matrix-vector product like this: If \(A\) has columns \(u_{1}\), \(u_{2}\), \(u_{3}\),\(\ldots\). Those are vectors, but I am omitting the overbar because I can’t figure out how to make Word do it right now.

So, \(A = \begin{pmatrix} u_{1} & u_{2} & u_{3} \end{pmatrix}\)

Then the matrix-vector product \(Ax\) is defined to be

\(Ax = x_{1}u_{1} + x_{2}u_{2} + x_{3}u_{3},\) generalized of course for however many columns \(A\) has.

Do several examples like: \[\begin{pmatrix} 1 & 3 \\ 7 & 5 \end{pmatrix}\begin{pmatrix} 2 \\ 3 \end{pmatrix} = 2\begin{pmatrix} 1 \\ 7 \end{pmatrix} + 3\begin{pmatrix} 3 \\ 5 \end{pmatrix} = \begin{pmatrix} 2 \\ 14 \end{pmatrix} + \begin{pmatrix} 9 \\ 15 \end{pmatrix} = \begin{pmatrix} 11 \\ 29 \end{pmatrix}\]

You’ll need to do several more.

Some things to points out:
1. Only \(Ax\) is defined. \(xA\) is not the same thing.
2. \(Ax\) is a vector. Not a scalar, not a matrix.
3. You can write things down that don’t make sense. For instance,
  
  \[\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix}\]
  
  is not defined. You’ll need the number of columns of \(A\) to match the number of rows of \(x\) in order for that product to be defined. Talk about a few examples.
Once you have the definition of matrix multiplication, just by definition our original problem:

Find \(a\), \(b\) such that

\(\begin{pmatrix} 5 \\ 1 \end{pmatrix} = a\begin{pmatrix} 2 \\ 0 \end{pmatrix} + b\begin{pmatrix} -1 \\ 1 \end{pmatrix}\)

is equivalent to the problem written in matrix form:

Find \(a\), \(b\) such that \[\begin{pmatrix} 2 & -1 \\ 0 & 1 \end{pmatrix}\left( \begin{array}{r} a \\ b \end{array} \right) = \left( \begin{array}{r} 5 \\ 1 \end{array} \right)\]
They should end class by practicing Exercises 23-30 in Section 3.2. Emphasize that when we multiply a matrix by a vector, it is equivalent to a linear combination of the vectors that make up the matrix.

If there is still time, they can go back to our motivating example and re-write the vector equation as a matrix equation:

\[\left( \begin{array}{r} \begin{matrix} 68 \\ 85 \\ 101 \end{matrix} \\ 117 \\ 185 \end{array} \right) = \left( \begin{array}{r} \begin{matrix} 11 & 1 \\ 11.25 & 1 \\ 11.5 & 1 \\ 11.75 & 1 \\ 12.75 & 1 \end{matrix} \end{array} \right)\begin{pmatrix} m \\ b \end{pmatrix}\]

3.4 Lesson 19: Matrix Equations

3.4.1 Objectives

Rewrite a vector equation as an equivalent matrix equation (and vice-versa).
In R, use the matrix command to define a matrix.
Rewrite a system of equations as an equivalent matrix equation (and vice-versa).
In R, use the solve command to find the solution of a given matrix equation.

3.4.2 Reading

Section 3.2.

3.4.3 In class

Review. Remind them of the relationship between systems of linear equations, vector equations, and matrix equations. Demonstrate with a simple example with unknown coefficients. In this lesson, we will use R to solve systems of linear equations. This will require us to build matrices in R.
Matrices in R. Go over how to solve a matrix equation and note how this requires finding the matrix’s inverse. The text goes over how to do this for simple two-equation, two-unknown, systems. However, we are not requiring our cadets to do this by hand. Instead, they should be able to convert a system of linear equations to matrix form and use R to solve. Show them how to build a matrix in R using matrix.
Solving Systems in R. Demonstrate the solve function in R and have them practice using it on example problems. End the lesson with an example that returns an error. This happens because there is no solution to the system. We will explore that more in-depth next time.

3.4.4 R Commands

c, matrix, solve

3.4.5 Problems & Activities

Start with a simple system of equations:

\[-3x + y = 15\]

\[2x + 4y = 18\]

For review, have them write this in vector equation and matrix equation form:

\[\begin{pmatrix} 15 \\ 18 \end{pmatrix} = x\begin{pmatrix} -3 \\ 2 \end{pmatrix} + y\begin{pmatrix} 1 \\ 4 \end{pmatrix}\]

\[\begin{pmatrix} 15 \\ 18 \end{pmatrix} = \begin{pmatrix} -3 & 1 \\ 2 & 4 \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix}\]

What values of \(x\) and \(y\) satisfy this vector/matrix equation?

We can think of this algebraically; our next logical step would be to “divide” both sides of the equation by the matrix, leaving us with just the vector \(\begin{pmatrix} x \\ y \end{pmatrix}\). But we can’t really “divide” by a matrix. Rather, we would multiply both sides by the inverse of the matrix. The text discusses how to do this for \(2 \times 2\) matrices, but we will rely on R.

The solve function takes two inputs: 1) a square matrix representing the coefficients of the linear system of equations; and 2) a vector representing the outputs of the linear system. It returns a vector that represents the solution to that system.

In order to use this, we need to know how to define a matrix:
```
U = matrix(c(-3,1,2,4),nrow=2,ncol=2)
```
OR
```
u1 = c(-3,2)
u2 = c(1,4)
U = matrix(c(u1,u2),nrow=2,ncol=2)
```
The first argument of this function is a vector of numbers that will be in our matrix. The function will arrange these values in the specified number of rows and columns working by column. So when you read in the values, make sure you read by column and not across by rows.

Now we can use the solve function to obtain a solution:
```
v = c(2,4)
solve(U,v)
```
```
## [1] -0.2857143  1.1428571
```
This returns the vector -3 6. This means that the values \(x = -3\) and \(y = 6\) form a solution to this system.
Most of the rest of class time should be dedicated to practice, using Exercises 65-74, and 83-86.
Finish with an example with no solution. The vector is not a linear combination of those contained in the matrix.

\[\begin{pmatrix} 2 \\ 3 \\ 1 \end{pmatrix} = \begin{pmatrix} 0 & 1 & 2 \\ 1 & 1 & 2 \\ 2 & 0 & 0 \end{pmatrix}\begin{pmatrix} x_{1} \\ x_{2} \\ x_{3} \end{pmatrix}\]

3.5 Lesson 20: Existence of Linear Combinations

3.5.1 Objectives

Use errors returned by the solve command to determine whether or not unique solution exists to a system of equations.
Given a vector equation, using R, determine whether a linear combination of the vectors exists to satisfy the vector equation.
Given a set of data, write a vector equation that represents a linear model fitting the data perfectly. Determine if that vector equation has a solution or not and what can be concluded about the dataset.

3.5.2 Reading

Section 3.3.

3.5.3 Course Director’s Note

Now that we are several lessons into Block 2, I recommend you provide students a look ahead. There is no quiz this block, but since it’s our shortest block, GR 2 is only 3 weeks away. Project 2 is due in about 2 weeks, and we are not dedicating any class time towards working on Project 2.

3.5.4 In class

Note on the Text. This section is titled “Existence of Linear Combinations” but is approaching this from two different angles. One is when there is no solution to a system of linear equations (think \(x + y = 1\) and \(x + y = 2\)). The other context is when data points do not fall exactly on the line. We are more concerned with the second context. In this lesson, we are essentially verifying that observed data points do not fall exactly on a proposed model.
Review of solve Function. Start with some practice on using R to find solutions to systems of linear equations.
No Solutions. Note that when certain types of error result from using the solve function (see page 301), there is no solution to the system.

Technically speaking, the error message: Error in solve.default(M, v2) : Lapack routine dgesv: system is exactly singular: U[2,2] = 0

Means only that the problem does not have a unique solution. The same error message occurs whether you try to solve

\[x + y = 2\] \[2x + 2y = 4\]

which has infinitely man solutions, or if you try to solve

\[x + y = 2\] \[2x + 2y = 5\]

which has no solutions. Many will remember from high school that linear systems of equations either have exactly one solution, no solutions, or infinitely many solutions. That error message means that R thinks we are in one of the last two cases.
Linear Combinations and Data. The rest of this section is dedicated to verifying that datasets do not usually fit an exact model. Return to our motivating example. Introduce the term “target vector” for the vector of user counts. We can use R to fit a line that connects two of our data points; however, we can quickly verify that this conjectured line does not pass through our other points. Therefore, no perfect solution exists.

3.5.5 R Commands

c, matrix, solve

3.5.6 Problems & Activities

Start with a review of solving systems of linear equations with solve. Have them take a few minutes practicing exercises 36-40 in Section 3.2.
Now that they have practiced dealing with systems of linear equations in R, transition to other such systems where there may or may not be a solution. Start with an example (say Exercise 45 in Section 3.3). Then have them work through some exercises from 37-52 in this section.
Let’s revisit our motivating example from a couple lessons ago. We were given year and number of users (in millions) and we would like to fit a line modeling this relationship. We had expressed our data in matrix equation form:

\[\left( \begin{array}{r} \begin{matrix} 68 \\ 85 \\ 101 \end{matrix} \\ 117 \\ 185 \end{array} \right) = \left( \begin{array}{r} \begin{matrix} 11 & 1 \\ 11.25 & 1 \\ 11.5 & 1 \\ 11.75 & 1 \\ 12.75 & 1 \end{matrix} \end{array} \right)\begin{pmatrix} m \\ b \end{pmatrix}\]

We can refer to the vector of user counts as the target vector, which is the text is often denoted by \(\overline{v}\). We want to know if a linear combination of the year vector (often denoted by \(\overline{u}\)) and the vector of ones, denoted by \(\overline{1}\). In other words, does any \(m\) and \(b\) exist that exactly fit this data? Does any straight line exactly fit this data? Well any such line would have to go through the first two points, so let’s find a \(m\) and \(b\) that satisfy:

\[\begin{pmatrix} 68 \\ 85 \end{pmatrix} = \begin{pmatrix} 11 & 1 \\ 11.25 & 1 \end{pmatrix}\begin{pmatrix} m \\ b \end{pmatrix}\]
```
solve(matrix(c(11,11.25,1,1),2,2),c(68,85))
```
```
## [1]   68 -680
```
This gives us \(m = 68\) and \(b = -680\). For now, we can speculate that our model is

\[Users(Year) = 68Year - 680\]

If this model is accurate, then each other point will lie directly on this line. So \(Users(11.5)\) should be equal to exactly 101.
```
11.5*68-680
```
```
## [1] 102
```
This gives us 102. Thus, no line exists that goes exactly through all five of the points observed. In other words, there is no linear combination of the year vector and the intercept vector that gives us the users vector.

This presents a problem. We still want to linearly model the relationship between Year and Users, but no exact line exists. So we need one that is “close”. In order to determine what line is best for a given set of data, we will use the method of least squares. In order to use that, we need to understand the concepts of projection and dot product.
Exercises 75-83, 87-89 (exponential example) in Section 3.3.

3.6 Lesson 21: Dot Product

3.6.1 Objectives

Given two vectors with the same number of components, find their dot product by hand and by using R.
Describe the length of a vector as being equivalent to the square root of the dot product of the vector with itself.
Use the dot product to determine whether two vectors are orthogonal to one another.

3.6.2 Reading

Section 3.4.

3.6.3 In class

Review of Motivating Problem. Very rarely in modeling do data fit exactly according to some proposed model. When no linear combination of the \(\overline{u}\) and \(\overline{1}\) vectors equals the target vector \(\overline{v}\), then we need to find something “close”. To do, this we need an understanding of dot products and projections. Today, we will talk about dot products and orthogonality and next time we will cover projections.
Dot Product. Introduce the dot product as a way to “multiply” two vectors together. (Another is cross product, but we won’t cover that in this course). Emphasize that the dot product of two vectors is a scalar. Cadets will often ask what this value represents. One generalized interpretation of the dot product is as a measure of how “related” two vectors are.
Length. Demonstrate that the square root of the dot product of a vector with itself is equivalent to the length of the vector.
Orthogonality. Go over the definition of orthogonality: two vectors are orthogonal when their dot product is zero. We can think of orthogonal vectors as being perpendicular to one another.

3.6.4 R Commands

dot, c, arithmetic

3.6.5 Problems & Activities

Start with a review of our motivating problem. There is no linear combination of \(\overline{u}\) and \(\overline{1}\) that equals our target vector \(\overline{v}\).

\[\left( \begin{array}{r} \begin{matrix} 68 \\ 85 \\ 101 \end{matrix} \\ 117 \\ 185 \end{array} \right) = \left( \begin{array}{r} \begin{matrix} 11 & 1 \\ 11.25 & 1 \\ 11.5 & 1 \\ 11.75 & 1 \\ 12.75 & 1 \end{matrix} \end{array} \right)\begin{pmatrix} m \\ b \end{pmatrix}\]

Our goal now is to find values \(m\) and \(b\) that get us “close” to the target vector \(\overline{v}\). In order to do this, we will need to get a “projection” of our target vector onto the space composed of linear combinations of \(\overline{u}\) and \(\overline{1}\). In order to understand this idea, we need to start with a discussion of dot products.
Demonstrate the dot product as the sum of the element-wise products of two vectors. Note that we cannot take the dot product of two vectors of different dimensions.

Example: Find the following dot product:

\[\begin{pmatrix} 1 \\ -2 \\ 2 \end{pmatrix} \cdot \begin{pmatrix} 4 \\ 0 \\ -1 \end{pmatrix}\]

This involves simply multiplying corresponding elements and summing the results:

\[(1 \times 4) + (-2 \times 0) + (2 \times -1) = 2\]

We can do this in R in a couple ways:
```
a = c(1,-2,2)
b = c(4,0,-1)

sum(a*b)
```
```
## [1] 2
```
```
dot(a,b)
```
```
## [1] 2
```
Now show that the length of a vector is equivalent to the dot product of the vector with itself. Use the vector \(\overline{a} = \begin{pmatrix} 1 \\ -2 \\ 2 \end{pmatrix}\):

\[\left\| \overline{a} \right\| = \sqrt{1^{2} + (-2)^{2} + 2^{2}} = \sqrt{9} = 3\]

\[\sqrt{\overline{a} \cdot \overline{a}} = \sqrt{1^{2} + (-2)^{2} + 2^{2}} = \sqrt{9} = 3\]
Note that if two vectors have a dot product of 0, we consider them orthogonal. We can demonstrate this graphically (on the board) with two-dimensional vectors \(\overline{u} = \begin{pmatrix} 1 \\ 2 \end{pmatrix}\) and \(\overline{v} = \begin{pmatrix} -1 \\ 1/2 \end{pmatrix}\).
The rest of the class should be devoted to practice. Exercises 1-24, Section 3.4.

3.7 Lesson 22: Vector Projection

3.7.1 Objectives

Given two non-zero vectors, compute the projection of one vector onto the other, by hand and in R, or describe why this is not possible.
Describe a projection as the closest possible linear combination of one vector to the other.
Calculate the residual vector between a vector and its projection.
Given two 2-D non-zero vectors, sketch the projection and residual of one vector onto the other by hand.

3.7.2 Reading

Section 3.4.

3.7.3 In class

Review of Motivating Problem. Again, very rarely in modeling do data fit exactly according to some proposed model. When no linear combination of the \(\overline{u}\) and \(\overline{1}\) vectors equals the target vector \(\overline{v}\), then we need to find something “close”. Today, we will talk about projections of one vector onto one other vector; next time we will talk about projections of one vector onto two others.
Review of Dot Products. Spend a few minutes allowing them to practice a few dot product operations.
Projections. Discuss the definition and interpretation of a projection of a vector \(\overline{v}\) onto another vector \(\overline{u}\). Draw a picture in 2-d and emphasize that a projection of \(\overline{v}\) onto \(\overline{u}\) is a scalar multiple of the vector \(\overline{u}\). We are finding the linear combination \(x\overline{u}\) that is closest to \(\overline{v}\). Note that we can do this “by hand” or by using the project function in R.
Residual. The difference between the vector \(\overline{v}\) and its projection \(x\overline{u}\) is known as the residual vector. Represent it graphically and demonstrate how to find it “by hand” and in R.

3.7.4 R Commands

project, dot, c, arithmetic

3.7.5 Problems & Activities

Start with a review of our motivating problem. There is no linear combination of \(\overline{u}\) and \(\overline{1}\) that equals our target vector \(\overline{v}\).

\[\left( \begin{array}{r} \begin{matrix} 68 \\ 85 \\ 101 \end{matrix} \\ 117 \\ 185 \end{array} \right) = \left( \begin{array}{r} \begin{matrix} 11 & 1 \\ 11.25 & 1 \\ 11.5 & 1 \\ 11.75 & 1 \\ 12.75 & 1 \end{matrix} \end{array} \right)\begin{pmatrix} m \\ b \end{pmatrix}\]

Our goal now is to find values \(m\) and \(b\) that get us “close” to the target vector \(\overline{v}\). In order to do this, we will need to get a “projection” of our target vector onto the space composed of linear combinations of \(\overline{u}\) and \(\overline{1}\).
In order to find projections, we need to understand dot products. A dot product is a measure of how two vectors are related and can be found by multiplying corresponding elements of vectors and summing the products. Select a couple of Exercises from 1-24 in Section 3.4 and give them a few minutes to practice.

The projection of \(v\) onto \(u\) is defined to be the multiple of \(u\) that is closest to being \(v\). Draw a figure on the board or projector with a \(v\) vector, which is your target, and a \(u\), vector. All you are allowed to “make” are multiples of \(u\), which would be vectors in the same direction as \(u\) but with varying lengths. Which multiple of \(u\) is closest to being \(v\)? It turns out (state without assertion), that the multiple of \(u\) that we want is \(x\overline{u}\) where \(x = \frac{\overline{v} \cdot \overline{u}}{\overline{u} \cdot \overline{u}}\). Graphically demonstrate this by letting \(\overline{v} = \begin{pmatrix} 3 \\ 4 \end{pmatrix}\) and \(\overline{u} = \begin{pmatrix} 2 \\ 1 \end{pmatrix}\) (see page 321).

blank.canvas(xlim=c(-5,5),ylim=c(-1,5))
mathaxis.on()

u=c(2,1)
v=c(3,4)
place.vector(v)
place.text("$v$",x=1,y=1.9)

place.vector(1.8*u,col="gray47",lty=3)
place.text("$1.8u$",x=1.5*1.8*1,1.5*1.8*0.4)

place.vector(2.4*u,col="gray47",lty=3)
place.text("$2.4u$",x=1.8*2.4*1,1.8*2.4*0.4)

place.vector(-0.75*u,col="gray47",lty=3)
place.text("$-0.75u$",x=-1.8*0.75*1+0.8,-1.8*0.75*0.4)

place.vector(u,col="cadetblue")
place.text("$u$",x=1,0.4)

blank.canvas(xlim=c(-5,5),ylim=c(-1,5))
mathaxis.on()

place.vector(v)
place.text("$v$",x=1,y=1.9)

place.vector(1.8*u,col="firebrick")
place.text("$1.8u$",x=1.5*1.8*1,1.5*1.8*0.4)

r=v-1.8*u
place.vector(r,base=1.8*u,lty=3,col="gray47")
place.text("$r=v-1.8u$",x=3.5,y=2.5)
place.text(paste0("$||r||=",round(sqrt(sum(r^2)),2),"$"),x=3.5,y=2.1)

place.vector(u,col="cadetblue")
place.text("$u$",x=1,0.4)

blank.canvas(xlim=c(-5,5),ylim=c(-1,5))
mathaxis.on()

place.vector(v)
place.text("$v$",x=1.6,y=1.9)

place.vector(2.4*u,col="firebrick")
place.text("$2.4u$",x=1.5*1.8*1,1.5*1.8*0.4)

r=v-(2.4*u)
place.vector(r,base=2.4*u,lty=3,col="gray47")
place.text("$r=v-2.4u$",x=4.1,y=3.3)
place.text(paste0("$||r||=",round(sqrt(sum(r^2)),2),"$"),x=4.1,y=3.0)

place.vector(u,col="cadetblue")
place.text("$u$",x=1,0.4)

blank.canvas(xlim=c(-5,5),ylim=c(-1,5))
mathaxis.on()

place.vector(v)
place.text("$v$",x=1.6,y=1.9)

place.vector(-0.75*u,col="firebrick")
place.text("$-0.75u$",x=-1.8*0.75*1+0.8,-1.8*0.75*0.4)

r=v-(-0.75*u)
place.vector(r,base=-0.75*u,lty=3,col="gray47")
place.text("$r=v-(-0.75)u$",x=-1.2,y=1)
place.text(paste0("$||r||=",round(sqrt(sum(r^2)),2),"$"),x=-1,y=0.5)

place.vector(u,col="cadetblue")
place.text("$u$",x=1,0.4)

blank.canvas(xlim=c(-5,5),ylim=c(-1,5))
mathaxis.on()

place.vector(v)
place.text("$v$",x=1.6,y=1.9)

x=dot(u,v)/dot(u,u)

place.vector(x*u,col="firebrick")
place.text("$xu$",x=3.0,1.2)
place.text("$x=\\frac{u\\cdot v}{u\\cdot u}$",x=3,y=0.4)
r=v-(x*u)
place.vector(r,base=x*u,lty=3,col="gray47")
place.text("$r=v-(x)u$",x=4,y=3)

place.text(paste0("$||r||=",round(sqrt(sum(r^2)),2),"$"),x=4,y=2.6)

place.vector(u,col="cadetblue")
place.text("$u$",x=1,0.4)

Another way to write the expression for a projection is:

\[\left( \frac{\overline{v} \cdot \overline{u}}{\left\| \overline{u} \right\|} \right)\frac{\overline{u}}{\left\| \overline{u} \right\|}\]

The first part of this expression is the length of the projected vector. It is the dot product (a measure of how related two vectors are) scaled by the length of \(\overline{u}\).

Using either expression, find the projection of \(\overline{v}\) onto \(\overline{u}\).

v = c(3,4)
u = c(2,1)

dot(v,u)/dot(u,u)*u

## [1] 4 2

u.len = sqrt(dot(u,u))

dot(v,u)/u.len * u/u.len

## [1] 4 2

x = project(v~u)

x*u

## [1] 4 2

In any method, the projection turns out to be the vector \(\begin{pmatrix} 4 \\ 2 \end{pmatrix}\).

Note the syntax of the project function. You enter a tilde expression with the target vector first. The function returns \(x\), the scalar multiplier of \(\overline{u}\).

IMPORTANT: the projection of \(\overline{u}\) onto \(\overline{v}\) IS NOT the same as the projection of \(\overline{v}\) onto \(\overline{u}\).

The residual vector is denoted by \(\overline{r}\) and is the difference between the target vector and its projection onto \(\overline{u}\):

\[\overline{r} = \overline{v} - x\overline{u}\]

Find the residual vector for the projection of \(\overline{v}\) onto \(\overline{u}\):

\[\overline{r} = \begin{pmatrix} 3 \\ 4 \end{pmatrix} - \begin{pmatrix} 4 \\ 2 \end{pmatrix} = \begin{pmatrix} -1 \\ 2 \end{pmatrix}\]

Using R:
```
proj.vu <- dot(v,u)/dot(u,u)*u

v-proj.vu
```
```
## [1] -1  2
```
Exercises 39-54 in Section 3.4.

3.8 Lesson 23: The Method of Least Squares I

3.8.1 Objectives

Use the method of least squares to obtain the best-fit linear model for a given set of data.
Given a dataset and a linear model, compute the residual vector between the data and the model.
Given a residual vector for a linear model, compute the length of the residual vector and the associated RMSE.

3.8.2 Reading

Section 3.5.

3.8.3 In class

Review of Motivating Problem. When no linear combination of the \(\overline{u}\) and \(\overline{1}\) vectors equals the target vector \(\overline{v}\), then we need to find something “close”. Today, we will use what we’ve learned to find the “best” solution to our problem.
Least Squares Method. The book gets straight to the equation to finding the values \(m\) and \(b\) that best fit our data. Be sure to build up from last time. Last time, we projected a target vector \(\overline{v}\) onto one other vector \(\overline{u}\). This time we would like to find the linear combination of \(\overline{u}\) and \(\overline{1}\) that are closest to \(\overline{v}\). I would point them straight to page 336, figure 3. Rather than projecting \(\overline{v}\) onto one other vector, we are projecting \(\overline{v}\) onto the “model space” which is a fancy way of saying all possible linear combinations of year and 1. Spend some time on how we get to the matrix form on page 337. However, note that we are NOT requiring them to memorize or apply the matrix methodology in the text. When asked to find \(m\) and \(b\) that give a best fit, they will use project.
Residual Vector. Given a best fit linear model, demonstrate how to obtain the residual vector. No model is perfect, and we can think of this as a vector of “errors”.
Length of Residual Vector. The text does not discuss measures of model fit. The length of the residual vector gives us a measure this model’s fit. However, it needs to be scaled based on how many observations we have. This leads us to root mean squared error (RMSE). Clarify that even though this is not in the text, they need to be able to obtain the RMSE of a linear model. We will not introduce \(R^{2}\) in this course.
Practice. We have two days on this topic, so if you don’t get to RMSE, it can wait until next time, along with practice.

3.8.4 R Commands

project, dot, c, plotPoints, plotFun, arithmetic

3.8.5 Problems & Activities

Start with a review of our motivating problem. There is no linear combination of \(\overline{u}\) and \(\overline{1}\) that equals our target vector \(\overline{v}\).

\[\left( \begin{array}{r} \begin{matrix} 68 \\ 85 \\ 101 \end{matrix} \\ 117 \\ 185 \end{array} \right) = \left( \begin{array}{r} \begin{matrix} 11 & 1 \\ 11.25 & 1 \\ 11.5 & 1 \\ 11.75 & 1 \\ 12.75 & 1 \end{matrix} \end{array} \right)\begin{pmatrix} m \\ b \end{pmatrix}\]

Our goal now is to find values \(m\) and \(b\) that get us “close” to the target vector \(\overline{v}\). In order to do this, we will need to get a “projection” of our target vector onto the space composed of linear combinations of \(\overline{u}\) and \(\overline{1}\).
Obtaining a projection of a target vector onto a model space requires a little more explanation. Following along with the text, we need to find \(m\) and \(b\) that yields \(\overline{y} = m\overline{x} + b\overline{1} + \overline{r}\), where \(\overline{r}\) is the residual vector that is orthogonal to both \(\overline{x}\) and \(\overline{1}\). (Note that the text switches from \(\overline{v}\) and \(\overline{u}\) to \(\overline{y}\) and \(\overline{x}\).) Well we can take the dot product of both sides of this expression and \(\overline{x}\) and then repeat with \(\overline{1}\). This gives us a system of two equations with two unknowns, \(m\) and \(b\):

\[\overline{y} \cdot \overline{x} = m\left( \overline{x} \cdot \overline{x} \right) + b\left( \overline{1} \cdot \overline{x} \right) + \overline{r} \cdot \overline{x}\]

\[\overline{y} \cdot \overline{1} = m\left( \overline{x} \cdot \overline{1} \right) + b\left( \overline{1} \cdot \overline{1} \right) + \overline{r} \cdot \overline{1}\]

But again, \(\overline{r}\) is orthogonal to both \(\overline{x}\) and \(\overline{1}\), so \(\overline{r} \cdot \overline{x}\) and \(\overline{r} \cdot \overline{1}\) both equal 0. Thus, we need \(m\) and \(b\) such that :

\[\overline{y} \cdot \overline{x} = m\left( \overline{x} \cdot \overline{x} \right) + b\left( \overline{1} \cdot \overline{x} \right)\]

\[\overline{y} \cdot \overline{1} = m\left( \overline{x} \cdot \overline{1} \right) + b\left( \overline{1} \cdot \overline{1} \right)\]

This system of two equations and two unknowns can be expressed in matrix form and solved using methods from earlier in this block. However, we can use the project function as a shortcut.

For our example, find the values \(m\) and \(b\) that result in the best linear fit to our data.

Year = c(11,11.25,11.5,11.75,12.75)
Users = c(68,85,101,117,185)

project(Users~Year+1)

## (Intercept)        Year 
##  -666.63699    66.76712

The best fit linear model is \(Users = -666.637 + 66.767Year\).

Note that the same model is obtained using fitModel:

fitModel(Users~b+m*Year)

## function (Year, ..., transformation = function (x) 
## x) 
## return(transformation(predict(model, newdata = data.frame(Year = Year), 
##     ...)))
## <environment: 0x000001fd0b06e660>
## attr(,"coefficients")
##          b          m 
## -666.63699   66.76712 
## attr(,"class")
## [1] "nlsfunction" "function"

We can visualize this using plotPoints and plotFun.

plotPoints(Users~Year)
plotFun(-666.63699+66.76712*Year~Year,add=T)

Find the residual vector for our model.
```
r = Users-(-666.63699+66.76712*Year)
```
Note that this vector is simply a vector of errors. In order to summarize the fit of this model, we can find the length of this vector, or more appropriately, the root mean squared error:

\[RMSE = \sqrt{\frac{\left\| \overline{r} \right\|^{2}}{n}} = \frac{\left\| \overline{r} \right\|}{\sqrt{n}}\]

where \(n\) is the number of observations in the dataset (for this example, 5).
```
r.length = sqrt(dot(r,r))

rmse = r.length/sqrt(5)

rmse
```
```
## [1] 0.4951823
```
The resulting value ( 0.4952) gives a measure of error on the same scale as our data. In our case, 0.4952 is a remarkably low RMSE given that observed values of “Users” ranged 68 through 185.
Summarize. What is it that fitModel is doing? It is projecting a vector of observations (the things I want to predict) onto a line or plane (depending how many parameters you have) that is made out of vectors of the predictor variables. Projection is an idea that is lurking about in many areas of mathematics and statistics. The RMSE is one possible measure of how good our model is.

3.9 Lesson 24: The Method of Least Squares II

3.9.1 Objectives

Given a dataset, represent the method of least square (“target problem”) as a vector equation and as a matrix equation.
Evaluate a least squares model at a given input and interpret the meaning of the answer.
Use residual vectors and/or RMSEs to compare the accuracy of multiple linear models for the same dataset and determine which linear model is the best-fit.

3.9.2 Reading

Section 3.5.

3.9.3 In class

Review & Catch-up. Use the first part of class to catch up if necessary and to review the concepts from last time.
Least Squares Practice. The remainder of class should be dedicated to practice obtaining best linear models. They will practice using simple tabular data (like the motivating example) as well as using built-in datasets.

3.9.4 R Commands

project, dot, c, plotPoints, plotFun, arithmetic

3.9.5 Problems & Activities

Section 3.5, Exercises 69-72. In addition to building the model and plotting the fit, obtain the RMSE.

project(Rate~Months+1,data=MonthlyUnemployment)

## (Intercept)      Months 
## 10.17802260 -0.07053626

plotPoints(Rate~Months,data=MonthlyUnemployment)

plotFun(10.178-0.0705*Months~Months,add=T)

10.178-0.0705*36

## [1] 7.64

At month 36, we expect an unemployment rate of 7.64%.

10.178-0.0705*65

## [1] 5.5955

At month 65, we expect an unemployment rate of 5.5955%.

r = MonthlyUnemployment$Rate-(10.178-0.0705*MonthlyUnemployment$Months)

r.length = sqrt(dot(r,r))

rmse = r.length/sqrt(60)

Section 3.5 Exercises 15-22, 73-96.