Mathematical intuition behind Dot product & Cosine similarity in vector databases
In my previous article, we explored the definition of a vector and how to calculate the Euclidean distance between them. In this article, we will delve into the mathematical intuition behind the dot product and cosine similarity, which are commonly used in similarity searches within vector databases apart from Euclidean distance method.
What is cos function ?
Let us start by first understanding a cos function. Take an example of a right angle triangle ABC
that has a hypotenuse c
, an opposite side a
and adjacent side b
Value of cos θ
is calculated as : Adjacent/Hypotenuse
$$cos\;\theta\;=b/c$$
Definition of Law of Cosine
According to the law of cosine ,square of a side of triangle is equal to the difference between the sum of squares of the other two sides of the triangle and double the product of the other sides and the cosine angle that exists between them.
For the triangle ABC
above, by the law of cosine, the cosine formula for the square of side a
can be stated as :
$$a^2=b^2+c^2-2bc\;cos\;\theta$$
Similarly cosine formula for sides b
and c
of the triangle is :
$$b^2=a^2+c^2-2ac\;cos\;\theta$$
$$c^2=a^2+b^2-2ab\;cos\;\theta$$
Magnitude of vector
To understand the dot product of a vector it is important to understand the magnitude of a vector. Magnitude of a vector is also sometimes called as absolute value of a vector. So a magnitude of a vector is nothing but its length. For vector a
the magnitude is represented as |a|
.
The equation for the magnitude of vector a
in the following diagram in two dimensions is
$$a=(a_1,a_2)\;\;\text{is}\;\; |a|=\sqrt{a_1^2 +a_2^2 }$$
For vectors in three dimension the magnitude is
$$a=(a_1,a_2,a_3)\;\;\text{is}\;\; |a|=\sqrt{a_1^2 +a_2^2 +a_3^3 }$$
For vectors in N dimensions the equation can be generalized to
$$a=(a_n)\;\;\text{is}\;\; |a|=\sqrt{a_1^2 +a_2^2 +a_3^3 +.... + a_n^n}$$
Algebraic definition of Dot Product of two vectors
To multiply two vectors we use a method called as dot product . The resulting value of the multiplication is a number or a scalar.
Lets take an example of two vectors
$$A=(a_1,a_2,a_3)\;\text{and}\;B=(b_1,b_2,b_3)$$
The dot product of the above two vectors can be computed by multiplying first component of A
with the first component of B
and second component of A
with the second component of B
and third component of A
with the third component of B
and so on and then adding them all together.
So the dot product of vectors A
and B
can be calculated as
$$A .B =a_1b_1+a_2b_2+a_3b_3$$
Incase if there are N
components the dot product can be computed by the following formula where i
and j
are the elements of the vector
$$A .B =\sum_{i}^{N}a_ib_i$$
The above equation is called as an Algebraic definition of the Dot Product.
Geometric definition of Dot Product of two vectors
The dot product of two vectors a
and b
having a magnitude of |a|
and |b|
is represented as |a||b| cos θ
where θ
is the angle between the vectors a
and b
.
We can express the geometric dot product of two dimensional vector a
and b
as
$$a.b=|a||b|\; cos \;\theta$$
where |a|
and |b|
is magnitude of the two vectors.
Now what would be a dot product of a vector with itself ? It will be
$$a\overrightarrow{}\;\;.\;a\overrightarrow{}\;=\;|\;\overrightarrow{a}\;|^2$$
This is because the angle across a vector and itself is O°
and cos O°= 1
Let's now derive the geometric dot product formula
$$a.b=|a||b|\; cos \;\theta$$
using the law of cosines.
In the following diagram we have two vectors, vector a
and vector b
with angle 0
between them
a, b
and c
represent the lengths of the vector and they form a triangle where c=a-b
.
Now the dot product of c
with itself is
The above derived equation is the law of cosine that was stated earlier at the start of this article.
Since c=a-b
the dot product of the two vectors a
and b
can be derived as follows by substituting the value of c=a-b
in the cosine equation :
So the geometric definition of dot of product of two vectors a
and b
we get is :
$$a.b=|a||b|\;cos\;\theta$$
The above equation of dot product of vectors a
and b
is the product of their magnitude multiplied by the cosine angle across them. This equation is called as Geometric definition of the Dot Product .
The angle cos θ
between two vectors can be thus calculated using derived form of the above equation :
$$cos\;\theta =\frac{a.b}{|a||b|}$$
and the angle θ
can be calculated using :
$$\theta = cos^{-1} \left( \frac{a.b}{|a|.|b|}\right)$$
where
a.b
is the dot product of vector a
and b
|a|
and |b|
are magnitudes of vector a
and b
respectively
Compute angle θ between two vectors in two dimensions
Suppose we have 2 vectors
$$A = 3i + 5j \;\; \text{and } \; B = 4i +8j$$
To find the cos θ
between A
and B
we will use the following equation that was derived earlier :
$$cos\;\theta =\frac{a.b}{|a||b|}$$
We first calculate the dot product between them :
$$A⋅B=(3)(4)+(5)(8)=12+40=52$$
and then calculate the magnitude |A|
and |B|
Then we calculate the value for cos(θ)
We get cos(θ)
\=0.9970
To find value of θ
we use the following equation
$$\theta = cos^{-1} \left( \frac{a.b}{|a|.|b|}\right)$$
The value for θ
we get is ≈4.4°
So for vector
$$A = 3i + 5j \;\; \text{and } \; B = 4i +8j$$
The value of cos(θ)
\=0.9970
and θ
≈4.4°
Closing Notes
Now that we've explored the mathematical intuition behind dot product and cosine similarity along with how to calculate the angles across multiple vectors, we would in the next article see how similarity searches in vector databases leverage the above concepts.
Stay tuned and thank you very much for reading.