Demystifying the standard basis.

Abstract

This post contains mental scaffolding I documented to ease the transition from coordinate-based linear algebra to coordinate-free linear algebra. In particular, I look at the distinction between vectors and their coordinate representations, their conflation in the standard basis, and how this can cause conceptual issues when working with arbitrary bases and polynomial vector spaces.

As a starting point to motivate these notes, when we are first introduced to coordinate-based linear algebra, say in secondary school, we are told that “vectors $v \in \R^n$ are columns of numbers”, with possibly the somewhat mysterious caveat that we are working in the “standard basis”.

When transitioning to coordinate-free linear algebra, where one considers vectors as abstract objects obeying vector space axioms, and where vectors are no longer necessarily in $\R^n$, this conflation, if not explicitly addressed, can cause conceptual issues downstream.

Abstract vectors, bases, and coordinate representations.

In transitioning to coordinate-free linear algebra, we shift viewpoints from conceiving of vectors as being arrays of numbers, to conceiving of vectors as being any abstract object that obeys vector space properties.

We have an abstract vector $v$ in a vector space $V$ of dimension $n$. Where the vector space is defined over a field $\F$ of scalars which may be real, or complex.

This vector $v \in V$ is an object that can be defined without using a basis of the vector space $V$. The two main examples we will look at are,

Other examples of vectors in vector spaces, although not finite-dimensional vector spaces, are continuous functions in one real variable in $C(\R)$, and also infinite sequences in $\F^{\infty}$.

As these examples show, broadening the idea of a vector in this way allows us to subsume a wider class of objects to linear algebraic study than previously expected - objects like polynomials, which most likely would not have been conceived of as a vector prior to this formalism.

That a vector $v \in V$ can be defined without reference to a basis lies at the heart of the coordinate-free approach. Simply put, it encapsulates the following important idea,

Abstract vectors, as well their behaviour under linear maps and transformations, can be described without adopting a specific coordinate reference frame.

As example, take some vector $v \in V$, which can be visualised as an arrow pointing freely in space, with its tail rooted at an origin point. Applying the linear operator $Tv = 2v$ stretches this arrow by a factor of 2 in the direction it points. Notice – no coordinates, no basis. We do not need to overlay a coordinate grid on the space the arrow resides to describe this transformation.

However, if we need to do any kind of numerical computations, then defining a specific coordinate reference frame via a basis becomes essential.

We now define a basis $B = \ls{v}{1}{n}$ of the vector space $V$. By definition, these vectors are linearly independent, and span $V$, and because $V$ has dimension $n$, the basis consists of $n$ vectors. This allows us to uniquely express our vector $v \in V$ via scalar coefficients $a_k \in \F$ as a linear combination of these basis vectors,

\[v = \lc{a}{v}{1}{n}. \tag{1}\]

Doing so is referred to as “expanding $v$ in basis $B$”, and is one of the most fundamental acts one can perform in linear algebra.

The scalar coefficient $a_k \in \F$ of the $k$th basis vector is defined as the $k$th coordinate. In practice, as we will see in the later sections, these coordinates are often the outcome of solving a system of equations.

For computations involving vectors and matrices as data structures, we can collect the $n$ coordinates from the basis expansion into a column vector $[v]_B \in \F^{n \times 1}$, where $\F^{n \times 1}$ is the vector space of $n \times 1$ dimensional “column matrices”,

\[\cv{v}{B} \in \F^{n \times 1} = \begin{bmatrix} a_1\\ \vdots \\ a_n \end{bmatrix}.\tag{2}\]

$\cv{v}{B}$ is defined as the coordinate vector of $v$ with respect to the basis $B$. Collecting the $n$ coordinates from the basis expansion into a coordinate vector is referred to as “reading off the coordinates of $v$ in basis $B$”

Primarily, coordinate vectors like $[v]_B$ reside in coordinate vector space. We’ve chosen to represent $[v]_B$ as a column vector in $\F^{n \times 1}$, whereas some texts represent $[v]_B$ as an ordered list $(a_1, \dots, a_n)$ in $\F^n$. A standard convention here is to see coordinate vector space being denoted as $\F^n$, and yet coordinate vectors $[v]_B$ being represented as column vectors, rather than ordered lists.

The key takeaway here is that coordinate vector space can be either $\F^n$ or $\F^{n \times 1}$, and one needs to be mindful of context and notational convention in the text one is working in to avoid confusion.

From hereon, we will follow standard convention and represent coordinate vectors $\cv{v}{B}$ as column vectors, and denote coordinate vector space containing these column vectors as $\F^n$.

We now have a distinction between an abstract vector $v$, and its coordinate vector $[v]_B$ relative to basis $B$, and also between an abstract vector space $V$ and coordinate vector space $\F^n$. As a slogan,

Vectors are not their coordinate representations. For a vector to have a coordinate representation, we must first choose a basis.

This is fairly conventional in many linear algebra textbooks such as Axler (2024) and Friedberg et al. (2014). However, in moving from coordinate-based to coordinate-free perspectives, we will scrutinise the following in more detail,

In my own experience, moving between abstract vectors and their coordinate representations for the first few times across coordinate free and coordinate based linear algebra texts can feel tricky. Primarily because some coordinate-based texts assume the standard basis, but sometimes do not make it so explicit that they are doing so.

These difficulties can be remedied by explicitly defining the relationship using a coordinate map from the abstract vector space $V$ to coordinate vector space $\F^n$, following Roman (2008).

Coordinate maps.

In expanding the vector $v$ in basis $B$ in $(1)$ and reading off the coordinates to get $(2)$, we are defining the mapping,

\[\lc{a}{v}{1}{n} \mapsto \begin{bmatrix} a_1\\ \vdots \\ a_n \end{bmatrix}. \tag{3}\]

Since $v = \lc{a}{v}{1}{n}$ by the basis expansion, we can write this more compactly as the mapping $v \mapsto [a_1, \dots, a_n]^{\top}$. Given a basis $B$, we can name this mapping as the coordinate map $\phi_B = [\cdot]_B$ from our abstract vector space $V$ to coordinate vector space $\F^n$,

\[\begin{aligned} &\phi_B = [\cdot]_B: V \rightarrow \F^n\\ &\phi_B(v) = [v]_B = \begin{bmatrix} a_1\\ \vdots \\ a_n \end{bmatrix}. \tag{4} \end{aligned}\]

This coordinate map $\phi_B$ is both bijective and linear, proof can be found in A1 in the appendix. Because the coordinate map $\phi_B$ is bijective, i.e. invertible, not only can we read off the column vector $\cv{v}{B}$ from the basis expansion of $v$ in $B$, but equally, the column vector $\cv{v}{B}$ tells us the basis expansion of $v$ in basis $B$, via $\phi_B^{-1}$. And because the coordinate map $\phi_B$ is a linear map $\phi_B: V \rightarrow \F^n$, it acts in a way so as to preserve vector space structure.

Taken together, this implies that the coordinate map $\phi_B$ is a linear isomorphism, and the existence of this isomorphism implies that the vector spaces $V$ and $\F^n$ are isomorphic, denoted as,

\[V \cong \F^n.\]

From a linear algebraic perspective, this means that the abstract vector space $V$ and its corresponding coordinate vector space $\F^n$ are structurally identical. With a choice of basis $B$, then any abstract vector $v \in V$ can be represented in coordinates as column vectors.

A powerful consequence is that we can harness the full range of matrix-vector numerical and algorithmic methods on coordinate representations of objects not previously conceived of as vectors. As a very simple example, if we choose a basis to represent the differentiation operator as a matrix, and a polynomial as a column vector, then differentiation can then be reduced to a matrix-vector multiplication problem. An algebraic manipulation then becomes a problem that can be dealt with efficiently and quickly using the full spectrum of techniques in numerical linear algebra.

The power of this isomorphism blooms further when we start introducing things like geometry into the picture - see A2 in the appendix.

The standard basis in $\F^n$.

Consider the case where our abstract vector space $V$ is the vector space of ordered lists, $\F^n$. Then our vector $v \in \F^n$ will be $v = (\ls{x}{1}{n})$, an ordered list of length $n$ consisting of scalar entries $x_k \in \F$. In keeping with our earlier slogan, this vector can be defined without reference to a basis.

We now choose the basis of our vector space $\F^n$ to be the standard basis $E = \ls{e}{1}{n}$. We now seek a coordinate representation $\cv{v}{E} \in \F^n$ with respect to the standard basis $E$ of our vector $v \in \F^n$. Although we are using $\F^n$ to denote both the vector space of ordered lists and coordinate vector space, we can still distinguish these two spaces by noting that elements of the latter space are column vectors.

Using our standard basis vectors, we can now uniquely express $v = (\ls{x}{1}{n})$ as a linear combination of standard basis vectors in scalar coefficients $a_k \in \F$,

\[v = (\ls{x}{1}{n}) = \lc{a}{e}{1}{n} = a_1(1, 0, \dots, 0) + \dots + a_n(0, \dots, 0, 1).\tag{5}\]

“Solving” for our coordinates $a_k \in \F$, we have that for all $k = 1, \dots n$,

\[x_k = a_k. \tag{6}\]

In this setup, the scalar entries $x_k$ of $v$ are exactly the coordinates $a_k$ required to represent $v$ in the standard basis $E$. That is,

\[v = (\ls{x}{1}{n}) = \lc{x}{e}{1}{n}. \tag{7}\]

Reading off the coordinates of $v$ in the standard basis $E$, our coordinate vector of $v$ with respect to the standard basis $E$ is,

\[[v]_E = \begin{bmatrix} x_1\\ \vdots \\ x_n \end{bmatrix}. \tag{8}\]

The key thing to observe here is that the vector $v = (x_1, \dots, x_n) \in \F^n$ is element-wise identical to its coordinate representation $\cv{v}{E} = [x_1, \dots, x_n]^{\top}$ in the standard basis $E$, and this is the defining feature of the standard basis, one that does not hold for any other choice of basis.

As an example in $\R^3$, we have that $v = (2, 3, 7) = 2e_1 + 3e_2 + 7e_3$, and so reading off coordinates in the standard basis $E$ yields $\cv{v}{E} = [2, 3, 7]^{\top}$.

In choosing our basis to be the standard basis $E$, we have defined the following coordinate map,

\[\begin{aligned} &\phi_E: \F^n \rightarrow \F^n, \\ &\phi_E(v) = \phi_E(\lc{x}{e}{1}{n}) = \begin{bmatrix} x_1\\ \vdots \\ x_n \end{bmatrix}. \tag{9} \end{aligned}\]

When $V = \F^n$, and we work in the standard basis $E$, then every vector’s coordinate representation coincides with its entries. Applying the coordinate map $\phi_E$ directly transcribes these entries into a coordinate vector, without changing them.

As a short one-liner,

In the standard basis, the coordinate map “tips ordered lists onto their sides”.

Because the entries of a vector $v \in \F^n$ are identical to its coordinates in the standard basis, provided that we confine ourselves only to the standard basis, then we can “temporarily forget” that we need the coordinate map altogether, thereby treating a coordinate representation $\cv{v}{E}$ as if it were the vector $v \in \F^n$.

It is precisely this mechanism that allows us to conflate a vector $v = (x_1, \dots, x_n)$ in $\F^n$ with its coordinate representation $\cv{v}{E} = [x_1, \dots, x_n]^{\top}$ in the standard basis. However, once we leave the realm of the standard basis, say by using an arbitrary basis of the vector space $\F^n$, then this conflation leads to issues.

The following is a brief reminder to help anchor the differences between coordinate-free texts and coordinate-based texts.

“Standard basis vertigo”.

One particularly thorny point of confusion across some linear algebra texts, which I refer to as “standard basis vertigo”, occurs when our vector space $V$ is the vector space of ordered lists, $\F^n$, and we also choose to represent coordinate vectors $\cv{v}{B}$ as ordered lists $(a_1, \dots, a_n)$, rather than as column vectors $[a_1, \dots, a_n]^{\top}$.

When our vector space is $\F^n$ and coordinate vectors are also represented as ordered lists, then is a vector $v = (x_1, \dots, x_n)$ specified with reference to a basis of $\F^n$, or not?

At the beginning of this post, we claimed that any vector $v \in V$, thereby including the vector $v = (x_1, \dots, x_n)$, could be defined without reference to a basis of $V$, in this case $\F^n$. And yet, isn’t $(x_1, \dots, x_n)$, which is also our coordinate vector $\cv{v}{E}$, already specified in the standard basis $E$?

The problematic move here is in using an ordered list to represent coordinate vectors $\cv{v}{B}$ in coordinate vector space, rather than column vectors. The vector $v = (x_1, \dots, x_n)$ in vector space $\F^n$ has scalar entries $x_k \in \F$. Knowing these scalar entries $x_k \in \F$ implicitly tell us the coordinates of the coordinate vector $[v]_E$ in the standard basis $E$ - the coordinates are the scalar entries $x_k$ themselves. But because our coordinate vectors $\cv{v}{E}$ are also ordered lists, so that $\cv{v}{E}$ is also $(x_1, \dots, x_n)$, then is $(x_1, \dots, x_n)$ a vector, or its coordinate representation?

There are two main routes out of this confusion, and the answer to the above is,

It depends on whether we choose to represent coordinate vectors also as ordered lists, or as column vectors.

As the confusion is primarily notational, we now break from standard convention, and distinguish between ordered lists residing in $\F^n$, and column vectors residing in $\F^{n \times 1}$.

Route 1.

If we work with the vector space of ordered lists $\F^n$, and choose to represent our coordinate vectors as ordered lists, so that coordinate vector space is also represented by $\F^n$, then we have to be content with the fact that when we work with a vector $v \in \F^n$, then this always implicitly invokes the standard basis $E$ in the background.

In this setup, then the coordinate map $\phi_E$ is the identity $I: \F^n \rightarrow \F^n$. An ordered list $(x_1, \dots, x_n)$ in this case is simultaneously the vector $v$ in $\F^n$, and also the coordinate representation $\cv{v}{E}$, and it is notationally impossible to distinguish the two. Asking whether the ordered list is the vector or the coordinate representation, when it is both at the same time, is the precise source of the “standard basis vertigo”.

Route 2.

If we work with the vector space of ordered lists $\F^n$, but instead choose to represent coordinate vectors as column vectors, so that coordinate vector space is represented by $\F^{n \times 1}$, then the vertigo completely dissolves. This typographical distinction ensures that the vector $v \in \F^n$ is an ordered list, specificable without any reference to a basis of $\F^n$. And because coordinate representations $\cv{v}{E}$ are now column vectors, they are distinct from the ordered list $v \in \F^n$, even if both are populated by the same numbers.

Arbitrary bases in $\F^n$.

If we again let our abstract vector space $V$ be $\F^n$ itself, so that $v = (\ls{x}{1}{n})$, our vector is an ordered list of length $n$ consisting of entries $x_k$, independent of a basis. Using our arbitrary basis $B$, we can still uniquely express $v = (\ls{x}{1}{n})$ as a linear combination in scalar coefficients $a_i \in \F$,

\[v = (\ls{x}{1}{n}) = \lc{a}{v}{1}{n} \tag{10}.\]

Choosing an arbitrary basis $B$ that isn’t the standard basis implies that we now have to solve for coordinates $a_i \in \F$ in a system of $n$ linear equations. As each basis vector $v_i \in \F^n$ is itself an ordered list, then we have,

\[v_i = (v_{i1}, \dots, v_{in}). \tag{11}\]

We now need to solve for coordinates $a_i \in \F$ for $i = 1, \dots, n$, where each $k$-th entry of $v$ is,

\[x_k = a_1v_{k1} + \dt+ a_nv_{kn}. \tag{12}\]

As we haven’t chosen to use the standard basis, it generally will not be the case that the scalar entries $x_k$ will be the same as our coordinates $a_k$, so that in general,

\[x_k \neq a_k. \tag{13}\]

As an example in $\R^3$, say we wish to find the coordinate representation of $v = (2, 3, 7)$ in the basis $B$, given by $(1,1,0), (0,1,0), (1,1,2)$. We set up the following system and solve for each $a_i \in \F$, resulting in the following basis expansion of $v$ in $B$,

\[\begin{aligned} (2, 3, 7) &= a_1(1,1,0) + a_2(0,1,0) + a_3(1,1,2) \\ &= \fr{-3}{2}(1,1,0) + 1(0,1,0) + \fr{7}{2}(1,1,2). \end{aligned}\]

Reading off coordinates yields the coordinate representation $\cv{v}{B} = [-3/2, 1, 7/2]^{\top}$. Note that the entries of $v$ do not coincide with the coordinates in $\cv{v}{B}$. And that the same vector $v = (2, 3, 7)$ now has more than one coordinate representation, one in the standard basis $E$ and one in the above arbitrary basis $B$.

Choosing an arbitrary basis $B$ amounts to using the coordinate map,

\[\begin{aligned} &\phi_B: \F^n \rightarrow \F^n, \\ &\phi_B(v) = \phi_B(\lc{a}{v}{1}{n}) = \begin{bmatrix} a_1\\ \vdots \\ a_n \end{bmatrix}. \tag{13} \end{aligned}\]

When $V = \F^n$, and we work in an arbitrary basis $B$, then in general every vector’s coordinate representation does not coincide with its entries. Applying the coordinate map $\phi_B$ first requires solution of a linear system to get a basis expansion.

As a short one-liner,

In arbitrary bases, the coordinate map requires basis book-keeping.

The key takeaway here is that when we choose an arbitrary basis $B$, then we can no longer conflate a vector $v = (\ls{x}{1}{n})$ in $\F^n$ with its coordinate representation $[v]_B = [\ls{a}{1}{n}]^{\top}$, because the entries of $v$ and the coordinates of $\cv{v}{B}$ no longer coincide.

This can be a source of a number of conceptual issues in coordinate-based linear algebra. Working in the standard basis, where the coordinate map is “invisible”, and where vectors “are” their coordinate representations, then as soon as one moves out of this comfort zone to a situation requiring arbitrary bases, then a sense of dislocation can often be felt.

Similarly, another sticking point in coordinate-based linear algebra is a general intuitive discomfort in performing changes of basis. An example is when one changes basis from the standard basis to another arbitrary basis, and possibly back to the standard basis again. If one hasn’t fully grasped that a vector is independent of its coordinate representation, and that the same vector may have multiple coordinate representations, then often one has to merely accept it on trust from a textbook or instructor as to why two distinct “columns of numbers” can refer to the same entity.

From personal experience with machine learning material, this is particularly keenly felt when one deals with techniques involving subspaces and orthogonal projections, such as prinicpal components analysis and even linear regression. Where the orthogonal projection of a vector onto a subspace will have two distinct coordinate representations in the subspace basis and the parent vector space basis, where not only are the “columns of numbers” distinct, but are of differing “size”.

We conclude this part with a slogan,

A single vector has as many coordinate representations as there are bases. A single coordinate representation also corresponds to as many vectors as there are bases.

Bases in polynomial vector spaces.

Here is a classic example in linear algebra where the conflation of a vector and its coordinate representation starts to strain. Let our abstract vector space $V$ be a vector space of polynomials of at most $n-1$th degree, $\mathcal{P}_{n-1}(\F)$. Then our vector $v$ would be a polynomial function $p: \F \rightarrow \F$ with coefficients $a_k \in \F$ such that for all $z \in \F$,

\[p(z) = a_0 + a_1z + \dt+ a_{n-1}z^{n-1}. \tag{14}\]

If we conflate the above vector with its representation, then we might be tempted to represent this polynomial as a column vector $[a_0, \dots, a_{n-1}]^{\top}$. Let’s not do so just yet. Instead, noting that a polynomial is simply a mapping rule, nothing prevents us from specifying $p$ as a polynomial $q: \F \rightarrow \F$ with the following coefficients in $\F$, such that for all $z \in \F$,

\[\begin{aligned} q(z) &= p(c) + \frac{p^{(1)}(c)}{1!}(z-c) + \dots + \frac{p^{(n-1)}(c)}{(n-1)!}(z-c)^{n-1}. \tag{15} \end{aligned}\]

At this stage, if we similarly conflate the polynomial and its coordinate representation, then we might write $[\ls{b}{0}{n-1}]^{\top}$, where,

\[b_k = \frac{p^{(k)}(c)}{k!}. \tag{16}\]

Now algebraically, polynomials $p$ and $q$ are equivalent because $p(z) = q(z)$ for all $z \in \F$, and there is no need to view the polynomials with vector space terminology to see this. And even when we view $p$ and $q$ as vectors in $\mathcal{P}_{n-1}(\F)$, then they are still the same vector, and so we now refer to both as the polynomial $p$.

If we are used to thinking of vectors as lists of numbers, and that there is no difference between a vector and its coordinate representation, then the following question reveals some issues,

Which of the column vectors $[\ls{a}{0}{n-1}]^{\top}$ or $[\ls{b}{0}{n-1}]^{\top}$ is the polynomial $p$?

At its core, this question amounts to a category error, but one which can be dealt with by applying our slogans,

Neither column vector, as a coordinate representation residing in $\F^n$, “is” the polynomial $p$, which resides in $\mathcal{P}_{n-1}(\F)$. However, both column vectors are valid coordinate representations of the polynomial $p$, provided we declare the basis in which each coordinate representation is recorded.

We can clarify all of this by being explicit in how we are using the linear algebra we have established so far.

Define the standard basis $E$ of polynomials $E = \set{e_0, \dots, e_{n-1}}$, where each basis function $e_k : \F \rightarrow \F$ is defined as $e_k(z) = z^k$. Also define a shifted basis $B$ of polynomials $B = \set{p_0, \dots, p_{n-1}}$, where each shifted basis function $p_k: \F \rightarrow \F$ is defined as $p_k(z) = (z-c)^k$.

In linear algebra texts like Axler (2024), one will often witness the above notation being shortened so as to conflate a function $f$ and its evaluation $f(z)$. Both bases are then written in terms of evaluations rather than functions, so that $E = \set{1, z, \dots, z^{n-1}}$ and $B = \set{1, (z-c), \dots, (z-c)^{n-1}}$.

Expanding the polynomial $p$ in the standard basis $E$ then gives,

\[p = \lc{a}{e}{0}{n-1}. \tag{17}\]

To see that this linear algebraic formulation recovers our expression in $(14)$, we can evaluate this at some fixed value $z \in \F$ to get,

\[\begin{aligned} p(z) &= a_0e_0(z) + \dots + a_{n-1}e_{n-1}(z) \\ &= a_0z^0 + a_1z^1 + \dots + a_{n-1}z^{n-1} \\ &= a_0(1) + a_1(z) + \dots + a_{n-1}(z^{n-1}). \tag{18} \end{aligned}\]

Similarly, expanding polynomial $p$ in the shifted basis $B$ gives,

\[p = \lc{b}{p}{0}{n-1}. \tag{19}\]

Evaluating at some fixed point $z \in \F$ then recovers our expression in $(15)$,

\[\begin{aligned} p(z) &= b_0p_0(z) + b_1p_1(z) + \dots b_{n-1}p_{n-1}(z) \\ &= b_0 (z-c)^0 + b_1(z-c)^1 + \dots b_{n-1}(z-c)^{n-1}\\ &= p(c) + \frac{p^{(1)}(c)}{1!}(z-c) + \dots + \frac{p^{(n-1)}(c)}{(n-1)!}(z-c)^{n-1}. \tag{20} \end{aligned}\]

Using ### to read off the coordinates of the basis expansion of $p$ in the standard basis $E$ amounts to applying the coordinate map $\phi_E$, resulting in,

\[\phi_E(p) = \phi_E(\lc{a}{e}{0}{n-1}) = \begin{bmatrix} a_0\\ \vdots \\ a_{n-1} \end{bmatrix}. \tag{21}\]

Similarly using ### to read off the coordinates of the basis expansion of $p$ in the shifted basis $B$ amounts to applying the coordinate map $\phi_B$, and this gives,

\[\phi_B(p) = \phi_B(\lc{b}{p}{0}{n-1}) = \begin{bmatrix} b_0\\ \vdots \\ b_{n-1} \end{bmatrix}. \tag{22}\]

From this we can see that the same polynomial in the vector space $\mathcal{P}_{n-1}(\F)$ can have distinct coordinate representations, based on our choice of basis. In terms of the machinery we established earlier, we have that

\[\mathcal{P}_{n-1}(\F) \cong \F^n. \tag{23}\]

Our category error then amounts to mistaking this isomorphism for identity (a vector is its coordinate representation). In attempting to write down a coordinate representation without specifying a basis, then the mistake is to believe that there exists only one isomorphism between $\mathcal{P}_{n-1}(\F)$ and $\F^n$, when in fact there are many valid isomorphisms, all of which are contingent on the choice of $B$ in $\phi_B$.

Non-canonically isomorphic vector spaces.

Earlier, we established that an abstract vector space $V$ of dimension $n$, and coordinate vector space $\F^n$, both as instances of linear algebraic structures, are structurally identical, i.e. isomorphic,

\[V \cong \F^n. \tag{24}\]

All this means is that a linear bijection between these two vector spaces exists. We showed this by proving that given a choice of basis $B$, the coordinate map $\phi_B: V \rightarrow \F^n$ is linear and bijective. But we need one final qualification to round everything off.

Recalling that we are free to choose any basis $B$, and that our coordinate map is contingent on this choice, choosing a different basis $B’$ to construct our coordinate map $\phi_{B’}$ will in general result in a different isomorphism, i.e.

\[\phi_B \neq \phi_{B'}. \tag{25}\]

Out of all possible isomorphisms that can be defined via our choice of basis, there is nothing in the structure of the vector space $V$ that suggests a preference for one basis $B$, distinguished over all other possible bases $B’$ – in this restricted sense, our choice of basis $B$ is arbitrary.

Precisely because there are as many isomorphisms between $V$ and $\F^n$ as there are choices of bases, and because any isomorphism we define must involve an arbitrary choice of basis, we further say that $V$ and $\F^n$ are not canonically isomorphic.

If on the other hand $V$ and $\F^n$ were canonically isomorphic, then wouldn’t need to make an arbitrary choice of basis to define the isomorphism between $V$ and $\F^n$, and the isomorphism would be intrinsic to the nature of the spaces themselves.

We’ve been circling a number of ideas in coordinate-free linear algebra via slogans, and they are all instantiations one central idea,

Binding thse slogans into one succinct statement, we have,

The vector spaces $V$ and $\F^n$ are not canonically isomorphic.

Appendix.

A1. Proof of linearity of the coordinate map $\phi_B$.

For proof that the coordinate map $\phi_B$ is linear, we need to show that it is additive and homogeneous, and we do so all in one go.

Let $u, v \in V$. As $B$ is a basis of $V$, we can express $u$ and $v$ uniquely as linear combinations in the basis vectors. That is, there exist scalars $a_i \in \F$ and $b_i \in \F$, such that

\[u = \lc{a}{v}{1}{n}, \quad v = \lc{b}{v}{1}{n}. \tag{26}\]

Now let $\lambda, \mu \in \F$, and we have,

\[\begin{aligned} \phi_B(\lambda u + \mu v) &= \phi_B\sqlr{\lambda(\lc{a}{v}{1}{n}) + \mu(\lc{b}{v}{1}{n})} \\ &= \phi_B\sqlr{(\lambda a_1 + \mu b_1) v_1 + \dots +(\lambda a_n + \mu b_n)v_n} \\ &= \begin{bmatrix}\lambda a_1 + \mu b_1 \\ \vdots \\ \lambda a_n + \mu b_n \end{bmatrix} \\ &= \lambda\begin{bmatrix}a_1 \\ \vdots \\ a_n \end{bmatrix} + \mu \begin{bmatrix}b_1 \\ \vdots \\ b_n \end{bmatrix}\\ &= \lambda \phi_B(u) + \mu \phi_B(v). \tag{27} \end{aligned}\]

For proof that the coordinate map $\phi_B$ is bijective, we first show that it is injective and then surjective.

To prove that $\phi_B$ is injective, we only need to prove that $\nul \phi_B \subseteq \set{0}$. Now if $v \in \nul \phi_B$, then by definition,

\[\phi_B(v) = \begin{bmatrix} 0 \\ \vdots \\ 0 \end{bmatrix}. \tag{28}\]

By definition of $B$ as a basis, using the coordinates to form the following linear combination results in a unique vector $v \in V$, which is the zero vector $0_V$,

\[0v_1 + \dots 0v_n = v = 0_V. \tag{29}\]

And so it must be the case that $v \in \set{0}$ and therefore $\nul \phi_B = \set{0}$, showing that $\phi_B$ is injective.

For proof that $\phi_B$ is surjective, we only need to prove that $\F^n \subseteq \range \phi_B$. Let $\cv{v}{B} = [a_1, \dots a_n]^{\top} \in \F^n$, and we need to show that there exists a $v \in V$ such that,

\[\cv{v}{B} = \begin{bmatrix} a_1 \\ \vdots \\ a_n \end{bmatrix} = \phi_B(v). \tag{30}\]

By definition of $B$ as a basis, and using the coordinates $a_k$ to form the following linear combination results in a unique vector $v \in V$,

\[\lc{a}{v}{1}{n} = v. \tag{31}\]

And so it must be the case that $v \in V$, and therefore $\range \phi_B = \F^n$, showing that $\phi_B$ is surjective.

A2. Transport of geometric structure via isometries.

As the vector spaces $V$ and $\F^n$ are isomorphic, the coordinate map for any choice of basis preserves linear combinations, and therefore linear algebraic structure.

This becomes more powerful when we not only endow both vector spaces $V$ and $\F^n$ with geometry individually, but in a way so as to ensure that the coordinate map addiitonally preserves geometry across both spaces, allowing us to speak of “one geometry” across both spaces.

Here is a brief description of one direction in which this works. Assume that $\F^n$ has the Euclidean inner product, and we work with the standard basis, which is orthonormal. We can then define the inner product of $V$ to be equivalent to the inner product on $\F^n$, so that for all $u, v \in V$,

\[\inner{u, v}_V := \inner{\phi_B(u), \phi_B(v)}_{\F^n} = \bfu^{\top}\bfv. \tag{32}\]

In doing so, the coordinate map $\phi_B$ becomes an isometry. As we assumed an orthonormal basis of $\F^n$, this is equivalent to choosing the basis $B$ of $V$ to be orthonormal.

In pulling back the Euclidean inner product on $\F^n$ to $V$ via the isomorphism $\phi_B$, we have transported the Euclidean geometric structure of $\F^n$ to $V$. Instead of computing distances, lengths, angles etc. in $V$, we can simply do so in Euclidean vector space $\F^n$, as well as reason geometrically about vectors in $v \in V$ as if they were in $\F^n$. We are also not confined to transporting Euclidean structure either - we are free to choose other inner products on $\F^n$.

This idea is not confined to linear algebra, and is broadly referred to in mathematics as transport of structure. In machine learning, an elementary example of transport of structure from $\F^n$ to $V$ is in feature-engineering, such as one-hot-encoding of categorical data.

References.

[1] Axler, S. (2024). Linear Algebra Done Right (4th ed.). Springer.

[2] Friedberg, S., Insel, A., Spence, S. (2014). Linear Algebra (4th ed.). Pearson Education.

[3] Roman, S. (2008). Advanced Linear Algebra (3rd ed.). Springer.

[4] Strang, G. (2023). Introduction to Linear Algebra (6th ed.). Wellesley-Camrbidge Press.