But what is a differential?

A simple explanation using basic tools

Motivation

Often in introductory math and physics courses, the term “differential” is informally used to refer to a small change in a variable. Moreover, on occasion, we hear about exact and inexact differentials, no less informally. For example, the first law of thermodynamics is usually presented in a similar way to this:

\[dU = đQ - đW\]

where $dU$ is the differential of the internal energy of a thermodynamic system, \(đQ\) is the differential of heat supplied to the system, and \(đW\) is the differential of work done by the system. There, without precise definitions, the symbol $d$ is used to denote an exact differential, a bona fide differential, while \(đ\) denotes its sinister counterpart, an inexact differential.

However, faced with such informality, naturally the question arises as to what precisely all this means, if anything. What is really a differential and what distinguishes an exact differential from an inexact one? These questions belong to the domain of differential geometry, a beautiful area of mathematics which studies the geometry of very general spaces known as smooth manifolds In very simple terms, smooth manifolds are surfaces of arbitrary dimension where smooth coordinate patches can be consistently overlapped (think of a sphere, a cylinder, a torus, etc.).. However, the full machinery of differential geometry is usually out of reach for students in introductory courses, when they are first exposed to these concepts, and that is why the details are generally not elaborated on. Therefore, this post represents an effort to shed light on these issues from a simple, accessible point of view, without invoking the entire apparatus of differential geometry, but restricting to basic tools of calculus and linear algebra.

The ideas I will present have profound roots in differential geometry and naturally extend to manifolds of any dimension. However, in order to reduce complexity to a minimum, we will work in three-dimensional Euclidean space, $ \mathbb{R}^3$, with usual Cartesian coordinates ($x, y, z$). This space (or part of it) can equally represent, for example, a room with rectangular coordinates along the horizontal and vertical directions, as much as a thermodynamic space, with variables $(x, y, z)=(S,V,N)$ representing the entropy, volume, and number of particles of a thermodynamic system.

Some tools from geometry and linear algebra

Vectors

In the space $ \mathbb{R}^3 $, each point $ p = (x_0, y_0, z_0) $ has an associated set of unit vectors $ (e_x, e_y, e_z)_p $ that are parallel to the respective coordinate axes: $ x $, $ y $, and $ z $. We use the subscript $p$ to indicate that these vectors are based at point $p$. These vectors form a basis which allows us to describe any other vector at $ p $ in terms of them. Consider a generic vector $ v_p $ at point $ p $. This vector can be expressed as a linear combination of the basis vectors:

\[v_p = v_x \, (e_x)_p + v_y \, (e_y)_p + v_z \, (e_z)_p\]

where $ v_x $, $ v_y $, and $ v_z $ are the components of $ v_p $ in the directions of $ (e_x)_p $, $ (e_z)_p $, and $ (e_z)_p $, respectively. For simplicity, from now on we will omit the subscript $p$ on the basis vectors, as it is understood that they refer to vectors at the corresponding point. The set of all possible vectors at point $ p $ forms what is known as the tangent space of $ \mathbb{R}^3 $ at $p$, denoted as $T_p\mathbb{R}^3$. Additionally, although it is not strictly necessary for our construction, we will think of this space as endowed with the usual dot product, $ v_p \cdot w_p = v_x w_x + v_y w_y + v_z w_z $. This will allow us to connect the new ideas with other more familiar ones from vector calculus.

A vector field in $ \mathbb{R}^3 $ can be visualized as a map that assigns a vector in $T_p\mathbb{R}^3$ to each point $ p \in \mathbb{R}^3 $. A generic vector field on $ \mathbb{R}^3 $ can be expressed as

\[v = v_x(x, y, z) \, e_x + v_y(x, y, z) \, e_y + v_z(x, y, z) \, e_z\]

where the components are now functions of the coordinates in $ \mathbb{R}^3 $. As usual, we will restrict to smooth vector fields, i.e., those whose components are smooth functions.

Covectors

At each point $p \in \mathbb{R}^3 $, the dual space of $T_p\mathbb{R}^3$, denoted $T_p^* \mathbb{R}^3$, is another three-dimensional vector space, consisting of all the covectors that act on the vectors in $T_p\mathbb{R}^3$. A covector $ \omega_p \in T_p^* \mathbb{R}^3$ is a continuous linear function $ \omega_p : T_p\mathbb{R}^3 \to \mathbb{R} $, acting on vectors to return scalars. The space $T_p^* \mathbb{R}^3$ is usually called the cotangent space of $ \mathbb{R}^3 $ at $ p $. The vector basis $ (e_x, e_y, e_z) $ of $T_p\mathbb{R}^3$ has a natural counterpart in the cotangent space $T_p^* \mathbb{R}^3$, called the dual basis. This dual basis is denoted $(dx, dy, dz)_p$, and satisfies:

\[\begin{aligned} dx(e_x) &= 1, & dx(e_y) &= 0, & dx(e_z) &= 0, \\ dy(e_x) &= 0, & dy(e_y) &= 1, & dy(e_z) &= 0, \\ dz(e_x) &= 0, & dz(e_y) &= 0, & dz(e_z) &= 1 \end{aligned}\]

where we omit the subscript $p$ from the covectors of the dual basis, as it is clear they refer to covectors at point $ p $. An intuitive way to think of these covectors is essentially as “gauges” that tell us how much of each component, $ x $, $ y $, or $ z $, there is in a given vector. Indeed, the action of these covectors on a generic vector $ v $ is given by:

\[dx(v) = v_x, \quad dy(v) = v_y, \quad dz(v) = v_z,\]

where we have used the property of linearity.

It is important to note that, despite the notation used ($ dx $, $ dy $, $ dz $), up to this point, we made no reference to differentiation, let alone “small changes” in coordinates. What we have established are purely algebraic definitions. The choice of these names for the elements of the dual basis is intentionally suggestive, but at this point, they are just names. The motivation for this choice will become clear later on.

Just as a generic vector in $T_p\mathbb{R}^3$ can be expressed in terms of the unit vectors $ (e_x, e_y, e_z) $, a generic covector $\omega_p \in T_p^* \mathbb{R}^3$ can be expressed as a linear combination of the basis covectors:

\[\omega_p = \omega_x \, dx + \omega_y \, dy + \omega_z \, dz\]

where $ \omega_x $, $ \omega_y $, and $ \omega_z $ are the components of $ \omega_p $ in the dual basis covectors $ dx $, $ dy $, and $ dz $, respectively. The action of a generic covector $\omega$ on a generic vector $v$ is then given by:

\[\omega(v) = \omega_x dx(v) + \omega_y dy(v) + \omega_z dz(v) = \omega_x v_x + \omega_y v_y + \omega_z v_z\]

where we have successively applied the property of linearity.

1-forms

Just as with vector fields, we can define covector fields, which assign a covector in $T_p^* \mathbb{R}^3$ to each point $ p \in \mathbb{R}^3 $. Such a field can be expressed as:

\[\omega = \omega_x(x, y, z) \, dx + \omega_y(x, y, z) \, dy + \omega_z(x, y, z) \, dz\]

where the components are now functions of the coordinates in $ \mathbb{R}^3 $. A smooth covector field is known as a 1-form1-forms are part of a broader class of objects known as differential forms. The number in their name indicates their tensor rank, which means that, in addition to 1-forms, there are also 2-forms, 3-forms, etc. Differential forms are fundamental in differential geometry. We will not delve into their territory as it exceeds the scope of this post. We simply adopt the terminology to maintain consistency with the general context in which these topics belong. , and these objects are a crucial ingredient for what follows.

Correspondence between covectors and vectors

Recall the action of a covector $\omega$ on a vector $v$:

\[\omega(v) = \omega_x v_x + \omega_y v_y + \omega_z v_z\]

The reminiscence of this expression with a standard dot product suggests an interesting fact: a covector $ \omega \in T_p^* \mathbb{R}^3$ can be associated with a vector $ w \in T_p\mathbb{R}^3$ such that, for any other vector $ v \in T_p\mathbb{R}^3$, it holds that

\[\omega(v) = w \cdot v\]

Explicitly, this vector is $ w = \omega_x e_x + \omega_y e_y + \omega_z e_z $. This relation shows that the vectors associated with the covectors $ dx $, $ dy $, and $ dz $ are the very $ e_x $, $ e_y $, $ e_z $ respectively, i.e.,

\[\begin{aligned} dx(v) &= e_x \cdot v, \\ dy(v) &= e_y \cdot v, \\ dz(v) &= e_z \cdot v, \end{aligned}\]

for a generic vector $ v $, which can be easily verified. This relation is the content of the so-called Riesz Representation Theorem from linear algebra, which, given a dot product, establishes a special correspondence between vectors and covectors. It is important to highlight that this correspondence is dependent on the dot product, a structure that is foreign to the construction of vectors and covectors themselves. That is, the space $T_p^* \mathbb{R}^3$, and particularly the covectors $ dx $, $ dy $, $ dz $, are intrinsic to the space, as they can be constructed without reference to the dot productIn rigor, we have used the dot product when defining the basis vectors $ (e_x, e_y, e_z) $ as unit vectors parallel to the coordinate axes. However, this choice is dispensable; the construction can be carried out, with a bit more effort, without appealing to the dot product, as is standard in differential geometry. Here we use the natural dot product to simplify the treatment. For the completely intrinsic construction of the spaces $T_p\mathbb{R}^3$ and $T_p^* \mathbb{R}^3$ see any reference material on differential geometry, for example, Lee's Introduction to Smooth Manifolds .. And there is no intrinsic correspondence between vectors and covectors. It is only after a dot product is introduced that the correspondence arises, and it is different for different dot products.

The differential of a function

Definition

We are now in a position to introduce the fundamental concept of this discussion: the differential of a function. Consider a smooth scalar function defined over our space $ \mathbb{R}^3 $, denoted as $ f: \mathbb{R}^3 \to \mathbb{R} $. At each point $ p \in \mathbb{R}^3 $, we define the differential of $ f $ at $ p $ as follows:

\[df_p = \left(\frac{\partial f}{\partial x}\right)_p \, dx + \left(\frac{\partial f}{\partial y}\right)_p \, dy + \left(\frac{\partial f}{\partial z}\right)_p \, dz\]

Thus, $ df_p $ is simply the covector whose components are the partial derivatives of $ f $ evaluated at $ p $. Similarly, we define the differential of $ f $ as the following 1-form:

\[df = \frac{\partial f}{\partial x} \, dx + \frac{\partial f}{\partial y} \, dy + \frac{\partial f}{\partial z} \, dz\]

where we now understand $ \frac{\partial f}{\partial x} $, $ \frac{\partial f}{\partial y} $, and $ \frac{\partial f}{\partial z} $ as functions of the coordinates in $ \mathbb{R}^3 $.

Interpretation

To understand the utility of the differential of a function, let us see how it acts on vectors. It can be easily verified that, on the basis vectors, it acts as follows:

\[df(e_x) = \frac{\partial f}{\partial x}, \quad df(e_y) = \frac{\partial f}{\partial y}, \quad df(e_z) = \frac{\partial f}{\partial z}\]

That is, the differential of $ f $ applied to the basis vectors returns precisely the partial derivatives of $ f $ with respect to the corresponding coordinates. Furthermore, the action of $ df $ on a generic vector $v $ is given by

\[df(v) = v_x \frac{\partial f}{\partial x} + v_y \frac{\partial f}{\partial y} + v_z \frac{\partial f}{\partial z}\]

Upon closely analyzing this expression, we see that it is nothing other than the directional derivative of $ f $ along $v $. That is, the differential of $ f $ encodes the information about the derivatives of $ f $, such that when applied to a given vector, it returns the directional derivative along it. Moreover, this action can also be expressed as

\[df(v) =v \cdot \nabla f\]

where $\nabla f$ is the gradient of $f$. Therefore, recalling the association between vectors and covectors introduced before, the gradient of a function is the vector associated with the differential through the dot product.

Differential vs. gradient

Both the differential and the gradient of a function contain the information about the directional derivatives of the function. In other words, both allow the directional derivative along a vector to be obtained by acting on that vector. However, the way the gradient “acts” on the vector is by taking a dot product, as can be seen in the expression above.

On the other hand, the differential acts intrinsically as a covector, which means it does not require any additional structure such as a dot product, as it is the very nature of covectors to act on vectors. Furthermore, if we take a different dot product, the gradient of the same function will be differentIn a general context, the gradient of a function is defined as the unique vector that, under a given dot product (or, more precisely, a metric), satisfies the relationship $df(v) =v \cdot \nabla f$ for any vector $v $. In contrast, the differential of a function does not depend on the dot product. This shows that the differential is a more intrinsic object than the gradient for encoding derivatives.

Examples

As an example, consider the function $ f: \mathbb{R}^3 \to \mathbb{R} $ defined by

\[f(x, y, z) = x^2 y + e^y \cos(z)\]

We calculate its differential using the rules of derivation:

\[df = 2xy \, dx + e^y \cos(z) \, dy - e^y \sin(z) \, dz.\]

As another example, if we consider the coordinate $ x $ as a function in $ \mathbb{R}^3 $, its differential is trivially

\[d(x) = dx.\]

Notice that, while the left-hand side of the equation involves a differential operation, the right-hand side is purely algebraic. This highlights the convenience of the chosen notation, as it allows us to overlook these details and proceed seamlessly.

It is important to emphasize that the concepts we introduced have precise definitions, and we have not needed to refer to less rigorous notions such as “small changes” in variables and functions.

Exact and inexact differentials

Finally, having established the theoretical framework, we can precisely explain what the notions of “exact differential” and “inexact differential” really mean. An exact differential is defined as a 1-form that is the differential of a smooth scalar function, as we have previously discussed. In other words, a 1-form

\[\omega = \omega_x \, dx + \omega_y \, dy + \omega_z \, dz\]

is an exact differential if and only if there exists a smooth function $ f : \mathbb{R}^3 \to \mathbb{R} $ such that $ \omega = df $. When this is the case, $ f $ is called a potential function for $ \omega $.

For example, the 1-form $ \omega = x \, dx + y \, dy + z \, dz $ is an exact differential, as it can be written $\omega = df$ where $ f = \frac{1}{2}(x^2 + y^2 + z^2) $ is the potential function.

For a 1-form to be an exact differential, its components must satisfy the following conditions:

\[\omega_x = \frac{\partial f}{\partial x}, \quad \omega_y = \frac{\partial f}{\partial y}, \quad \omega_z = \frac{\partial f}{\partial z}\]

where $ f $ is the potential. The symmetry of the second partial derivatives of $ f $ imposes integrability conditions on $ \omega $ for this to be possible, namely:

\[\frac{\partial \omega_x}{\partial y} = \frac{\partial \omega_y}{\partial x}, \quad \frac{\partial \omega_y}{\partial z} = \frac{\partial \omega_z}{\partial y}, \quad \frac{\partial \omega_z}{\partial x} = \frac{\partial \omega_x}{\partial z}\]

These conditions are not trivial at all, and it is relatively simple to construct 1-forms that do not satisfy them and, thus, are not exact differentials. For example, the 1-form $ \omega = y \, dx $ is not an exact differential, as it does not satisfy the integrability conditions:

\[\frac{\partial \omega_x}{\partial y} = 1 \neq \frac{\partial \omega_y}{\partial x} = 0\]

A 1-form that is not exact is called an inexact differential. In less formal contexts, they are denoted, e.g., as \(đQ\), as we did at the beginning of this post, where the symbol đ is used to emphasize that the 1-form is not the differential of any function $ Q $. However, here we have uncovered that, beyond the mysterious notation, it is merely a 1-form that does not meet the criteria to be the differential of a function.

Integrating 1-forms

The concept of a 1-form is closely related to integration and is essential for defining line integrals in a way that does not depend on any additional structure, such as a dot product. This contrasts with the usual presentations in physics, where line integrals of vector fields are reliant on dot products. Here, we will explain how to intrinsically integrate a 1-form over a curve in $ \mathbb{R}^3 $.

Consider a 1-form $ \omega $ and a simple smooth curve $ \gamma: I \to \mathbb{R}^3 $, where $ I = (a, b) $ is a real interval. The integral of the 1-form $ \omega $ over the curve $ \gamma $ is defined by

\[\int_\gamma \omega = \int_a^b \omega(\gamma'(t)) \, dt\]

where \(\gamma'(t)\) represents the tangent vector to the curve at the point $ \gamma(t) $, and the integral on the right-hand side is a standard real integral. This means we are integrating over $ I $ the function that results from evaluating $ \omega $ on the tangent vectors along the curve $\gamma$. This definition is noteworthy because, as mentioned, it does not involve a dot product, which is a foreign structure.

When $ \omega $ is an exact differential, i.e., $ \omega = df $ for some function $ f $, the integral over $ \gamma $ is remarkably simple to calculate:

\[\int_\gamma df = f(b) - f(a)\]

as can be verified by a straightforward calculation and application of the fundamental theorem of calculus. This implies that the integral of an exact differential over a curve only depends on the values of $ f $ at the endpoints and not on the specific path between them.

Another consequence is that the integral of an exact differential over any closed curve is zero. Indeed, the fundamental theorem of calculus guarantees that a 1-form is exact if and only if its integral over every closed curve is zero.

On the contrary, for an inexact differential, the result of the integral can depend on the specific path taken between two points.

Application to thermodynamics

To conclude, I will apply the introduced concepts to thermodynamics. The state of a thermodynamic system can be represented as a point in thermodynamic space, where a possible set of variables is, for example, $(S, V, N)$. These variables represent the entropy, volume and number of particles of the system, respectively, and are referred to as state variablesThese state variables cannot take any real value, as they must be positive. Therefore, they do not cover the entire $ \mathbb{R}^3 $, but only a part of it. Nevertheless, all the theory presented here regarding 1-forms applies without modification, with the sole exception that we are restricted to the appropriate subset of the space where the variables have physical meaning.. The functions defined in this space are called state functions. An example is the internal energy, $U = U(S,V,N)$. The differential of the internal energy can be written as follows:

\[dU = \left(\frac{\partial U}{\partial S}\right)_{V,N} dS + \left(\frac{\partial U}{\partial V}\right)_{S,N} dV + \left(\frac{\partial U}{\partial N}\right)_{S,V} dN\]

The subscripts $V, N, S$ in the partial derivatives indicate what variables are held constant during the derivation. This detail is important as there are other sets of variables that can parameterize the thermodynamic space (temperature, pressure, Gibbs free energy, etc.), and the partial derivative with respect to the same physical variable can vary depending on which other variables are used to parameterize the space. It is interesting to observe that the expression above, to which we have given a precise meaning, coincides with what one might write informally, albeit without having the tools presented here.

In thermodynamics, the partial derivatives of internal energy can be used to define other commonly used state variables, such as

\[T := \left(\frac{\partial U}{\partial S}\right)_{V,N} \quad P := -\left(\frac{\partial U}{\partial V}\right)_{S,N} \quad \mu := \left(\frac{\partial U}{\partial N}\right)_{V,S}\]

the temperature, pressure, and chemical potential, respectively. Commonly, the following notation is used:

\[đQ = TdS, \quad đW = PdV - \mu dN\]

These 1-forms represent the heat supplied to the system and the work done by the system, respectively, and it is emphasized that they are inexact differentials, i.e., they do not derive directly from a state function. In this way, the differential of the internal energy takes the form

\[dU = đQ - đW\]

which is the standard formulation of the first law of thermodynamics.

During a quasi-static thermodynamic process, represented by a curve $\gamma$ in thermodynamic space, the change in internal energy of the system is calculated as

\[\Delta U = \int_{\gamma} dU = U(b) - U(a)\]

where $a$ and $b$ represent the initial and final states of the system, respectively. This is because $dU$ is an exact differential, and its integral along a curve depends only on the values of $U$ at the endpoints.

On the other hand, the total heat supplied to the system and the total work done by the system along the process are expressed as

\[\Delta Q = \int_{\gamma} đQ, \quad \Delta W = \int_{\gamma} đW\]

and depend on the specific path taken between the endpoints, i.e. they depend on the process and not only on the initial and final states, as is well known in thermodynamics.

We have come to the end of this post, where I have tried to elucidate the concepts of exact and inexact differential, and their application to thermodynamics. We have seen that these notions are not merely informal terms, but have precise definitions that can be understood in terms of basic concepts of linear algebra and calculus after some preparation. I hope this post has been useful in clarifying this, and that it has sparked your curiosity to delve deeper into the fascinating world of differential geometry and its applications to physics.