Search Blogs

Thursday, April 6, 2023

Dual Numbers

A while back when I was doing some exploration of writing a simple NN code to improve my understanding of neural networks and deep learning in general, I came across dual numbers. They're a type of number that generalizes the concept of real and complex numbers. But what makes them so interesting is that they can encode both a function value and its derivative in a single number. This means that we can use them to simplify the calculation of derivatives and solve complex problems efficiently.

So how does one think of dual numbers? What's the difference between a dual number and a complex number? One way to think about dual numbers is that they consist of two parts: a scalar part and a skew part. The scalar part is just a regular real number, while the skew part is a multiple of a new number, often denoted as $\epsilon$, that satisfies the property $\epsilon^2=0$. This means that every dual number can be written as $a+b\epsilon$, where $a$ and $b$ are real numbers.

What's most interesting is that the skew part of a dual number is that it provides an approximation of the first derivative of a function evaluated at a particular point. By using the dual number representation of the function at that point, one can calculate both the function value and its derivative in one shot.  One reason dual numbers have applications in deep learning is that algebra on dual numbers provides the chain rule for calculus, therefore they can be used to compute derivatives of complicated functions involving multiple variables and interdependencies.

As an example, say I want to evaluate the function $f(x)=x^2+2x$ at $x=3$. The dual number representation of $f(3)$ is $f(3+\epsilon)=f(3)+f'(3)\epsilon$, where $f'(x)=\frac{df(x)}{dx}$. We can compute $f(3)$ directly as $f(3)=3^2+2\cdot3=9+6=15$. To compute $f'(3)$, we can take the derivative of $f$ with respect to $x$: $f'(x)=2x+2$. Evaluating this at $x=3$, we get $f'(3)=2\cdot3+2=8$. Therefore, the dual number representation of $f$ at $x=3$ is $15+8\epsilon$.

One of the benefits of dual numbers is the derivative of the composition of two functions, $f(g(x))$ requires only the derivatives of the individual functions. Specifically, if $f(x)$ and $g(x)$ are two functions, then the dual number representation of their composition $f(g(x))$ is $(f(g(x)), f'(g(x))g'(x))$. This is especially useful when dealing with complex functions involving multiple variables and complicated interdependencies.

Dual numbers are actually a useful mathematical concept because they have practical applications in a wide range of fields. It's pretty cool that one can encode function values and derivatives in a single number, which makes it possible to simplify the calculation of derivatives and solve complex problems efficiently. On my computational blog, I have an example using dual numbers to calculate the derivative of an interatomic potential, Dual Numbers Pluto blog.


Reuse and Attribution

No comments:

Post a Comment

Please refrain from using ad hominem attacks, profanity, slander, or any similar sentiment in your comments. Let's keep the discussion respectful and constructive.