Last edit: 17 Feb 2025
First we need to know a tough word called infinitesimal which means infinitely small.
dy = infinitesimal y
dx = infinitesimal x
The 'd' here can be read as delta or simply 'd' as you like. So, dy/dx = infinitesimal y divided by infinitesimal x.
Δy = change in y
Δx = change in x
lim Δx → 0 means limit Δx approaching 0
dy/dx also means change in y divided by change in x when change in x is approaching 0. Written as dy/dx = lim Δx → 0 Δy/Δx
Let's begin with a example. Given y = x². Wait wait. I know you just said the infamous chain rule. Forget about the chain rule you have memorized. Screw it. Ask what is y = x²? This means in a Cartesian coordinate system, using this equation, on any point (x, y), the value of y is the square of x. Then what happens when x is incremented by a infinitesimal change called dx? That makes y also increment by an infinitesimal change called dy. From (x, y), due to dx, the coordinate of any new point will be (x + dx, y + dy). Substitute this new coordinate back into the original equation of y = x²:
y + dy = (x + dx)²
y + dy = x² + 2x dx + (dx)²
Substitue y = x²
x² + dy = x² + 2x dx + (dx)²
dy = 2x dx + (dx)²
(dx)² → 0
dy = 2x dx
dy/dx = 2x
dy/dx or d/dx (y) or so d/dx (x²) in this case here, is also called the 1st derivative of y with respect to x. It also represents the gradient of y = x².
Again, bet you can already tell the answer, but wait, that's not why we are here today. Starting out at dy/dx = 2x, when x is incremented by an infinitesimal dx, the value of its corresponding vertical axis, which is dy/dx also increments by an infinitesimal d (dy/dx), thus its new coordinate will be (x + dx, dy/dx + d (dy/dx)). Substituting this new coordinate back into the original equation of dy/dx = 2x:
dy/dx + d (dy/dx) = 2 (x + dx)
dy/dx + d (dy/dx) = 2x + 2dx
Substitute dy/dx = 2x
2x + d (dy/dx) = 2x + 2dx
d (dy/dx) = 2dx
d (dy/dx) / dx = 2
d²y/dx² = 2
By a writing convention, d (dy/dx) / dx or d/dx (dy/dx) is written as the infamous d²y/dx². Notice how the graph is a constant value? That's because dy/dx = 2x has a constant gradient aka slope across any x values.