What is the Chain Rule?

The Chain Rule handles composed functions — functions inside other functions. If y = f(g(x)), then dy/dx = f'(g(x)) · g'(x). In words: differentiate the outer function (leaving the inner alone), then multiply by the derivative of the inner function.

The Leibniz Form

In Leibniz notation: dy/dx = (dy/du)·(du/dx). This looks exactly like fraction cancellation — the du terms 'cancel'. This mnemonic is not a rigorous proof but it is extremely useful for remembering the rule and setting up chain rule applications.

Step-by-Step Method

Worked Examples

Example 1: d/dx[sin(x²)]. Outer = sin(u), inner = x². d/dx[sin(u)] = cos(u). d/dx[x²] = 2x. Answer: cos(x²)·2x = 2x·cos(x²).

Example 2: d/dx[e^(3x+1)]. Outer = eᵘ, inner = 3x+1. Derivative: e^(3x+1)·3.

Example 3: d/dx[(x²+5)⁶]. Outer = u⁶, inner = x²+5. Derivative: 6(x²+5)⁵·2x = 12x(x²+5)⁵.

Chain Rule with Product Rule

When a composed function is also multiplied by something, use both rules. d/dx[x·sin(x²)] requires Product Rule AND Chain Rule on sin(x²). Result: sin(x²) + x·cos(x²)·2x = sin(x²) + 2x²cos(x²).

Quick Reference

The Chain Rule states: if y = f(g(x)), then dy/dx = f'(g(x)) · g'(x). You differentiate the outer function leaving the inner intact, then multiply by the derivative of the inner function.

Why the Chain Rule Exists

Every differentiation rule so far — Power, Product, Quotient — handles functions built by arithmetic operations. But what happens when functions are composed? When you nest one function inside another, like sin(x²) or e^(3x+1) or (x²+5)⁶, none of those rules apply directly. The Chain Rule was created specifically for this situation.

The intuition: if a small change in x causes a change in u = g(x), and that change in u causes a change in y = f(u), then the total rate of change dy/dx is the product of these two individual rates. This is why the Leibniz form dy/dx = (dy/du)·(du/dx) looks like fraction cancellation — it captures exactly this chaining of rates.

Identifying When to Use It

Before computing any derivative, ask: is the argument of every function just plain x? If the answer is no — if you see sin(3x), or e^(x²), or √(x+1), or (2x−5)⁸ — the Chain Rule is required. A reliable diagnostic: if you were to evaluate the function at x=2 by hand, would you need to compute something inside before applying the outer function? If yes, that "inside computation" is your inner function g(x).

The Three-Layer Method

For complex expressions with multiple nested layers, work strictly from outside to inside. Each layer contributes one factor in the final product.

Example — Three Layers d/dx[sin²(3x)]
Layer 1Outermost: (·)². Derivative contribution: 2(sin 3x)¹ = 2 sin(3x)
Layer 2Middle: sin(·). Derivative contribution: cos(3x)
Layer 3Innermost: 3x. Derivative contribution: 3
ResultMultiply all layers: 2sin(3x) · cos(3x) · 3 = 6 sin(3x)cos(3x) = 3 sin(6x)

Chain Rule with Implicit Differentiation

The Chain Rule is the mechanism behind implicit differentiation. When you differentiate y² with respect to x, you are applying the Chain Rule: the outer function is (·)², the inner function is y(x). Result: 2y · (dy/dx). This is why every y-term in implicit differentiation picks up a dy/dx factor — it is the Chain Rule, every time.

Common Chain Rule Mistakes

Real-World Context

The Chain Rule is the mathematical foundation of backpropagation in neural networks — the algorithm that trains every modern AI system. When a neural network computes a prediction, it chains together dozens of composed functions (activation layers). Computing how the loss changes with respect to each weight requires applying the Chain Rule recursively through every layer. Every time ChatGPT or any language model was trained, billions of Chain Rule computations were performed.

In physics, the Chain Rule underlies the relationship between different coordinate systems. Converting between Cartesian and polar coordinates, or between laboratory and rotating frames of reference, requires the Chain Rule applied to coordinate transformations.

Practice Problems

Answers: −15x²sin(5x³) · | · 2x/(x²+1) · | · 21(3x−1)⁶ · | · cos(x)·e^(sin x) · | · 2/(1+4x²)

Frequently Asked Questions
How do I know when to use the Chain Rule?
Use the Chain Rule whenever you have a function of a function — any time there is something other than a plain x inside a function. sin(3x), (x²+1)⁵, e^(x³), ln(cos x) — all require the Chain Rule. If the argument of the outer function is just x, you don't need it.
Can I apply the Chain Rule more than once?
Yes — for deeply nested functions, apply it repeatedly from the outside in. d/dx[sin²(3x)] = 2sin(3x)·cos(3x)·3 = 6sin(3x)cos(3x). Three layers: power, then sin, then 3x.
← Previous
Derivative Rules — Power, Product & Chain
Next →
Implicit Differentiation Explained
References & Further Reading
  • Stewart, J. (2015). Calculus, §3.4. Cengage.
  • Spivak, M. (2006). Calculus, Ch. 10. Publish or Perish.
  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning, §6.5. MIT Press.
AM
Dr. Aisha Malik, PhD Mathematics
Senior Lecturer in Applied Mathematics · 12 years teaching calculus at university level

Dr. Malik holds a PhD in Applied Mathematics from the University of Edinburgh and has taught calculus to over 4,000 students at both undergraduate and postgraduate level. Her research focuses on numerical methods for differential equations. She has reviewed this article for mathematical accuracy and pedagogical clarity.

Technically reviewed by: Prof. James Chen, Stanford Mathematics Department