# 4.9: Newton’s Method - Mathematics

Learning Objectives

• Describe the steps of Newton’s method.
• Explain what an iterative process means.
• Recognize when Newton’s method does not work.
• Apply iterative processes to various situations.

In many areas of pure and applied mathematics, we are interested in finding solutions to an equation of the form (f(x)=0.) For most functions, however, it is difficult—if not impossible—to calculate their zeroes explicitly. In this section, we take a look at a technique that provides a very efficient way of approximating the zeroes of functions. This technique makes use of tangent line approximations and is behind the method used often by calculators and computers to find zeroes.

## Describing Newton’s Method

Consider the task of finding the solutions of (f(x)=0.) If (f) is the first-degree polynomial (f(x)=ax+b), then the solution of (f(x)=0) is given by the formula (x=−frac{b}{a}). If (f) is the second-degree polynomial (f(x)=ax^2+bx+c), the solutions of (f(x)=0) can be found by using the quadratic formula. However, for polynomials of degree 3 or more, finding roots of (f) becomes more complicated. Although formulas exist for third- and fourth-degree polynomials, they are quite complicated. Also, if f is a polynomial of degree 5 or greater, it is known that no such formulas exist. For example, consider the function

[f(x)=x^5+8x^4+4x^3−2x−7. onumber]

No formula exists that allows us to find the solutions of (f(x)=0.) Similar difficulties exist for nonpolynomial functions. For example, consider the task of finding solutions of (tan(x)−x=0.)No simple formula exists for the solutions of this equation. In cases such as these, we can use Newton’s method to approximate the roots.

Newton’s method makes use of the following idea to approximate the solutions of (f(x)=0.) By sketching a graph of (f), we can estimate a root of (f(x)=0). Let’s call this estimate (x_0). We then draw the tangent line to (f) at (x_0). If (f′(x_0)≠0), this tangent line intersects the (x)-axis at some point ((x_1,0)). Now let (x_1) be the next approximation to the actual root. Typically, (x_1) is closer than (x_0) to an actual root. Next we draw the tangent line to (f) at (x_1). If (f′(x_1)≠0), this tangent line also intersects the (x)-axis, producing another approximation, (x_2). We continue in this way, deriving a list of approximations: (x_0,, x_1,, x_2,, ….) Typically, the numbers (x_0,, x_1,, x_2,, …) quickly approach an actual root (x*), as shown in the following figure. Now let’s look at how to calculate the approximations (x_0,, x_1,, x_2,, ….) If (x_0) is our first approximation, the approximation (x_1) is defined by letting ((x_1,0)) be the (x)-intercept of the tangent line to (f) at (x_0). The equation of this tangent line is given by

[y=f(x_0)+f′(x_0)(x−x_0). onumber]

Therefore, (x_1) must satisfy

[f(x_0)+f′(x_0)(x_1−x_0)=0. onumber]

Solving this equation for (x_1), we conclude that

[x_1=x_0−frac{f(x_0)}{f'(x_0)}. onumber]

Similarly, the point ((x_2,0)) is the (x)-intercept of the tangent line to (f) at (x_1). Therefore, (x_2) satisfies the equation

[x_2=x_1−frac{f(x_1)}{f'(x_1)}. onumber]

In general, for (n>0,x_n) satisfies

[x_n=x_{n−1}−frac{f(x_{n−1})}{f'(x_{n−1})}.label{Newton}]

Next we see how to make use of this technique to approximate the root of the polynomial (f(x)=x^3−3x+1.)

Example (PageIndex{1}): Finding a Root of a Polynomial

Use Newton’s method to approximate a root of (f(x)=x^3−3x+1) in the interval ([1,2]). Let (x_0=2) and find (x_1,, x_2, ,x_3, ,x_4,) and (x_5).

Solution

From Figure (PageIndex{2}), we see that (f) has one root over the interval ((1,2)). Therefore (x_0=2) seems like a reasonable first approximation. To find the next approximation, we use Equation ef{Newton}. Since (f(x)=x^3−3x+1), the derivative is (f′(x)=3x^2−3). Using Equation ef{Newton} with (n=1) (and a calculator that displays (10) digits), we obtain

[x_1=x_0−frac{f(x_0)}{f'(x_0)}=2−frac{f(2)}{f'(2)}=2−frac{3}{9}≈1.666666667. onumber]

To find the next approximation, (x_2), we use Equation with (n=2) and the value of (x_1) stored on the calculator. We find that

[x_2=x_1-frac{f(x_1)}{f'(x_1)}≈1.548611111. onumber]

Continuing in this way, we obtain the following results:

• (x_1≈1.666666667)
• (x_2≈1.548611111)
• (x_3≈1.532390162)
• (x_4≈1.532088989)
• (x_5≈1.532088886)
• (x_6≈1.532088886.)

We note that we obtained the same value for (x_5) and (x_6). Therefore, any subsequent application of Newton’s method will most likely give the same value for (x_n). Exercise (PageIndex{1})

Letting (x_0=0), let’s use Newton’s method to approximate the root of (f(x)=x^3−3x+1) over the interval ([0,1]) by calculating (x_1) and (x_2).

Hint

Use Equation ef{Newton}.

(x_1≈0.33333333)
(x_2≈0.347222222)

Newton’s method can also be used to approximate square roots. Here we show how to approximate (sqrt{2}). This method can be modified to approximate the square root of any positive number.

Example (PageIndex{2}): Finding a Square Root

Use Newton’s method to approximate (sqrt{2}) (Figure (PageIndex{3})). Let (f(x)=x^2−2), let (x_0=2), and calculate (x_1,, x_2,, x_3,, x_4,, x_5). (We note that since (f(x)=x^2−2) has a zero at (sqrt{2}), the initial value (x_0=2) is a reasonable choice to approximate (sqrt{2})). Solution

For (f(x)=x^2−2,; f′(x)=2x.) From ef{Newton}, we know that

[egin{align*} x_n&=x_{n−1}−frac{f(x_{n−1})}{f'(x_{n−1})}[4pt]
&=x_{n−1}−frac{x^2_{n−1}−2}{2x_{n−1}}[4pt]
&=frac{1}{2}x_{n−1}+frac{1}{x_{n−1}}[4pt]
&=frac{1}{2}left(x_{n−1}+frac{2}{x_{n−1}} ight).end{align*} ]

Therefore,

(x_1=frac{1}{2}left(x_0+frac{2}{x_0} ight)=frac{1}{2}left(2+frac{2}{2} ight)=1.5)

(x_2=frac{1}{2}left(x_1+frac{2}{x_1} ight)=frac{1}{2}left(1.5+frac{2}{1.5} ight)≈1.416666667.)

Continuing in this way, we find that

(x_1=1.5)

(x_2≈1.416666667)

(x_3≈1.414215686)

(x_4≈1.414213562)

(x_5≈1.414213562.)

Since we obtained the same value for (x_4) and (x_5), it is unlikely that the value (x_n) will change on any subsequent application of Newton’s method. We conclude that (sqrt{2}≈1.414213562.)

Exercise (PageIndex{2})

Use Newton’s method to approximate (sqrt{3}) by letting (f(x)=x^2−3) and (x_0=3). Find (x_1) and (x_2).

Hint

For (f(x)=x^2−3), Equation ef{Newton} reduces to (x_n=frac{x_{n−1}}{2}+frac{3}{2x_{n−1}}).

(x_1=2)
(x_2=1.75)

When using Newton’s method, each approximation after the initial guess is defined in terms of the previous approximation by using the same formula. In particular, by defining the function (F(x)=x−left[frac{f(x)}{f′(x)} ight]), we can rewrite Equation ef{Newton} as (x_n=F(x_{n−1})). This type of process, where each (x_n) is defined in terms of (x_{n−1}) by repeating the same function, is an example of an iterative process. Shortly, we examine other iterative processes. First, let’s look at the reasons why Newton’s method could fail to find a root.

## Failures of Newton’s Method

Typically, Newton’s method is used to find roots fairly quickly. However, things can go wrong. Some reasons why Newton’s method might fail include the following:

1. At one of the approximations (x_n), the derivative (f′) is zero at (x_n), but (f(x_n)≠0). As a result, the tangent line of (f) at (x_n) does not intersect the (x)-axis. Therefore, we cannot continue the iterative process.
2. The approximations (x_0,, x_1,, x_2,, …) may approach a different root. If the function (f) has more than one root, it is possible that our approximations do not approach the one for which we are looking, but approach a different root (see Figure (PageIndex{4})). This event most often occurs when we do not choose the approximation (x_0) close enough to the desired root.
3. The approximations may fail to approach a root entirely. In Example (PageIndex{3}), we provide an example of a function and an initial guess (x_0) such that the successive approximations never approach a root because the successive approximations continue to alternate back and forth between two values. Example (PageIndex{3}): When Newton’s Method Fails

Consider the function (f(x)=x^3−2x+2). Let (x_0=0). Show that the sequence (x_1,, x_2,, …) fails to approach a root of (f).

Solution

For (f(x)=x^3−2x+2,) the derivative is (f′(x)=3x^2−2).Therefore,

[x_1=x_0−frac{f(x_0)}{f′(x_0)}=0−frac{f(0)}{f′(0)}=−frac{2}{−2}=1. onumber]

In the next step,

[x_2=x_1−frac{f(x_1)}{f'(x_1)}=1−frac{f(1)}{f′(1)}=1−frac{1}{1}=0. onumber]

Consequently, the numbers (x_0,, x_1,, x_2,, …) continue to bounce back and forth between (0) and (1) and never get closer to the root of (f) which is over the interval ([−2,−1]) (Figure (PageIndex{5})). Fortunately, if we choose an initial approximation (x_0) closer to the actual root, we can avoid this situation. Exercise (PageIndex{3})

For (f(x)=x^3−2x+2,) let (x_0=−1.5) and find (x_1) and (x_2).

Hint

Use Equation ef{Newton}.

(x_1≈−1.842105263)
(x_2≈−1.772826920)

From Example (PageIndex{3}), we see that Newton’s method does not always work. However, when it does work, the sequence of approximations approaches the root very quickly. Discussions of how quickly the sequence of approximations approach a root found using Newton’s method are included in texts on numerical analysis.

## Other Iterative Processes

As mentioned earlier, Newton’s method is a type of iterative process. We now look at an example of a different type of iterative process.

Consider a function (F) and an initial number (x_0). Define the subsequent numbers (x_n) by the formula (x_n=F(x_{n−1})). This process is an iterative process that creates a list of numbers (x_0,, x_1,, x_2,, …,, x_n,, ….) This list of numbers may approach a finite number (x^*) as (n) gets larger, or it may not. In Example (PageIndex{4}), we see an example of a function (F) and an initial guess (x_0) such that the resulting list of numbers approaches a finite value.

Example (PageIndex{4}): Finding a Limit for an Iterative Process

Let (F(x)=frac{1}{2}x+4) and let (x_0=0). For all (n≥1), let (x_n=F(x_{n−1})). Find the values (x_1,, x_2,, x_3,, x_4,, x_5). Make a conjecture about what happens to this list of numbers (x_1,, x_2,, x_3,, …,, x_n,, …) as (n→∞). If the list of numbers (x_1,, x_2,, x_3,, …) approaches a finite number (x^*), then (x^*) satisfies (x^*=F(x^*)), and (x^*) is called a fixed point of (F).

Solution

If (x_0=0), then

• (x_1=frac{1}{2}(0)+4=4)
• (x_2=frac{1}{2}(4)+4=6)
• (x_3=frac{1}{2}(6)+4=7)
• (x_4=frac{1}{2}(7)+4=7.5)
• (x_5=frac{1}{2}(7.5)+4=7.75)
• (x_6=frac{1}{2}(7.75)+4=7.875)
• (x_7=frac{1}{2}(7.875)+4=7.9375)
• (x_8=frac{1}{2}(7.9375)+4=7.96875)
• (x _9=frac{1}{2}(7.96875)+4=7.984375.)

From this list, we conjecture that the values (x_n) approach (8).

Figure (PageIndex{6}) provides a graphical argument that the values approach (8) as (n→∞). Starting at the point ((x_0,x_0)), we draw a vertical line to the point ((x_0,F(x_0))). The next number in our list is (x_1=F(x_0)). We use (x_1) to calculate (x_2). Therefore, we draw a horizontal line connecting ((x_0,x_1)) to the point ((x_1,x_1)) on the line (y=x), and then draw a vertical line connecting ((x_1,x_1)) to the point ((x_1,F(x_1))). The output (F(x_1)) becomes (x_2). Continuing in this way, we could create an infinite number of line segments. These line segments are trapped between the lines (F(x)=frac{x}{2}+4) and (y=x). The line segments get closer to the intersection point of these two lines, which occurs when (x=F(x)). Solving the equation (x=frac{x}{2}+4,) we conclude they intersect at (x=8). Therefore, our graphical evidence agrees with our numerical evidence that the list of numbers (x_0,, x_1,, x_2,, …) approaches (x*=8) as (n→∞). Exercise (PageIndex{4})

Consider the function (F(x)=frac{1}{3}x+6). Let (x_0=0) and let (x_n=F(x_{n−1})) for (n≥2). Find (x_1,, x_2,, x_3,, x_4,, x_5). Make a conjecture about what happens to the list of numbers (x_1,, x_2,, x_3,, …, x_n,, … ) as (n→∞.)

Hint

Consider the point where the lines (y=x) and (y=F(x)) intersect.

(x_1=6,;x_2=8,;x_3=frac{26}{3},;x_4=frac{80}{9},;x_5=frac{242}{27};;x^*=9)

Iterative Processes and Chaos

Iterative processes can yield some very interesting behavior. In this section, we have seen several examples of iterative processes that converge to a fixed point. We also saw in Example (PageIndex{4}) that the iterative process bounced back and forth between two values. We call this kind of behavior a 2-cycle. Iterative processes can converge to cycles with various periodicities, such as 2−cycles, 4−cycles (where the iterative process repeats a sequence of four values), 8-cycles, and so on.

Some iterative processes yield what mathematicians call chaos. In this case, the iterative process jumps from value to value in a seemingly random fashion and never converges or settles into a cycle. Although a complete exploration of chaos is beyond the scope of this text, in this project we look at one of the key properties of a chaotic iterative process: sensitive dependence on initial conditions. This property refers to the concept that small changes in initial conditions can generate drastically different behavior in the iterative process.

Probably the best-known example of chaos is the Mandelbrot set (see Figure), named after Benoit Mandelbrot (1924–2010), who investigated its properties and helped popularize the field of chaos theory. The Mandelbrot set is usually generated by computer and shows fascinating details on enlargement, including self-replication of the set. Several colorized versions of the set have been shown in museums and can be found online and in popular books on the subject. In this project we use the logistic map

[f(x)=rx(1−x)]

where (x∈[0,1]) and (r>0)

as the function in our iterative process. The logistic map is a deceptively simple function; but, depending on the value of (r), the resulting iterative process displays some very interesting behavior. It can lead to fixed points, cycles, and even chaos.

To visualize the long-term behavior of the iterative process associated with the logistic map, we will use a tool called a cobweb diagram. As we did with the iterative process we examined earlier in this section, we first draw a vertical line from the point ((x_0,0)) to the point ((x_0,f(x_0))=(x_0,x_1)). We then draw a horizontal line from that point to the point ((x_1,x_1),) then draw a vertical line to ((x_1,f(x_1))=(x_1,x_2)), and continue the process until the long-term behavior of the system becomes apparent. Figure shows the long-term behavior of the logistic map when (r=3.55) and (x_0=0.2). (The first (100) iterations are not plotted.) The long-term behavior of this iterative process is an (8)-cycle. 1. Let (r=0.5) and choose (x_0=0.2). Either by hand or by using a computer, calculate the first (10) values in the sequence. Does the sequence appear to converge? If so, to what value? Does it result in a cycle? If so, what kind of cycle (for example, (2)−cycle, (4)−cycle.)?
2. What happens when (r=2)?
3. For (r=3.2) and (r=3.5), calculate the first (100) sequence values. Generate a cobweb diagram for each iterative process. (Several free applets are available online that generate cobweb diagrams for the logistic map.) What is the long-term behavior in each of these cases?
4. Now let (r=4.) Calculate the first (100) sequence values and generate a cobweb diagram. What is the long-term behavior in this case?
5. Repeat the process for (r=4,) but let (x_0=0.201.) How does this behavior compare with the behavior for (x_0=0.2)?

## Key Concepts

• Newton’s method approximates roots of (f(x)=0) by starting with an initial approximation (x_0), then uses tangent lines to the graph of (f) to create a sequence of approximations (x_1,, x_2,, x_3,, ….)
• Typically, Newton’s method is an efficient method for finding a particular root. In certain cases, Newton’s method fails to work because the list of numbers (x_0,, x_1,, x_2,, …) does not approach a finite value or it approaches a value other than the root sought.
• Any process in which a list of numbers (x_0,, x_1,, x_2,, …) is generated by defining an initial number (x_0) and defining the subsequent numbers by the equation (x_n=F(x_{n−1})) for some function (F) is an iterative process. Newton’s method is an example of an iterative process, where the function (F(x)=x−left[frac{f(x)}{f′(x)} ight]) for a given function (f).

## Glossary

iterative process
process in which a list of numbers (x_0,x_1,x_2,x_3…) is generated by starting with a number (x_0) and defining (x_n=F(x_{n−1})) for (n≥1)
Newton’s method
method for approximating roots of (f(x)=0;) using an initial guess (x_0); each subsequent approximation is defined by the equation (x_n=x_{n−1}−frac{f(x_{n−1})}{f'(x_{n−1})})

### Calculus Revisited #9: Newton's Method

Welcome to blog entry #9 of 21 of our Calculus Revisited. Today we will cover Newton's Method: an effective and powerful way of finding roots of functions.

Process: Newton's Method

Goal: Find roots of the function f(x). That is, solve for x: f(x) = 0

2. Calculate x_(n+1) = x_n - f(x_n)/f'(x_n) (function over its derivative)

3. If x_n+1 matches x_n, you are done, with the root being x_n+1. You determine how accurate the answer is by selecting how many decimal points to match. Calculator accuracy (what you see on the screen) varies from 10 to 12 decimal points.

Caution: The choice of initial value is important. Usually, selecting initial guesses close to the root works the best, but this does not always work.

Also, this method is not 100% in finding roots. If you get x_n and x_n+1 flopping between two distinct values, you are caught in a loop - a root will not be found.

I recommend that you use a calculator when working with Newton's Method. On scientific calculators, you may be able to take advantage of the last answer feature, setting x_n = ans.

Example: (TI -36X Pro, TI-30 Multiview, Casio fx-300, Casio fx-115ES, most graphing calculators)
2 ENTER (2 is stored in ans)
ans - f(ans)/f'(ans) (calculates x_1 with x_n = 2)
Keep pressing ENTER

Problems
Newton's Method will often be used to estimate nth roots (square root, cube root, etc). The first problem will be an example of such.

1. Estimate 𕔊 to 5 decimal places.

Observe that 𕔈 = 2 and 𕔍 = 3, which means 𕔊 lies in between 2 and 3. Let's make an initial guess 2.5.

Use the function f(x) = x^2 - 6. Why?

Finding the root to the above equation leads to 𕔊.

Having x_0 as the initial guess,
x_0 = 2.50000
x_1 = 2.45000
x_2 = 2.44949
x_3 = 2.44949

(x_1 = 2.5 - (2.5^2 - 6)/(2*2.5) = 2.45,
x_2 = 2.45 - (2.45^2 - 6)/(2*2.45) = 2.44949, and so on)

So to five decimal places, 𕔊 ≈ 2.44949

2. Find a root of the equation e^x - x = 5 for any x > 0. Let x_0 = 2. (the initial guess)

Get the equation in form of f(x) = 0:
e^x - x = 5
e^x - x - 5 = 0

Then f(x) = e^x - x - 5, f'(x) = e^x - 1, and

x_n+1 = x_n - (e^x_n - x_n - 5)/(e^x_n - 1)

x_0 = 2.00000
x_1 = 1.93911
x_2 = 1.93685
x_3 = 1.93685

Then a root of f(x) is x𕛑.93685.

3. Solve x^4 - 4x - 4 = 0 with the condition -1 Hint: You may want to graph f(x) to estimate an appropriate initial guess.

Then f(x) = x^4 - 4x - 4, f'(x) = 4x^3 - 4, and

x_n+1 = x_n - (x_n^4 - 4x_n - 4)/(4x_n^3 - 4)

By calculation, we get:
x_0 = -0.50000
x_1 = -0.93056
x_2 = -0.86520
x_3 = -0.86198
x_4 = -0.86198

That wraps it up on Newton's Method. Next we head into integration! - Eddie

## 4.9: Newton’s Method - Mathematics

Solving algebraic equations is a common exercise in introductory Mathematics classes. However, sometimes equations cannot be solved using simple algebra and we might be required to find a good, accurate $estimate$ of the exact solution. A common and easily used algorithm to find a good estimate to an equation's exact solution is Newton's Method (also called the Newton-Raphson Method), which was developed in the late 1600's by the English Mathematicians Sir Isaac Newton and Joseph Raphson .

The algorithm for Newton's Method is simple and easy-to-use. It uses the the first derivative of a function and is based on the basic Calculus concept that the derivative of a function $f$ at $x=c$ is the slope of the line tangent to the graph of $y=f(x)$ at the point $(c, f(c))$. Let's carefully construct Newton's Method.

Let $y=f(x)$ be a differentiable function. Our goal is to solve the equation $f(x)=0$ for $x$. Let's call the exact solution to this equation $x=r$. See the diagram below.

We begin with an $initial guess$ $x_<0>$. At the point $(x_<0>, f(x_<0>))$ draw the tangent line. Denote $x_<1>$ as the point where this tangent line crosses the $x$-axis. The point $x_<1>$ is our second guess. Repeat. At the point $(x_<1>, f(x_<1>))$ draw the tangent line. Denote $x_<2>$ as the point where this tangent line crosses the $x$-axis. The point $x_<2>$ is our third guess. Repeat . etc. See the diagram below.

We can see that these successive guesses, $x_<0>, x_<1>, x_<2>, x_<3>, cdots$ zig-zag their way and get closer and closer to the exact solution $x=r$. Let's create a $recursion$ which will generate a sequence of successive guesses. Let's first consider how we can go from one guess to the next, i.e., from guess $x_$ to guess $x_$. See the diagram below.

The $SLOPE$ of the tangent line at the point $(x_, f(x_))$ is $I.) m = f'(x_)$ and $II.) m = displaystyle < rise over run >= < f(x_) over x_ - x_ >$ Now set the two slopes equal to each other getting $f'(x_) = < f(x_) over x_ - x_ > longrightarrow$ $x_ - x_ = < f(x_) over f'(x_)> longrightarrow$ $x_ = x_ - < f(x_) over f'(x_)> longrightarrow$ i.e., the recursion for Newton's Method is $x_ = x_ - < f(x_) over f'(x_)>$

In the list of Newton's Method Problems which follows, most problems are average and a few are somewhat challenging. I recommend using the Desmos Graphing Calculator if you want to graph functions. It's fun and easy to use.

PROBLEM 1 : Apply Newton's Method to the equation $x^3+x-5=0$. Begin with the given initial guess, $x_ <0>$, and find $x_ <1>$ and $x_ <2>$. Then use a spreadsheet or some other technology tool to find the solution to this equation to five decimal places.

a.) Use the initial guess $x_<0>=0$.
b.) Use the initial guess $x_<0>=1$.
c.) Use the initial guess $x_<0>=-1$.

a.) Use the initial guess $x_<0>=1$.
b.) Use the initial guess $x_<0>=2$.
c.) Use the initial guess $x_<0>=0.5$.
d.) Use the initial guess $x_<0>=1000$ !
e.) Use the initial guess $x_<0>=-500$ !

A heartfelt "Thank you" goes to The MathJax Consortium for making the construction of this webpage fun and easy.

We want to minimize $f(mathbf)$ . Firstly, we consider its second-order truncated Taylor series:

$f(mathbf) approx hat(mathbf)= f(mathbf) + [mathbf-mathbf]^T, abla f(mathbf) + frac<1><2>,[mathbf-mathbf]^T, abla^2 f(mathbf),[mathbf-mathbf]$

Considering the Hessian Matrix $abla^2 f(mathbf)$ is positve-definite, $hat(mathbf)$ can be easily minimized:

So we have the minimized value:

$hat(mathbf)= f(mathbf) - [ abla^2 f(mathbf)^<-1> abla f(mathbf)]^T abla f(mathbf) + frac<1><2>[ abla^2 f(mathbf)^<-1> abla f(mathbf)]^T abla^2 f(mathbf)[ abla^2 f(mathbf)^<-1> abla f(mathbf)]$

So the decrement on the approximation $hat(mathbf)$ is given by:

(A Hessian Matrix is always symmetric for "well behaved" functions)

## An Introduction to Optimization, 4th Edition Praise for the Third Edition ". . . guides and leads the reader through the learning path . . . [e]xamples are stated very clearly and the results are presented with attention to detail."  —MAA Reviews

Fully updated to reflect new developments in the field, the Fourth Edition of Introduction to Optimization fills the need for accessible treatment of optimization theory and methods with an emphasis on engineering design. Basic definitions and notations are provided in addition to the related fundamental background for linear algebra, geometry, and calculus.

• A new chapter on integer programming
• Expanded coverage of one-dimensional methods
• Updated and expanded sections on linear matrix inequalities
• Numerous new exercises at the end of each chapter
• MATLAB exercises and drill problems to reinforce the discussed theory and algorithms
• Numerous diagrams and figures that complement the written presentation of key concepts
• MATLAB M-files for implementation of the discussed theory and algorithms (available via the book's website)

## Contents

The idea is to start with an initial guess which is reasonably close to the true root, then to approximate the function by its tangent line using calculus, and finally to compute the x -intercept of this tangent line by elementary algebra. This x -intercept will typically be a better approximation to the original function's root than the first guess, and the method can be iterated.

More formally, suppose f : (a, b) → ℝ is a differentiable function defined on the interval (a, b) with values in the real numbers ℝ , and we have some current approximation xn . Then we can derive the formula for a better approximation, xn + 1 by referring to the diagram on the right. The equation of the tangent line to the curve y = f (x) at x = xn is

where f′ denotes the derivative. The x -intercept of this line (the value of x which makes y = 0 ) is taken as the next approximation, xn + 1 , to the root, so that the equation of the tangent line is satisfied when ( x , y ) = ( x n + 1 , 0 ) ,0)> :

Solving for xn + 1 gives

We start the process with some arbitrary initial value x0 . (The closer to the zero, the better. But, in the absence of any intuition about where the zero might lie, a "guess and check" method might narrow the possibilities to a reasonably small interval by appealing to the intermediate value theorem.) The method will usually converge, provided this initial guess is close enough to the unknown zero, and that f ′(x0) ≠ 0 . Furthermore, for a zero of multiplicity 1, the convergence is at least quadratic (see rate of convergence) in a neighbourhood of the zero, which intuitively means that the number of correct digits roughly doubles in every step. More details can be found in the analysis section below.

Householder's methods are similar but have higher order for even faster convergence. However, the extra computations required for each step can slow down the overall performance relative to Newton's method, particularly if f or its derivatives are computationally expensive to evaluate.

The name "Newton's method" is derived from Isaac Newton's description of a special case of the method in De analysi per aequationes numero terminorum infinitas (written in 1669, published in 1711 by William Jones) and in De metodis fluxionum et serierum infinitarum (written in 1671, translated and published as Method of Fluxions in 1736 by John Colson). However, his method differs substantially from the modern method given above. Newton applied the method only to polynomials, starting with an initial root estimate and extracting a sequence of error corrections. He used each correction to rewrite the polynomial in terms of the remaining error, and then solved for a new correction by neglecting higher-degree terms. He did not explicitly connect the method with derivatives or present a general formula. Newton applied this method to both numerical and algebraic problems, producing Taylor series in the latter case.

Newton may have derived his method from a similar but less precise method by Vieta. The essence of Vieta's method can be found in the work of the Persian mathematician Sharaf al-Din al-Tusi, while his successor Jamshīd al-Kāshī used a form of Newton's method to solve x PN = 0 to find roots of N (Ypma 1995). A special case of Newton's method for calculating square roots was known since ancient times and is often called the Babylonian method.

Newton's method was used by 17th-century Japanese mathematician Seki Kōwa to solve single-variable equations, though the connection with calculus was missing. 

Newton's method was first published in 1685 in A Treatise of Algebra both Historical and Practical by John Wallis.  In 1690, Joseph Raphson published a simplified description in Analysis aequationum universalis.  Raphson also applied the method only to polynomials, but he avoided Newton's tedious rewriting process by extracting each successive correction from the original polynomial. This allowed him to derive a reusable iterative expression for each problem. Finally, in 1740, Thomas Simpson described Newton's method as an iterative method for solving general nonlinear equations using calculus, essentially giving the description above. In the same publication, Simpson also gives the generalization to systems of two equations and notes that Newton's method can be used for solving optimization problems by setting the gradient to zero.

Arthur Cayley in 1879 in The Newton–Fourier imaginary problem was the first to notice the difficulties in generalizing Newton's method to complex roots of polynomials with degree greater than 2 and complex initial values. This opened the way to the study of the theory of iterations of rational functions.

Newton's method is a powerful technique—in general the convergence is quadratic: as the method converges on the root, the difference between the root and the approximation is squared (the number of accurate digits roughly doubles) at each step. However, there are some difficulties with the method.

### Difficulty in calculating derivative of a function Edit

Newton's method requires that the derivative can be calculated directly. An analytical expression for the derivative may not be easily obtainable or could be expensive to evaluate. In these situations, it may be appropriate to approximate the derivative by using the slope of a line through two nearby points on the function. Using this approximation would result in something like the secant method whose convergence is slower than that of Newton's method.

### Failure of the method to converge to the root Edit

It is important to review the proof of quadratic convergence of Newton's method before implementing it. Specifically, one should review the assumptions made in the proof. For situations where the method fails to converge, it is because the assumptions made in this proof are not met.

#### Overshoot Edit

If the first derivative is not well behaved in the neighborhood of a particular root, the method may overshoot, and diverge from that root. An example of a function with one root, for which the derivative is not well behaved in the neighborhood of the root, is

In some cases, Newton's method can be stabilized by using successive over-relaxation, or the speed of convergence can be increased by using the same method.

#### Stationary point Edit

If a stationary point of the function is encountered, the derivative is zero and the method will terminate due to division by zero.

#### Poor initial estimate Edit

A large error in the initial estimate can contribute to non-convergence of the algorithm. To overcome this problem one can often linearize the function that is being optimized using calculus, logs, differentials, or even using evolutionary algorithms, such as the stochastic tunneling. Good initial estimates lie close to the final globally optimal parameter estimate. In nonlinear regression, the sum of squared errors (SSE) is only "close to" parabolic in the region of the final parameter estimates. Initial estimates found here will allow the Newton–Raphson method to quickly converge. It is only here that the Hessian matrix of the SSE is positive and the first derivative of the SSE is close to zero.

#### Mitigation of non-convergence Edit

In a robust implementation of Newton's method, it is common to place limits on the number of iterations, bound the solution to an interval known to contain the root, and combine the method with a more robust root finding method.

### Slow convergence for roots of multiplicity greater than 1 Edit

If the root being sought has multiplicity greater than one, the convergence rate is merely linear (errors reduced by a constant factor at each step) unless special steps are taken. When there are two or more roots that are close together then it may take many iterations before the iterates get close enough to one of them for the quadratic convergence to be apparent. However, if the multiplicity m of the root is known, the following modified algorithm preserves the quadratic convergence rate: 

This is equivalent to using successive over-relaxation. On the other hand, if the multiplicity m of the root is not known, it is possible to estimate m after carrying out one or two iterations, and then use that value to increase the rate of convergence.

If the multiplicity m of the root is finite then g(x) = f(x) / f′(x) will have a root at the same location with multiplicity 1. Applying Newton's method to find the root of g(x) recovers quadratic convergence in many cases although it generally involves the second derivative of f(x) . In a particularly simple case, if f(x) = x m then g(x) = x / m and Newton's method finds the root in a single iteration with

Suppose that the function f has a zero at α , i.e., f (α) = 0 , and f is differentiable in a neighborhood of α .

If f is continuously differentiable and its derivative is nonzero at α , then there exists a neighborhood of α such that for all starting values x0 in that neighborhood, the sequence <xn> will converge to α . 

If the function is continuously differentiable and its derivative is not 0 at α and it has a second derivative at α then the convergence is quadratic or faster. If the second derivative is not 0 at α then the convergence is merely quadratic. If the third derivative exists and is bounded in a neighborhood of α , then:

If the derivative is 0 at α , then the convergence is usually only linear. Specifically, if f is twice continuously differentiable, f ′(α) = 0 and f ″(α) ≠ 0 , then there exists a neighborhood of α such that, for all starting values x0 in that neighborhood, the sequence of iterates converges linearly, with rate 1/2  Alternatively, if f ′(α) = 0 and f ′(x) ≠ 0 for xα , x in a neighborhood U of α , α being a zero of multiplicity r , and if fC r (U) , then there exists a neighborhood of α such that, for all starting values x0 in that neighborhood, the sequence of iterates converges linearly.

However, even linear convergence is not guaranteed in pathological situations.

In practice, these results are local, and the neighborhood of convergence is not known in advance. But there are also some results on global convergence: for instance, given a right neighborhood U+ of α , if f is twice differentiable in U+ and if f ′ ≠ 0 , f · f ″ > 0 in U+ , then, for each x0 in U+ the sequence xk is monotonically decreasing to α .

### Proof of quadratic convergence for Newton's iterative method Edit

According to Taylor's theorem, any function f (x) which has a continuous second derivative can be represented by an expansion about a point that is close to a root of f (x) . Suppose this root is α . Then the expansion of f (α) about xn is:

## Non-polynomial Functions with Multiple Roots

When using a computer to find roots of more complicated functions it's best to understand what is going on and give the computer a guess close to your desired answer.

### Example 2

[Certain math software is not able to find the solution directly for us. We need to know how to properly use the tool to get the solution, either with graphs or setting up Newton's Method. This could involve giving an initial estimate for the root.]

There appear to be 2 roots, one near t = &minus1 and the other near t = 3 . However, if we look more carefully in the region near t = 3 (by zooming in), we see that there is more than one root there.

By simple substitution, we can see that one of the roots is exactly t = 3 :

Now for the root near t = 3.4 .

We will use Newton's Method to approximate the root. We need to differentiate y = 1&minus t 2 + 2 t . Since we have t as an exponent in our expression, we need to use logarithms when differentiating. [See how to differentiate logarithms in Derivative of the Logarithmic Function].

Let's differentiate 2 t by itself first.

Take natural log of both sides:

So for Newton's Method in this example, we would have:

We can write this more conveniently (for later steps) showing the substitution as:

Now, doing another step of Newton's Method:

We can conclude that correct to 7 decimal places, t = 3.4074505 .

Using a computer algebra system, we can zoom into the root and we can see (from where the graph cuts the y-axis) that t is indeed close to 3.40745.

Now for the negative case. Let t0 = &minus1 be our initial guess.

t_2 =-1.213076633- [(1-t^2+2^t)/(-2t+2^t ln 2)]_(t=-1.213076633) =-1.198322474

t_3 =-1.198322474- [(1-t^2+2^t)/(-2t+2^t ln 2)]_(t=-1.198322474) =-1.198250199

We could continue until we obtained the required accuracy.

Comparing this to the zoomed in graph, we can see that the solution is t = &minus1.198250197 , correct to 9 decimal places.

So the solutions for 1&minus t 2 + 2 t = 0 are

correct to 5 decimal places.

## BIBLIOGRAPHY

Aristotle, Physics, Book I Nicomachean Ethics, Book III
cf. Thomas L. Health, Mathematics in Aristotle (Oxford,
1949), pp. 270-72. F. Bacon, The Philosophical Works of
Francis Bacon,
ed. J. M. Robertson (London, 1905), p. 249.
Isaac Barrow, Mathematical Lectures read in the Publick
Schools at the University of Cambridge,
trans. John Kirkby
(London, 1734) for Barrow's familiarity with Galileo's works
see Marie Boas Hall, “Galileo's Influence on Seventeenth-
Century English Scientists,” in Galileo, Man of Science, ed.
Ernan McMullin (New York and London, 1967), pp. 411-12.
Ernst Cassirer, Das Erkenntnisproblem, 3 vols. (Berlin,
1922-23), I, 136-44 for Randall's view, see his well-known
paper “Scientific Method in the School of Padua,” Journal
of the History of Ideas,
1 (1940), 177-206, and its revision
in his The School of Padua and the Emergence of Modern
Science
(Padua, 1961). Cassirer accepted the results of
Randall's inquiry but could not “subscribe to his conclu-
sions,” for he believed Galileo's conception of the dual
method, despite the identity of the terms used, to be more
influenced by the mathematical tradition than by the phi-
losophers of Padua see his “Galileo's Platonism,” in M. P.
Ashley Montagu, ed. Studies and Essays in the History of
Science and Learning
(New York, 1946), pp. 279-97. I. B.
Cohen, Franklin and Newton, An Inquiry into Speculative
Newtonian Experimental Science
M. R. Cohen and I. Drabkin, eds., A Source Book in
Greek Science
(New York, 1948 Cambridge, Mass. 1959). A.
Crombie, Robert Grosseteste and the Origins of Experimental
Science
(Oxford, 1953). Descartes, Oeuvres de Descartes, eds.
C. Adam and P. Tannery, 13 vols. (Paris, 1891-1912). Galen,
Claudii Galeni Opera omnia, ed. C. G. Kühn, 20 vols.
(Leipzig, 1821-33) idem, Galen on Medical Experience, ed.
and trans. R. Walzer (London and New York, 1944). Galileo
Galilei, “Letter to the Grand Duchess Christina” (1615),
in S. Drake, Discoveries and Opinions of Galileo (Garden
City, N.Y., 1957) idem, Opere di Galileo Galilei, ed. A.
Favaro, 20 vols. (Florence, 1890-1909 reprint 1929-39)
idem, Dialogues Concerning Two New Sciences, trans. H.
Crew and A. de Salvio (New York, 1914) idem, Dialogue
Concerning the Two Chief World Systems,
trans. S. Drake
(Berkeley and Los Angeles, 1953). Neal Gilbert, Renaissance
Concepts of Method
(New York and London, 1960). Thomas
L. Hankins, Jean D'Alembert—Science and the Enlighten-
ment
(Oxford, 1970), passim. Jaako Hintikka, “Kant and the
Tradition of Analysis,” Deskription, Analytizität und Ex-
istenz, 3-4 Forschungsgespräch des internationalen For-
schungszentrum für Grundfragen der Wissenschaften Salz-
burg,
ed. Paul Weingartner (Pustet-Verlag, Salzburg, and
Munich, 1966), pp. 254-72. Robert Hooke, Micrographia
(London, 1665) Hooke used Bacon's term instantia crucis
but in one place modified it to read experimentum crucis.
See Richard S. Westfall, “The Development of Newton's
Theory of Color,” Isis, 53 (1962), 354, and note 46. For a
skeptical appraisal of Newton's famous experiment see
A. I. Sabra, Theories of Light from Descartes to Newton
(London, 1967), pp. 294-97. Sabra's argument that only
Newton's adherence to a corpuscular doctrine permitted
him to infer the heterogeneity of white light from this
experiment is inconclusive. W. S. Jevons, The Principles of
Science
(London and New York, 1905). P. S. de Laplace,
Exposition du système du monde, 6th ed. (Paris, 1835).
A. L. Lavoisier, Traité élémentaire de chimie (Paris, 1789).
Colin Maclaurin, Account of Sir Isaac Newton's Philo-
sophical Discoveries,
3rd ed. (London, 1775). Paul Mouy,
Le Développement de la physique cartésienne, 1646-1712
(Paris, 1934). Isaac Newton, Mathematical Principles of
Natural Philosophy,
ed. F. Cajori (Berkeley, 1934) idem,
Opticks, 4th ed. (1730) idem, Isaac Newton's Papers and
Letters on Natural Philosophy,
ed. I. B. Cohen (Cambridge,
Mass., 1958) idem, “Account of the Booke entituled Com-
mercium Epistolicum, etc.,” Philosophical Transactions, 19,
no. 342 (1717) trans. “Recensio,” in the second edition
(1722) of the Commercium for Newton's authorship of this
“Account” see Louis T. More, Isaac Newton (New York,
1934), pp. 590-91, note 43 idem, Universal Arithmetick
(London, 1728). Jacques Rohault, Traité de physique (Paris,
1671). W. J. 'sGravesande, Introductio ad philosophiam,
metaphysicam et logicam continens
(Leiden, 1736) trans.
into French as Oeuvres philosophiques et mathématiques de
Mr G. J. 'sGravesande,
ed. J. N. S. Allamand, two parts in
one vol. (Amsterdam, 1774). E. W. Strong, “Newton's
'Mathematical Way'” in Roots of Scientific Thought, eds.
Philip P. Wiener and Aaron Noland (New York, 1957) the
article is reprinted, somewhat abridged, from the Journal
of the History of Ideas,
12 (1951), 90-110. Henry G. Van
Leeuwen, The Problem of Certainty in English Thought
(The Hague, 1963). Basil Willey, The Seventeenth Century
Background
(London, 1949).

Cosmology Experimental Science Newton's Opticks
Number Optics Unity of Science.]

## Related Resources

##### Instructor  #### Buy Both and Save 25%!

This item: An Introduction to Optimization, 4th Edition

Cannot be combined with any other offers.  #### Buy Both and Save 25%!

This item: An Introduction to Optimization, 4th Edition

Cannot be combined with any other offers.  #### Buy Both and Save 25%!

This item: An Introduction to Optimization, 4th Edition

Cannot be combined with any other offers.

## Damped Newton's Method

To help improve convergence, newton's method may be dampened with a constant α from (0,1]. Ideally, the each value of α should have the next iteration get as close to the root as possible. Only possible method of determining α is the Bank-Rose algorithm. Ώ]