Articles

13.8: Optimization of Functions of Several Variables - Mathematics


One of the most useful applications for derivatives of a function of one variable is the determination of maximum and/or minimum values. This application is also important for functions of two or more variables, but as we have seen in earlier sections of this chapter, the introduction of more independent variables leads to more possible outcomes for the calculations. The main ideas of finding critical points and using derivative tests are still valid, but new wrinkles appear when assessing the results.

Critical Points

For functions of a single variable, we defined critical points as the values of the function when the derivative equals zero or does not exist. For functions of two or more variables, the concept is essentially the same, except for the fact that we are now working with partial derivatives.

Definition: Critical Points

Let (z=f(x,y)) be a function of two variables that is differentiable on an open set containing the point ((x_0,y_0)). The point ((x_0,y_0)) is called a critical point of a function of two variables (f) if one of the two following conditions holds:

  1. (f_x(x_0,y_0)=f_y(x_0,y_0)=0)
  2. Either (f_x(x_0,y_0) ; ext{or} ; f_y(x_0,y_0)) does not exist.

Example (PageIndex{1}): Finding Critical Points

Find the critical points of each of the following functions:

  1. (f(x,y)=sqrt{4y^2−9x^2+24y+36x+36})
  2. (g(x,y)=x^2+2xy−4y^2+4x−6y+4)

Solution:

a. First, we calculate (f_x(x,y) ; ext{and} ; f_y(x,y):)

[egin{align*} f_x(x,y)&=dfrac{1}{2}(−18x+36)(4y^2−9x^2+24y+36x+36)^{−1/2} &=dfrac{−9x+18}{sqrt{4y^2−9x^2+24y+36x+36}} end{align*}]

[egin{align*} f_y(x,y)&=dfrac{1}{2}(8y+24)(4y^2−9x^2+24y+36x+36)^{−1/2} &=dfrac{4y+12}{sqrt{4y^2−9x^2+24y+36x+36}} end{align*}.]

Next, we set each of these expressions equal to zero:

[egin{align*} dfrac{−9x+18}{sqrt{4y^2−9x^2+24y+36x+36}}&=0 dfrac{4y+12}{sqrt{4y^2−9x^2+24y+36x+36}}&=0. end{align*}]

Then, multiply both sides of each equation by its denominator (to clear the denominators):

[egin{align*} −9x+18&=0 4y+12&=0. end{align*}]

Therefore, (x=2) and (y=−3,) so ((2,−3)) is a critical point of (f).

We must also check for the possibility that the denominator of each partial derivative can equal zero, thus causing the partial derivative not to exist. Since the denominator is the same in each partial derivative, we need only do this once:

[4y^2−9x^2+24y+36x+36=0. onumber]

This equation represents a hyperbola. We should also note that the domain of (f) consists of points satisfying the inequality

[4y^2−9x^2+24y+36x+36≥0. onumber]

Therefore, any points on the hyperbola are not only critical points, they are also on the boundary of the domain. To put the hyperbola in standard form, we use the method of completing the square:

[egin{align*} 4y^2−9x^2+24y+36x+36&=0 4y^2−9x^2+24y+36x&=−36 4y^2+24y−9x^2+36x&=−36 4(y^2+6y)−9(x^2−4x)&=−36 4(y^2+6y+9)−9(x^2−4x+4)&=−36−36+36 4(y+3)^2−9(x−2)^2&=−36.end{align*}]

Dividing both sides by (−36) puts the equation in standard form:

[egin{align*} dfrac{4(y+3)^2}{−36}−dfrac{9(x−2)^2}{−36}&=1 dfrac{(x−2)^2}{4}−dfrac{(y+3)^2}{9}&=1. end{align*}]

Notice that point ((2,−3)) is the center of the hyperbola.

Thus, the critical points of the function (f) are ( (2, -3) ) and all points on the hyperbola, (dfrac{(x−2)^2}{4}−dfrac{(y+3)^2}{9}=1).

b. First, we calculate (g_x(x,y)) and (g_y(x,y)):

[egin{align*} g_x(x,y)&=2x+2y+4 g_y(x,y)&=2x−8y−6. end{align*}]

Next, we set each of these expressions equal to zero, which gives a system of equations in (x) and (y):

[egin{align*} 2x+2y+4&=0 2x−8y−6&=0. end{align*}]

Subtracting the second equation from the first gives (10y+10=0), so (y=−1). Substituting this into the first equation gives (2x+2(−1)+4=0), so (x=−1).

Therefore ((−1,−1)) is a critical point of (g). There are no points in (mathbb{R}^2) that make either partial derivative not exist.

Figure (PageIndex{1}) shows the behavior of the surface at the critical point.

Exercise (PageIndex{1}):

Find the critical point of the function (f(x,y)=x^3+2xy−2x−4y.)

Hint

Calculate (f_x(x,y)) and (f_y(x,y)), then set them equal to zero.

Answer

The only critical point of (f) is ((2,−5)).

Determining Global and Local Extrema

The main purpose for determining critical points is to locate relative maxima and minima, as in single-variable calculus. When working with a function of one variable, the definition of a local extremum involves finding an interval around the critical point such that the function value is either greater than or less than all the other function values in that interval. When working with a function of two or more variables, we work with an open disk around the point.

Definition: Global and Local Extrema

Let (z=f(x,y)) be a function of two variables that is defined and continuous on an open set containing the point ((x_0,y_0).) Then (f) has a local maximum at ((x_0,y_0)) if

[f(x_0,y_0)≥f(x,y)]

for all points ((x,y)) within some disk centered at ((x_0,y_0)). The number (f(x_0,y_0)) is called a local maximum value. If the preceding inequality holds for every point ((x,y)) in the domain of (f), then (f) has a global maximum (also called an absolute maximum) at ((x_0,y_0).)

The function (f) has a local minimum at ((x_0,y_0)) if

[f(x_0,y_0)≤f(x,y)]

for all points ((x,y)) within some disk centered at ((x_0,y_0)). The number (f(x_0,y_0)) is called a local minimum value. If the preceding inequality holds for every point ((x,y)) in the domain of (f), then (f) has a global minimum (also called an absolute minimum) at ((x_0,y_0)).

If (f(x_0,y_0)) is either a local maximum or local minimum value, then it is called a local extremum (Figure (PageIndex{2})).

In Calculus 1, we showed that extrema of functions of one variable occur at critical points. The same is true for functions of more than one variable, as stated in the following theorem.

Fermat’s Theorem for Functions of Two Variables

Let (z=f(x,y)) be a function of two variables that is defined and continuous on an open set containing the point ((x_0,y_0)). Suppose (f_x) and (f_y) each exists at ((x_0,y_0)). If f has a local extremum at ((x_0,y_0)), then ((x_0,y_0)) is a critical point of (f).

Consider the function (f(x)=x^3.) This function has a critical point at (x=0), since (f'(0)=3(0)^2=0). However, (f) does not have an extreme value at (x=0). Therefore, the existence of a critical value at (x=x_0) does not guarantee a local extremum at (x=x_0). The same is true for a function of two or more variables. One way this can happen is at a saddle point. An example of a saddle point appears in the following figure.

Figure (PageIndex{3} label{saddlefigure}): Graph of the function (z=x^2−y^2). This graph has a saddle point at the origin.

In this graph, the origin is a saddle point. This is because the first partial derivatives of f((x,y)=x^2−y^2) are both equal to zero at this point, but it is neither a maximum nor a minimum for the function. Furthermore the vertical trace corresponding to (y=0) is (z=x^2) (a parabola opening upward), but the vertical trace corresponding to (x=0) is (z=−y^2) (a parabola opening downward). Therefore, it is both a global maximum for one trace and a global minimum for another.

Definition: Saddle Point

Given the function (z=f(x,y),) the point (ig(x_0,y_0,f(x_0,y_0)ig)) is a saddle point if both (f_x(x_0,y_0)=0) and (f_y(x_0,y_0)=0), but (f) does not have a local extremum at ((x_0,y_0).)

Classifying Critical Points

In order to develop a general method for classifying the behavior of a function of two variables at its critical points, we need to begin by classifying the behavior of quadratic polynomial functions of two variables at their critical points.

To see why this will help us, consider that the quadratic approximation of a function of two variables (its 2nd-degree Taylor polynomial) shares the same first and second partials as the function it approximates at the chosen point of tangency (or center point). Since sharing the same second partials means the two surfaces will share the same concavity (or curvature) at the critical point, this causes these quadratic approximation surfaces to share the same behavior as the function (z = f(x, y)) that they approximate at the point of tangency. In other words, if the original function has a relative maximum at this point, so will the quadratic approximation. If the original function has a relative minimum at this point, so will the quadratic approximation, and if the original function has a saddle point at this point, so will the quadratic approximation.

Now there are really three basic behaviors of a quadratic polynomial in two variables at a point where it has a critical point. It will fit one of the following three forms, often being a transformation of one of the following functions.

  1. A sum of two squared terms, like (z = x^2 + y^2), producing a paraboloid that opens up and has a relative (absolute) minimum at its vertex. See the plot on the left side of Figure (PageIndex{4}).
  2. The negative of a sum of two squared terms, like (z = -left(x^2 + y^2 ight)), producing a paraboloid that opens down and has a relative (absolute) maximum at its vertex. See the plot on the right side of Figure (PageIndex{4}).
  3. The difference of two squared terms, like (z = f(x, y) = x^2 - y^2) or (z = f(x, y) = y^2 - x^2), producing a saddle with a saddle point at its critical point. See Figure (PageIndex{3}).

Figure (PageIndex{4}): (z = x^2 + y^2) has an absolute minimum of (0) at ( (0,0)), while (z = -(x^2 + y^2)) has an absolute maximum of (0) at ( (0,0)),

Example (PageIndex{1}): Classifying the critical points of a function

Use completing the square to identify local extrema or saddle points of the following quadratic polynomial functions:

  1. (f(x,y) = x^2 - 6x + y^2 + 10y + 20)
  2. (f(x,y) = 12 - 3x^2 - 6x - y^2 + 12y)
  3. (f(x,y) = x^2 + 8x - 2y^2 + 16y)
  4. (f(x,y) = x^2 + 6xy + y^2)

Solution

a. To determine the critical points of this function, we start by setting the partials of (f) equal to (0). [ egin{align*} ext{Set}quad f_x(x,y) &= 2x -6 = 0 & implies x &= 3 ext{and}quad f_y(x,y) &= 2y + 10 = 0 & implies y &= -5 end{align*} ]We obtain a single critical point with coordinates ( (3, -5) ). Next we need to determine the behavior of the function (f) at this point.

Completing the square, we get: [egin{align*} f(x,y) &= x^2 - 6x + y^2 + 10y + 20 &= x^2 - 6x + 9 + y^2 + 10y + 25 + 20 - 9 - 25 &= (x - 3)^2 + (y + 5)^2 - 14 end{align*}]Notice that this function is really just a translated version of (z = x^2 + y^2), so it is a paraboloid that opens up with its vertex (minimum point) at the critical point ( (3, -5) ). We can argue that it has an absolute minimum value of (-14) at the point ( (3, -5) ), since we are adding squared terms to (-14) and thus cannot get a value less than (-14) for any values of (x) and (y), while we do obtain this minimum value of (-14) at the vertex point ( (3, -5) ).

b. Setting the partials of (f) equal to (0), we obtain: [ egin{align*} ext{Set}quad f_x(x,y) &= -6x -6 = 0 & implies x &= -1 ext{and}quad f_y(x,y) &= -2y + 12 = 0 & implies y &= 6 end{align*} ]We obtain a single critical point with coordinates ( (-1, 6) ). Next we need to determine the behavior of the function (f) at this point.

To complete the square here, we first need to factor out the factors of the squared terms. Doing this and reordering the terms some gives us: [egin{align*} f(x,y) &= 12 - 3x^2 - 6x - y^2 + 12y &= - 3left(x^2 + 2xquadquad ight) - 1left(y^2 - 12y quadquad ight) + 12 &= -3left(x^2 + 2x + 1 ight) - 1left(y^2 - 12y +36 ight) + 12 +3+36 &= 51 - 3(x + 1)^2 - (y - 6)^2 end{align*}]Notice that this function is an elliptic paraboloid that opens down with its vertex (maximum point) at the critical point ( (-1, 6) ). We can argue that it has an absolute maximum value of (51) at the point ( (-1, 6) ), since we are subtracting squared terms from (51) and thus cannot get a value more than (51) for any values of (x) and (y), while we do obtain this minimum value of (51) at the vertex point ( (-1, 6) ).

c. Setting the partials of (f) equal to (0), we obtain: [ egin{align*} ext{Set}quad f_x(x,y) &= 2x + 8 = 0 & implies x &= -4 ext{and}quad f_y(x,y) &= -4y + 16 = 0 & implies y &= 4 end{align*} ]This gives us a critical point with coordinates ( (-4, 4) ). To determine if (f) has a local extremum or saddle point at this point, we complete the square.

Factoring out (-2) from the (y)-squared term gives us: [egin{align*} f(x,y) &= x^2 + 8x - 2y^2 + 16y &= x^2 + 8x +16 - 2left(y^2 - 8y + 16 ight) - 16 + 32 &= (x + 4)^2 - 2(y - 4)^2 +16end{align*}]Since one squared term is positive and one is negative, we see that this function has the form of (z = x^2 - y^2) and so it has a saddle point at its critical point. That is, (f) has a saddle point at ( (-4, 4, 16) ).

d. Setting the partials of (f) equal to (0), we get: [ egin{align*} ext{Set}quad f_x(x,y) &= 2x + 6y = 0 & ext{and}quad f_y(x,y) &= 6x + 2y = 0 & implies y &= -3x end{align*} ]Substituting (-3x) into the first equation for (y) gives us, [egin{align*}2x + 6(-3x) &= 0 -16x &= 0 x &= 0end{align*}]Since (y = -3x), we have ( y = -3(0) = 0), so the critical point of (f) is ( (0,0) ). To determine the behavior of (f) at this critical point, we complete the square.

[egin{align*} f(x,y) &= x^2 + 6xy + y^2 &= (x^2 + 6xy + 9y^2) + y^2 - 9y^2 &= (x + 3y)^2 - 8y^2 end{align*}]As this produces a difference of squares with one positive squared term and the other a negative squared term, we see that (f) takes a form similar to (z = x^2 - y^2) and will have a saddle point at ( (0, 0, 0) ).

Now let's consider the quadratic approximation to a function (z = f(x, y)) centered at a critical point ( (x_0, y_0) ) of this function.

[Q(x, y) = f (x_0, y_0) + f_x(x_0, y_0) (x - x_0) + f_y(x_0, y_0) (y - y_0) + frac{f_{xx}(x_0, y_0)}{2}(x-x_0)^2 + f_{xy}(x_0, y_0)(x-x_0)(y-y_0) + frac{f_{yy}(x_0, y_0)}{2}(y-y_0)^2]

But, since the point ( (x_0, y_0) ), in this case, is a critical point of (f), we know that (f_x(x_0, y_0) = 0) and (f_y(x_0, y_0) = 0).

This allows us to simplify (Q(x, y)) to just:

[Q(x, y) = f (x_0, y_0) + frac{f_{xx}(x_0, y_0)}{2}(x-x_0)^2 + f_{xy}(x_0, y_0)(x-x_0)(y-y_0) + frac{f_{yy}(x_0, y_0)}{2}(y-y_0)^2]

Now we need to complete the square on this quadratic polynomial in two variables to learn how we can classify the behavior of this function at this critical point. Remember that the original function will share the same behavior (max, min, saddle point) as this 2nd-degree Taylor polynomial at this critical point.

To make this process easier, let's make some substitutions. Let's choose to let (u = x - x_0) and (v = y - y_0),

and let [egin{align*} a &= frac{f_{xx}(x_0, y_0)}{2}, b &= f_{xy}(x_0, y_0), c &= frac{f_{yy}(x_0, y_0)}{2} , ext{and} d &= f (x_0, y_0) end{align*}]

Then we need to complete the square on the polynomial: [ Q(x,y) = au^2 +buv + cv^2 + d]

Completing the square:

First we factor out the coefficient of (u^2): [= aleft[ u^2 + frac{b}{a}uv + frac{c}{a}v^2 ight] + d]

Next, we complete the square using the first two terms: [= aleft[ left(u^2 + frac{b}{a}uv + left(frac{b}{2a}v ight)^2 ight) + frac{c}{a}v^2 - left(frac{b}{2a}v ight)^2 ight] + d]

Rewriting the perfect square trinomial as the square of a binomial and combining the (v^2) terms yields:

[egin{align*} &= aleft[ left(u+ frac{b}{2a}v ight)^2 + left(frac{c}{a} - frac{b^2}{4a^2} ight)v^2 ight] + d
&= aleft[ left(u+ frac{b}{2a}v ight)^2 + left(frac{4ac}{4a^2} - frac{b^2}{4a^2} ight)v^2 ight] + d
&= aleft[ left(u+ frac{b}{2a}v ight)^2 + left(frac{4ac-b^2}{4a^2} ight)v^2 ight] + d end{align*}]

Note that the shape of this function's graph depends on the sign of the coefficient of (v^2). And the sign of this coefficient is determined only by its numerator, as the denominator is always positive (being a perfect square). This expression, (4ac-b^2), is called the discriminant, as it helps us discriminate (tell the difference between) which behavior the function has at this critical point.

If (D = 4ac-b^2gt 0), then the two squared terms inside the brackets are both positive, and

  • if (a = frac{f_{xx}(x_0, y_0)}{2} gt 0), the function (f) opens upwards with a local minimum at the critical point ( (x_0, y_0) ). Note it would be similar to the form, (z = x^2 + y^2).
  • if (a = frac{f_{xx}(x_0, y_0)}{2} lt 0), the function (f) opens downwards with a local maximum at the critical point ( (x_0, y_0) ). Note it would be similar to the form, (z = -left(x^2 + y^2 ight)).

If (D = 4ac-b^2 lt 0), then either

  • the two squared terms inside the brackets have opposite signs (meaning (f) is concave up along a line parallel to the (x)-axis and concave down along a line parallel to the (y)-axis, or vice-versa) or
  • the (b^2) term, representing the square of the mixed partial (f_{xy}(x_0, y_0)), is larger than the positive product of the two 2nd-partials (f_{xx}(x_0, y_0)) and (f_{yy}(x_0, y_0)). This means that even if the surface is concave up in both (x)- and (y)-directions, or concave down in both (x)- and (y)-directions, a large mixed partial can offset these and cause the surface to have a saddle point at the point ((x_0, y_0)).

In either case, the quadratic polynomial will be in the form of (z = x^2 - y^2) or (z = y^2 - x^2) (i.e., it will be the difference of two squared terms), so we get a saddle point at the critical point ( (x_0, y_0) ).

But if (D = 4ac-b^2 = 0), the quadratic polynomial reduces to (Q(x,y) = aleft(u+ frac{b}{2a}v ight)^2 + d), whose graph is a parabolic cylinder, so the behavior of the function is not clear at the critical point ( (x_0, y_0) ).

Now remembering the values of the constants (a), (b), and (c) from above, we see that: [egin{align*} D(x_0, y_0) &= 4frac{f_{xx}(x_0, y_0)}{2}frac{f_{yy}(x_0, y_0)}{2} - ig(f_{xy}(x_0, y_0)ig)^2 &= f_{xx}(x_0, y_0)f_{yy}(x_0, y_0) - ig(f_{xy}(x_0, y_0)ig)^2 end{align*}]

This formula is called the Second Partials Test, and it can be used to classify the behavior of any function at its critical points, as long as its second partials exist there and as long as the value of this discriminate is not zero.

The Second Partials Test

The second derivative test for a function of one variable provides a method for determining whether an extremum occurs at a critical point of a function. When extending this result to a function of two variables, an issue arises related to the fact that there are, in fact, four different second-order partial derivatives, although equality of mixed partials reduces this to three. The second partials test for a function of two variables, stated in the following theorem, uses a discriminant (D) that replaces (f''(x_0)) in the second derivative test for a function of one variable.

second partials Test

Let (z=f(x,y)) be a function of two variables for which the first- and second-order partial derivatives are continuous on some disk containing the point ((x_0,y_0)). Suppose (f_x(x_0,y_0)=0) and (f_y(x_0,y_0)=0.) Define the quantity

[D=f_{xx}(x_0,y_0)f_{yy}(x_0,y_0)−ig(f_{xy}(x_0,y_0)ig)^2.]

Then:

  1. If (D>0) and (f_{xx}(x_0,y_0)>0), then (f) is concave up at this critical point, so (f) has a local minimum at ((x_0,y_0)).
  2. If (D>0) and (f_{xx}(x_0,y_0)<0), then (f) is concave down at this critical point, so (f) has a local maximum at ((x_0,y_0)).
  3. If (D<0), then (f) has a saddle point at ((x_0,y_0)).
  4. If (D=0), then the test is inconclusive.

See Figure (PageIndex{4}).

To apply the second partials test, it is necessary that we first find the critical points of the function. There are several steps involved in the entire procedure, which are outlined in a problem-solving strategy.

Problem-Solving Strategy: Using the second partials Test for Functions of Two Variables

Let (z=f(x,y)) be a function of two variables for which the first- and second-order partial derivatives are continuous on some disk containing the point ((x_0,y_0).) To apply the second partials test to find local extrema, use the following steps:

  1. Determine the critical points ((x_0,y_0)) of the function (f) where (f_x(x_0,y_0)=f_y(x_0,y_0)=0.) If you find any critical points where at least one of the partial derivatives does not exist, you will need to find and justify extrema in another way, as you can't use the second partials test.
  2. Calculate the discriminant (D=f_{xx}(x_0,y_0)f_{yy}(x_0,y_0)−ig(f_{xy}(x_0,y_0)ig)^2) for each critical point of (f).
  3. Apply the four cases of the test to determine whether each critical point is a local maximum, local minimum, or saddle point, or whether the test is inconclusive. If the test is inconclusive, you will need to analyze and classify the behavior at the critical point another way.

Example (PageIndex{2}): Using the second partials Test

Find the critical points for each of the following functions, and use the second partials test to find any local extrema or saddle points.

  1. (f(x,y)=4x^2+9y^2+8x−36y+24)
  2. (g(x,y)=dfrac{1}{3}x^3+y^2+2xy−6x−3y+4)

Solution:

a. Step 1 of the problem-solving strategy requires us to find the critical points of (f). To do this, we first calculate (f_x(x,y)) and (f_y(x,y)) and then set each of them equal to zero:

[egin{align*} f_x(x,y)&=8x+8 f_y(x,y)&=18y−36. end{align*}]

Setting them equal to zero yields the system of equations

[egin{align*} 8x+8&=0 18y−36&=0. end{align*}]

The solution to this system is (x=−1) and (y=2). Therefore ((−1,2)) is the only critical point of (f).

Step 2 of the problem-solving strategy involves calculating (D.) To do this, we first calculate the second partial derivatives of (f:)

[egin{align*} f_{xx}(x,y)&=8 f_{xy}(x,y)&=0 f_{yy}(x,y)&=18. end{align*}]

Therefore, (D(-1,2)=f_{xx}(−1,2)f_{yy}(−1,2)−ig(f_{xy}(−1,2)ig)^2=(8)(18)−(0)^2=144>0.)

Step 3 tells us to apply the four cases of the test to classify the function's behavior at this critical point.

Since (D>0) and (f_{xx}(−1,2)=8>0,;f) is concave up, so (f) has a local minimum of (f(-1,2) = -16) at ((−1,2)), as shown in the following figure. (Note that this corresponds to case 1 of the second partials test.)

Figure (PageIndex{5}): The function (f(x,y)) has a local minimum at ((−1,2,−16).) Note the scale on the (y)-axis in this plot is in thousands.

b. For step 1, we first calculate (g_x(x,y)) and (g_y(x,y)), then set each of them equal to zero:

[egin{align*} g_x(x,y)&=x^2+2y−6 g_y(x,y)&=2y+2x−3. end{align*}]

Setting them equal to zero yields the system of equations

[egin{align*} x^2+2y−6&=0 2y+2x−3&=0. end{align*}]

To solve this system, first solve the second equation for (y). This gives (y=dfrac{3−2x}{2}). Substituting this into the first equation gives

[egin{align*} x^2+3−2x−6&=0 x^2−2x−3&=0 (x−3)(x+1)&=0. end{align*}]

Therefore, (x=−1) or (x=3). Substituting these values into the equation (y=dfrac{3−2x}{2}) yields the critical points (left(−1,frac{5}{2} ight)) and (left(3,−frac{3}{2} ight)).

Step 2 involves calculating the second partial derivatives of (g):

[egin{align*} g_{xx}(x,y)&=2x g_{xy}(x,y)&=2 g_{yy}(x,y)&=2. end{align*}]

Next, we substitute each critical point into the discriminant formula:

[egin{align*} Dleft(−1, frac{5}{2} ight)&=(2(−1))(2)−(2)^2=−4−4=−8 Dleft(3,− frac{3}{2} ight)&=(2(3))(2)−(2)^2=12−4=8. end{align*}]

In step 3, we use the second partials test to classify the behavior of the function at each of its critical points.

At point (left(−1,frac{5}{2} ight)), we see that (Dleft(−1, frac{5}{2} ight)=-8<0) (case 3 of the test), which means that (f) has a saddle point at the point (left(−1,frac{5}{2} ight)). The coordinates of this saddle point are (left(−1,frac{5}{2}, frac{41}{12} ight)).

Applying the theorem to point (left(3,−frac{3}{2} ight)) leads to case (1). That is, since (Dleft(3,- frac{3}{2} ight)=8>0) and (g_{xx}left(3,- frac{3}{2} ight)=2(3)=6>0), we know that (g) is concave up at this critical point, so (g) has a local minimum of (-frac{29}{4}) at the point (left(3,−frac{3}{2} ight)), as shown in the following figure.

Note: Sometimes it can be helpful to find a general formula for (D). For example, here we could have used the following formula:

[egin{align*} D(x_0, y_0) &=g_{xx}(x_0,y_0)g_{yy}(x_0,y_0)−ig(g_{xy}(x_0,y_0)ig)^2 &=(2x_0)(2)−2^2 &=4x_0−4.end{align*}]

Then we would have:

[egin{align*} Dleft(−1, frac{5}{2} ight)&=4(-1)-4=−4−4=−8 Dleft(3,− frac{3}{2} ight)&=4(3)-4=12−4=8. end{align*}]

Note that the final values of the discriminant at each critical point are the same.

Exercise (PageIndex{2})

Use the second partials to find the local extrema of the function

[ f(x,y)=x^3+2xy−6x−4y^2. onumber]

Hint

Follow the problem-solving strategy for applying the second partials test.

Answer

(left(frac{4}{3},frac{1}{3} ight)) is a saddle point, (left(−frac{3}{2},−frac{3}{8} ight)) is a local maximum.

  • A critical point of the function (f(x,y)) is any point ((x_0,y_0)) where either (f_x(x_0,y_0)=f_y(x_0,y_0)=0), or at least one of (f_x(x_0,y_0)) and (f_y(x_0,y_0)) do not exist.
  • A saddle point is a point ((x_0,y_0)) where (f_x(x_0,y_0)=f_y(x_0,y_0)=0), but ((x_0,y_0)) is neither a maximum nor a minimum at that point.
  • To find extrema of functions of two variables, first find the critical points, then calculate the discriminant and apply the second partials test.

Key Equations

  • Discriminant

(D=f_{xx}(x_0,y_0)f_{yy}(x_0,y_0)−(f_{xy}(x_0,y_0))^2)

Glossary

critical point of a function of two variables

the point ((x_0,y_0)) is called a critical point of (f(x,y)) if one of the two following conditions holds:

1. (f_x(x_0,y_0)=f_y(x_0,y_0)=0)

2. At least one of (f_x(x_0,y_0)) and (f_y(x_0,y_0)) do not exist

discriminant
the discriminant of the function (f(x,y)) is given by the formula (D=f_{xx}(x_0,y_0)f_{yy}(x_0,y_0)−(f_{xy}(x_0,y_0))^2)
saddle point
given the function (z=f(x,y),) the point ((x_0,y_0,f(x_0,y_0))) is a saddle point if both (f_x(x_0,y_0)=0) and (f_y(x_0,y_0)=0), but (f) does not have a local extremum at ((x_0,y_0))

Contributors

  • Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is licensed with a CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.

  • Paul Seeburger (Monroe Community College) edited and adapted this section extensively.
    Paul also wrote the entire subsection titled Classifying Critical Points.

13.8: Optimization of Functions of Several Variables - Mathematics

In single-variable calculus we were concerned with functions that map the real numbers $R$ to $R$, sometimes called "real functions of one variable'', meaning the "input'' is a single real number and the "output'' is likewise a single real number. In the last chapter we considered functions taking a real number to a vector, which may also be viewed as functions $fcolonR oR^3$, that is, for each input value we get a position in space. Now we turn to functions of several variables, meaning several input variables, functions $fcolonR^n oR$. We will deal primarily with $n=2$ and to a lesser extent $n=3$ in fact many of the techniques we discuss can be applied to larger values of $n$ as well.

A function $fcolonR^2 oR$ maps a pair of values $(x,y)$ to a single real number. The three-dimensional coordinate system we have already used is a convenient way to visualize such functions: above each point $(x,y)$ in the $x$-$y$ plane we graph the point $(x,y,z)$, where of course $z=f(x,y)$.

Example 14.1.1 Consider $f(x,y)=3x+4y-5$. Writing this as $z=3x+4y-5$ and then $3x+4y-z=5$ we recognize the equation of a plane. In the form $f(x,y)=3x+4y-5$ the emphasis has shifted: we now think of $x$ and $y$ as independent variables and $z$ as a variable dependent on them, but the geometry is unchanged.

Example 14.1.2 We have seen that $x^2+y^2+z^2=4$ represents a sphere of radius 2. We cannot write this in the form $f(x,y)$, since for each $x$ and $y$ in the disk $x^2+y^2 Example 14.1.3 Consider $f=sqrt x+sqrt y$. This function is defined only when both $x$ and $y$ are non-negative. When $y=0$ we get $f(x,y)=sqrt x$, the familiar square root function in the $x$-$z$ plane, and when $x=0$ we get the same curve in the $y$-$z$ plane. Generally speaking, we see that starting from $f(0,0)=0$ this function gets larger in every direction in roughly the same way that the square root function gets larger. For example, if we restrict attention to the line $x=y$, we get $f(x,y)=2sqrt x$ and along the line $y=2x$ we have $f(x,y)=sqrt x+sqrt<2x>=(1+sqrt2)sqrt x$.

A computer program that plots such surfaces can be very useful, as it is often difficult to get a good idea of what they look like. Still, it is valuable to be able to visualize relatively simple surfaces without such aids. As in the previous example, it is often a good idea to examine the function on restricted subsets of the plane, especially lines. It can also be useful to identify those points $(x,y)$ that share a common $z$-value.

Example 14.1.4 Consider $f(x,y)=x^2+y^2$. When $x=0$ this becomes $f=y^2$, a parabola in the $y$-$z$ plane when $y=0$ we get the "same'' parabola $f=x^2$ in the $x$-$z$ plane. Now consider the line $y=kx$. If we simply replace $y$ by $kx$ we get $f(x,y)=(1+k^2)x^2$ which is a parabola, but it does not really "represent'' the cross-section along $y=kx$, because the cross-section has the line $y=kx$ where the horizontal axis should be. In order to pretend that this line is the horizontal axis, we need to write the function in terms of the distance from the origin, which is $ds sqrt=sqrt$. Now $ds f(x,y)=x^2+k^2x^2=(sqrt)^2$. So the cross-section is the "same'' parabola as in the $x$-$z$ and $y$-$z$ planes, namely, the height is always the distance from the origin squared. This means that $f(x,y)=x^2+y^2$ can be formed by starting with $z=x^2$ and rotating this curve around the $z$ axis.

Finally, picking a value $z=k$, at what points does $f(x,y)=k$? This means $x^2+y^2=k$, which we recognize as the equation of a circle of radius $sqrt k$. So the graph of $f(x,y)$ has parabolic cross-sections, and the same height everywhere on concentric circles with center at the origin. This fits with what we have already discovered.

As in this example, the points $(x,y)$ such that $f(x,y)=k$ usually form a curve, called a level curve of the function. A graph of some level curves can give a good idea of the shape of the surface it looks much like a topographic map of the surface. In figure 14.1.2 both the surface and its associated level curves are shown. Note that, as with a topographic map, the heights corresponding to the level curves are evenly spaced, so that where curves are closer together the surface is steeper.

Functions $fcolon R^n oR$ behave much like functions of two variables we will on occasion discuss functions of three variables. The principal difficulty with such functions is visualizing them, as they do not "fit'' in the three dimensions we are familiar with. For three variables there are various ways to interpret functions that make them easier to understand. For example, $f(x,y,z)$ could represent the temperature at the point $(x,y,z)$, or the pressure, or the strength of a magnetic field. It remains useful to consider those points at which $f(x,y,z)=k$, where $k$ is some constant value. If $f(x,y,z)$ is temperature, the set of points $(x,y,z)$ such that $f(x,y,z)=k$ is the collection of points in space with temperature $k$ in general this is called a level set for three variables, a level set is typically a surface, called a level surface.

Example 14.1.5 Suppose the temperature at $(x,y,z)$ is $T(x,y,z)=e^<-(x^2+y^2+z^2)>$. This function has a maximum value of 1 at the origin, and tends to 0 in all directions. If $k$ is positive and at most 1, the set of points for which $T(x,y,z)=k$ is those points satisfying $x^2+y^2+z^2=-ln k$, a sphere centered at the origin. The level surfaces are the concentric spheres centered at the origin.


Problem 1

You decide to build a box that has the shape of a rectangular prism with a volume of 1000 cubic centimeters. Find the dimensions x, y and z of the box so that the total surface area of all 6 faces of the box is minimum.
Solution to Problem 1:
The total area A of all six faces of the prism is given by.
A = 2 xy + 2 yz + 2 zx

The volume of the box is given hence
xyz = 1000
Solve the above for z.
z = 1000 / (xy)
Substitute z in the expression of the area A to obtain.
A(x,y) = 2xy + 2y * 1000 / (xy) + 2x * 1000 / (xy) = 2xy + 2000 / x + 2000 / y
We now need to find x and y that minimize the area A. We first need to find the critical points and then test the second partial derivatives. The first order partial derivatives of A are given by
A x (x,y) = 2y - 2000 / (x 2 )
A y (x,y) = 2x - 2000 / (y 2 )
The critical points are found by setting Ax(x,y) = 0 and Ay(x,y) = 0 and solving the system obtained. Which gives
2y - 2000 / (x 2 ) = 0 which gives 2yx 2 = 2000
2x - 2000 / (y 2 ) = 0 which gives 2xy 2 = 2000
Solve the above to obtain
x = 10 and y = 10
We now need to find the second order partial derivatives.
A xx (x,y) = 4000/(x 3 )
A yy (x,y) = 4000/(y 3 )
A xy (x,y) = 2
We now need to test the values of Axx, Ayy and Axy at the point (10,10) in order to use the theorem on minima and maxima of functions with 2 variables.
D = Axx(10,10) Ayy(10,10) - Axy 2 (10,10) = 4 * 4 - 4 = 12

Problem 2

Solution to Problem 2:
Using all available cardboard to make the box, the total area A of all six faces of the prism is given by.
A = 2xy + 2yz + 2zx = 12
The volume V of the box is given by
V = xyz
Solve the equation 2xy + 2yz + 2zx = 12 for z
z = (6 - xy) / (x + y)
Substitute z in the expression of the volume V to obtain.
V(x,y) = xy (6 - xy) / (x + y)
Let us find the critical points by first finding the first order partial derivatives
Vx(x,y) = -y 2 ( x 2 + 2xy - 6 ) / (x + y) 2
Vy(x,y) = -x 2 ( y 2 + 2xy - 6 ) / (x + y) 2
We now solve the system of equations given by Vx = 0 and Vy = 0. One obvious solution is given by the point (0,0) but is not physically possible. Other solutions are found by setting
x 2 + 2xy - 6 = 0
and
y 2 + 2xy - 6 = 0
Subtracting the equations term by term we obtain
x 2 - y 2 = 0
Solve to obtain
x = y and x = - y
The solution x = -y is not valid for this problem since both x and y are dimensions and cannot be negative. Use x = y in the equation x 2 + 2xy - 6 = 0, we obtain
x 2 + 2x 2 - 6 = 0
Solve for x
x = 𕔆
Find y
y = x = 𕔆
Let us now find the second order partial derivatives
Vxx(x,y) = -2y 2 ( y 2 + 6 ) / (x + y) 3
Vyy(x,y) = -2x 2 ( x 2 + 6 ) / (x + y) 3
Vxy(x,y) = -2xy(x 2 + 3xy + y 2 - 6 ) / (x + y) 3
We now need the values of Axx, Ayy and Axy to find the value of D = Vxx(𕔆,𕔆) Vyy(𕔆,𕔆) - Vxy 2 (𕔆,𕔆) in order to use the theorem on minima and maxima of functions with 2 variables.

D = Vxx(𕔆,𕔆) Vyy(𕔆,𕔆) - Vxy 2 (𕔆,𕔆) = 5/2
D is positive and Vxx(𕔆,𕔆) = -𕔆 is negative and therefore the volume V is maximum for
x = 𕔆 meters
y = 𕔆 meters
z = (6- xy) / (x + y) = 𕔆 meters.

Problem 3

Solution to Problem 3:
One way to find the distance from a point to a plane is to take a point (x,y,z) on the plane find the distance between this point and the given point and minimize it. Because the distance involves the square root, it would be better to minimize the square of the distance. Let the square of the distance between the given point and the point (x,y,z) on the plane be f.
f(x,y,z) = (x - 1) 2 + (y - 2) 2 + (z + 1) 2
We now solve the equation of the plane for z to obtain
z = 3 - x + y
Substitute z in f by 3 - x + y in f.
F(x,y) = (x - 1) 2 + (y - 2) 2 + (- x + y + 4) 2
We now find the first order partial derivatives
Fx(x,y) = 2(x - 1) + 2(-1)(-x + y + 4)
Fy(x,y) = 2(y - 2) + 2(-x + y + 4)
We now need to find the critical points by setting the first partial derivatives equal to zero.
2(x - 1) + 2(-1)(-x + y + 4) = 0 and 2(y - 2) + 2(-x + y + 4) = 0
We now solve the system of equations to obtain
(8/3 , 1/3 , 2/3)
We now calculate the second order derivatives
Fxx(x,y) = 4
Fyy(x,y) = 4
Fxy(x,y) = -2
We now need to find the sign of D = Fxx(8/3,1/3) Fyy(8/3,1/3) - Fxy 2 (8/3,1/3) in order to use the theorem on minima and maxima of functions with 2 variables
D = 4 * 4 - (-2) 2 = 12
Since D is positive and Fxx is positive, F has a minimum at the point (8/3,1/3) which correspond to a point on the plane given by
(8/3,-1/3,2/3)
The distance d between the given point and the plane is given by
d = √ [ (1 - 8/3) 2 + (2 - 1/3) 2 + (-1 - 2/3) 2 ]
= 5 / 𕔇

More on partial derivatives and mutlivariable functions. Multivariable Functions
Home Page


Functions of Optimization (6 Functions With Diagram)

Managerial economics is concerned with decision making by managers of a firm.

Managers of a firm have to take decisions regarding the level of output of a product to be produced, the price for a product to be charged, the size of the sales force to be engaged, the technique to be used for the production, the level of advertising expenditure to be incurred and many other such things.

In business decisions making a large number of options are open to a manager from which he has to make a choice.

Obviously, a manager will try to make a best choice from among different options available to him. A best or optimum choice is one that best achieves the desired goal or objective of the firm. For example, a manager might consider what level of output of a product he should produce.

Manager will produce the level of output which maximises firm’s profit if he has set before himself the objective of profit-maximisation. This is a maximisation problem which he has to solve. Similarly, he may be considering choosing among the various combinations of factors or inputs that can be used for producing a level of output.

To maximise profits he will choose the combination of inputs that minimises cost for producing a given level of output. Evidently, this is a minimisation problem which he has to solve.

Decision making that involves solving of maximisation and minimisation problems is called optimisation. Therefore, for making efficient decision it is necessary for a successful manager to learn the techniques of optimisation. It may however he noted that popular techniques of optimisation are mathematical in nature. In recent year the use of analytical models of business decision making has increased the importance of the knowledge of techniques of optimisation for the students of business management.

Mathematical formulations of these analytical models of managerial decision-making are expressed in terms of functions which describe economic relationship between various variables.

Therefore, to begin with we will explain the concept of a function and its various important types.

2. Functions:

A function describes the relation between two or more than two variables. That is, a function expresses dependence of one variable on one or more other variables. Thus, if the value of a variable Y depends on another variable X, we may write

Where ƒ stands for function.

This expression (1) is read as ‘Y is function of X’. This implies that every value of the variable Y is determined by a unique value of the variable X. In the function (1), Y is known as the dependent variable and X is the independent variable. Thus in function (1) Y is called the dependent variable and its value depends on the value of X.

Further, the independent variable is interpreted as the cause and the dependent variable as the effect. An important function which is extensively used in economics is a demand function which expresses quantity demanded of a commodity is a function of its price, other factors being held constant. Thus, demand for a commodity X is described as under

Where Dx is the quantity demanded of commodity X and Px is its price.

Similarly, supply function of a commodity X is expressed as

When the value of the variable Y depends on more than two variables X1,X2……………………………… Xn this function is written in general form as:

where n is the number of independent variables. Again note that in economics we write ’causes’ as the independent variables and ‘effect 1 as the dependent variable.

For example, demand for a product is generally considered to be a function of its own price, prices of other commodities (which may be substitutes or complements), income of the consumers, tastes and preferences of the consumers and advertising expenditure made by a firm to promote its product. Thus,

Dx = demand for the commodity X

Px = price of the commodity X.

Py = price of a substitute product Y.

M = income of the consumers

T = tastes and preferences of the consumer for the product.

A = advertising expenditure incurred by the firm.

The exact nature of relation of dependent variable with the independent variables can be known from the specific form of the function. The specific form of a function can take a variety of mathematical forms.

We explain below some specific types of functions:

1. Linear and Power Functions:

A widely used mathematical form of a function is a linear function.

A linear function can be stated in the following general form:

Where a and b are positive constants and are called parameters of the function. Note that parameters of a function are variables that are fixed and given in a specific function. The values of constants a and b determine the specific nature of a linear function. The linear demand function with price as the only independent variable is written as

The minus sign before coefficient b indicates that quantity demanded of a commodity is negatively related with price of the commodity. That is, if price of a commodity falls, its quantity demand increases and vice versa. If a equals 7 and b equals 0.5, the linear demand function can be expressed in the following specific form:

The above specific demand function shows that a unit fall in price the commodity will cause o.5 units increases in the quantity demands of the commodity .If price (P) is zero , the second term (0.5P) in the demands functions drops out the quantity demanded is equal to 7.

We can take various values of P and find out different quantities (Qd) of a commodity demanded at them. In Figure 3.1 we have plotted these price-quantity combinations on a graph and have obtained demand curve DD of the commodity representing the given demand function (Qd = 7 – 0.5P).

It should be noted that, contrary to mathematical practice, by convention in economics to represent demand function we show the independent variable (price in the above case of demand function) on the y-axis and the dependent variable (the quantity demanded in the present case) on the x-axis. Graph of linear demand function is shown in Figure 3.1. It is worth noting that slope of the demand function curve in Figure 3.1 will represent ∆P/∆P. However, if we represent quantity demanded (Qd) on the y-axis, and price (Px) on the x-axis the slope of the demand curve so drawn would be equal to ∆Q/∆P.

2. Multivariate Linear Demand Function:

Linear demand function with more than one independent variables, can be written in the following way :

Where b1, b2, b3, b4 are the coefficients of the respective variables. In economics the effect of variables other than the own price of a commodity in the demand function are depicted by shifts in the demand curve. For instance when income (M) of the consumers increases consumers will demand more of the product X at a given price. This implies shifting of the demand curve to the right.

The linear multivariate function is written in the following form:

In this function the coefficients 0.4, 0.2, 0.3 and 0.5 show the precise impact of the independent variables X1, X2, X3, X4 on the dependent variable Y.

3. Power Functions:

The linear functions stated above are known as first degree functions where the independent variables X1, X2, X3, etc are raised to the first power only. We now turn to explain power functions. In economics power functions of the quadratic and cubic forms are extensively used.

4. Quadratic Functions:

In quadratic function one or more of the independent variables are squared, that is, raised to the second power. Note that power is also referred to as exponent. A quadratic function may be written as

This implies that value of the dependent variable Y depends on the constant a plus the coefficient b times the value of the independent variable X plus the coefficient c times the square of the variable X. Suppose a = 4, b = 3 and c = 2 then quadratic function takes the following specific form.

We can obtain the different values of Y for taking different values of the independent variable X.

Quadratic functions are of two types:

Convex quadratic functions and Concave quadratic functions. The form of quadratic function depends on the sign of the coefficient c of X 2 . The quadratic function, Y = a + bX + cX 2 , where the coefficient c of X 2 is positive (i.e. c > 0) is called convex quadratic function, because its graph is U-shaped as shown in Figure 3.2. On the other hand, if coefficient of X 2 is negative (c < 0), that is, when Y = a + bX- cX 2 , then we have concave quadratic function because its graphs is of inverted U- shape (i.e. re­shaped) as shown in Figure 3.3.

It is worth noting that slope of the curve of convex quadratic functions as is evident from U-shaped graph in this case where coefficient of X 2 is positive, slope is increasing everywhere. On the other hand, in case of concave quadratic function where coefficient of X 2 is negative (c < O), slope of its graph is decreasing everywhere.

It should be further noted that in analytical geometry it is proved that graph of any quadratic function is a parabola which may be either convex or concave. A parabola is a curve which has a turning point and unlike the curve of a linear function, its slope is changing at different values of X.

5. Multivariable Quadratic Function:

When there are more than one independent variable such as X1, X2, and they have a quadratic relationship with the dependent variable Y, such a function, is called multivariable quadratic function.

In case of two independent variables X1 and X2 such a function may be expressed as under:

If such a function is graphically shown, it will be represented by a three – dimensional surface and not a two dimensional curve.

6. Cubic Function:

A cubic function is the power function in which there is a third degree term relating to an independent variable. Thus, cubic functions may have first degree, second degree and third degree terms.

A cubic function may have the following form:

a is the intercept term, the dependent variable X has the first degree, second degree and third degree terms. When the signs of all the coefficients a, b, c and d are positive, then the values of V will increase by progressively larger increments as the value of X increases.

However, when the signs of various coefficients differ in the cubic function, that is, some have positive signs and some have negative signs, then the graph of the function may have both convex and concave segments depending on the values of the coefficients.

Such a cubic function where signs of the coefficients of variables differ may be expressed as follows:

In which the sign of the coefficient c of variable X 2 is negative whereas the coefficients of others are positive.


R. C. Strongin and Ya. D. Sergeyev, Global Optimization with Non-Convex Constraints. Sequential and Parallel Algorithms (Kluwer, Dordrecht, 2000).

V. V. Voevodin and Vl. V. Voevodin, Parallel Computations (BKhV-Peterburg, St. Petersburg, 2002) [in Russian].

Yu. G. Evtushenko, “A Numerical Method for Global Optimization of Functions (Search on a Nonuniform Grid),” Zh. Vychisl. Mat. Mat. Fiz. 11, 1390–1403 (1971).

Yu. G. Evtushenko and V. A. Rat’kin, “Bisection Method for the Global Optimization of Functions of Several Variables,” Izv. Ross. Akad. Nauk, Tekh. Kibern., No. 1, 119–127 (1987).

A. Ya. Belyankov, “Improving the Efficiency of Nonuniform Covering Methods in Global Optimization,” in Abstracts of the Conference on Mathematical Programming and Software (Ural’skoe Otdelenie Akad. Nauk SSSR, Sverdlovsk, 1989), pp. 21–22 [in Russian].

Yu. G. Evtushenko, V. U. Malkova, and A. A. Stanevichyus, “Parallelization of the Global Extremum Searching Process,” Avtom.Telemekh., No. 5, 46–58 (2007) [Autom. Remote Control 68, 787–798 (2007)].

Yu. Nesterov and B. Polyak, “Cubic Regularization of Newton Method and Its Global Performance,” Math. Program. 108(1), 177–205 (2006).

A. S. Strekalovsky, Elements of Nonconvex Optimization (Nauka, Novosibirsk, 2003) [in Russian].

A. Ya. Belyankov, “Preliminary Parallelipiped Partitioning in Nonuniform Covering Methods in Global Optimization,” II All-Russia Conf. with Youth Research School on Mathematical Modeling of Developing Economies (Vyatskii Gos. Univ., Kirov, 2007), pp. 59–62 [in Russian].

M. A. Posypkin and I. Kh. Sigal, “Investigation of Algorithms for Parallel Computations in Knapsack-Type Discrete Optimization Problems,” Zh. Vychisl. Mat. Mat. Fiz. 45(10), 1801–1809 (2005) [Comput. Math. Math. Phys. 45, 1735–1742 (2005)].

Joint Supercomputer Center of the Russian Academy of Sciences, httr://www.jsss.ru

Yu. G. Evtushenko and M. A. Potapov, “Methods for Solving Multicriteria Problems,” Dokl. Akad. Nauk SSSR 291, 25–29 (1986).

Yu. G. Evtushenko, “A Numerical Method for Finding Best Guaranteed Estimates,” Zh. Vychisl. Mat. Mat. Fiz. 12(1), 89–104 (1972).


The objective function would have to be written in a separate m-file which takes a vector x as it's input and returns a scalar output. Multiple variables are passed as x[1], x[2], x[3] etc.

x0 is a vector of initial guesses for the variables. Here they are all initialized to 1. It's of length 12 here as there seem to be 12 variables (including 'M').

This link will help you understand how to write the objective functions for multiple variables:

Related

Hot Network Questions


Contents

A multi-objective optimization problem is an optimization problem that involves multiple objective functions. [1] [2] [3] In mathematical terms, a multi-objective optimization problem can be formulated as

and the ideal objective vector as

In other words, the components of a nadir and an ideal objective vector define upper and lower bounds for the objective function values of Pareto optimal solutions, respectively. In practice, the nadir objective vector can only be approximated as, typically, the whole Pareto optimal set is unknown. In addition, a utopian objective vector z → u t o p i a n >^> with

Economics Edit

In economics, many problems involve multiple objectives along with constraints on what combinations of those objectives are attainable. For example, consumer's demand for various goods is determined by the process of maximization of the utilities derived from those goods, subject to a constraint based on how much income is available to spend on those goods and on the prices of those goods. This constraint allows more of one good to be purchased only at the sacrifice of consuming less of another good therefore, the various objectives (more consumption of each good is preferred) are in conflict with each other. A common method for analyzing such a problem is to use a graph of indifference curves, representing preferences, and a budget constraint, representing the trade-offs that the consumer is faced with.

Another example involves the production possibilities frontier, which specifies what combinations of various types of goods can be produced by a society with certain amounts of various resources. The frontier specifies the trade-offs that the society is faced with — if the society is fully utilizing its resources, more of one good can be produced only at the expense of producing less of another good. A society must then use some process to choose among the possibilities on the frontier.

Macroeconomic policy-making is a context requiring multi-objective optimization. Typically a central bank must choose a stance for monetary policy that balances competing objectives — low inflation, low unemployment, low balance of trade deficit, etc. To do this, the central bank uses a model of the economy that quantitatively describes the various causal linkages in the economy it simulates the model repeatedly under various possible stances of monetary policy, in order to obtain a menu of possible predicted outcomes for the various variables of interest. Then in principle it can use an aggregate objective function to rate the alternative sets of predicted outcomes, although in practice central banks use a non-quantitative, judgement-based, process for ranking the alternatives and making the policy choice.

Finance Edit

In finance, a common problem is to choose a portfolio when there are two conflicting objectives — the desire to have the expected value of portfolio returns be as high as possible, and the desire to have risk, often measured by the standard deviation of portfolio returns, be as low as possible. This problem is often represented by a graph in which the efficient frontier shows the best combinations of risk and expected return that are available, and in which indifference curves show the investor's preferences for various risk-expected return combinations. The problem of optimizing a function of the expected value (first moment) and the standard deviation (square root of the second central moment) of portfolio return is called a two-moment decision model.

Optimal control Edit

In engineering and economics, many problems involve multiple objectives which are not describable as the-more-the-better or the-less-the-better instead, there is an ideal target value for each objective, and the desire is to get as close as possible to the desired value of each objective. For example, energy systems typically have a trade-off between performance and cost [4] [5] or one might want to adjust a rocket's fuel usage and orientation so that it arrives both at a specified place and at a specified time or one might want to conduct open market operations so that both the inflation rate and the unemployment rate are as close as possible to their desired values.

Often such problems are subject to linear equality constraints that prevent all objectives from being simultaneously perfectly met, especially when the number of controllable variables is less than the number of objectives and when the presence of random shocks generates uncertainty. Commonly a multi-objective quadratic objective function is used, with the cost associated with an objective rising quadratically with the distance of the objective from its ideal value. Since these problems typically involve adjusting the controlled variables at various points in time and/or evaluating the objectives at various points in time, intertemporal optimization techniques are employed. [6]

Optimal design Edit

Product and process design can be largely improved using modern modeling, simulation and optimization techniques. [ citation needed ] The key question in optimal design is the measure of what is good or desirable about a design. Before looking for optimal designs it is important to identify characteristics which contribute the most to the overall value of the design. A good design typically involves multiple criteria/objectives such as capital cost/investment, operating cost, profit, quality and/or recovery of the product, efficiency, process safety, operation time etc. Therefore, in practical applications, the performance of process and product design is often measured with respect to multiple objectives. These objectives typically are conflicting, i.e. achieving the optimal value for one objective requires some compromise on one or more of other objectives.

For example, when designing a paper mill, one can seek to decrease the amount of capital invested in a paper mill and enhance the quality of paper simultaneously. If the design of a paper mill is defined by large storage volumes and paper quality is defined by quality parameters, then the problem of optimal design of a paper mill can include objectives such as: i) minimization of expected variation of those quality parameter from their nominal values, ii) minimization of expected time of breaks and iii) minimization of investment cost of storage volumes. Here, maximum volume of towers are design variables. This example of optimal design of a paper mill is a simplification of the model used in. [7] Multi-objective design optimization have also been implemented in engineering systems in circumstances such as control cabinet layout optimization, [8] airfoil shape optimization using scientific workflows, [9] design of nano-CMOS semiconductors, [10] system on chip design, design of solar-powered irrigation systems, [11] optimization of sand mould systems, [12] [13] engine design, [14] [15] optimal sensor deployment [16] and optimal controller design. [17] [18]

Process optimization Edit

Multi-objective optimization has been increasingly employed in chemical engineering and manufacturing. In 2009, Fiandaca and Fraga used the multi-objective genetic algorithm (MOGA) to optimize the pressure swing adsorption process (cyclic separation process). The design problem involved the dual maximization of nitrogen recovery and nitrogen purity. The results provided a good approximation of the Pareto frontier with acceptable trade-offs between the objectives. [19]

In 2010, Sendín et al. solved a multi-objective problem for the thermal processing of food. They tackled two case studies (bi-objective and triple objective problems) with nonlinear dynamic models and used a hybrid approach consisting of the weighted Tchebycheff and the Normal Boundary Intersection approach. The novel hybrid approach was able to construct a Pareto optimal set for the thermal processing of foods. [20]

In 2013, Ganesan et al. carried out the multi-objective optimization of the combined carbon dioxide reforming and partial-oxidation of methane. The objective functions were methane conversion, carbon monoxide selectivity and hydrogen to carbon monoxide ratio. Ganesan used the Normal Boundary Intersection (NBI) method in conjunction with two swarm-based techniques (Gravitational Search Algorithm (GSA) and Particle Swarm Optimization (PSO)) to tackle the problem. [21] Applications involving chemical extraction [22] and bioethanol production processes [23] have posed similar multi-objective problems.

In 2013, Abakarov et al proposed an alternative technique to solve multi-objective optimization problems arising in food engineering. [24] The Aggregating Functions Approach, the Adaptive Random Search Algorithm, and the Penalty Functions Approach were used to compute the initial set of the non-dominated or Pareto-optimal solutions. The Analytic Hierarchy Process and Tabular Method were used simultaneously for choosing the best alternative among the computed subset of non-dominated solutions for osmotic dehydration processes. [25]

In 2018, Pearce et al. formulated task allocation to human and robotic workers as a multi-objective optimization problem, considering production time and the ergonomic impact on the human worker as the two objectives considered in the formulation. Their approach used a Mixed-Integer Linear Program to solve the optimization problem for a weighted sum of the two objectives to calculate a set of Pareto optimal solutions. The application of the approach to several manufacturing tasks showed improvements in at least one objective in most tasks and in both objectives in some of the processes. [26]

Radio resource management Edit

The purpose of radio resource management is to satisfy the data rates that are requested by the users of a cellular network. [27] The main resources are time intervals, frequency blocks, and transmit powers. Each user has its own objective function that, for example, can represent some combination of the data rate, latency, and energy efficiency. These objectives are conflicting since the frequency resources are very scarce, thus there is a need for tight spatial frequency reuse which causes immense inter-user interference if not properly controlled. Multi-user MIMO techniques are nowadays used to reduce the interference by adaptive precoding. The network operator would like to both bring great coverage and high data rates, thus the operator would like to find a Pareto optimal solution that balance the total network data throughput and the user fairness in an appropriate subjective manner.

Radio resource management is often solved by scalarization that is, selection of a network utility function that tries to balance throughput and user fairness. The choice of utility function has a large impact on the computational complexity of the resulting single-objective optimization problem. [27] For example, the common utility of weighted sum rate gives an NP-hard problem with a complexity that scales exponentially with the number of users, while the weighted max-min fairness utility results in a quasi-convex optimization problem with only a polynomial scaling with the number of users. [28]

Electric power systems Edit

Reconfiguration, by exchanging the functional links between the elements of the system, represents one of the most important measures which can improve the operational performance of a distribution system. The problem of optimization through the reconfiguration of a power distribution system, in terms of its definition, is a historical single objective problem with constraints. Since 1975, when Merlin and Back [29] introduced the idea of distribution system reconfiguration for active power loss reduction, until nowadays, a lot of researchers have proposed diverse methods and algorithms to solve the reconfiguration problem as a single objective problem. Some authors have proposed Pareto optimality based approaches (including active power losses and reliability indices as objectives). For this purpose, different artificial intelligence based methods have been used: microgenetic, [30] branch exchange, [31] particle swarm optimization [32] and non-dominated sorting genetic algorithm. [33]

Inspection of Infrastructure Edit

Autonomous inspection of infrastructure has the potential to reduce costs, risks and environmental impacts, as well as ensuring better periodic maintenance of inspected assets. Typically, planning such missions has been viewed as a single-objective optimization problem, where one aims to minimize the energy or time spent in inspecting an entire target structure. [34] For complex, real-world structures, however, covering 100% of an inspection target is not feasible, and generating an inspection plan may be better viewed as a multiobjective optimization problem, where one aims to both maximize inspection coverage and minimize time and costs. A recent study has indicated that multiobjective inspection planning indeed has the potential to outperform traditional methods on complex structures [35]

As there usually exist multiple Pareto optimal solutions for multi-objective optimization problems, what it means to solve such a problem is not as straightforward as it is for a conventional single-objective optimization problem. Therefore, different researchers have defined the term "solving a multi-objective optimization problem" in various ways. This section summarizes some of them and the contexts in which they are used. Many methods convert the original problem with multiple objectives into a single-objective optimization problem. This is called a scalarized problem. If Pareto optimality of the single-objective solutions obtained can be guaranteed, the scalarization is characterized as done neatly.

Solving a multi-objective optimization problem is sometimes understood as approximating or computing all or a representative set of Pareto optimal solutions. [36] [37]

When decision making is emphasized, the objective of solving a multi-objective optimization problem is referred to supporting a decision maker in finding the most preferred Pareto optimal solution according to his/her subjective preferences. [1] [38] The underlying assumption is that one solution to the problem must be identified to be implemented in practice. Here, a human decision maker (DM) plays an important role. The DM is expected to be an expert in the problem domain.

The most preferred results can be found using different philosophies. Multi-objective optimization methods can be divided into four classes. [2] In so-called no preference methods, no DM is expected to be available, but a neutral compromise solution is identified without preference information. [1] The other classes are so-called a priori, a posteriori and interactive methods and they all involve preference information from the DM in different ways.

In a priori methods, preference information is first asked from the DM and then a solution best satisfying these preferences is found. In a posteriori methods, a representative set of Pareto optimal solutions is first found and then the DM must choose one of them. In interactive methods, the decision maker is allowed to iteratively search for the most preferred solution. In each iteration of the interactive method, the DM is shown Pareto optimal solution(s) and describes how the solution(s) could be improved. The information given by the decision maker is then taken into account while generating new Pareto optimal solution(s) for the DM to study in the next iteration. In this way, the DM learns about the feasibility of his/her wishes and can concentrate on solutions that are interesting to him/her. The DM may stop the search whenever he/she wants to. More information and examples of different methods in the four classes are given in the following sections.

Scalarizing a multi-objective optimization problem is an a priori method, which means formulating a single-objective optimization problem such that optimal solutions to the single-objective optimization problem are Pareto optimal solutions to the multi-objective optimization problem. [2] In addition, it is often required that every Pareto optimal solution can be reached with some parameters of the scalarization. [2] With different parameters for the scalarization, different Pareto optimal solutions are produced. A general formulation for a scalarization of a multiobjective optimization is thus

Very well-known examples are the so-called

Somewhat more advanced examples are the:

  • achievement scalarizing problems of Wierzbicki. [39] One example of the achievement scalarizing problems can be formulated as
  • Sen's Multi-Objective Programming[40]

When a decision maker does not explicitly articulate any preference information the multi-objective optimization method can be classified as no-preference method. [2] A well-known example is the method of global criterion, [41] in which a scalarized problem of the form

is solved. In the above problem, ‖ ⋅ ‖ can be any L p > norm, with common choices including L 1 > , L 2 > and L ∞ > . [1] The method of global criterion is sensitive to the scaling of the objective functions, and thus, it is recommended that the objectives are normalized into a uniform, dimensionless scale. [1] [38]

A priori methods require that sufficient preference information is expressed before the solution process. [2] Well-known examples of a priori methods include the utility function method, lexicographic method, and goal programming.

but in practice it is very difficult to construct a utility function that would accurately represent the decision maker's preferences [1] - particularly since the Pareto front is unknown before the optimization begins.

The lexicographic method assumes that the objectives can be ranked in the order of importance. We can assume, without loss of generality, that the objective functions are in the order of importance so that f 1 > is the most important and f k > the least important to the decision maker. The lexicographic method consists of solving a sequence of single-objective optimization problems of the form

A posteriori methods aim at producing all the Pareto optimal solutions or a representative subset of the Pareto optimal solutions. Most a posteriori methods fall into either one of the following two classes: mathematical programming-based a posteriori methods, where an algorithm is repeated and each run of the algorithm produces one Pareto optimal solution, and evolutionary algorithms where one run of the algorithm produces a set of Pareto optimal solutions.

Well-known examples of mathematical programming-based a posteriori methods are the Normal Boundary Intersection (NBI), [42] Modified Normal Boundary Intersection (NBIm) [43] Normal Constraint (NC), [44] [45] Successive Pareto Optimization (SPO) [46] and Directed Search Domain (DSD) [47] methods that solve the multi-objective optimization problem by constructing several scalarizations. The solution to each scalarization yields a Pareto optimal solution, whether locally or globally. The scalarizations of the NBI, NBIm, NC and DSD methods are constructed with the target of obtaining evenly distributed Pareto points that give a good evenly distributed approximation of the real set of Pareto points.

Evolutionary algorithms are popular approaches to generating Pareto optimal solutions to a multi-objective optimization problem. Currently, most evolutionary multi-objective optimization (EMO) algorithms apply Pareto-based ranking schemes. Evolutionary algorithms such as the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) [48] and Strength Pareto Evolutionary Algorithm 2 (SPEA-2) [49] have become standard approaches, although some schemes based on particle swarm optimization and simulated annealing [50] are significant. The main advantage of evolutionary algorithms, when applied to solve multi-objective optimization problems, is the fact that they typically generate sets of solutions, allowing computation of an approximation of the entire Pareto front. The main disadvantage of evolutionary algorithms is their lower speed and the Pareto optimality of the solutions cannot be guaranteed. It is only known that none of the generated solutions dominates the others.

Another paradigm for multi-objective optimization based on novelty using evolutionary algorithms was recently improved upon. [51] This paradigm searches for novel solutions in objective space (i.e., novelty search [52] on objective space) in addition to the search for non-dominated solutions. Novelty search is like stepping stones guiding the search to previously unexplored places. It is especially useful in overcoming bias and plateaus as well as guiding the search in many-objective optimization problems.

Commonly known a posteriori methods are listed below:

  • ε-constraints method [53][54]
  • Multiple-objective Branch-and-Bound [55][56][57]
  • Normal Boundary Intersection (NBI) [42]
  • Modified Normal Boundary Intersection (NBIm) [43] Normal Constraint (NC), [44][45]
  • Successive Pareto Optimization (SPO) [46]
  • Directed Search Domain (DSD) [47]
  • NSGA-II [48]
  • PGEN (Pareto surface generation for convex multi-objective instances) [58] (Indirect Optimization on the basis of Self-Organization)
  • SMS-EMOA (S-metric selection evolutionary multi-objective algorithm) [59]
  • Approximation-Guided Evolution (first algorithm to directly implement and optimise the formal concept of approximation from theoretical computer science) [60] (using machine learning for adapting strategies and objectives), [61][62] implemented in LIONsolver for multiple objective linear programs and for multiple objective convex programs
  • Subpopulation Algorithm based on Novelty [51]

In interactive methods of optimizing multiple objective problems, the solution process is iterative and the decision maker continuously interacts with the method when searching for the most preferred solution (see e.g. Miettinen 1999, [1] Miettinen 2008 [63] ). In other words, the decision maker is expected to express preferences at each iteration in order to get Pareto optimal solutions that are of interest to the decision maker and learn what kind of solutions are attainable.

The following steps are commonly present in interactive methods of optimization : [63]

  1. initialize (e.g. calculate ideal and approximated nadir objective vectors and show them to the decision maker)
  2. generate a Pareto optimal starting point (by using e.g. some no-preference method or solution given by the decision maker)
  3. ask for preference information from the decision maker (e.g. aspiration levels or number of new solutions to be generated)
  4. generate new Pareto optimal solution(s) according to the preferences and show it/them and possibly some other information about the problem to the decision maker
  5. if several solutions were generated, ask the decision maker to select the best solution so far
  6. stop (if the decision maker wants to otherwise, go to step 3).

The above aspiration levels refer to desirable objective function values forming a reference point. Instead of mathematical convergence that is often used as a stopping criterion in mathematical optimization methods, a psychological convergence is often emphasized in interactive methods. Generally speaking, a method is terminated when the decision maker is confident that he/she has found the most preferred solution available.

Types of preference information Edit

There are different interactive methods involving different types of preference information. Three of those types can be identified based on


G. M. Ademenko, “A minimization method for functions of n variables,” Izv. Akad. Nauk BSSR, Ser. Fiz.-Mat. Nauk, No. 2, 131–132 (1972).

G. M. Ademenko, R. Gabasov, and A. A. Edinovich, “Minimization of functions of a finite number of variables by second-order methods,” Zh. Vychisl. Matem. I. Matem. Fiz.,11, No. 5, 1139–1149 (1971).

G. M. Adel'son-Vel'skii, V. L. Arlazarov, and F. M. Filler, “Search for the extrema of a function of several variables,” in: Proc. First Winter School on Mathematical Programming, 1968 [in Russian], No. 2, Moscow (1969), pp. 205–215.

S. I. Al'ber and Ya. I. Al'ber, “Application of the method of differential descent for the solution of nonlinear systems,” Zh. Vychisl. Matem. i Matem. Fiz.,7, No. 1, 14–32 (1967).

Ya. I. Al'ber, “Implicit difference scheme for the minimization of functionals and the solution of nonlinear systems,” Izv. Vyssh. Ucheb. Zaved., Radiofiz.,10, No. 7, 1035–1941 (1967).

G. E. Antonov and V. Ya. Katkovnik, “Filtering and smoothing of functions of several variables in order to find a global extremum,” Avtomat. i Vychisl. Tekh., No. 4, 32–38 (1970).

E. A. Arais and M. N. Golovchiner, “Optimization of ravine functions,” in: Aspects of Programming and Design Automation [in Russian], Tomsk. Univ., Tomsk (1971), pp. 98–108.

G. M. Bakan, “The use of certain extremum-locating methods for a function of several variables in optimization problems,” in: Complex Control Systems (All-Republic Inter-institutional Collection, No. 3) [in Russian] (1967), pp. 106–122.

P. I. Bartolomei, “Use of the Monte Carlo approach for calculation of the optimal regime of an energy system on a digital computer,” Trudy Ural'sk. Politekh. Inst., No. 154, 72–82 (1966).

D. I. Batishev, “Test functions for the comparison of extremum-locating methods for functions of several variables,” Avtomat. i Vychisl. Tekh., No. 1, 16–19 (1968).

A. V. Buledza, “Improvement of estimates for the method of steepest descent,” Dopovidi Akad. Nauk Ukr. RSR, No. 2, 150–153 (1962).

L. M. Vasserman, B. B. Levi, and Yu. A. Nazarkin, “Steepest descent with variation of dimensions,” Nauch. Trudy Kursk. Politekh. Inst., No. I, Part 2, 14–21 (1971).

M. K. Gavurin and Yu. B. Farforovskaya, “An iterative method for locating the minimum of the sum of the squares,” Zh. Vychisl. Matem. i Matem. Fiz.,6, No. 6, 1094–1097 (1966).

A. L. Gaidukov, “Random search in optimal planning,” in: Applied Problems of Engineering Cybernetics [in Russian], Sov. Radio, Moscow (1966), pp. 420–434.

B. A. Galanov, “A method of connected differential descent,” in: Mathematical Methods in Cybernetic Engineering [in Russian], No. 1, Kiev (1970), pp. 8–12.

B. A. Galanov, “Gradient methods for the solution of systems of linear algebraic equations,” Dopcvidi Akad. Nauk Ukr. RSR, No. 12, 1519–1522 (1966).

B. A. Galanov and Yu. M. Krivonos, “A differential-descent method in extremal problems,” in: Proc. Semin. Mathematical Methods in Special-Purpose Computer Engineering [in Russian], No. 1, Kiev (1969), pp. 38–47.

A. I. Galushkin, “Analysis of an iterative extremum-locating method,” Avtomat. i Vychisl. Tekh., No. 2, 38–40 (1970).

I. M. Gel'fand and S. V. Fomin, Variational Calculus [in Russian], Fizmatgiz, Moscow (1961), 228 pages.

I. M. Gel'fand and M. L. Tsetlin, “Some control techniques for complex systems,” Usp. Matem. Nauk,17, No. 1, 3–25 (1962).

I. M. Gel'fand and M. L. Tsetlin, “A nonlocal search principle for automatic optimization systems,” Dokl. Akad. Nauk SSSR,137, No. 2, 295–298 (1961).

S. M. Gerashchenko, “Assignment of right-hand sides in a system of differential equations for the gradient method,” Differents. Uravnen.,3, No. 12, 2144–2150 (1967).

S. M. Gerashchenko, “Application of automatic control theory to the differential-descent method,” in: Proc. Second All-Republic Conf. Belorussian Mathematicians [in Russian], Belorussk. Univ., Minsk (1969), pp. 179–191.

S. M. Gerashchenko, “Comparison of certain modifications of differential descent in terms of convergence rate,” Differents. Uravnen.,6, No. 10, 1810–1817 (1970).

G. I. Gershengom, “Method of ravines and random directions,” Inform. Sborn. Trudov Vychisl. Tsentra Irkutsk. Univ., No. 1, 59–71 (1966).

L. S. Gurin, “Experience in the use of Monte Carlo methods for finding the extremal values of functions,” in: Aspects of Computational Mathematics and Computer Engineering [in Russian], Mashgiz, Moscow (1963), pp. 118–123.

L. S. Gurin, “Optimization in stochastic models,” Zh. Vychisl. Matem. i Matem. Fiz.,4, No. 6, 1134–1137 (1964).

L. S. Gurin and V. P. Lobach, “Combination of the Monte Carlo method with the steepest-descent method for the solution of certain extremal problems,” Zh. Vychisl. Matem. i Matem. Fiz.,2, No. 3, 499–502 (1962).

Yu. M. Danilin, “Minimization methods based on approximation of the objective functional by convex functionals,” in: Proc. Semin. Optimal Decision Theory [in Russian], No. 1, Kiev (1969), pp. 45–71.

Yu. M. Danilin, “Estimate of the efficiency of an algorithm for the location of an absolute minimum,” Zh. Vychisl. Matem. i Matem. Fiz.,11, No. 4, 1026–1031 (1971).

Yu. M. Danilin and S. A. Piyavskii, “An-algorithm for the absolute minimum,” in: Proc. Semin. Optimal Decision Theory [in Russian], No. 2, Kiev (1967), pp. 25–37.

Yu. M. Danilin and B. N. Pshenichnyi, “Minimization techniques with accelerated convergence,” Zh. Vychisl. Matem. i Matem. Fiz.,10, No. 6, 1341–1354 (1970).

V. F. Dem'yanov and A. M. Rubinov, Approximation Methods for the Solution of Extremal Problems [in Russian], Leningrad Univ. (1968), 180 pages.

V. I. Denisov, “Combination of two methods for locating an extremum of a multifactorial system,” Avtomat. i Vychisl. Tekh., No. 6, 32–36 (1968).

A. D. Dobysh, “An algorithm for minimization of a function computed with random error,” Sborn. Trudov Moskov. Inzh.-Stroit. Inst., No. 83, 124–139 (1970).

Yu. G. Evtushenko, “Numerical method for locating the global extremum of a function (direct search on a nonuniform net),” Zh. Vychisl. Matem. i Matem. Fiz.,11, No. 6, 1390–1403 (1971).

N. P. Zhidkov and B. M. Shchedrin, “A method for finding the minimum of a function of several variables,” in: Computational Methods and Programming [in Russian], No. 10, Moskov. Univ., Moscow (1968), pp. 203–210.

G. Zoutendijk, Methods of Feasible Directions, Elsevier, New York-Amsterdam (1960).

A. K. Zuev, L. A. Rastrigin, and K. K. Ripa, “Correction of random-search algorithms for the solution of wandering-extremum tracking problems,” in: Statistical Optimization Problems [in Russian], Zinatne, Riga (1971), pp. 93–96.

V. V. Ivanov, “Problems in the optimization of computations,” in: Digital Computer Software [in Russian], Kiev (1972), pp. 149–172.

V. V. Ivanov, “Rapid-descent algorithms,” Dokl. Akad. Nauk SSSR,143, No. 4, 775–778 (1962).

V. V. Ivanov and K. Dzhumaev, “Analysis of the precision of Powell's algorithm (I),” in: Proc. Semin. Mathematical Methods and Special-Purpose Computer Engineering [in Russian], No. 1, Kiev (1968–69), pp. 3–12.

V. V. Ivanov and K. Dzhumaev, “Analysis of the precision of Powell's algorithm (II),” in: Mathematical Methods in Cybernetic Engineering [in Russian], No. 1, Kiev (1970), pp. 21–25.

V. V. Ivanov arid G. A. Kiyashko, “Optimal algorithms for minimization in the class of unimodal functions,” Dopovidi Akad. Nauk Ukr. RSR, No. 4, 313–316 (1972).

V. N. Il'in, “Modifications of the method of steepest descent,” Trudy Moskov. Inst. Radiotekh. Élektron. Avtomat., No. 46, 5–10 (1970).

A. N. Ioseliani, “An extremal-locating method for functions of several variables,” Trudy Gruz. Politekh. Inst., No. 5(110), 203–207 (1966).

A. N. Ioseliani, “Convergence of an extremum-locating algorithm for functions of several variables,” Trudy Gruz. Politekh. Inst., No. 2(114), 59–64 (1967).

A. N. Ioseliani, “A modification of search for an extremum by the method of tangent planes,” Trudy Inst. Élektron. Avtomat. Telemekhan. Akad. Nauk Gruz. SSR,8, No. 1, 74–80 (1970).

A. N. Ioseliani and M. E. Salukvadze, “Comparison of certain deterministic extremum-locating methods,” Trudy Inst. Élektron Avtomat. Telemekhan. Akad. Nauk Gruz. SSR,8, No. 1, 61–73 (1970).

L. V. Kantorovich and G. P. Akilov, Functional Analysis in Normed Spaces [in Russian], Fizmatgiz, Moscow (1959), 634 pages.

V. Ya. Katkovnik, “Sensitivity of gradient schemes,” Avtomat. i Telemekhan., No. 12, 87–94 (1970).

V. Ya. Katkovnik, “A generalized gradient descent algorithm,” Trudy Leningrad. Politekh. Inst., No. 318, 143–146 (1971).

A. B. Kovrigin, “Estimation of the rate of convergence of the K-step gradient method,” Vestnik Leningrad Univ., No. 13, 34–36 (1970).

Yu. M. Krivonos and B. A. Galanov, “A differential descent method,” Differents. Uravnen.,5, No. 8, 1426–1430 (1969).

A. I. Kuzovkin and V. M. Tikhomirov, “Volume of computations for finding the minimum of a convex function,” Ekonom. i Matem. Metody,3, No. 1, 95–103 (1967).

V. I. Kuptsov and E. G. Shurshkova, “Convergence of a modified Newton method,” Shorn. Rabot Vychisl. Tsentra Moskov. Univ.,14, 157–165 (1970).

V. A. Lavrov and V. A. Khokhlov, “Program for finding a local extremum of a function of several variables on computers of the Mir series,” in: Machines for Engineering Calculations [in Russian], No. 5, Kiev (1972), pp. 8–17.

G. S. Lbov, “Adaptive search for an extremum of a function of variables measured on a scale of denominations,” in: Computing Systems [in Russian], No. 44, Nauka, Novosibirsk (1971), pp. 13–22.

V. V. Leonov, “Computational method for the synthesis of optimization processes,” in: Second All-Union Congr. Theoretical and Applied Mechanics [in Russian], Nauka, Moscow (1964).

V. V. Leonov, “A covering method for finding the global maximum of a function of several variables,” in: Cybernetics Research [in Russian], Sov. Radio, Moscow (1970), pp. 41–52.

V. V. Leonov, “A method for finding the global maximum of a function of several vari-ables,” Sborn. Trudov Inst. Matem., Sibirsk. Otd. Akad. Nauk SSSR, No. 8, 43–50 (1971).

É. É. Loiter, L. A. Brichkin, and É. A. Nedel'chik, “A technique for accelerating the convergence of the iterative procedure in the solution of certain optimization problems,” Nauch. Trudy Kazakhsk. Politekh. Inst. (Alma-Ata) (1971), pp. 87–91.

Yu. I. Lyubich, “Convergence of the steepest-descent process,” Dokl. Akad. Nauk SSSR,179, No. 5, 1054–1056 (1968).

M. D. Maergoiz, “Parameter selection in the minimization problem,” Dokl. Akad. Nauk SSSR,188, No. 4, 752–755 (1969).

M. D. Maergoiz, “Application of the generalized Aitkin-Steffensen method to the function minimization problem,” in: Numerical Analysis [in Russian], No. 1, Kiev (1970), pp. 56–78.

M. D. Maergoiz, “Application of the generalized Aitkin-Steffensen method to the function minimization problem,” Sibirsk. Matem. Zh.,13, No. 1, 133–141 (1972).

G. D. Maistrovskii, “Convergence of the conjugate-gradient method,” Zh. Vychisl. Matem. i Matem. Fiz.,11, No. 5, 1291–1294 (1971).

N. N. Moiseev, Optimization Methods [in Russian], Vychisl. Tsentr., Moscow (1969), 96 pages, Chap. I: “The extremum-locating problem for functions of several variables,”

I. B. Motskus, Multiple-Extremum Problems in Design [in Russian], Nauka, Moscow (1967).

A. M. Myalkovskii and M. Kh. Tishavaeva, “An algorithm for finding the global maxima of multiple-extremum functions,” in: Aspects of Cybernetics and Computational Mathematics [in Russian], No. 11, Fan, Tashkent (1967), pp. 56–63.

Yu. A. Nazarkin, “Survey and classification of investigative methods for optimal systems,” Nauch. Trudy Kursk. Politekh. Inst., No. 1, Part 2, 5–13 (1971).

V. V. Nalimov and N. A. Chernova, Statistical Methods in Extremal Experimental Design [in Russian], Nauka, Moscow (1965), 340 pages.

I. I. Narozhnyi, “Combination of the gradient method and Euler equation for the optimization of aircraft flight paths,” Vestnik Leningrad. Univ., No. 13, 118–125 (1971).

Yu. I. Neimark and R. G. Strongin, “Search for an extremum of a function by the maximum-information principle,” Avtomat. i Telemekhan., No. 1, 113–118 (1966).

E. G. Nikolaev, “Steepest descent based on the random m-gradient method,” Avtomat. i Vychisl. Tekh., No. 3, 40–46 (1970).

E. G. Nikolaev, “A method for the random selection of descent direction,” in: Aspects of Cybernetics and Computational Mathematics [in Russian], No. 28, Fan, Tashkent (1969), pp. 160–167.

L. Ya. Oblomskaya, “Rate of convergence of the conjugate-gradient method for quadratic functionals,” in: Proc. Second Winter School on Mathematical Programming and Related Topics, 1969 [in Russian], No. 3, Moscow (1969), pp. 550–568.

E. V. Oganesyan, G. S. Gekchyan, and S. A. Piliposyan, “Nonequivalence of extremum-locating methods,” in: Automation of Chemical-Engineering Processes [in Russian], Erevan (1966), pp. 68–74.

A. A. Pervozvanskii, Search [in Russian], Nauka, Moscow (1970), 164 pages.

I. Sh. Pinsker and B. M. Tseitlin, “A nonlinear optimization problem,” Avtomat. i Tele-mekhan.,13, No. 12, 1611–1619 (1962).

I. Sh. Pinsker and B. M. Tseitlin, “Solution of the optimization problem by the method of independent search,” in: Automatic Control and Computer Engineering [in Russian], No. 6, Mashinostroenie, Moscow (1964), pp. 213–231.

S. A. Piyavskii, “Algorithm for finding the absolute minimum of a function,” in: Proc. Semin. Optimal Decision Theory [in Russian], No. 2, Kiev (1967), pp. 13–24.

S. A. Piyavskii, “An algorithm for finding the absolute extremum of a function,” Zh. Vychisl. Matem. i Matem. Fiz.,12, No. 4, 888–896 (1972).

V. Poll, “Methods for locating the stationary points of functions of several variables,” Eesti NSV Tead. Akad. Toimetised. Füüs.-Mat.,16, No. 1, 35–44 (1967).

V. Poll, “Convergence of certain methods for locating stationary points of functions of several variables,” Eesti NSV Tead. Akad. Toimetised. Füüs.-Mat.,16, No. 2, 157–167 (1967).

V. Poll, “Methods for locating stationary points,” Eesti NSV Tead. Akad. Toimetised. Füüs.-Mat.,16, No. 3, 382–384 (1967).

B. T. Polyak, “Techniques for accelerating the convergence of iterative methods,” Zh. Vychisl. Matem. i Matem. Fiz.,4, No. 5, 791–803 (1964).

B. T. Polyak, “Minimization methods for functions of several variables,” Ékonom. i Matem. Metody,3, No. 6, 881–901 (1967).

B. T. Polyak, “The conjugate-gradient method,” in: Proc. Second Winter School on Mathematical Programming and Related Topics, 1969 [in Russian], No. 1, Moscow (1969), pp. 152–201.

B. T. Polyak, “The conjugate-gradient method in extremal problems,” Zh. Vychisl. Matem. i Matem. Fiz.,9, No. 4, 807–821 (1969).

B. T. Polyak, “Convergence of methods of feasible directions in extremal problems,” Zh. Vychisl. Matem. i Matem. Fiz.,11, No. 4, 855–869 (1971).

B. T. Polyak and B. I. Shostakovskii, “A problem on the maximum of a function of several variables,” in: Sborn. Rabot Vychisl. Tsentra Moskov. Univ.,5, 107–114 (1966).

A. F. Potapova, “Accelerating the convergence of the steepest-descent method,” Zh. Vychisl. Matem. i Matem. Fiz.,11, No. 3, 749–752 (1971).

B. N. Pshenichnyi, “A descent algorithm,” Zh. Vychisl. Matem. i Matem. Fiz.,8, No. 3, 647–652 (1968).

B. N. Pshenichnyi and D. N. Marchenko, “An approach to the location of a global minimum,” in: Proc. Semin. Optimal Decision Theory [in Russian], No. 2, Kiev (1967), pp. 3–12.

T. L. Razinkova, “An algorithm for the computer-aided location of a function extremum,” Zh. Vychisl. Matem. i Matem. Fiz.,5, No. 4, 734–742 (1965).

L. A. Rastrigin, Random Search in Optimization Problems for Multiple-Parameter Systems [in Russian], Zinatne, Riga (1965).

L. A. Rastrigin, “Markovian and non-Markovian descent for extremal control in a noisy environment (II),” Izv. Akad. Nauk Latv. SSR, Ser. Fiz. Tekh. Nauk, No. 6, 80–87 (1965).

L. A. Rastrigin, Statistical Search Methods [in Russian], Nauka, Moscow (1968).

L. A. Rastrigin, Random Search with Linear Timing [in Russian], Zinatne, Riga (1971), 192 pages.

L. A. Rastrigin, “Principles of random search,” in: Theory and Application of Adaptive Systems [in Russian], Alma-Ata (1971), pp. 55–74.

V. P. Revyakin, G. S. Lbov, Yu. S. Lbov, and G. M. Shishkin, “An adaptive random-search algorithm for the solution of extremal problems,” Izv. Irkutsk. Sel'skokhoz. Inst.,3, No. 28, 97–113 (1970).

O. K. Romanov, “Heuristic method for successive improvement of the solution of a general optimization problem,” Uch. Zap. Moskov. Obl. Pedagog. Inst.,202, No. 6, 247–260 (1968).

M. E. Salukadze, “Optimum search for an extremum of a function of several variables,” in: Automatic Control [in Russian], Tbilisi (1967).

B. A. Samokish, “Analysis of the rate of convergence of the steepest-descent method,” Usp. Matem. Nauk,12, No. 1, 238–240 (1957).

I. V. Serienko, “Amethod for the solution of extremal-value problems,” Avtomatika (Kiev), No. 5, 15–24 (1964).

B. P. Serebryakov and V. F. Gurskii, “Interpolation algorithms for determining an extremum of a symmetric function,” Trudy Moskov. Aviats. Inst., No. 229, 90–99 (1971).

V. A. Skokov, “An algorithm for the minimization of functions of several variables,” in: Abstr. Sci. Conf. Scientists of Moscow State University [in Russian], MGU, Moscow (1968), pp. 18–19.

V. S. Spiridonov, “Application of the gradient relaxation method for the solution of systems of nonlinear equations,” Zh. Vychisl. Matem. i Matem. Fiz.,8, No. 4, 872–873 (1968).

L. A. Starosel'skii, G. A. Shelud'ko, and B. Ya. Kantor, “One realization of the method of ravines with adaptation of the ravine spacing by an exponential law,” Zh. Vychisl. Matem. i Matem. Fiz.,8, No. 5, 1161–1167 (1968).

R. I. Stakhovskii, “Comparison of certain search methods for an automatic optimizer,” in: Theory and Application of Discrete Automatic Systems [in Russian], Izd. AN SSSR, Moscow (1960).

R. G. Strongin, “Search for an extremum of a function of several variables by the maximum-information principle with an automatically adapted model,” in: Proc. All-Union Interinstitutional Sympos. Applied Mathematics and Cybernetics [in Russian], Gor'kii (1967), pp. 113–130.

R. G. Strongin, “Minimization of functions on the basis of the maximum-difference hypothesis,” in: Abstr. Fourth All-Union Semin. Statistical Optimization Problems [in Russian], Zinatne, Riga (1967).

R. G. Strongin, “Minimization of functions on the basis of the maximum-difference hypothesis,” in: Statistical Optimization Problems [in Russian], Riga (1968), pp. 41–50.

R. G. Strongin, “A global minimization algorithm,” Izv. Vyssh. Ucheb. Zaved., Radiofiz.,13, No. 4, 539–545 (1970).

R. G. Strongin, “Minimization of multiple-extremum functions of several variables,” Izv. Akad. Nauk SSSR, Tekh. Kibernet., No. 6, 39–46 (1971).

R. G. Strongin, “Algorithms for finding an absolute minimum,” in: Statistical Optimization Problems [in Russian], Zinatne, Riga (1971), pp. 51–68.

A. G. Sukharev, “Optimal extremum-seeking strategies,” Zh. Vychisl. Matem. i Matem. Fiz.,11, No. 4, 910–924 (1971).

L. T. Tarushkina, “A method for determining the extremum of a functional composed of bilinear forms,” Trudy Sibirsk. Fiz.-Tekh. Inst. pri Tomsk. Univ., No. 51, 100–103 (1970).

V. E. Truten', “Estimating the total error of an algorithm for finding a global minimum,” Vychisl. i Prikl. Matem., Mezhved. Nauch. Sborn., No. 8, 182–186 (1969).

D. J. Wilde, Optimum Seeking Methods, Prentice-Hall, Englewood Cliffs, N. J. (1964).

D. K. Faddeev and V. N. Faddeeva, Computational Methods of Linear Algebra [in Russian], Fizmatgiz, Moscow (1960), 656 pages.

A. V. Fiacco and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques, Wiley, New York (1968).

K. Fujii, Y. Nishimura, and H. Taguchi, “Improvement of steepest-descent method for optimizing control systems,” J. Japan. Assoc. Automat. Control Engrs.,11, No. 1, 38–43 (1967).

O. K. Khanmamedov, “On the global search problem,” Avtomat. i Vychisl. Tekh., No. 3, 66–67 (1972).

Hsin-kuei Ho, “On the Newton method and the gradient method,” Intyun Shusyué Yui Tszi-suan' Shusyué,3, No. 3, 167–173 (1966).

I. V. Tsaritsyna, “A technique for accelerating the convergence of minimum-seeking methods for functions of several variables,” Vestnik Leningrad Univ., No. 7, 47–51 (1971).

B. M. Tseitlin and L. B. Lasovskaya, “An algorithm and program for the quadratic method of finding optimum parameters,” in: Automated Analysis and Control of Precision in Mechanical Engineering [in Russian], Nauka, Moscow (1967), pp. 53–68.

A. Tsuruo, “The status and the future of the Monte Carlo methods,” Nippon Genshiryoku Gakkaishi,6, No. 9, 523–527 (1964).

F. L. Chernous'ko, “Optimal search for an extremum of a unimodular function,” Zh. Vychisl. Matem. i Matem. Fiz.,10, No. 4, 922–933 (1970).

F. L. Chernous'ko, “Optimal search for the minimum of a convex function,” Zh. Vychisl. Matem. i Matem. Fiz.,10, No. 6, 1355–1366 (1970).

V. E. Shamanskii, “Some computational schemes for iterative processes,” Ukrainsk. Matem. Zh.,14, No. 1, 100–109 (1962).

V. E. Shamanskii, “A modification of the Newton method,” Ukrainsk. Matem. Zh.,19, No. 1, 133–138 (1967).

V. E. Shamanskii, “Application of the Newton method in a special case,” Zh. Vychisl. Matem. i Matem. Fiz.,7, No. 4, 774–783 (1967).

N. Z. Shor, Structure of Algorithms for the Numerical Solution of Optimal Planning and Design Problems [in Russian], Inst. Kibernet., Akad. Nauk Ukr. SSSR, Kiev (1964).

N. Z. Shor, “Rate of convergence of generalized gradient descent,” Kibernetika, No. 3, 98–99 (1968).

N. Z. Shor, “Generalized gradient descent,” in: Proc. Winter School on Mathematical Programming, Drogobych [in Russian], No. 3, Moscow (1969).

N. Z. Shor, “Application of the space-dilation operation to minimization problems for convex functions,” Kibernetika, No. 1, 6–12 (1970).

N. Z. Shor, “Rate of convergence of the generalized gradient-descent method with space dilation,” Kibernetika, No. 2, 80–85 (1970).

N. Z. Shor and V. I. Biletskii, “The space-dilation method for accelerating convergence in problems of the ravine type,” in: Proc. Semin. Optimal Decision Theory [in Russian], No. 2, Kiev (1969), pp. 3–18.

N. Z. Shor and N. G. Zhurbenko, “A minimization method using the space-dilation operation in the direction of the difference of two sequential gradients,” Kibernetika, No. 3, 51–59 (1971).

S. I. Shudrova, “∏-Parametric modification of the steepest-descent method,” Uch. Zap. Mordovsk. Univ., No. 18, 197–204 (1961).

D. B. Yudin and É. M. Khazen, “Certain mathematical aspects of statistical search methods,” in: Automation and Computer Engineering [in Russian], No. 13, Zinatne, Riga, (1966), pp. 29–41.

N. Adachi, “On variable metric algorithms,” J. Optimization Theory Appl.,7, No. 6, 391–410 (1971).

F. Andreuzzi and G. P. Mattolini, “Ricerca del minimo vincolato di una funzione dotata di derivate prima seconda,” Calcolo,8, Nos. 1–2, 89–105 (1971).

H. A. Antosiewicz and W. C. Reinboldt, “Numerical analysis and functional analysis,” in: Survey of Numerical Analysis (J. Todd, ed.), McGraw-Hill, New York (1962), Chap. 14.

L. Armijo, “Minimization of functions having Lipschitz continuous first partial derivatives,” Pacific J. Math.,16, No. 1, 1–3 (1966).

E. S. Armstrong, A Combined Newton-Raphson and Gradient Parameter Correction Technique for Solution of Optimal-Control Problems, NASA Tech. Rep. R-293 (1968).

S. Atluri, “Conjugate gradient minimization technique in process optimization,” in: Proc. Seventh Ann. Allerton Conf. Circuit and System Theory, Monticello, Ill., 1969, New York (1969), pp. 586–594.

A. Auslender, “Méthodes numériques pour la décomposition et la minimisation de fonctions non différentiables,” Numer. Math.,18, No. 3, 213–223 (1971).

A. Auslender, “Une méthode de générale pour la décomposition et la minimisation de fonctions non différentiables,” Compt. Rend.,271, No. 21, A1078-A1081 (1970).

R. M. Baer, “Note on an extremum locating algorithm,” Comput. J.,5, No. 3, 193 (1962).

Y. Bard, “Comparison of gradient methods for the solution of nonlinear parameter estimation problems,” SIAM J. Numer. Anal.,7, No. 1, 157–186 (1970).

R. Bass, “A rank two algorithm for unconstrained minimization,” Math. Comput.,26, No. 117, 129–143 (1972).

M. Bassetti and R. M. Buonanni, An Improved Newton-Raphson Method, Lab. Naz. Frascati Rep. No. 4 (1967), 9 pages.

F. S. Beckman, “The solution of linear equations by the conjugate gradient method,” in: Mathematical Methods for Digital Computers, Wiley, New York-London (1960), pp. 62–72.

R. D. Bell, “The application of third variations to function minimization,” Internat. J. Control,12, No. 1, 17–24 (1970).

A. Bensasson, “Résolution automatique des systèmes d'équations algébriques par la méthode du gradient,” Internat. J. Electron.,23, No. 3, 43–52 (1968).

L. D. Berkovitz, “Variational methods in problems of control and programming,” J. Math. Anal. Appl.,3, No. 1, 145–169 (1961).

G. Berman, “Minimization by successive approximation,” SIAM J. Numer. Anal.,3, No. 1, 123–133 (1966).

G. Best, “A method of structural weight minimization suitable for high-speed digital computers,” AIAA Journal,1, No. 2, 478–479 (1963).

M. C. Biggs, “Minimization algorithms making use of nonquadratic properties of the objective function,” J. Inst. Math. Appl.,8, No. 3, 315–327 (1971).

J. R. Blum, “Approximation methods which converge with probability one,” Ann. Math. Statist.,125, No. 2, 382–386 (1954).

A. Bonnemay, “Un algorithme rapide de minimalisation statique: comparaison avec d'autres algorithmes,” Compt. Rend.,272, No. 23, A1514-A1517 (1971).

A. D. Booth, Numerical Methods, Butterworths, London (1955), 195 pages.

G. E. P. Box and K. B. Wilson, “The experimental attainment of optimal conditions,” J. Roy. Statist. Soc.,B13, 1–38 38–45 (1951).

M. J. Box, “A comparison of several current optimization methods, and the use of transformations in constrained problems,” Comput. J.,9, No. 1, 67–77 (1966).

D. Braess, “Über Dämpfung bei Minimalisierungsverfahren,” Comput.,1, No. 3, 264–272 (1966).

J. P. Brannen, “An iterative method for optimizing expectations,” J. Soc. Ind. Appl. Math.,13, No. 2, 545–554 (1965).

H. Bremermann, “A method of uncontrained global optimization,” Math. Biosci.,9, 1–15 (1970).

S. H. Brooks, “A discussion of random methods for seeking maxima,” Operat. Res.,6, No. 2, 244–251 (1958).

R. R. Brown, “A generalized computer procedure for the design of optimum systems, Part I,” Commun. Electron., No. 43, 285–289 (1959).

R. R. Brown, “A generalized computer procedure for the design of optimum systems, Part II,” Commun. Electron., No. 43, 289–293 (1959).

C. G. Broyden, “Quasi-Newton methods and their application to function minimization,” Math. Comput.,21, No. 99, 368–381 (1967).

C. G. Broyden, “The convergence of single-rank quasi-Newton methods,” Math. Comput.,24, No. 110, 365–382 (1970).

C. G. Broyden, “The convergence of a class of double-rank minimization algorithms. 1. General considerations,” J. Inst., Math. Appl.,6, No. 1, 76–90 (1970).

C. G. Broyden, “The convergence of a class of double-rank minimization algorithms. 2. The new algorithm,” J. Inst. Math. Appl.,6, No. 3, 222–231 (1970).

C. G. Broyden, “The convergence of an algorithm for solving sparse nonlinear systems,” Math. Comput.,25, No. 114, 285–294 (1971).

R. C. Buck, “Stochastic ascent,” Numer. Math.,4, No. 3, 177–186 (1962).

R. J. Buehler, B. V. Shah, and O. Kempthorne, The Method of Parallel Tangents (PARTAN) for Finding an Optimum, ONR Tech. Rep. No. 2, Contract No. 5030(05), Iowa State Univ. Statist. Lab. (April, 1961 rev. August, 1962.).

J. W. Cantrell, “Relation between the memory gradient method and the Fletcher-Reeves method,” J. Optimization Theory Appl.,4, No. 1, 67–71 (1969).

C. W. Carroll, An Operations Research Approach to the Economics Optimization of a Kraft Pulping Process (Ph. D. Dissertation), Inst. Paper Chemistry, Appleton Wis. (1959).

C. W. Carroll, “The created response surface technique for optimizating nonlinear restrained systems,” Operat. Res.,9, No. 2, 169–185 (1961).

R. Chattopadhyay, “A study of test functions for optimization algorithms,” J. Optimization Theory Appl.,8, No. 3, 231–236 (1971).

A. Charnes and W. W. Cooper, “Letter to the editor: ‘Such solutions are very little solved’,” Operat. Res.,3, No. 3, 345–346 (1955).

D. Chazan and W. L. Miranker, “A nongradient and parallel algorithm for unconstrained minimization,” SIAM J. Control,8, No. 2, 207–217 (1970).

E. W. Cheney and A. A. Goldstein, “Newton's method for convex programming and Tcheby-cheff approximation,” Numer. Math.,1, No. 5, 253–268 (1959).

V. K. Chichinadze, N. I. Jabladze, and R. G. Vachanadze, “A new approach for solving optimization problems,” in: IEEE Systems, Man. and Cybernetics Group Ann. Sympos. Rec., Anaheim, Calif., 1971, New York (1971), pp. 162–164.

A. I. Cohen, “Rate of convergence of several conjugate gradient algorithms,” SIAM J. Numer. Anal.,9, No. 2, 248–259 (1972).

E. E. Cragg and A. V. Levy, “Study on a supermemory gradient method for the minimization of functions,” J. Optimization Theory Appl.,4, No. 3, 191–205 (1969).

J. B. Crockett and H. Chernoff, “Gradient methods of maximization,” Pacific J. Math.,5, No. 1, 30–50 (1955).

J. W. Daniel, “Convergence of the conjugate gradient method with computationally convenient modifications,” Numer. Math.,10, No. 2, 125–131 (1967).

J. W. Daniel, “The conjugate gradient method for linear and nonlinear operator equations,” SIAM J. Numer. Anal.,4, No. 1, 10–26 (1967).

J. W. Daniel, “A correction concerning the convergence rate for the conjugate gradient method.” SIAM J. Numer. Anal.,7, No. 2, 277–280 (1970).

W. C. Davidon, Variable Metric Method for Minimization, Argonne Nat. Lab. Res. and Develop. Rep. ANL-5990 (rev.), U. S. Atomic Energy Commission (1959).

W. C. Davidon, “Variance algorithm for minimizaton,” Comput. J.,10, No. 4, 406–410 (1968).

S. Dietze and H. Schwetlick, “Über die Schrittweitenwahl bei Abstiegsverfahren zur Minimisierung konvexer Funktionen,” J. Angew. Math. Mech.,51, No. 6, 451–454 (1971).

B. Dijkhuis, “An adaptive algorithm for minimizing a unimodal function of 1 variable,” Z. Angew. Math. Mech.,51, Sonderheft, 45–46 (1971).

V. Drzymala, “O zbiezności algorytmu poszukiwania ekstremum metodą najszybszego spadku,” Biul. Wojsk. Akad. Tech. J. Dąbrowskiego,18, No. 3, 39–49 (1969).

R. Elkin, Convergence Theorems for Gauss-Seidel and Other Minimization Algorithms (Dissertation), Univ. Maryland (1968), 129 pages Dissert. Abstr.,B29, No. 8, 2970 (1969).

V. Fabian, “Stochastic approximation methods,” Czech. Math. J.,10, No. 1, 123–159 (1960).

P. Faure and P. Huard, “Resolution de programmes mathématiques à fonction non linéaire par la méthode du gradient reduit,” Rev. Franç. Rech. Opérat.,9, No. 36, 167–205 (1965).

A. V. Fiacco and G. P. McCormick, “Extensions of SUMT for nonlinear programming: equality constraints and extrapolation,” Management Sci.,12, No. 11, 816–828 (1966).

A. V. Fiacco and G. P. McCormick, “The sequential unconstrained minimization technique for nonlinear programming a primal-dual method,” Management Sci.,10, No. 2, 360–366 (1964).

A. V. Fiacco and G. P. McCormick, “Computational algorithm for the sequential unconstrained minimization technique for nonlinear programming,” Management Sci.,10, No. 4, 601–617 (1964).

A. V. Fiacco and G. P. McCormick, “The slacked unconstrained minimization technique for convex programming,” SIAM J. Appl. Math.,15, No. 3, 505–515 (1967).

A. W. O. Firth, “Optimization problems: solution by an analogue computer,” Comput. J.,4, No. 1, 68–72 (1961).

R. Fletcher, “Function minimization without evaluating derivatives — a review,” Comput. J.,8, No. 1, 33–41 (1965).

R. Fletcher, “A new approach to variable metric algorithms,” Comput. J.,13, No. 3, 317–322 (1970).

R. Fletcher and M. J. D. Powell, “A rapidly convergent descent method for minimization,” Comput. J.,6, No. 2, 163–168 (1963).

R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” Comput. J.,7, No. 2, 149–154 (1964).

G. E. Forsythe, “On the asymptotic directions of the S-dimensional optimum gradient method,” Numer. Math., 11, No. 1, 57–76 (1968).

G. E. Forsythe and T. S. Motzkin, “Asymptotic properties of the optimum gradient method,” Bull. Amer. Math. Soc.,57, 183 (1951).

I. Fried, “N-step conjugate gradient minimization scheme for nonquadratic functions,” AIAA Journal,9, No. 11, 2286–2287 (1971).

S. N. Ghani, “An improved ‘complex’ method for function minimization,” Computer Aided Design,4, No. 2, 71–78 (1972).

P. E. Gill and W. Murray, Quasi-Newton Methods for Unconstrained Optimization, Nat. Phys. Lab. Rep. N. Maths 97 (1971), 29 pages.

P. E. Gill, W. Murray, and R. A. Pitfield, The Implementation of Two Revised QuasiNewton Algorithms for Unconstrained Optimization, Nat. Phys. Lab., Div. Numer. Anal. Comput., N NAC 11 (1972), 131 pages.

D. Goldfarb, “Sufficient conditions for the convergence of variable-metric algorithms,” in: Optimization (R. Fletcher, ed.), Academic Press, New York (1969).

A. A. Goldstein, “Cauchy's method of minimization,” Numer. Math.,4, No. 2, 146–150 (1962).

A. A. Goldstein and B. R. Kripke, “Mathematical programming by minimizing differentiable functions,” Numer. Math.,6, No. 1, 47–48 (1964).

A. A. Goldstein and J. F. Price, “An effective algorithm for minimization,” Numer. Math.,10, 184–189 (1967).

A. A. Goldstein and J. F. Price, “On descent from local minima,” Math. Comput.,25, No. 115, 569–574 (1971).

D. S. Gordon and D. W. G. Shen, “An accelerated Davidon algorithm for solving active network equations represented by nonquadratic cost functions,” in: Proc. Fourth Hawaiian Internat. Conf. Systems Science, Honolulu, 1971, North Hollywood, Calif., (1971), pp. 608–610.

J. Greenstadt, “On the relative efficiencies of gradient methods,” Math. Comput.,21, No. 99, 360–367 (1967).

J. Greenstadt, “A quasi-Newton method with no derivatives,” Math. Comput.,26, No. 117, 145–166 (1972).

B. Gruber, “Numerical determination of the relative minimum of a function of several variables by quadratic interpolation,” Apl. Mat.,12, No. 2, 87–100 (1967).

J. Guigou, “Expériences numériques comparatives concernant la méthode du gradient conjugué,” Bull. Direct, et Études Rech. C, No. 2, 79–92 (1968).

K. P. Hadeler, “Bemerkungen zum Gradientenverfahren,” Numer. Math.,18, No. 5, 413–422 (1972).

R. M. Hayes, “Iterative methods of solving linear problems on Hilbert space,” Nat. Bur. Stand., Appl. Math. Ser., No. 39, 71–103 (1954).

J. Heller, “A Monte Carlo approach to the approximate solution to sequencing problems,” Nat. Conf. Assoc. Computing Mach., 1962, Digest Tech. Papers, New York (1962), p. 41.

M. R. Hestenes, “The conjugate-gradient method for solving linear systems,” in: Proc. Sympos. Applied Mathematics, New York-Toronto-London (1956), pp. 83–102.

M. R. Hestenes, “Inversion of matrices by biorthogonalization and related results,” J. Soc. Ind. Appl. Math.,6, No. 1, 51–90 (1958).

M. R. Hestenes and E. Stiefel, “Methods of conjugate gradient for solving linear systems,” J. Res. Math. Mat. Bur. Stand.,49, No. 6, 409–436 (1952).

C. Hildreth, “Point estimates of ordinates of concave functions,” J. Amer. Statist. Assoc.,49, No. 267, 598–619 (1954).

R. Hooke and T. A. Jeeves, “‘Direct search’ solution of numerical and statistical problems,” J. Assoc. Computing Mach.,8, No. 2, 212–229 (1961).

J. Hrouda, “Reklový algorithmus pro určení minima funkce několika proměnných,” Apl. Mat.,11, No. 4, 271–277 (1966).

H. Y. Huang, “Unified approach to quadratically convergent algorithms for function minimization,” J. Optimization Theory Appl.,5, Nos. 1–6, 405–523 (1970).

A. V. Levy, “Numerical experiments on quadratically convergent algorithms for function minimization,” J. Optimization Theory Appl.,6, No. 3, 269–282 (1970).

S. Inomata and M. Kumada, “On the golf method,” Bull. Electrotech. Lab. (Japan),25, No. 7, 495–512 (1961).

D. H. Jacobson and W. Oksman, “An algorithm that minimizes homogeneous functions of N variables in N + 2 iteration and rapidly minimizes general functions,” J. Math. Anal. Appl.,38, No. 3, 535–552 (1972).

D. H. Jacobson and W. Oksman, “New algorithms for function minimization,” in: Proc. Ninth IEEE Sympos. Adaptive Processes, Decision, and Control, Austin, Texas, 1970, New York (1970), pp. xxi. 1/1–xxi. 1/4.

J. Janko, “Solution of nonlinear equation systems by Newton's method and the gradients method,” Apl. Mat.,10, No. 3, 230–234 (1965).

P. Jarratt, “An iterative method for locating turning points,” Comput. J.,10, No. 1, 82–84 (1967).

B. Kacprzyński, “Sekwencyjna methoda poszukowania ekstremum,” Arch. Automat, i Telemech.,11, No. 2, 147–164 (1966).

H. J. Kelley, G. E. Myers, and I. L. Johnson, Jr., “An improved conjugate direction minimization procedure,” AIAA Journal,8, No. 11, 2091–2093 (1970).

H. J. Kelley and J. L. Speyer, “Accelerated gradient projection,” in: Lecture Notes in Mathematics,” No. 132, Springer, Berlin (1970), pp. 151–158.

J. Kiefer, “Sequential minimax search for a maximum,” Proc. Amer. Math. Soc.,4, No. 3, 502–506 (1953).

J. Kiefer and J. Wolfowitz, “Stochastic estimation of the maximum of a regression function,” Ann. Math. Statist.,23, 463–466 (1952).

R. F. King, “A fifth-order family of modified Newton methods,” BIT (Sweden),11, No. 4, 409–412 (1971).

W. Kolar, Zur Methode des steilsten Abstieges, Ber. Kernforschungsanlage Jülich, No. 445 (1966).

H. P. Künzi, H. G. Tzschach, and C. A. Zehnder, Numerical Methods of Mathematical Optimization with ALGOL and FORTRAN Programs, Academic Press, New York (1971), viii + 219 pages.

L. Lapidus, E. Shapiro, S. Shapiro, and R. Stillman, “Optimization of process performance,” AIChE Journal,7, 288–294 (1961).

F. Lehmann, “Allgemeiner Bericht über Monte-Carlo-Methoden,” Bl. Deut. Ges. Versicherungsmath.,8, No. 3, 431–456 (1967).

G. Leitmann (ed.), Optimization Techniques with Applications to Aerospace Systems, Academic Press, New York-London (1962), 453 pages.

A. Leon, “A comparison among eight known optimizing procedures,” in: Recent Advances in Optimization Techniques (1966).

D. G. Leunberger, Optimization by Vector Space Methods, Wiley, New York (1969), xvii +326 pages.

D. Mans, Minimierung konvexer Funktionen (Dissertation) (1971).

D. Mans, “Über die Konvergenz von Einzelschrittvervahren zur Minimierung konvexer Funktionen,” Manuscr. Math.,6, No. 1, 33–51 (1972).

D. W. Marquardt, “An algorithm for least-squares estimation of nonlinear parameters,” J. Soc. Ind. Appl. Math.,11, No. 2, 431–441 (1963).

A. Matthews and D. Davies, “A comparison of modified Newton methods for unconstrained optimization,” Comput. J.,14, No. 3, 293–294 (1971).

G. P. McCormick and J. D. Pearson, “Variable metric method and unconstrained optimization,” in: Optimization (R. Fletcher, ed.), Academic Press, New York-London (1969), pp. 307–325.

D. G. McDowell, Function Minimizing Algorithms (Doctoral Dissertation), Michigan State Univ. (1970), 85 pages Dissert. Abstr. Internat.,B31, No. 11, 6755 (1971).

A. Miele and J. W. Cantrell, “Study on a memory gradient method for the minimization of functions,” J. Optimiz. Theory Appl.,3, No. 6, 459–470 (1969).

A. Miele and J. W. Cantrell, Gradient Methods in Mathematical Programming, Part 2: Memory Gradient Method, Rice Univ. Aero-Astronaut. Rep. No. 56 (1969).

A. Miele and J. W. Cantrell, “Memory gradient method for the minimization of functions,” in: Lecture Notes in Mathematics, No. 132, Springer, Berlin (1970), pp. 252–263.

J. J. More, “Global convergence of Newton-Gauss-Seidel methods,” SIAM J. Numer. Anal.,8, No. 2, 325–336 (1971).

D. D. Morrison, “Optimization by least squares,” SIAM J. Numer. Anal.,5, No. 1, 83–88 (1968).

K. F. Muller and V. W. Eveleigh, “A learning algorithm for metric computation using first derivatives and arbitrary steps to achieve function minimization,” in: Proc. Fourth Hawaiian Internat. Conf. Systems Science, Honolulu, 1971, North Hollywood, Calif. (1971), pp. 73–75.

W. Murray, Second Derivative Methods, Nat. Phys. Lab. Rep., presented at the IMA/NPL Sympos. Numerical Methods for Unconstrained Optimization.

B. A. Murtagh and R. W. H. Sargent, “Computational experience with quadratically convergent minimization methods,” Computing J.,13, No. 2, 185–194 (1970).

G. E. Myers, “Properties of the conjugate-gradient and Davidon methods,” J. Optimization Theory Appl.,2, No. 4, 209–219 (1968).

J. A. Nelder and R. Mead, “A simplex method for function minimization,” Comput. J.,7, No. 4, 308–313 (1965).

D. J. Newman, “Location of the maximum on unimodel surfaces,” J. Assoc. Comput. Mach.,12, No. 3, 395–398 (1965).

A. M. Ostrowski, “Contributions to the theory of the method of steepest descent,” Arch. Rational Mech. Anal.,26, No. 4, 257–280 (1967).

K. J. Overholt, “Extended Aitkin acceleration,” Nord. Tidskr. Informationsbehandl.,5, No. 2, 122–132 (1965).

J. R. Palmer, “An improved procedure for orthogonalizing the search vectors in Rosen-brock's and Swann's direct search optimization methods,” Comput. J.,12, No. 1, 69–71 (1969).

J. D. Pearson, On Variable Metric Methods of Minimization, Res., Anal. Corp. Tech. Paper No. RAC-Tp 302 (1968).

T. Pietrzykowski, “Application of the steepest ascent method to concave programming,” in: Information Processing 1962, North-Holland, Amsterdam (1963), pp. 185–189.

T. Pietrzykowski, “The potential method for conditional maxima in the locally compact metric spaces,” Numer. Math.,14, No. 4, 325–329 (1970).

M. Pincus, “A Monte Carlo method for the approximate solution of certain types of constrained optimization problems,” Operat. Res.,18, No. 6, 1225–1228 (1970).

E. Polak and G. Ribiere, “Note sur le convergence de méthodes de directions conjugées,” Rev. Franc. Inform. et Rech. Opérat.,3, No. 16, 35–43 (1969).

T. A. Porsching, “On rates of convergence of Jacobi and Gauss-Seidel methods for M-functions,” SIAM, J. Numer. Anal.,8, No. 3, 575–582 (1971).

M. J. D. Powell, “An iterative method for finding stationary values of a function of several variables,” Comput. J.,5, No. 2, 147–151 (1962).

M. J. D. Powell, “An effective method for finding the minimum of a function of several variables without calculating derivatives,” Comput. J.,7, No. 2, 155–162 (1964).

M. J. D. Powell, “A survey of numerical methods for unconstrained optimization,” SIAM Rev.,12, No. 1, 79–97 (1970).

P. S. Pütter, “Ein allgemeines Maximalisierungsverfahren,” Z. Angew. Math. Mech.,39, No. 12, 466–472 (1959).

J. O. Ramsay, “A family of gradient methods for optimization,” Comput. J.,13, No. 4, 413–417 (1970).

J. Reid, “On the method of conjugate gradients for the solution of large sparse systems of linear equations,” in: Large Sparse Sets of Linear Equations, London-New York (1971), pp. 231–252.

G. Ribiere, “Sur la méthode de Davidon-Fletcher-Powell pour la minimization des fonctions,” Management Sci.,16, No. 9, 572–592 (1970).

F. S. G. Richards, “On finding local maxima of functions of a real variable,” Biometrika,54, Nos. 1–2, 310–311 (1967).

P. D. Roberts and R. H. Davis, “Conjugate gradients,” Control.13, No. 129, 206–210 (1969).

R. A. Rohrer, “Explicit step-size determination in gradient exploitation minimization techniques,” IEEE Trans. Circuit Theory,CT-17, No. 1, 154–155 (1970).

I. B. Rosen, “A review of quasi-Newton methods in nonlinear equation solving and unconstrained optimization,” in: Proc. Twenty-First Nat. Conf. Assoc. Computing Mach., Washington, D. C. (1966).

H. H. Rosenbrock, “An automatic method for finding the greatest or the least value of a function,” Comput. J.,3, No. 3, 175–183 (1960).

J. Sacks, “Asymptotic distribution of stochastic approximation procedures,” Ann. Math. Statist.,29, No. 2, 373–405 (1958).

G. N. Saridis, “Learning applied to successive approximation algorithms,” IEEE Trans. Systems Science and Cybernetics,SSC-6, No. 2 (1970).

S. Schechter, “Iteration methods for nonlinear problems,” Trans. Amer. Math. Soc.,104, No. 1, 179–189 (1962).

J. W. Schmidt and H. F. Trinkaus, “Extremwertermittelung mit Funktionsverten bei Functionen von mehreren Veränderlichen,” Computing,1, No. 3, 224–232 (1966).

B. V. Shah, R. J. Buehler, and O. Kempthorne, “Some algorithms for minimizing a function of several variables,” J. Soc. Ind. Appl. Math.,12, No. 1, 74–92 (1964).

D. F. Shanno, “Conditioning of quasi-Newton methods for function minimization,” Math. Comput.,24, No. 111, 647–656 (1970).

D. F. Shanno, “Parameter selection for modified Newton methods for function minimization,” SIAM J. Numer. Anal.,7, No. 3, 366–372 (1970).

D. F. Shanno and P. C. Kettler, “Optimal conditioning for quasi-Newton methods,” Math. Comput.,24, No. 111, 657–664 (1970).

C. S. Smith, The Automatic Computation of Maximum Likelihood Estimates,” NCB Scientific Dept. Rep. SC 846/MR/40 (1962).

H. W. Sorenson, “Comparison of some conjugate direction procedures for function minimization,” J. Franklin Inst.,288, No. 6, 421–441 (1969).

H. A. Spang, “A review of minimization techniques for nonlinear functions,” SIAM Rev.,4, No. 4, 343–365 (1962).

W. Spendley, G. R. Hext, and F. R. Himsworth, “Sequential application of simplex designs in optimization and evolutionary operation,” Technometrics,4, 441 (1962).

G. W. Stewart III, “A modification of Davidon's minimization method to accept difference approximations of derivatives,” J. Assoc. Computing Mach.,14, No. 1, 72–83 (1967).

E. Stiefel, “Über einige Methoden der Relaxationsrechnung,” Z. Angew. Math. Phys.,3, No. 1, 1–33 (1952).

T. A. Straeter, A Comparison of Gradient Dependent Techniques for the Minimization of an Unconstrained Function of Several Variables, AIAA Paper No. 951 (1969), 5 pages.

L. K. Schubert, “Modification of a quasi-Newton method for nonlinear equations with a sparse Jacobian,” Math. Comput.,24, No. 109, 27–30 (1970).

W. H. Swann, Report on the Development of a New Direct Search Method of Optimization, Central Inst. Lab. Res. Note 64/3 (1964).

R. P. Tewarson, “On the use of generalized inverses in function minimization,” Computing,6, Nos. 3–4, 241–248 (1970).

T. Tsuda and T. Kiyono, “Application of the Monte Carlo method to systems of nonlinear algebraic equations,” Numer. Math.,6, No. 2, 59–67 (1964).

A. M. Vercoustre, “Étude comparative des méthodes de minimization de Fletcher et Powell et de Davidon,” Bull. Direct. et Études Rech. C. No. 1, 57–75 (1970).

J. Warga, “Minimizing certain convex functions,” J. Soc. Ind. Appl. Math.,11, No. 3, 588–593 (1963).

T. M. Whitney and R. K. Meany, “Two algorithms related to the method of steepest descent,” SIAM J. Numer. Anal.,4, No. 1, 109–118 (1967).

J. H. Westcott, “Numerical computational methods of optimization in control,” Automatica,5, No. 6, 831–843 (1969).

P. Wolfe, “Convergence conditions for ascent methods,” SIAM Rev.,11, No. 2, 226–235 (1969).

W. I. Zangwill, “Minimizing a function without calculating derivatives,” Comput. J.,10, No. 3, 293–296 (1967).

W. I. Zangwill, Nonlinear Programming by Sequential Unconstrained Maximization, Working Paper 131, Cent. Res. Management Sci., Univ. California, Berkeley (1965).

R. Zieliński, “On the Monte Carlo evaluation of the extremal value of a function,” Algorytmy,2, No. 4, 7–13 (1965).

R. Zieliński, “On Monte Carlo estimation of the maximum of a function,” Algorytmy,7, No. 13, 5–7 (1970).

R. Zieliński, Stochastyczne algorytmy w zagadnieniach optymizacji, Komentowany przeglad bibliograficzny,” Algorytmy,3, No. 6, 127–138 (1966).


Using Symbolic Mathematics with Optimization Toolbox™ Solvers

This example shows how to use the Symbolic Math Toolbox™ functions jacobian and matlabFunction to provide analytical derivatives to optimization solvers. Optimization Toolbox™ solvers are usually more accurate and efficient when you supply gradients and Hessians of the objective and constraint functions.

Problem-based optimization can calculate and use gradients automatically see Automatic Differentiation in Optimization Toolbox. For a problem-based example using automatic differentiation, see Constrained Electrostatic Nonlinear Optimization, Problem-Based.

There are several considerations in using symbolic calculations with optimization functions:

Optimization objective and constraint functions should be defined in terms of a vector, say x . However, symbolic variables are scalar or complex-valued, not vector-valued. This requires you to translate between vectors and scalars.

Optimization gradients, and sometimes Hessians, are supposed to be calculated within the body of the objective or constraint functions. This means that a symbolic gradient or Hessian has to be placed in the appropriate place in the objective or constraint function file or function handle.

Calculating gradients and Hessians symbolically can be time-consuming. Therefore you should perform this calculation only once, and generate code, via matlabFunction , to call during execution of the solver.

Evaluating symbolic expressions with the subs function is time-consuming. It is much more efficient to use matlabFunction .

matlabFunction generates code that depends on the orientation of input vectors. Since fmincon calls the objective function with column vectors, you must be careful to call matlabFunction with column vectors of symbolic variables.

First Example: Unconstrained Minimization with Hessian

The objective function to minimize is:

f ( x 1 , x 2 ) = log ( 1 + 3 ( x 2 - ( x 1 3 - x 1 ) ) 2 + ( x 1 - 4 / 3 ) 2 ) .

This function is positive, with a unique minimum value of zero attained at x1 = 4/3, x2 =(4/3)^3 - 4/3 = 1.0370.

We write the independent variables as x1 and x2 because in this form they can be used as symbolic variables. As components of a vector x they would be written x(1) and x(2) . The function has a twisty valley as depicted in the plot below.


Example 1

Show that the function $f(x, y) = x^2 - y^2$ does not have an extreme values.

We first note that $f$ has no singular or boundary points, and so $f$ can obtain extreme values only at critical points. Let's first determine the gradient of $f$ . We have that:

Now $ abla f(x, y) = 0$ at the point $(0, 0) in D(f)$ . So $(0, 0)$ is our only critical point. Now note that $f(x, 0) > 0$ for $(x, 0) in D(f)$ such that $x eq 0$ . Similarly, note that $f(0, y) < 0$ for $(0, y) in D(f)$ such that $y eq 0$ . Thus $f$ does not have an extreme value at $(0, 0) in D(f)$ . The following image is a portion of the graph of $f$ . As you can see, the origin is not an extreme value of $f$ .


Watch the video: Κριτήριο Εσσιανής (December 2021).