The gradient descent algorithm iteratively adjusts the values of \( x \) and \( y \) to find the minimum of the function \( f(x, y) \). Starting from initial values \( x_0 \) and \( y_0 \), it uses the gradients of the function to make steps towards the minimum, controlled by the learning rate \( r \). The process continues until the changes in the values are sufficiently small or another stopping criterion is met. Here is a step-by-step description of the gradient descent process for a function \( f(x, y) \) with initial values \( x_0 \) and \( y_0 \), and a learning rate \( r \):

- Start with an initial guess for the variables \( x \) and \( y \). Let these initial values be \( x_0 \) and \( y_0 \).

- Choose a learning rate \( r \), which controls the size of the steps taken towards the minimum.

2. Compute Gradients:

- For the function \( f(x, y) \), compute the partial derivatives with respect to \( x \) and \( y \): \[ \dfrac{\partial f}{\partial x} \quad \text{and} \quad \dfrac{\partial f}{\partial y} \] - Evaluate these partial derivatives at the current values of \( x \) and \( y \). Denote these derivatives as \( \dfrac{\partial f}{\partial x}(x_k, y_k) \) and \( \dfrac{\partial f}{\partial y}(x_k, y_k) \), where \( k \) is the current iteration index.

3. Update Rules: - Adjust the values of \( x \) and \( y \) by moving in the direction opposite to the gradients (since we are minimizing the function): \[ x_{k+1} = x_k - r \dfrac{\partial f}{\partial x}(x_k, y_k) \] \[ y_{k+1} = y_k - r \dfrac{\partial f}{\partial y}(x_k, y_k) \] - Here, \( x_{k+1} \) and \( y_{k+1} \) are the updated values of \( x \) and \( y \) after one iteration.

4. Iteration:

- Repeat the process of computing the gradients and updating the values of \( x \) and \( y \) until a stopping criterion is met. The stopping criterion can be:

- A maximum number of iterations.

- The change in the values of \( x \) and \( y \) between iterations is smaller than a predefined threshold.

- The gradient magnitudes \( \dfrac{\partial f}{\partial x}(x_k, y_k) \) and \( \dfrac{\partial f}{\partial y}(x_k, y_k) \) are smaller than a predefined threshold.

5. Convergence: - Once the stopping criterion is satisfied, the values of \( x \) and \( y \) should be close to the values that minimize the function \( f(x, y) \).

Another Gradient Descent Calculator to investigate convergence is also included.

Enter function \( f(x,y) \) , the initial values \( x_0 \) and \( y_0 \) of \( x \) and \( y \) and the learning rate \( r \).

The outputs are:

1) The partial derivatives \( \dfrac{\partial f}{\partial x} \; \text{and} \; \dfrac{\partial f}{\partial y} \) of \( f(x,y) \)

2) The step number as well as the updated values of \( x \) and \( y \),

3) The values of the partial derivatives at the updated values of \( x \) and \( y \) are displayed.

4) A graph of all the updated values of \( x \) and \( y \).

5) A table with \( x \) and \( y \), values of the partial derivatives as well the value of \( f(x,y) \) are presented as output.

Note from the table of values below that when there is convergence, the partial derivatives \( \dfrac{\partial f}{\partial x} \; \text{and} \; \dfrac{\partial f}{\partial y} \) approache zero and \( f(x,y) \) approaches its local minimum value.

Enter \( f(x,y) = \)

Enter initial point \( (x_0, y_0) \):

Enter learning rate \( r \) =

Step | \( x \) | \( y \) | \( \dfrac{\partial f}{\partial x} \) | \( \dfrac{\partial f}{\partial y} \) | \( f(x, y) \) |
---|