Skip to content

optim #

Custom parameter optimizers.

Classes:

Functions:

  • levenberg_marquardt

    Optimize a given set of parameters using the Levenberg-Marquardt algorithm.

LevenbergMarquardtConfig pydantic-model #

Bases: BaseModel

Configuration for the Levenberg-Marquardt optimizer.

Fields:

mode pydantic-field #

mode: Mode = _ADAPTIVE

The mode to run the optimizer in.

trust_radius pydantic-field #

trust_radius: float = 0.2

Target trust radius.

trust_radius_min pydantic-field #

trust_radius_min: float = 0.05

Minimum trust radius.

min_eigenvalue pydantic-field #

min_eigenvalue: float = 0.0001

Lower bound on hessian eigenvalue. If the smallest eigenvalue is smaller than this, a small amount of steepest descent is mixed in prior to taking a next step to try and correct this.

min_damping_factor pydantic-field #

min_damping_factor: float = 1.0

Minimum damping factor.

adaptive_factor pydantic-field #

adaptive_factor: float = 0.25

Adaptive trust radius adjustment factor to use when running in adaptive mode.

adaptive_damping pydantic-field #

adaptive_damping: float = 1.0

Adaptive trust radius adjustment damping to use when running in adaptive mode.

search_tolerance pydantic-field #

search_tolerance: float = 0.0001

The tolerance used when searching for the optimal damping factor with hessian diagonal search (i.e. mode='hessian-search').

search_trust_radius_max pydantic-field #

search_trust_radius_max: float = 0.001

The maximum trust radius to use when falling back to a second line search if the loss would increase after the one.

search_trust_radius_factor pydantic-field #

search_trust_radius_factor: float = 0.1

The factor to scale the trust radius by when falling back to a second line search.

error_tolerance pydantic-field #

error_tolerance: float = 1.0

Steps that increase the loss more than this amount are rejected.

quality_threshold_low pydantic-field #

quality_threshold_low: float = 0.25

The threshold below which the step is considered low quality.

quality_threshold_high pydantic-field #

quality_threshold_high: float = 0.75

The threshold above which the step is considered high quality.

convergence_loss pydantic-field #

convergence_loss: float = 0.1

The loss will be considered converged if its std deviation over the last n_convergence_steps steps is less than this value.

convergence_gradient pydantic-field #

convergence_gradient: float = 0.1

The gradient will be considered converged if its norm is less thanthis value.

convergence_step pydantic-field #

convergence_step: float = 0.01

The step size will be considered converged if its norm is less than this value.

n_convergence_steps pydantic-field #

n_convergence_steps: int = 2

The number of steps to consider when checking for convergence in the loss.

n_convergence_criteria pydantic-field #

n_convergence_criteria: int = 2

The number of convergence criteria that must be satisfied before the optimization is considered converged. If 0, no convergence criteria will be used and the optimizer will run for max_steps full steps.

max_steps pydantic-field #

max_steps: int

The maximum number of full steps to perform.

levenberg_marquardt #

levenberg_marquardt(
    x: Tensor,
    config: LevenbergMarquardtConfig,
    closure_fn: ClosureFn,
    correct_fn: CorrectFn | None = None,
    report_fn: ReportFn | None = None,
) -> Tensor

Optimize a given set of parameters using the Levenberg-Marquardt algorithm.

Notes
  • This optimizer assumes a least-square loss function.
  • This is a reimplementation of the Levenberg-Marquardt optimizer from the ForceBalance package, and so may differ from a standard implementation.

Parameters:

  • x (Tensor) –

    The initial guess of the parameters with shape=(n,).

  • config (LevenbergMarquardtConfig) –

    The optimizer config.

  • closure_fn (ClosureFn) –

    A function that computes the loss (shape=()), its gradient (shape=(n,)), and hessian (shape=(n, n)). It should accept as arguments the current parameter tensor, and two booleans indicating whether the gradient and hessian are required.

  • correct_fn (CorrectFn | None, default: None ) –

    A function that can be used to correct the parameters after each step is taken and before the new loss is computed. This may include, for example, ensuring that vdW parameters are all positive. It should accept as arguments the current parameter tensor and return the corrected parameter tensor.

  • report_fn (ReportFn | None, default: None ) –

    An optional function that should be called at the end of every step. This can be used to report the current state of the optimization. It should accept as arguments the step number, the current parameter tensor the loss, gradient and hessian, the step 'quality', and a bool indicating whether the step was accepted or rejected.

Returns:

  • Tensor

    The parameters that minimize the loss.

Source code in descent/optim/_lm.py
@torch.no_grad()
def levenberg_marquardt(
    x: torch.Tensor,
    config: LevenbergMarquardtConfig,
    closure_fn: ClosureFn,
    correct_fn: CorrectFn | None = None,
    report_fn: ReportFn | None = None,
) -> torch.Tensor:
    """Optimize a given set of parameters using the Levenberg-Marquardt algorithm.

    Notes:
        * This optimizer assumes a least-square loss function.
        * This is a reimplementation of the Levenberg-Marquardt optimizer from the
          ForceBalance package, and so may differ from a standard implementation.

    Args:
        x: The initial guess of the parameters with ``shape=(n,)``.
        config: The optimizer config.
        closure_fn: A function that computes the loss (``shape=()``), its
            gradient (``shape=(n,)``), and hessian (``shape=(n, n)``). It should
            accept as arguments the current parameter tensor, and two booleans
            indicating whether the gradient and hessian are required.
        correct_fn: A function that can be used to correct the parameters after
            each step is taken and before the new loss is computed. This may
            include, for example, ensuring that vdW parameters are all positive.
            It should accept as arguments the current parameter tensor and return
            the corrected parameter tensor.
        report_fn: An optional function that should be called at the end of every
            step. This can be used to report the current state of the optimization.
            It should accept as arguments the step number, the current parameter tensor
            the loss, gradient and hessian, the step 'quality', and a bool indicating
            whether the step was accepted or rejected.

    Returns:
        The parameters that minimize the loss.
    """

    x = x.clone().detach().requires_grad_(x.requires_grad)

    correct_fn = correct_fn if correct_fn is not None else lambda y: y
    closure_fn = torch.enable_grad()(closure_fn)

    report_fn = report_fn if report_fn is not None else lambda *_, **__: None

    closure_prev = closure_fn(x, True, True)
    trust_radius = torch.tensor(config.trust_radius).to(x.device)

    loss_history = []
    has_converged = False

    best_x, best_loss = x.clone(), closure_prev[0]

    for step in range(config.max_steps):
        loss_prev, gradient_prev, hessian_prev = closure_prev

        dx, expected_improvement, damping_adjusted, damping_factor = _step(
            gradient_prev, hessian_prev, trust_radius, config
        )

        if config.mode.lower() == _HESSIAN_SEARCH:
            dx, expected_improvement = _hessian_diagonal_search(
                x,
                closure_prev,
                closure_fn,
                correct_fn,
                damping_factor,
                trust_radius,
                config,
            )

        dx_norm = torch.linalg.norm(dx)
        _LOGGER.info(f"{config.mode} step found (length {dx_norm:.4e})")

        x_next = correct_fn(x + dx).requires_grad_(x.requires_grad)

        loss, gradient, hessian = closure_fn(x_next, True, True)
        loss_delta = loss - loss_prev

        step_quality = loss_delta / expected_improvement
        accept_step = True

        if loss > (loss_prev + config.error_tolerance):
            # reject the 'bad' step and try again from where we were
            loss, gradient, hessian = (loss_prev, gradient_prev, hessian_prev)
            trust_radius = _reduce_trust_radius(dx_norm, config)

            accept_step = False
        elif config.mode.lower() == _ADAPTIVE:
            # this was a 'good' step - we can maybe increase the trust radius
            trust_radius = _update_trust_radius(
                dx_norm, step_quality, trust_radius, damping_adjusted, config
            )

        if accept_step:
            x.data.copy_(x_next.data)
            loss_history.append(loss.detach().cpu().clone())

        if loss < best_loss:
            best_x, best_loss = x.clone(), loss.detach().clone()

        closure_prev = (loss, gradient, hessian)

        report_fn(step, x, loss, gradient, hessian, step_quality, accept_step)
        _LOGGER.info(f"step={step} loss={loss.detach().cpu().item():.4e}")

        if _has_converged(dx, loss_history, gradient, step_quality, config):
            _LOGGER.info(f"optimization has converged after {step + 1} steps.")
            has_converged = True

            break

    if not has_converged:
        _LOGGER.info(f"optimization has not converged after {config.max_steps} steps.")

    return best_x