optim #

Custom parameter optimizers.

Classes:

LevenbergMarquardtConfig –

Configuration for the Levenberg-Marquardt optimizer.

Functions:

levenberg_marquardt –

Optimize a given set of parameters using the Levenberg-Marquardt algorithm.

LevenbergMarquardtConfig `pydantic-model` #

Bases: BaseModel

Configuration for the Levenberg-Marquardt optimizer.

Fields:

type (Literal['levenberg-marquardt'])
mode (Mode)
trust_radius (float)
trust_radius_min (float)
min_eigenvalue (float)
min_damping_factor (float)
adaptive_factor (float)
adaptive_damping (float)
search_tolerance (float)
search_trust_radius_max (float)
search_trust_radius_factor (float)
error_tolerance (float)
quality_threshold_low (float)
quality_threshold_high (float)
convergence_loss (float)
convergence_gradient (float)
convergence_step (float)
n_convergence_steps (int)
n_convergence_criteria (int)
max_steps (int)

mode `pydantic-field` #

mode: Mode = _ADAPTIVE

The mode to run the optimizer in.

trust_radius `pydantic-field` #

trust_radius: float = 0.2

Target trust radius.

trust_radius_min `pydantic-field` #

trust_radius_min: float = 0.05

Minimum trust radius.

min_eigenvalue `pydantic-field` #

min_eigenvalue: float = 0.0001

Lower bound on hessian eigenvalue. If the smallest eigenvalue is smaller than this, a small amount of steepest descent is mixed in prior to taking a next step to try and correct this.

min_damping_factor `pydantic-field` #

min_damping_factor: float = 1.0

Minimum damping factor.

adaptive_factor `pydantic-field` #

adaptive_factor: float = 0.25

Adaptive trust radius adjustment factor to use when running in adaptive mode.

adaptive_damping `pydantic-field` #

adaptive_damping: float = 1.0

Adaptive trust radius adjustment damping to use when running in adaptive mode.

search_tolerance `pydantic-field` #

search_tolerance: float = 0.0001

The tolerance used when searching for the optimal damping factor with hessian diagonal search (i.e. mode='hessian-search').

search_trust_radius_max `pydantic-field` #

search_trust_radius_max: float = 0.001

The maximum trust radius to use when falling back to a second line search if the loss would increase after the one.

search_trust_radius_factor `pydantic-field` #

search_trust_radius_factor: float = 0.1

The factor to scale the trust radius by when falling back to a second line search.

error_tolerance `pydantic-field` #

error_tolerance: float = 1.0

Steps that increase the loss more than this amount are rejected.

quality_threshold_low `pydantic-field` #

quality_threshold_low: float = 0.25

The threshold below which the step is considered low quality.

quality_threshold_high `pydantic-field` #

quality_threshold_high: float = 0.75

The threshold above which the step is considered high quality.

convergence_loss `pydantic-field` #

convergence_loss: float = 0.1

The loss will be considered converged if its std deviation over the last n_convergence_steps steps is less than this value.

convergence_gradient `pydantic-field` #

convergence_gradient: float = 0.1

The gradient will be considered converged if its norm is less thanthis value.

convergence_step `pydantic-field` #

convergence_step: float = 0.01

The step size will be considered converged if its norm is less than this value.

n_convergence_steps `pydantic-field` #

n_convergence_steps: int = 2

The number of steps to consider when checking for convergence in the loss.

n_convergence_criteria `pydantic-field` #

n_convergence_criteria: int = 2

The number of convergence criteria that must be satisfied before the optimization is considered converged. If 0, no convergence criteria will be used and the optimizer will run for max_steps full steps.

max_steps `pydantic-field` #

max_steps: int

The maximum number of full steps to perform.

levenberg_marquardt #

levenberg_marquardt(
    x: Tensor,
    config: LevenbergMarquardtConfig,
    closure_fn: ClosureFn,
    correct_fn: CorrectFn | None = None,
    report_fn: ReportFn | None = None,
) -> Tensor

Optimize a given set of parameters using the Levenberg-Marquardt algorithm.

Notes

This optimizer assumes a least-square loss function.
This is a reimplementation of the Levenberg-Marquardt optimizer from the ForceBalance package, and so may differ from a standard implementation.

Parameters:

x (Tensor) –

The initial guess of the parameters with shape=(n,).
config (LevenbergMarquardtConfig) –

The optimizer config.
closure_fn (ClosureFn) –

A function that computes the loss (shape=()), its gradient (shape=(n,)), and hessian (shape=(n, n)). It should accept as arguments the current parameter tensor, and two booleans indicating whether the gradient and hessian are required.
correct_fn (CorrectFn | None, default: None ) –

A function that can be used to correct the parameters after each step is taken and before the new loss is computed. This may include, for example, ensuring that vdW parameters are all positive. It should accept as arguments the current parameter tensor and return the corrected parameter tensor.
report_fn (ReportFn | None, default: None ) –

An optional function that should be called at the end of every step. This can be used to report the current state of the optimization. It should accept as arguments the step number, the current parameter tensor the loss, gradient and hessian, the step 'quality', and a bool indicating whether the step was accepted or rejected.

Returns:

Tensor –

The parameters that minimize the loss.

Source code in descent/optim/_lm.py

@torch.no_grad()
def levenberg_marquardt(
    x: torch.Tensor,
    config: LevenbergMarquardtConfig,
    closure_fn: ClosureFn,
    correct_fn: CorrectFn | None = None,
    report_fn: ReportFn | None = None,
) -> torch.Tensor:
    """Optimize a given set of parameters using the Levenberg-Marquardt algorithm.

    Notes:
        * This optimizer assumes a least-square loss function.
        * This is a reimplementation of the Levenberg-Marquardt optimizer from the
          ForceBalance package, and so may differ from a standard implementation.

    Args:
        x: The initial guess of the parameters with ``shape=(n,)``.
        config: The optimizer config.
        closure_fn: A function that computes the loss (``shape=()``), its
            gradient (``shape=(n,)``), and hessian (``shape=(n, n)``). It should
            accept as arguments the current parameter tensor, and two booleans
            indicating whether the gradient and hessian are required.
        correct_fn: A function that can be used to correct the parameters after
            each step is taken and before the new loss is computed. This may
            include, for example, ensuring that vdW parameters are all positive.
            It should accept as arguments the current parameter tensor and return
            the corrected parameter tensor.
        report_fn: An optional function that should be called at the end of every
            step. This can be used to report the current state of the optimization.
            It should accept as arguments the step number, the current parameter tensor
            the loss, gradient and hessian, the step 'quality', and a bool indicating
            whether the step was accepted or rejected.

    Returns:
        The parameters that minimize the loss.
    """

    x = x.clone().detach().requires_grad_(x.requires_grad)

    correct_fn = correct_fn if correct_fn is not None else lambda y: y
    closure_fn = torch.enable_grad()(closure_fn)

    report_fn = report_fn if report_fn is not None else lambda *_, **__: None

    closure_prev = closure_fn(x, True, True)
    trust_radius = torch.tensor(config.trust_radius).to(x.device)

    loss_history = []
    has_converged = False

    best_x, best_loss = x.clone(), closure_prev[0]

    for step in range(config.max_steps):
        loss_prev, gradient_prev, hessian_prev = closure_prev

        dx, expected_improvement, damping_adjusted, damping_factor = _step(
            gradient_prev, hessian_prev, trust_radius, config
        )

        if config.mode.lower() == _HESSIAN_SEARCH:
            dx, expected_improvement = _hessian_diagonal_search(
                x,
                closure_prev,
                closure_fn,
                correct_fn,
                damping_factor,
                trust_radius,
                config,
            )

        dx_norm = torch.linalg.norm(dx)
        _LOGGER.info(f"{config.mode} step found (length {dx_norm:.4e})")

        x_next = correct_fn(x + dx).requires_grad_(x.requires_grad)

        loss, gradient, hessian = closure_fn(x_next, True, True)
        loss_delta = loss - loss_prev

        step_quality = loss_delta / expected_improvement
        accept_step = True

        if loss > (loss_prev + config.error_tolerance):
            # reject the 'bad' step and try again from where we were
            loss, gradient, hessian = (loss_prev, gradient_prev, hessian_prev)
            trust_radius = _reduce_trust_radius(dx_norm, config)

            accept_step = False
        elif config.mode.lower() == _ADAPTIVE:
            # this was a 'good' step - we can maybe increase the trust radius
            trust_radius = _update_trust_radius(
                dx_norm, step_quality, trust_radius, damping_adjusted, config
            )

        if accept_step:
            x.data.copy_(x_next.data)
            loss_history.append(loss.detach().cpu().clone())

        if loss < best_loss:
            best_x, best_loss = x.clone(), loss.detach().clone()

        closure_prev = (loss, gradient, hessian)

        report_fn(step, x, loss, gradient, hessian, step_quality, accept_step)
        _LOGGER.info(f"step={step} loss={loss.detach().cpu().item():.4e}")

        if _has_converged(dx, loss_history, gradient, step_quality, config):
            _LOGGER.info(f"optimization has converged after {step + 1} steps.")
            has_converged = True

            break

    if not has_converged:
        _LOGGER.info(f"optimization has not converged after {config.max_steps} steps.")

    return best_x

optim #

LevenbergMarquardtConfig pydantic-model #

mode pydantic-field #

trust_radius pydantic-field #

trust_radius_min pydantic-field #

min_eigenvalue pydantic-field #

min_damping_factor pydantic-field #

adaptive_factor pydantic-field #

adaptive_damping pydantic-field #

search_tolerance pydantic-field #

search_trust_radius_max pydantic-field #

search_trust_radius_factor pydantic-field #

error_tolerance pydantic-field #

quality_threshold_low pydantic-field #

quality_threshold_high pydantic-field #

convergence_loss pydantic-field #

convergence_gradient pydantic-field #

convergence_step pydantic-field #

n_convergence_steps pydantic-field #

n_convergence_criteria pydantic-field #

max_steps pydantic-field #