mm #

Compute differentiable ensemble averages using OpenMM and SMEE.

Classes:

GenerateCoordsConfig –

Configure how coordinates should be generated for a system using PACKMOL.
MinimizationConfig –

Configure how a system should be energy minimized.
SimulationConfig –
NotEnoughSamplesError –

An error raised when an ensemble average is attempted with too few samples.
TensorReporter –

A reporter which stores coords, box vectors, reduced potentials and kinetic

Functions:

generate_dg_solv_data –

Run a solvation free energy calculation using absolv, and saves the output
generate_system_coords –

Generate coordinates for a system of molecules using PACKMOL.
simulate –

Simulate a SMEE system of molecules or topology.
compute_dg_solv –

Computes ∆G_solv from existing FEP data.
compute_ensemble_averages –

Compute ensemble average of the potential energy, volume, density,
reweight_dg_solv –

Computes ∆G_solv by re-weighting existing FEP data.
reweight_ensemble_averages –

Compute the ensemble average of the potential energy, volume, density,
tensor_reporter –

Create a TensorReporter capable of writing frames to a file.
unpack_frames –

Unpack frames saved by a TensorReporter.

GenerateCoordsConfig `pydantic-model` #

Bases: BaseModel

Configure how coordinates should be generated for a system using PACKMOL.

Fields:

target_density (OpenMMQuantity[_GRAMS_PER_ML])
scale_factor (float)
padding (OpenMMQuantity[angstrom])
tolerance (OpenMMQuantity[angstrom])
seed (int | None)

target_density `pydantic-field` #

target_density: OpenMMQuantity[_GRAMS_PER_ML] = (
    0.95 * _GRAMS_PER_ML
)

Target mass density for final system with units compatible with g / mL.

scale_factor `pydantic-field` #

scale_factor: float = 1.1

The amount to scale the approximate box size by to help alleviate issues with packing larger molecules.

padding `pydantic-field` #

padding: OpenMMQuantity[angstrom] = 2.0 * angstrom

The amount of padding to add to the final box size to help alleviate PBC issues.

tolerance `pydantic-field` #

tolerance: OpenMMQuantity[angstrom] = 2.0 * angstrom

The minimum spacing between molecules during packing.

seed `pydantic-field` #

seed: int | None = None

The random seed to use when generating the coordinates.

MinimizationConfig `pydantic-model` #

Bases: BaseModel

Configure how a system should be energy minimized.

Fields:

tolerance (OpenMMQuantity[_KCAL_PER_MOL / _ANGSTROM])
max_iterations (int)

tolerance `pydantic-field` #

tolerance: OpenMMQuantity[_KCAL_PER_MOL / _ANGSTROM] = (
    10.0 * _KCAL_PER_MOL / _ANGSTROM
)

Minimization will be halted once the root-mean-square value of all force components reaches this tolerance.

max_iterations `pydantic-field` #

max_iterations: int = 0

The maximum number of iterations to perform. If 0, minimization will continue until the tolerance is met.

SimulationConfig `pydantic-model` #

Bases: BaseModel

Fields:

temperature (OpenMMQuantity[kelvin])
pressure (OpenMMQuantity[atmospheres] | None)
n_steps (int)
timestep (OpenMMQuantity[femtoseconds])
friction_coeff (OpenMMQuantity[1.0 / picoseconds])

temperature `pydantic-field` #

temperature: OpenMMQuantity[kelvin]

The temperature to simulate at.

pressure `pydantic-field` #

pressure: OpenMMQuantity[atmospheres] | None

The pressure to simulate at, or none to run in NVT.

n_steps `pydantic-field` #

n_steps: int

The number of steps to simulate for.

timestep `pydantic-field` #

timestep: OpenMMQuantity[femtoseconds] = 2.0 * femtoseconds

The timestep to use during the simulation.

friction_coeff `pydantic-field` #

friction_coeff: OpenMMQuantity[1.0 / picoseconds] = (
    1.0 / picoseconds
)

The integrator friction coefficient.

NotEnoughSamplesError #

Bases: ValueError

An error raised when an ensemble average is attempted with too few samples.

TensorReporter #

TensorReporter(
    output_file: BinaryIO,
    report_interval: int,
    beta: Quantity,
    pressure: Quantity | None,
)

A reporter which stores coords, box vectors, reduced potentials and kinetic energy using msgpack.

report_interval: The interval (in steps) at which to write frames.
beta: The inverse temperature the simulation is being run at.
pressure: The pressure the simulation is being run at, or None if NVT /
    vacuum.

Source code in smee/mm/_reporters.py

def __init__(
    self,
    output_file: typing.BinaryIO,
    report_interval: int,
    beta: openmm.unit.Quantity,
    pressure: openmm.unit.Quantity | None,
):
    """

    Args:
        output_file: The file to write the frames to.
        report_interval: The interval (in steps) at which to write frames.
        beta: The inverse temperature the simulation is being run at.
        pressure: The pressure the simulation is being run at, or None if NVT /
            vacuum.
    """
    self._output_file = output_file
    self._report_interval = report_interval

    self._beta = beta
    self._pressure = (
        None if pressure is None else pressure * openmm.unit.AVOGADRO_CONSTANT_NA
    )

generate_dg_solv_data #

generate_dg_solv_data(
    solute: TensorTopology,
    solvent_a: TensorTopology | None,
    solvent_b: TensorTopology | None,
    force_field: TensorForceField,
    temperature: Quantity = 298.15 * kelvin,
    pressure: Quantity = 1.0 * atmosphere,
    solvent_a_protocol: Optional[
        EquilibriumProtocol
    ] = None,
    solvent_b_protocol: Optional[
        EquilibriumProtocol
    ] = None,
    n_solvent_a: int = 216,
    n_solvent_b: int = 216,
    output_dir: Path | None = None,
)

Run a solvation free energy calculation using absolv, and saves the output such that a differentiable free energy can be computed.

The free energy will correspond to the free energy of transferring a solute from solvent A to solvent B.

Parameters:

solute (TensorTopology) –

The solute topology.
solvent_a (TensorTopology | None) –

The topology of solvent A, or None if solvent A is vacuum.
solvent_b (TensorTopology | None) –

The topology of solvent B, or None if solvent B is vacuum.
force_field (TensorForceField) –

The force field to parameterize the system with.
temperature (Quantity, default: 298.15 * kelvin ) –

The temperature to simulate at.
pressure (Quantity, default: 1.0 * atmosphere ) –

The pressure to simulate at.
solvent_a_protocol (Optional[EquilibriumProtocol], default: None ) –

The protocol to use to decouple the solute in solvent A.
solvent_b_protocol (Optional[EquilibriumProtocol], default: None ) –

The protocol to use to decouple the solute in solvent B.
n_solvent_a (int, default: 216 ) –

The number of solvent A molecules to use.
n_solvent_b (int, default: 216 ) –

The number of solvent B molecules to use.
output_dir (Path | None, default: None ) –

The directory to write the output FEP data to.

Source code in smee/mm/_fe.py

def generate_dg_solv_data(
    solute: smee.TensorTopology,
    solvent_a: smee.TensorTopology | None,
    solvent_b: smee.TensorTopology | None,
    force_field: smee.TensorForceField,
    temperature: openmm.unit.Quantity = 298.15 * openmm.unit.kelvin,
    pressure: openmm.unit.Quantity = 1.0 * openmm.unit.atmosphere,
    solvent_a_protocol: typing.Optional["absolv.config.EquilibriumProtocol"] = None,
    solvent_b_protocol: typing.Optional["absolv.config.EquilibriumProtocol"] = None,
    n_solvent_a: int = 216,
    n_solvent_b: int = 216,
    output_dir: pathlib.Path | None = None,
):
    """Run a solvation free energy calculation using ``absolv``, and saves the output
    such that a differentiable free energy can be computed.

    The free energy will correspond to the free energy of transferring a solute from
    solvent A to solvent B.

    Args:
        solute: The solute topology.
        solvent_a: The topology of solvent A, or ``None`` if solvent A is vacuum.
        solvent_b: The topology of solvent B, or ``None`` if solvent B is vacuum.
        force_field: The force field to parameterize the system with.
        temperature: The temperature to simulate at.
        pressure: The pressure to simulate at.
        solvent_a_protocol: The protocol to use to decouple the solute in solvent A.
        solvent_b_protocol: The protocol to use to decouple the solute in solvent B.
        n_solvent_a: The number of solvent A molecules to use.
        n_solvent_b: The number of solvent B molecules to use.
        output_dir: The directory to write the output FEP data to.
    """
    import absolv.config
    import absolv.runner
    import femto.md.config
    import femto.md.system

    output_dir = pathlib.Path.cwd() if output_dir is None else output_dir

    vacuum_protocol = absolv.config.EquilibriumProtocol(
        production_protocol=absolv.config.HREMDProtocol(
            n_steps_per_cycle=500,
            n_cycles=2000,
            integrator=femto.md.config.LangevinIntegrator(
                timestep=1.0 * openmm.unit.femtosecond
            ),
            trajectory_interval=1,
        ),
        lambda_sterics=absolv.config.DEFAULT_LAMBDA_STERICS_VACUUM,
        lambda_electrostatics=absolv.config.DEFAULT_LAMBDA_ELECTROSTATICS_VACUUM,
    )
    solution_protocol = absolv.config.EquilibriumProtocol(
        production_protocol=absolv.config.HREMDProtocol(
            n_steps_per_cycle=500,
            n_cycles=1000,
            integrator=femto.md.config.LangevinIntegrator(
                timestep=4.0 * openmm.unit.femtosecond
            ),
            trajectory_interval=1,
            trajectory_enforce_pbc=True,
        ),
        lambda_sterics=absolv.config.DEFAULT_LAMBDA_STERICS_SOLVENT,
        lambda_electrostatics=absolv.config.DEFAULT_LAMBDA_ELECTROSTATICS_SOLVENT,
    )

    config = absolv.config.Config(
        temperature=temperature,
        pressure=pressure,
        alchemical_protocol_a=solvent_a_protocol
        if solvent_a_protocol is not None
        else (vacuum_protocol if solvent_a is None else solution_protocol),
        alchemical_protocol_b=solvent_b_protocol
        if solvent_b_protocol is not None
        else (vacuum_protocol if solvent_b is None else solution_protocol),
    )

    solute_mol = openff.toolkit.Molecule.from_rdkit(
        smee.mm._utils.topology_to_rdkit(solute),
        allow_undefined_stereo=True,
    )

    system_config = absolv.config.System(
        solutes={solute_mol.to_smiles(mapped=True): 1},
        solvent_a=_to_solvent_dict(solvent_a, n_solvent_a),
        solvent_b=_to_solvent_dict(solvent_b, n_solvent_b),
    )

    topologies = {
        "solvent-a": smee.TensorSystem([solute], [1], is_periodic=False)
        if solvent_a is None
        else smee.TensorSystem([solute, solvent_a], [1, n_solvent_a], is_periodic=True),
        "solvent-b": smee.TensorSystem([solute], [1], is_periodic=False)
        if solvent_b is None
        else smee.TensorSystem([solute, solvent_b], [1, n_solvent_b], is_periodic=True),
    }
    pressures = {
        "solvent-a": None
        if solvent_a is None
        else pressure.value_in_unit(openmm.unit.atmosphere),
        "solvent-b": None
        if solvent_b is None
        else pressure.value_in_unit(openmm.unit.atmosphere),
    }
    n_solvent = {
        "solvent-a": None if solvent_a is None else n_solvent_a,
        "solvent-b": None if solvent_b is None else n_solvent_b,
    }

    for phase in topologies:
        state = {
            "temperature": temperature.value_in_unit(openmm.unit.kelvin),
            "pressure": pressures[phase],
            "n_solvent": n_solvent[phase],
        }

        (output_dir / phase).mkdir(exist_ok=True, parents=True)
        (output_dir / phase / "state.yaml").write_text(yaml.safe_dump(state))

    def _parameterize(
        top, coords, phase: typing.Literal["solvent-a", "solvent-b"]
    ) -> openmm.System:
        return smee.converters.convert_to_openmm_system(force_field, topologies[phase])

    prepared_system_a, prepared_system_b = absolv.runner.setup(
        system_config, config, _parameterize
    )

    femto.md.system.apply_hmr(
        prepared_system_a.system,
        parmed.openmm.load_topology(prepared_system_a.topology),
    )
    femto.md.system.apply_hmr(
        prepared_system_b.system,
        parmed.openmm.load_topology(prepared_system_b.topology),
    )

    result = absolv.runner.run_eq(
        config, prepared_system_a, prepared_system_b, "CUDA", output_dir, parallel=True
    )

    if solvent_a is not None:
        solvent_a_output = _extract_pure_solvent(
            solute, solvent_a, force_field, output_dir / "solvent-a"
        )
        (output_dir / "solvent-a" / "pure.pkl").write_bytes(
            pickle.dumps(solvent_a_output)
        )
    if solvent_b is not None:
        solvent_b_output = _extract_pure_solvent(
            solute, solvent_b, force_field, output_dir / "solvent-b"
        )
        (output_dir / "solvent-b" / "pure.pkl").write_bytes(
            pickle.dumps(solvent_b_output)
        )

    return result

generate_system_coords #

generate_system_coords(
    system: TensorSystem,
    force_field: TensorForceField | None,
    config: Optional[GenerateCoordsConfig] = None,
) -> tuple[Quantity, Quantity]

Generate coordinates for a system of molecules using PACKMOL.

Parameters:

system (TensorSystem) –

The system to generate coordinates for.
force_field (TensorForceField | None) –

The force field that describes the geometry of any virtual sites.
config (Optional[GenerateCoordsConfig], default: None ) –

Configuration of how to generate the system coordinates.

Returns:

tuple[Quantity, Quantity] –

The coordinates with shape=(n_atoms, 3) and box vectors with shape=(3, 3)

Source code in smee/mm/_mm.py

def generate_system_coords(
    system: smee.TensorSystem,
    force_field: smee.TensorForceField | None,
    config: typing.Optional["smee.mm.GenerateCoordsConfig"] = None,
) -> tuple[openmm.unit.Quantity, openmm.unit.Quantity]:
    """Generate coordinates for a system of molecules using PACKMOL.

    Args:
        system: The system to generate coordinates for.
        force_field: The force field that describes the geometry of any virtual sites.
        config: Configuration of how to generate the system coordinates.

    Raises:
        * PACKMOLRuntimeError

    Returns:
        The coordinates with ``shape=(n_atoms, 3)`` and box vectors with
        ``shape=(3, 3)``
    """

    system = system.to("cpu")
    force_field = None if force_field is None else force_field.to("cpu")

    config = config if config is not None else smee.mm.GenerateCoordsConfig()

    box_size = _approximate_box_size(system, config)

    with tempfile.TemporaryDirectory() as tmp_dir:
        tmp_dir = pathlib.Path(tmp_dir)

        for i, topology in enumerate(system.topologies):
            xyz = _topology_to_xyz(topology, force_field)
            (tmp_dir / f"{i}.xyz").write_text(xyz)

        input_file = tmp_dir / "input.txt"
        input_file.write_text(
            _generate_packmol_input(system.n_copies, box_size, config)
        )

        with input_file.open("r") as file:
            result = subprocess.run(
                "packmol", stdin=file, capture_output=True, text=True, cwd=tmp_dir
            )

        if result.returncode != 0 or not result.stdout.find("Success!") > 0:
            raise PACKMOLRuntimeError(result.stdout)

        output_lines = (tmp_dir / "output.xyz").read_text().splitlines()

    coordinates = (
        numpy.array(
            [
                [float(coordinate) for coordinate in coordinate_line.split()[1:]]
                for coordinate_line in output_lines[2:]
                if len(coordinate_line) > 0
            ]
        )
        * openmm.unit.angstrom
    )

    box_vectors = numpy.eye(3) * (box_size + config.padding)
    return coordinates, box_vectors

simulate #

simulate(
    system: TensorSystem | TensorTopology,
    force_field: TensorForceField,
    coords: Quantity,
    box_vectors: Quantity | None,
    equilibrate_configs: list[
        Union[MinimizationConfig, SimulationConfig]
    ],
    production_config: SimulationConfig,
    production_reporters: list[Any] | None = None,
    apply_hmr: bool = False,
) -> State

Simulate a SMEE system of molecules or topology.

Parameters:

system (TensorSystem | TensorTopology) –

The system / topology to simulate.
force_field (TensorForceField) –

The force field to simulate with.
coords (Quantity) –

The coordinates [Å] to use for the simulation. This should be a unit wrapped numpy array with shape=(n_atoms, 3).
box_vectors (Quantity | None) –

The box vectors [Å] to use for the simulation if periodic. This should be a unit wrapped numpy array with shape=(3, 3).
equilibrate_configs (list[Union[MinimizationConfig, SimulationConfig]]) –

A list of configurations defining the steps to run for equilibration. No data will be stored from these simulations.
production_config (SimulationConfig) –

The configuration defining the production simulation to run.
production_reporters (list[Any] | None, default: None ) –

A list of additional OpenMM reporters to use for the production simulation.
apply_hmr (bool, default: False ) –

Whether to apply Hydrogen Mass Repartitioning to the system prior to simulation.

Source code in smee/mm/_mm.py

def simulate(
    system: smee.TensorSystem | smee.TensorTopology,
    force_field: smee.TensorForceField,
    coords: openmm.unit.Quantity,
    box_vectors: openmm.unit.Quantity | None,
    equilibrate_configs: list[
        typing.Union["smee.mm.MinimizationConfig", "smee.mm.SimulationConfig"]
    ],
    production_config: "smee.mm.SimulationConfig",
    production_reporters: list[typing.Any] | None = None,
    apply_hmr: bool = False,
) -> openmm.State:
    """Simulate a SMEE system of molecules or topology.

    Args:
        system: The system / topology to simulate.
        force_field: The force field to simulate with.
        coords: The coordinates [Å] to use for the simulation. This should be
            a unit wrapped numpy array with ``shape=(n_atoms, 3)``.
        box_vectors: The box vectors [Å] to use for the simulation if periodic. This
            should be a unit wrapped numpy array with ``shape=(3, 3)``.
        equilibrate_configs: A list of configurations defining the steps to run for
            equilibration. No data will be stored from these simulations.
        production_config: The configuration defining the production simulation to run.
        production_reporters: A list of additional OpenMM reporters to use for the
            production simulation.
        apply_hmr: Whether to apply Hydrogen Mass Repartitioning to the system prior
            to simulation.
    """

    assert isinstance(coords.value_in_unit(openmm.unit.angstrom), numpy.ndarray)
    assert isinstance(box_vectors.value_in_unit(openmm.unit.angstrom), numpy.ndarray)

    force_field = force_field.to("cpu")

    system: smee.TensorSystem = (
        system
        if isinstance(system, smee.TensorSystem)
        else smee.TensorSystem([system], [1], False)
    ).to("cpu")

    requires_pbc = any(
        config.pressure is not None
        for config in equilibrate_configs + [production_config]
        if isinstance(config, smee.mm.SimulationConfig)
    )

    if not system.is_periodic and requires_pbc:
        raise ValueError("pressure cannot be specified for a non-periodic system")

    platform = _get_platform(system.is_periodic)

    omm_state = coords, box_vectors

    omm_system = smee.converters.convert_to_openmm_system(force_field, system)
    omm_topology = smee.converters.convert_to_openmm_topology(system)

    if apply_hmr:
        _apply_hmr(omm_system, system)

    for i, config in enumerate(equilibrate_configs):
        _LOGGER.info(f"running equilibration step {i + 1} / {len(equilibrate_configs)}")

        if isinstance(config, smee.mm.MinimizationConfig):
            omm_state = _energy_minimize(omm_system, omm_state, platform, config)

        elif isinstance(config, smee.mm.SimulationConfig):
            omm_state = _run_simulation(
                omm_system, omm_topology, omm_state, platform, config, None
            )
        else:
            raise NotImplementedError

        _LOGGER.info(_get_state_log(omm_state))

    _LOGGER.info("running production simulation")
    omm_state = _run_simulation(
        omm_system,
        omm_topology,
        omm_state,
        platform,
        production_config,
        production_reporters,
    )
    _LOGGER.info(_get_state_log(omm_state))

    return omm_state

compute_dg_solv #

compute_dg_solv(
    solute: TensorTopology,
    solvent_a: TensorTopology | None,
    solvent_b: TensorTopology | None,
    force_field: TensorForceField,
    fep_dir: Path,
) -> Tensor

Computes ∆G_solv from existing FEP data.

Notes

It is assumed that FEP data was generated using the same force field as force_field, and using generate_dg_solv_data. No attempt is made to validate this assumption, so proceed with extreme caution.

Parameters:

solute (TensorTopology) –

The solute topology.
solvent_a (TensorTopology | None) –

The topology of the solvent in phase A.
solvent_b (TensorTopology | None) –

The topology of the solvent in phase B.
force_field (TensorForceField) –

The force field used to generate the FEP data.
fep_dir (Path) –

The directory containing the FEP data.

Returns:

Tensor –

∆G_solv [kcal/mol].

Source code in smee/mm/_ops.py

def compute_dg_solv(
    solute: smee.TensorTopology,
    solvent_a: smee.TensorTopology | None,
    solvent_b: smee.TensorTopology | None,
    force_field: smee.TensorForceField,
    fep_dir: pathlib.Path,
) -> torch.Tensor:
    """Computes ∆G_solv from existing FEP data.

    Notes:
        It is assumed that FEP data was generated using the same force field as
        ``force_field``, and using ``generate_dg_solv_data``. No attempt is made to
        validate this assumption, so proceed with extreme caution.

    Args:
        solute: The solute topology.
        solvent_a: The topology of the solvent in phase A.
        solvent_b: The topology of the solvent in phase B.
        force_field: The force field used to generate the FEP data.
        fep_dir: The directory containing the FEP data.

    Returns:
        ∆G_solv [kcal/mol].
    """

    tensors, parameter_lookup, attribute_lookup, has_v_sites = _pack_force_field(
        force_field
    )

    kwargs = {
        "solute": solute,
        "solvent_a": solvent_a,
        "solvent_b": solvent_b,
        "force_field": force_field,
        "parameter_lookup": parameter_lookup,
        "attribute_lookup": attribute_lookup,
        "has_v_sites": has_v_sites,
        "fep_dir": fep_dir,
    }
    return _ComputeDGSolv.apply(kwargs, *tensors)

compute_ensemble_averages #

compute_ensemble_averages(
    system: TensorSystem,
    force_field: TensorForceField,
    frames_path: Path,
    temperature: Quantity,
    pressure: Quantity | None,
) -> tuple[dict[str, Tensor], dict[str, Tensor]]

Compute ensemble average of the potential energy, volume, density, and enthalpy (if running NPT) over an MD trajectory.

Parameters:

system (TensorSystem) –

The system to simulate.
force_field (TensorForceField) –

The force field to use.
frames_path (Path) –

The path to the trajectory to compute the average over.
temperature (Quantity) –

The temperature that the trajectory was simulated at.
pressure (Quantity | None) –

The pressure that the trajectory was simulated at.

Returns:

tuple[dict[str, Tensor], dict[str, Tensor]] –

A dictionary containing the ensemble averages of the potential energy [kcal/mol], volume [Å^3], density [g/mL], and enthalpy [kcal/mol], and a dictionary containing their standard deviations.

Source code in smee/mm/_ops.py

def compute_ensemble_averages(
    system: smee.TensorSystem,
    force_field: smee.TensorForceField,
    frames_path: pathlib.Path,
    temperature: openmm.unit.Quantity,
    pressure: openmm.unit.Quantity | None,
) -> tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]:
    """Compute ensemble average of the potential energy, volume, density,
    and enthalpy (if running NPT) over an MD trajectory.

    Args:
        system: The system to simulate.
        force_field: The force field to use.
        frames_path: The path to the trajectory to compute the average over.
        temperature: The temperature that the trajectory was simulated at.
        pressure: The pressure that the trajectory was simulated at.

    Returns:
        A dictionary containing the ensemble averages of the potential energy
        [kcal/mol], volume [Å^3], density [g/mL], and enthalpy [kcal/mol],
        and a dictionary containing their standard deviations.
    """
    tensors, parameter_lookup, attribute_lookup, has_v_sites = _pack_force_field(
        force_field
    )

    beta = 1.0 / (openmm.unit.MOLAR_GAS_CONSTANT_R * temperature)
    beta = beta.value_in_unit(openmm.unit.kilocalorie_per_mole**-1)

    if pressure is not None:
        pressure = (pressure * openmm.unit.AVOGADRO_CONSTANT_NA).value_in_unit(
            openmm.unit.kilocalorie_per_mole / openmm.unit.angstrom**3
        )

    kwargs: _EnsembleAverageKwargs = {
        "force_field": force_field,
        "parameter_lookup": parameter_lookup,
        "attribute_lookup": attribute_lookup,
        "has_v_sites": has_v_sites,
        "system": system,
        "frames_path": frames_path,
        "beta": beta,
        "pressure": pressure,
    }

    *avg_outputs, columns = _EnsembleAverageOp.apply(kwargs, *tensors)

    avg_values = avg_outputs[: len(avg_outputs) // 2]
    avg_std = avg_outputs[len(avg_outputs) // 2 :]

    return (
        {column: avg for avg, column in zip(avg_values, columns, strict=True)},
        {column: avg for avg, column in zip(avg_std, columns, strict=True)},
    )

reweight_dg_solv #

reweight_dg_solv(
    solute: TensorTopology,
    solvent_a: TensorTopology | None,
    solvent_b: TensorTopology | None,
    force_field: TensorForceField,
    fep_dir: Path,
    dg_0: Tensor,
    min_samples: int = 50,
) -> tuple[Tensor, float]

Computes ∆G_solv by re-weighting existing FEP data.

Notes

It is assumed that FEP data was generated using generate_dg_solv_data.

Parameters:

solute (TensorTopology) –

The solute topology.
solvent_a (TensorTopology | None) –

The topology of the solvent in phase A.
solvent_b (TensorTopology | None) –

The topology of the solvent in phase B.
force_field (TensorForceField) –

The force field to reweight to.
fep_dir (Path) –

The directory containing the FEP data.
dg_0 (Tensor) –

∆G_solv [kcal/mol] computed with the force field used to generate the FEP data.
min_samples (int, default: 50 ) –

The minimum number of effective samples required to re-weight.

Raises:

NotEnoughSamplesError –

If the number of effective samples is less than min_samples.

Returns:

tuple[Tensor, float] –

The re-weighted ∆G_solv [kcal/mol], and the minimum number of effective samples between the two phases.

Source code in smee/mm/_ops.py

def reweight_dg_solv(
    solute: smee.TensorTopology,
    solvent_a: smee.TensorTopology | None,
    solvent_b: smee.TensorTopology | None,
    force_field: smee.TensorForceField,
    fep_dir: pathlib.Path,
    dg_0: torch.Tensor,
    min_samples: int = 50,
) -> tuple[torch.Tensor, float]:
    """Computes ∆G_solv by re-weighting existing FEP data.

    Notes:
        It is assumed that FEP data was generated using ``generate_dg_solv_data``.

    Args:
        solute: The solute topology.
        solvent_a: The topology of the solvent in phase A.
        solvent_b: The topology of the solvent in phase B.
        force_field: The force field to reweight to.
        fep_dir: The directory containing the FEP data.
        dg_0: ∆G_solv [kcal/mol] computed with the force field used to generate the
            FEP data.
        min_samples: The minimum number of effective samples required to re-weight.

    Raises:
        NotEnoughSamplesError: If the number of effective samples is less than
            ``min_samples``.

    Returns:
        The re-weighted ∆G_solv [kcal/mol], and the minimum number of effective samples
        between the two phases.
    """
    tensors, parameter_lookup, attribute_lookup, has_v_sites = _pack_force_field(
        force_field
    )

    kwargs = {
        "solute": solute,
        "solvent_a": solvent_a,
        "solvent_b": solvent_b,
        "force_field": force_field,
        "parameter_lookup": parameter_lookup,
        "attribute_lookup": attribute_lookup,
        "has_v_sites": has_v_sites,
        "fep_dir": fep_dir,
        "dg_0": dg_0,
    }

    dg, n_eff = _ReweightDGSolv.apply(kwargs, *tensors)

    if n_eff < min_samples:
        raise NotEnoughSamplesError

    return dg, n_eff

reweight_ensemble_averages #

reweight_ensemble_averages(
    system: TensorSystem,
    force_field: TensorForceField,
    frames_path: Path,
    temperature: Quantity,
    pressure: Quantity | None,
    min_samples: int = 50,
) -> dict[str, Tensor]

Compute the ensemble average of the potential energy, volume, density, and enthalpy (if running NPT) by re-weighting an existing MD trajectory.

Parameters:

system (TensorSystem) –

The system that was simulated.
force_field (TensorForceField) –

The new force field to use.
frames_path (Path) –

The path to the trajectory to compute the average over.
temperature (Quantity) –

The temperature that the trajectory was simulated at.
pressure (Quantity | None) –

The pressure that the trajectory was simulated at.
min_samples (int, default: 50 ) –

The minimum number of samples required to compute the average.

Raises:

NotEnoughSamplesError –

If the number of effective samples is less than min_samples.

Returns:

dict[str, Tensor] –

A dictionary containing the ensemble averages of the potential energy [kcal/mol], volume [Å^3], density [g/mL], and enthalpy [kcal/mol].

Source code in smee/mm/_ops.py

def reweight_ensemble_averages(
    system: smee.TensorSystem,
    force_field: smee.TensorForceField,
    frames_path: pathlib.Path,
    temperature: openmm.unit.Quantity,
    pressure: openmm.unit.Quantity | None,
    min_samples: int = 50,
) -> dict[str, torch.Tensor]:
    """Compute the ensemble average of the potential energy, volume, density,
    and enthalpy (if running NPT) by re-weighting an existing MD trajectory.

    Args:
        system: The system that was simulated.
        force_field: The new force field to use.
        frames_path: The path to the trajectory to compute the average over.
        temperature: The temperature that the trajectory was simulated at.
        pressure: The pressure that the trajectory was simulated at.
        min_samples: The minimum number of samples required to compute the average.

    Raises:
        NotEnoughSamplesError: If the number of effective samples is less than
            ``min_samples``.

    Returns:
        A dictionary containing the ensemble averages of the potential energy
        [kcal/mol], volume [Å^3], density [g/mL], and enthalpy [kcal/mol].
    """
    tensors, parameter_lookup, attribute_lookup, has_v_sites = _pack_force_field(
        force_field
    )

    beta = 1.0 / (openmm.unit.MOLAR_GAS_CONSTANT_R * temperature)
    beta = beta.value_in_unit(openmm.unit.kilocalorie_per_mole**-1)

    if pressure is not None:
        pressure = (pressure * openmm.unit.AVOGADRO_CONSTANT_NA).value_in_unit(
            openmm.unit.kilocalorie_per_mole / openmm.unit.angstrom**3
        )

    kwargs: _ReweightAverageKwargs = {
        "force_field": force_field,
        "parameter_lookup": parameter_lookup,
        "attribute_lookup": attribute_lookup,
        "has_v_sites": has_v_sites,
        "system": system,
        "frames_path": frames_path,
        "beta": beta,
        "pressure": pressure,
        "min_samples": min_samples,
    }

    *avg_outputs, columns = _ReweightAverageOp.apply(kwargs, *tensors)
    return {column: avg for avg, column in zip(avg_outputs, columns, strict=True)}

tensor_reporter #

tensor_reporter(
    output_path: PathLike,
    report_interval: int,
    beta: Quantity,
    pressure: Quantity | None,
) -> TensorReporter

Create a TensorReporter capable of writing frames to a file.

Parameters:

output_path (PathLike) –

The path to write the frames to.
report_interval (int) –

The interval (in steps) at which to write frames.
beta (Quantity) –

The inverse temperature the simulation is being run at.
pressure (Quantity | None) –

The pressure the simulation is being run at, or None if NVT / vacuum.

Source code in smee/mm/_reporters.py

@contextlib.contextmanager
def tensor_reporter(
    output_path: os.PathLike,
    report_interval: int,
    beta: openmm.unit.Quantity,
    pressure: openmm.unit.Quantity | None,
) -> TensorReporter:
    """Create a ``TensorReporter`` capable of writing frames to a file.

    Args:
        output_path: The path to write the frames to.
        report_interval: The interval (in steps) at which to write frames.
        beta: The inverse temperature the simulation is being run at.
        pressure: The pressure the simulation is being run at, or ``None`` if NVT /
            vacuum.
    """
    with open(output_path, "wb") as output_file:
        reporter = TensorReporter(output_file, report_interval, beta, pressure)
        yield reporter

unpack_frames #

unpack_frames(
    file: BinaryIO,
) -> Generator[tuple[Tensor, Tensor, float], None, None]

Unpack frames saved by a TensorReporter.

Source code in smee/mm/_reporters.py

def unpack_frames(
    file: typing.BinaryIO,
) -> typing.Generator[tuple[torch.Tensor, torch.Tensor, float], None, None]:
    """Unpack frames saved by a ``TensorReporter``."""

    unpacker = msgpack.Unpacker(file, object_hook=_decoder)

    for frame in unpacker:
        yield frame

mm #

GenerateCoordsConfig pydantic-model #

target_density pydantic-field #

scale_factor pydantic-field #

padding pydantic-field #

tolerance pydantic-field #

seed pydantic-field #

MinimizationConfig pydantic-model #

tolerance pydantic-field #

max_iterations pydantic-field #

SimulationConfig pydantic-model #

temperature pydantic-field #

pressure pydantic-field #

n_steps pydantic-field #

timestep pydantic-field #

friction_coeff pydantic-field #

NotEnoughSamplesError #

TensorReporter #

generate_dg_solv_data #

generate_system_coords #

simulate #

compute_dg_solv #

compute_ensemble_averages #

reweight_dg_solv #

reweight_ensemble_averages #

tensor_reporter #

unpack_frames #

GenerateCoordsConfig `pydantic-model` #

target_density `pydantic-field` #

scale_factor `pydantic-field` #

padding `pydantic-field` #

tolerance `pydantic-field` #

seed `pydantic-field` #

MinimizationConfig `pydantic-model` #

tolerance `pydantic-field` #

max_iterations `pydantic-field` #

SimulationConfig `pydantic-model` #

temperature `pydantic-field` #

pressure `pydantic-field` #

n_steps `pydantic-field` #

timestep `pydantic-field` #

friction_coeff `pydantic-field` #