xtb_step v1 – Scope Document#
- Status:
Draft, agreed in chat
- Plug-in:
xtb_step- Top-level class:
xTB- Step display name:
xTB- Sub-steps:
Energy,Optimization,Frequencies
Goal of v1#
Produce a SEAMM plug-in that lets a flowchart run xTB single-point
energies, geometry optimizations, and vibrational frequencies (with the
thermochemistry table that xTB prints alongside a Hessian run), for
molecular systems, with optional implicit solvation, using xTB’s
GFN0/GFN1/GFN2/GFN-FF methods. The plug-in installs xTB automatically
via conda (seamm-xtb environment from conda-forge), integrates with
the SEAMM property database, and fails gracefully on periodic input.
It is not intended to be production-grade or to expose every xTB option; subsequent releases will add MD, metadynamics, reaction-path, mode-following, and electronic-property workflows.
Why this is not redundant with the existing DFTB+ xTB driver#
dftbplus_step already exposes GFN1-xTB and GFN2-xTB through DFTB+’s
internal xTB driver, and the SEAMM paper [Saxe2025] uses that route for
the methylisocyanide benchmark. A standalone xtb_step adds:
GFN0-xTB and GFN-FF, neither of which is available via DFTB+
Native ALPB and CPCM-X solvation [Ehlert2021]
Native xTB Hessian and the xTB thermochemistry block
A direct path to MD, metadynamics, and other xTB drivers in v2+
Naming and Packaging#
Repository name:
xtb_stepTop-level class:
xTB(preserving the code’s case)Step display name:
xTBConda environment:
seamm-xtb(deps:python,xtbfrom conda-forge – requires verification thatxtbis published on conda-forge under that exact name)Group in the menus:
Top-level:
SimulationsSub-steps:
Calculations
Stevedore entry points (in
setup.py):org.molssi.seamm-> top-levelxTBorg.molssi.seamm.tk-> top-levelxTBorg.molssi.seamm.xtb-> three sub-stepsorg.molssi.seamm.xtb.tk-> three sub-steps
Architecture#
Subflowchart pattern modeled on fhi_aims_step. The top-level
xTB step contains a subflowchart; the user adds Energy,
Optimization, or Frequencies sub-step nodes into it.
Class hierarchy:
Energyis the base class with the parameters and machinery shared by all sub-steps (method, charge, multiplicity, solvation, accuracy, threading, the JSON parser, the executable-finding, the property storage).Optimizationinherits fromEnergyand adds optimizer parameters (level, max iterations) and structure-handling controls.Frequenciesinherits fromEnergyand adds the optimize-first toggle and thermochemistry temperature/pressure.
Each sub-step invokes the xtb binary itself with its own command
line. The top-level run() method only iterates the subflowchart
and lets each substep run; it does not build a single combined input
file (this is the pattern in fhi_aims_step, not the pattern in
mopac_step or lammps_step).
The cookiecutter-generated top-level run() uses the wrong
(LAMMPS-style) template and must be replaced before the plug-in will
run.
Methods#
Single enumeration parameter, exposed at the Energy level so all
sub-steps inherit it:
Choice |
xTB CLI flag |
Citation |
|---|---|---|
GFN2-xTB (def) |
|
|
GFN1-xTB |
|
|
GFN0-xTB |
|
|
GFN-FF |
|
The general xTB review [Bannwarth2021] is cited at level 1 whenever any of these methods is used.
Tasks and CLI mapping#
Sub-step |
xTB invocation |
|---|---|
Energy |
|
Optimization |
|
Frequencies |
|
The Frequencies sub-step harvests xTB’s thermochemistry block
(ZPE, S, Cv, H, G at a chosen temperature/pressure) directly from the
xTB output. No separate Thermochemistry sub-step in v1. The standalone
thermochemistry_step plug-in remains usable downstream for users who
want finite-difference verification, per [Saxe2025] Section 3.
Solvation#
Two parameters at the Energy level:
solvation model: enumerationnone(default),ALPB,GBSA,CPCM-Xsolvent: enumeration of xTB’s supported solvent list (water, methanol, DMSO, acetone, acetonitrile, chloroform, dichloromethane, DMF, ether, hexane, octanol, THF, toluene, …). The exact list depends on the xTB version and on the solvation model; we will pull the canonical list from the xTB documentation (https://xtb-docs.readthedocs.io/) at implementation time.
CLI mapping:
ALPB->--alpb <solvent>GBSA->--gbsa <solvent>CPCM-X->--cpcmx <solvent>
Periodic Systems#
xTB’s PBC support is limited and not part of v1. Behavior on periodic input:
At the start of every sub-step’s
run(), checkconfiguration.periodicity.If non-zero,
printer.important()an explanatory message.raise RuntimeErrorso the flowchart stops with a clear error.
This matches the convention used elsewhere in the SEAMM plug-in ecosystem.
Installation#
Default installation = conda, conda environment seamm-xtb from
data/seamm-xtb.yml. The .ini follows the
dftbplus.ini template (DFTB+ is also conda-default since it’s on
conda-forge). Sections: [docker] and [local], with [local]
supporting installation modes conda, modules, local, and
docker.
Following the mopac_step pattern, the plug-in provides an
installer.py that knows how to:
create the conda environment
check whether xTB is callable
report the installed xTB version
Input / Output Strategy#
- Input:
Write
coord.xyz(XYZ format). All other options are passed on the xtb command line; no input deck file is needed. Charge and multiplicity go through--chrg Nand--uhf M(where M is the number of unpaired electrons; xTB convention).- Output:
Primary parser uses the JSON output produced by
--json(filextbout.json). Energies, gradients, dipole, Mulliken/CM5 charges, HOMO/LUMO are pulled from there. The thermochemistry block (ZPE, entropy, Cp, Gibbs free energy) is parsed from the textxtb.outbecause, as of recent xTB versions, the thermo block is not part of the JSON output. This needs verification at implementation time against a current xtb release.- Working directory layout:
Each sub-step gets its own directory under the SEAMM job tree (handled automatically by
seamm.Node). The directory containscoord.xyz,xtbout.json,xtb.out,stdout.txt,stderr.txt, and (for optimization)xtbopt.xyzandxtbopt.log.
Properties Seed#
Initial data/properties.csv entries, using the
<name>#xTB#{model} convention from mopac_step. {model} is
filled with the active method (GFN2-xTB etc.) at runtime.
total energy#xTB#{model}(float, E_h)electronic energy#xTB#{model}(float, E_h)HOMO energy#xTB#{model}(float, eV)LUMO energy#xTB#{model}(float, eV)band gap#xTB#{model}(float, eV)dipole moment#xTB#{model}(float, D)gradients#xTB#{model}(json, E_h/Bohr)force constants#xTB#{model}(json, E_h/Bohr^2)enthalpy of formation#xTB#{model}(float, kJ/mol)zero point energy#xTB#{model}(float, kJ/mol)entropy#xTB#{model}(float, J/mol/K)constant pressure heat capacity#xTB#{model}(float, J/mol/K)Gibbs free energy#xTB#{model}(float, kJ/mol)
Out of Scope for v1#
The following xTB capabilities are deliberately deferred:
Molecular dynamics (
--md)Metadynamics (
--metadyn)Nudged elastic band / reaction path (
--path)Mode-following (
--modef)Vertical IP/EA, Fukui (
--vipea,--vfukui)Custom xcontrol files
Periodic systems
ONIOM / QM-MM via wrappers
Open Items to Verify at Implementation Time#
Exact
xtbout.jsonschema in the current xTB release. Will verify against https://xtb-docs.readthedocs.io/ before writing the parser. Fall back to text scraping if needed.Whether
xtbis actually published on conda-forge under the package namextb. Will check withconda search -c conda-forge xtb.Whether to expose
--acc N(accuracy multiplier, default 1.0). Recommend yes, since it’s the main quality knob users actually turn.Whether to expose OMP thread count as a parameter. Recommend yes, mirroring
mopac_stepanddftbplus_step.Whether
--bhess(biased single-point Hessian) is worth exposing alongside--hess/--ohess. Recommend no for v1.
Skeleton Issues to Fix Before First Useful Build#
The cookiecutter-generated skeleton has two issues that need to be addressed before any code-fill-in:
Top-level ``run()`` template is wrong.
xtb_step/xtb.pyuses the LAMMPS-style “build singlemolssi.datfromnode.get_input()and run the binary once” template. This is not the right pattern for xTB. Replace with thefhi_aims_step-style iterate-and-let-substeps-run pattern.``pkg_resources`` is deprecated. Replace
pkg_resourcesimports withimportlib.resourcesinxtb.py,energy.py,optimization.py, andfrequencies.py.
References#
Saxe, P.; et al. SEAMM: A Simulation Environment for Atomistic and Molecular Modeling. J. Phys. Chem. A 2025, 129, 6973-6993. https://doi.org/10.1021/acs.jpca.5c03164
Bannwarth, C.; Caldeweyher, E.; Ehlert, S.; Hansen, A.; Pracht, P.; Seibert, J.; Spicher, S.; Grimme, S. Extended tight-binding quantum chemistry methods. WIREs Comput. Mol. Sci. 2021, 11, e1493. https://doi.org/10.1002/wcms.1493
Bannwarth, C.; Ehlert, S.; Grimme, S. GFN2-xTB – An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput. 2019, 15, 1652-1671. https://doi.org/10.1021/acs.jctc.8b01176
Grimme, S.; Bannwarth, C.; Shushkov, P. A Robust and Accurate Tight-Binding Quantum Chemical Method for Structures, Vibrational Frequencies, and Noncovalent Interactions of Large Molecular Systems Parametrized for All spd-Block Elements (Z = 1-86). J. Chem. Theory Comput. 2017, 13, 1989-2009. https://doi.org/10.1021/acs.jctc.7b00118
Pracht, P.; Caldeweyher, E.; Ehlert, S.; Grimme, S. A Robust Non-Self-Consistent Tight-Binding Quantum Chemistry Method for Large Molecules. ChemRxiv, 2019. https://doi.org/10.26434/chemrxiv.8326202.v1 (Note: GFN0-xTB has no formal journal paper as of last check; verify the citation status at implementation time.)
Spicher, S.; Grimme, S. Robust Atomistic Modeling of Materials, Organometallic, and Biochemical Systems. Angew. Chem. Int. Ed. 2020, 59, 15665-15673. https://doi.org/10.1002/anie.202004239
Ehlert, S.; Stahn, M.; Spicher, S.; Grimme, S. Robust and Efficient Implicit Solvation Model for Fast Semiempirical Methods. J. Chem. Theory Comput. 2021, 17, 4250-4261. https://doi.org/10.1021/acs.jctc.1c00471