Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge latest mindlessgen features into gp3_ipea #84

Merged
merged 10 commits into from
Nov 18, 2024
6 changes: 3 additions & 3 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
# the repo. Unless a later match takes precedence,
# @marcelmbn and @thfroitzheim will be requested for
# review when someone opens a pull request.
* @marcelmbn @thfroitzheim
* @marcelmbn @jonathan-schoeps

# These parts are specifically owned by some people
/src/mindlessgen/cli @marcelmbn
/src/mindlessgen/generator @marcelmbn
/src/mindlessgen/molecules @marcelmbn
/src/mindlessgen/prog @marcelmbn
/src/mindlessgen/molecules @marcelmbn @jonathan-schoeps
/src/mindlessgen/prog @marcelmbn @jonathan-schoeps
/src/mindlessgen/qm @marcelmbn
24 changes: 16 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,26 +7,34 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Changed
- vdW radii scaling parameter can now be adjusted via `mindlessgen.toml` or CLI
- The check_distance function now checks based on the sum of the van der Waals radii and a scaling factor acessible via `mindlessgen.toml` or CLI
- check_distance function now checks based on the sum of the van der Waals radii and a scaling factor acessible via `mindlessgen.toml` or CLI
- better type hints for `Callables`
- A clearer differentiation between the distinct scaling factors for the van der Waals radii.
- `README.md` with more detailed explanation of the element composition function.
- clearer differentiation between the distinct scaling factors for the van der Waals radii
- `README.md` with more detailed explanation of the element composition function
- Default `max_cycles` for the generation & refinement set to 200

### Fixed
- Unit conversion for (currenly unused) vdW radii from the original Fortran project
- unit conversion for (currenly unused) vdW radii from the original Fortran project
- minor print output issues (no new line breaks, more consistent verbosity differentiation, ...)
- bug in `postprocess_mol` which led to an unassigned return variable in the single-point case
- bug leading to `UnicodeDecodeError` when reading `xtb` output files
- bug with all atom lists being initialized with a length of 102 instead of 103
- inconsistent default values for the `mindlessgen.toml` and the `ConfigManager` class
- legacy pseudo random number generation removed and replaced by `np.random.default_rng()` for avoiding interference with other packages

### Added
- Support for the novel "g-xTB" method (working title: GP3-xTB)
- A function which contracts the coordinates after the initial generation.
- A function which is able to printout the xyz coordinates to the terminal similar to the `.xyz` layout.
- Elements 87 to 103 are accessible via the element composition. If `xtb` is the engine, the elements will be replaced by their lighter homologues.
- support for the novel "g-xTB" method (working title: GP3-xTB)
- function which contracts the coordinates after the initial generation
- function which is able to printout the xyz coordinates to the terminal similar to the `.xyz` layout
- elements 87 to 103 are accessible via the element composition. If `xtb` is the engine, the elements will be replaced by their lighter homologues.
- support for `python-3.13`
- option to set a fixed molecular charge, while ensuring `uhf = 0`

### Breaking Changes
- Removal of the `dist_threshold` flag and in the `-toml` file.
- The number of unpaired electrons (`Molecule.uhf`) is now set to 0 if `xtb` is used as `QMMethod` and a lanthanide is within the molecule to match the `f-in-core` approximation.
- "Contract Coordinates" functionality set to `true` by default in the `mindlessgen.toml` file.
- `basename.UHF` and `basename.CHRG` are only written to disk if they differ from the default value (0 and 0, respectively).

## [0.4.0] - 2024-09-19
### Changed
Expand Down
75 changes: 47 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
<a href="http://www.apache.org/licenses/LICENSE-2.0">
<img src="https://img.shields.io/badge/License-Apache%202.0-orange.svg" alt="Apache-2.0"/>
</a>
<a href="https://img.shields.io/badge/Python-3.10%20|%203.11%20|%203.12-blue.svg">
<img src="https://img.shields.io/badge/Python-3.10%20|%203.11|%203.12-blue.svg" alt="Python Versions"/>
<a href="https://img.shields.io/badge/Python-3.10%20|%203.11%20|%203.12%20|%203.13-blue.svg">
<img src="https://img.shields.io/badge/Python-3.10%20|%203.11%20|%203.12%20|%203.13-blue.svg" alt="Python Versions"/>
</a>
<img align="right" src="assets/C1H2N1O2Te2Er1Lu2_89bd3e.png" height="150" />

Expand Down Expand Up @@ -49,10 +49,10 @@ Both installation methods work in principle also without a virtual environment,
### Development purposes

For working on the code of `mindlessgen`, the following setup is recommended:
```
```bash
mamba create -n mindlessgen python=3.12
mamba activate mindlessgen
git clone {link to the MindlessGen repository}
git clone https://github.com/grimme-lab/MindlessGen.git # or the analogous SSH link
pip install -e '.[dev]'
```
Thereby, all necessary development tools (e.g., `ruff`, `mypy`, `tox`, `pytest`, and `pre-commit`) are installed.
Expand Down Expand Up @@ -82,11 +82,12 @@ If the path is not specified with `-c/--config`, `mindlessgen.toml` will be sear
1. Current working directory (`$CWD`)
2. Home directory (`$USER/`)

The active configuration can be printed using `--print-config`.
If neither a corresponding CLI command nor an entry in the configuration file is provided, the default values are used.
The active configuration, including the default values, can be printed using `--print-config`.

### Element composition
There are two related aspects of the element composition:
1. _Which elements_ should occur within the generated molecule?
1. **Which elements** should occur within the generated molecule?
2. **How many atoms** of the specified element should occur?
- **Example 1**: `C:1-3, O:1-1, H:1-*` would result in a molecule with 1, 2, or 3 carbon atoms, exactly 1 oxygen atom, and between 1 and an undefined number of hydrogen atoms (i.e., at least 1).
- **Example 2**: `Na:10-10, In:10-10, O:20-20`. This example would result in a molecule with exactly 10 sodium atoms, 10 indium atoms, and 20 oxygen atoms. **For a fixed element composition, the number of atoms (40) has to be within the min_num_atoms and max_num_atom interval.** `mindlessgen` will consequently always return a molecule with exactly 40 atoms.
Expand All @@ -98,28 +99,46 @@ There are two related aspects of the element composition:

## Citation

When using the program for academic purposes, please cite:

_J. Chem. Theory Comput._ 2009, **5**, 4, 993–1003

or in `BibTeX` format:
```
@article{doi:10.1021/ct800511q,
author = {Korth, Martin and Grimme, Stefan},
title = {“Mindless” DFT Benchmarking},
journal = {Journal of Chemical Theory and Computation},
volume = {5},
number = {4},
pages = {993-1003},
year = {2009},
doi = {10.1021/ct800511q},
note ={PMID: 26609608},
URL = {https://doi.org/10.1021/ct800511q},
eprint = {https://doi.org/10.1021/ct800511q}
}
```

## Acknowdledgements
When using the program for academic purposes, please cite _i)_ the original idea and _ii)_ the new Python implementation.

1. _J. Chem. Theory Comput._ 2009, **5**, 4, 993–1003
```
@article{korth_mindless_2009,
title = {Mindless {DFT} benchmarking},
volume = {5},
issn = {15499618},
url = {https://pubs.acs.org/doi/full/10.1021/ct800511q},
doi = {10.1021/ct800511q},
number = {4},
urldate = {2022-11-07},
journal = {J. Chem. Theo. Comp.},
author = {Korth, Martin and Grimme, Stefan},
month = apr,
year = {2009},
note = {Publisher: American Chemical Society},
pages = {993--1003},
}
```

2. A new publication featuring all functionalities and improvements of `mindlessgen` is in preparation.
In the meantime, please refer to the original publication and to the following preprint, which uses the `mindlessgen` program for the first time:
Müller, M.; Froitzheim, T.; Hansen, A.; Grimme, S. _ChemRxiv_ October 28, 2024. https://doi.org/10.26434/chemrxiv-2024-h76ms.
```
@misc{muller_advanced_2024,
title = {Advanced {Charge} {Extended} {Hückel} ({CEH}) {Model} and a {Consistent} {Adaptive} {Minimal} {Basis} {Set} for the {Elements} {Z}=1-103},
url = {https://chemrxiv.org/engage/chemrxiv/article-details/671a92581fb27ce1247466ad},
doi = {10.26434/chemrxiv-2024-h76ms},
urldate = {2024-10-28},
publisher = {ChemRxiv},
author = {Müller, Marcel and Froitzheim, Thomas and Hansen, Andreas and Grimme, Stefan},
month = oct,
year = {2024},
keywords = {DFT, Basis sets, EHT, SQM},
}
```

## Acknowledgements

[T. Gasevic](https://github.com/gasevic) for creating an initial `GitHub` [migration](https://github.com/gasevic/mlmgen) of the code and making important adjustments to the workflow.
[S. Grimme](https://www.chemie.uni-bonn.de/grimme/de/grimme) and M. Korth for the original code written in Fortran associated to the publication in [J. Chem. Theory Comput.](https://pubs.acs.org/doi/full/10.1021/ct800511q).
[T. Froitzheim](https://github.com/thfroitzheim) for helpful discussons during the development of the program.
24 changes: 13 additions & 11 deletions mindlessgen.toml
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# Default configuration for the 'Mindless Molecule GENerator' (MindlessGen) package
# Following file locations are searched for in the following order:
# Default configuration for the 'Mindless Molecule Generator' (MindlessGen) package
# The following file locations are searched for in ascending order:
# 1. Location specified by the `--config < str | Path >` command-line argument
# 2. Current working directory (`Path.cwd()`)
# 3. User's home directory (`Path.home()`)

[general]
# > Verbosity level defining the printout: Options: 0 = silent, 1 = default, 2 = verbose, 3 = debug
# > Verbosity level defining the printout: Options: -1 = super-silent, 0 = silent, 1 = default, 2 = verbose, 3 = debug
verbosity = 1
# > Number of parallel processes to use. Corresponds to the number of physical CPU cores used. Options: <int>
parallel = 1
# > Maximum number of generation/optimization try-and-error cycles per molecule. Options: <int>
max_cycles = 100
# > Maximum number of generation & optimization try-and-error cycles per molecule. Options: <int>
max_cycles = 200
# > Number of molecules to generate. Options: <int>
num_molecules = 1
# > Do post-processing (checking for HL gap, etc.) after the optimization. Options: <bool>
# > Do post-processing after the optimization with another engine (e.g., `orca`). Default: false. Options: <bool>
postprocess = false
# > Switch molecule structure XYZ writing on and off (default: true). Options: <bool>
# > Switch molecule structure XYZ writing on and off. Default: true. Options: <bool>
write_xyz = true

[generate]
Expand All @@ -26,22 +26,24 @@ max_num_atoms = 10
# > Initial coordinate scaling factor. Options: <float>
init_scaling = 3.0
# > Increase in the coordinate scaling factor per trial after check_distance was not met. Options: <float>
increase_scaling_factor = 1.3
increase_scaling_factor = 1.1
# > Scaling factor for the van der Waals radii employed for the fragment detection. Options: <float>
scale_fragment_detection = 1.25
# > Scaling factor for the minimal distance between two atoms based on the sum of the van der Waals radii. Options: <float>
scale_minimal_distance = 0.8
# > Contract the coordinates after the initial generation. Leads to more cluster-like and less extended structures. Options: <bool>
contract_coords = false
# > Contract the coordinates after the initial generation. Leads to more cluster-like and less extended structures
# and can speed-up the generation for larger molecules significantly. Options: <bool>
contract_coords = true
# > Atom types and their minimum and maximum occurrences. Format: "<element>:<min_count>-<max_count>"
# > Elements that are not specified are only added by random selection.
# > A star sign (*) can be used as a wildcard for integer value.
element_composition = "C:2-3, H:1-2, O:1-2, N:1-*"
# > Atom types that are not chosen for random selection. Format: "<element1>, <element2>, ..."
# > CAUTION: This option is overridden by the 'element_composition' option.
# > I.e., if an element is specified in 'element_composition' with an occurrence > 0, it will be added to the molecule anyway.
# > Example: forbidden_elements = "18,57-*"
forbidden_elements = "57-71, 81-*"
# > Set a charge for the molecule generation. Options: "none" (random charge assignment), "int" or <int> (fixed charge assignment)
molecular_charge = "none"

[refine]
# > Maximum number of fragment optimization cycles. Options: <int>
Expand Down
6 changes: 4 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,19 @@ build-backend = "setuptools.build_meta"
name = "mindlessgen"
authors = [
{ name = "Marcel Müller", email = "marcel.mueller@thch.uni-bonn.de" },
{ name = "Jonathan Schöps", email = "s6jtscho@uni-bonn.de" },
]
description = "Mindless Molecule GENerator"
description = "Mindless Molecule Generator"
readme = "README.md"
requires-python = ">=3.10"
license = { file = "LICENSE.md" }
classifiers = [
"License :: OSI Approved :: MIT License",
"License :: OSI Approved :: Apache-2.0 License",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Scientific/Engineering",
"Typing :: Typed",
]
Expand Down
7 changes: 7 additions & 0 deletions src/mindlessgen/cli/cli_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,12 @@ def cli_parser(argv: Sequence[str] | None = None) -> dict:
required=False,
help="Contract the coordinates of the molecule after the coordinats generation.",
)
parser.add_argument(
"--molecular-charge",
type=str,
required=False,
help="Define the charge of the molecule.",
)

### Refinement arguments ###
parser.add_argument(
Expand Down Expand Up @@ -280,6 +286,7 @@ def cli_parser(argv: Sequence[str] | None = None) -> dict:
"scale_fragment_detection": args_dict["scale_fragment_detection"],
"scale_minimal_distance": args_dict["scale_minimal_distance"],
"contract_coords": args_dict["contract_coords"],
"molecular_charge": args_dict["molecular_charge"],
}
# XTB specific arguments
rev_args_dict["xtb"] = {"xtb_path": args_dict["xtb_path"]}
Expand Down
6 changes: 6 additions & 0 deletions src/mindlessgen/generator/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,12 @@ def single_molecule_generator(
print(e)
stop_event.set()
return None
except RuntimeError as e:
if config.general.verbosity > 0:
print(f"Generation failed for cycle {cycle + 1}.")
if config.general.verbosity > 1:
print(e)
return None

try:
# ____ _ _ _
Expand Down
4 changes: 3 additions & 1 deletion src/mindlessgen/molecules/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
This module contains all molecule-related functionality.
"""

from .molecule import Molecule, PSE_NUMBERS, PSE_SYMBOLS
from .molecule import Molecule, PSE_NUMBERS, PSE_SYMBOLS, ati_to_atlist, atlist_to_ati
from .generate_molecule import (
generate_random_molecule,
generate_coordinates,
Expand Down Expand Up @@ -40,5 +40,7 @@
"get_alkaline_earth_metals",
"PSE_NUMBERS",
"PSE_SYMBOLS",
"ati_to_atlist",
"atlist_to_ati",
"postprocess_mol",
]
Loading