Skip to content

Commit

Permalink
Avoid memory leaks and other tensorflow issues (#68)
Browse files Browse the repository at this point in the history
This PR addresses two sources of memory leaks apparent when repeatedly
encoding many molecules in a loop, both originating from `tensorflow`:
- First, there is a very mild leak, caused by `tensorflow` not fully
cleaning up some of its internals, which appears across many
`tensorflow` versions.
- Second, there is also a bigger leak introduced in `tensorflow` vesion
`2.10`.

The first issue is addressed by manually clearing
`_py_funcs_used_in_graph`, while for the second I temporarily pin the
supported `tensorflow` version to `<2.10`, awaiting the issue to be
fixed upstream. The pin also avoids backward compatibility problems that
start to appear in `2.14` and prevent the pretrained checkpoint from
being loaded (see #67).
  • Loading branch information
kmaziarz authored Dec 12, 2023
1 parent a9a3fc9 commit 5fa9c3c
Show file tree
Hide file tree
Showing 5 changed files with 11 additions and 3 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
- environment-file: environment-py39.yml
build-name: "python 3.9.16, tf 2.9.1, rdkit 2022.09.1"
- environment-file: environment.yml
build-name: "python 3.10, tf latest, rdkit latest"
build-name: "python 3.10, tf 2.9.*, rdkit latest"
defaults:
run:
shell: bash -l {0}
Expand Down
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.
### Changed
- Relax `protobuf` version requirement ([#62](https://github.com/microsoft/molecule-generation/pull/62))

### Fixed
- Avoid memory leaks and other `tensorflow` issues ([#68](https://github.com/microsoft/molecule-generation/pull/68))

## [0.4.0] - 2023-06-16

### Added
Expand Down
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ dependencies:
- pip
- python=3.10
- rdkit
- tensorflow
- tensorflow<2.10
- pip:
- numpy
5 changes: 5 additions & 0 deletions molecule_generation/utils/moler_inference_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from typing import Any, DefaultDict, Iterator, List, Optional, Tuple, Union

import numpy as np
import tensorflow as tf
from more_itertools import chunked, ichunked
from rdkit import Chem

Expand Down Expand Up @@ -81,6 +82,10 @@ def _encode_from_smiles(
else:
result.extend(graph_rep_mean.numpy())

# Hack below avoids memory leaks caused by repeated calls to `tf.data.Dataset.from_generator`
# (see https://github.com/tensorflow/tensorflow/issues/37653 for details).
tf.compat.v1.get_default_graph()._py_funcs_used_in_graph = []

return result


Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"numpy>=1.19.2",
"protobuf<4.21", # Avoid the breaking 4.21.0 release.
"scikit-learn>=0.24.1",
"tensorflow>=2.1.0,<3",
"tensorflow>=2.1,<2.10", # Avoid versions 2.10+ which suffer from memory leaks.
"tf2_gnn>=2.13.0",
],
packages=setuptools.find_packages(),
Expand Down

0 comments on commit 5fa9c3c

Please sign in to comment.