Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update versions of python packages #560

Merged
merged 3 commits into from
Sep 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion Docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

ARG PYTHON_VERSION=3.10
ARG FORGE_VERSION=23.11.0-0
ARG FORGE_VERSION=24.3.0-0

# Install conda
RUN wget --no-check-certificate -qO ~/miniforge.sh \
Expand Down Expand Up @@ -189,6 +189,9 @@ SHELL ["/bin/bash", "--login", "-c"]
COPY --from=selected_freesurfer_build_image /opt/freesurfer /opt/freesurfer
COPY --from=selected_conda_build_image /venv /venv

# Fix for cuda11.8+cudnn8.7 bug+warning: https://github.com/pytorch/pytorch/issues/97041
RUN if [[ "$DEVICE" == "cu118" ]] ; then cd /venv/python3.10/site-packages/torch/lib && ln -s libnvrtc-*.so.11.2 libnvrtc.so ; fi

# Copy fastsurfer over from the build context and add PYTHONPATH
COPY . /fastsurfer/
ENV PYTHONPATH=/fastsurfer:/opt/freesurfer/python/packages \
Expand Down
16 changes: 8 additions & 8 deletions Docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ As you can see, only the tag of the image is changed from gpu to cpu and the sta

Here we build an experimental image to test performance when running on AMD GPUs. Note that you need a supported OS and Kernel version and supported GPU for the RocM to work correctly. You need to install the Kernel drivers into
your host machine kernel (amdgpu-install --usecase=dkms) for the amd docker to work. For this follow:
https://docs.amd.com/en/latest/deploy/linux/quick_start.html
https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html#rocm-install-quick, https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/amdgpu-install.html#amdgpu-install-dkms and https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/docker.html

```bash
PYTHONPATH=<FastSurferRoot>
Expand All @@ -149,22 +149,22 @@ python build.py --device rocm --tag my_fastsurfer:rocm
and run segmentation only:

```bash
docker run --rm --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
--device=/dev/kfd --device=/dev/dri --group-add video --ipc=host \
--shm-size 8G \
docker run --rm --security-opt seccomp=unconfined \
--device=/dev/kfd --device=/dev/dri --group-add video \
-v /home/user/my_mri_data:/data \
-v /home/user/my_fastsurfer_analysis:/output \
my_fastsurfer:rocm \
--t1 /data/subjectX/t1-weighted.nii.gz \
--sid subjectX --sd /output
```

Note, we tested on an AMD Radeon Pro W6600, which is [not officially supported](https://docs.amd.com/en/latest/release/gpu_os_support.html), but setting `HSA_OVERRIDE_GFX_VERSION=10.3.0` [inside docker did the trick](https://en.opensuse.org/AMD_OpenCL#ROCm_-_Running_on_unsupported_hardware):
In conflict with the official ROCm documentation (above), we also needed to add the group render `--group-add render` (in addition to `--group-add video`).

Note, we tested on an AMD Radeon Pro W6600, which is [not officially supported](https://docs.amd.com/en/latest/release/gpu_os_support.html), but setting `HSA_OVERRIDE_GFX_VERSION=10.3.0` [inside docker did the trick](https://en.opensuse.org/SDB:AMD_GPGPU#Using_CUDA_code_with_ZLUDA_and_ROCm):

```bash
docker run --rm --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
--device=/dev/kfd --device=/dev/dri --group-add video --ipc=host \
--shm-size 8G \
docker run --rm --security-opt seccomp=unconfined \
--device=/dev/kfd --device=/dev/dri --group-add video --group-add render \
-v /home/user/my_mri_data:/data \
-v /home/user/my_fastsurfer_analysis:/output \
-e HSA_OVERRIDE_GFX_VERSION=10.3.0 \
Expand Down
17 changes: 9 additions & 8 deletions Docker/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,8 @@
Target = Literal['runtime', 'build_common', 'build_conda', 'build_freesurfer',
'build_base', 'runtime_cuda']
CacheType = Literal["inline", "registry", "local", "gha", "s3", "azblob"]
AllDeviceType = Literal["cpu", "cuda", "cu116", "cu117", "cu118", "rocm", "rocm5.1.1",
"rocm5.4.2"]
DeviceType = Literal["cpu", "cu116", "cu117", "cu118", "rocm5.1.1", "rocm5.4.2"]
AllDeviceType = Literal["cpu", "cuda", "cu118", "cu121", "cu124", "rocm", "rocm6.1"]
DeviceType = Literal["cpu", "cu118", "cu121", "cu124", "rocm6.1"]

CREATE_BUILDER = "Create builder with 'docker buildx create --name fastsurfer'."
CONTAINERD_MESSAGE = (
Expand All @@ -59,10 +58,11 @@ class DEFAULTS:
# and rocm versions, if pytorch comes with new versions.
# torch 1.12.0 comes compiled with cu113, cu116, rocm5.0 and rocm5.1.1
# torch 2.0.1 comes compiled with cu117, cu118, and rocm5.4.2
dkuegler marked this conversation as resolved.
Show resolved Hide resolved
# torch 2.4 comes compiled with cu118, cu121, cu124 and rocm6.1
MapDeviceType: Dict[AllDeviceType, DeviceType] = dict(
((d, d) for d in get_args(DeviceType)),
rocm="rocm5.1.1",
cuda="cu117",
rocm="rocm6.1",
cuda="cu124",
)
BUILD_BASE_IMAGE = "ubuntu:22.04"
RUNTIME_BASE_IMAGE = "ubuntu:22.04"
Expand Down Expand Up @@ -185,12 +185,12 @@ def make_parser() -> argparse.ArgumentParser:

parser.add_argument(
"--device",
choices=["cpu", "cuda", "cu117", "cu118", "rocm", "rocm5.4.2"],
choices=["cpu", "cuda", "cu118", "cu121", "cu124", "rocm", "rocm6.1"],
required=True,
help="""selection of internal build stages to build for a specific platform.<br>
- cuda: defaults to cu118, cuda 11.8<br>
- cuda: defaults to cu124, cuda 12.4<br>
- cpu: only cpu support<br>
- rocm: defaults to rocm5.4.2 (experimental)""",
- rocm: defaults to rocm6.1 (experimental)""",
)
parser.add_argument(
"--tag",
Expand Down Expand Up @@ -231,6 +231,7 @@ def make_parser() -> argparse.ArgumentParser:
--cache type=registry,ref=server/fastbuild,mode=max.
Will default to the environment variable FASTSURFER_BUILD_CACHE:
{cache_kwargs.get('default', 'N/A')}""",
metavar="type={inline,local,...}[,<param>=<value>[,...]]",
**cache_kwargs,
)
parser.add_argument(
Expand Down
2 changes: 1 addition & 1 deletion Docker/install_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
def mode(arg: str) -> str:
if arg in ["base", "cpu"] or \
re.match("^cu\\d+$", arg) or \
re.match("^rocm\\d+\\.\\d+(\\.\\d+)?$"):
re.match("^rocm\\d+\\.\\d+(\\.\\d+)?$", arg):
return arg
else:
raise argparse.ArgumentTypeError(f"The mode was '{arg}', but should be "
Expand Down
2 changes: 1 addition & 1 deletion FastSurferCNN/data_loader/data_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -623,7 +623,7 @@ def read_classes_from_lut(lut_file: str | Path):
if lut_file.suffix == ".csv":
kwargs["sep"] = ","
elif lut_file.suffix == ".txt":
kwargs["delim_whitespace"] = True
kwargs["sep"] = "\\s+"
else:
raise RuntimeError(
f"Unknown LUT file extension {lut_file}, must be csv, txt or tsv."
Expand Down
4 changes: 3 additions & 1 deletion FastSurferCNN/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,9 @@ def load_checkpoint(self, ckpt: Union[str, os.PathLike]):
# make sure the model is, where it is supposed to be
self.model.to(self.device)

model_state = torch.load(ckpt, map_location=device)
# WARNING: weights_only=False can cause unsafe code execution, but here the
# checkpoint can be considered to be from a safe source
model_state = torch.load(ckpt, map_location=device, weights_only=False)
self.model.load_state_dict(model_state["model_state"])

# workaround for mps (move the model back to mps)
Expand Down
4 changes: 3 additions & 1 deletion FastSurferCNN/utils/checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,9 @@ def load_from_checkpoint(
loaded_epoch : int
Epoch number.
"""
checkpoint = torch.load(checkpoint_path, map_location="cpu")
# WARNING: weights_only=False can cause unsafe code execution, but here the
# checkpoint can be considered to be from a safe source
checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

if drop_classifier:
classifier_conv = ["classifier.conv.weight", "classifier.conv.bias"]
Expand Down
4 changes: 3 additions & 1 deletion HypVINN/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,9 @@ def load_checkpoint(self, ckpt: str):
of a model.
"""
logger.info("Loading checkpoint {}".format(ckpt))
model_state = torch.load(ckpt, map_location=self.device)
# WARNING: weights_only=False can cause unsafe code execution, but here the
# checkpoint can be considered to be from a safe source
model_state = torch.load(ckpt, map_location=self.device, weights_only=False)
self.model.load_state_dict(model_state["model_state"])

def get_modelname(self):
Expand Down
14 changes: 7 additions & 7 deletions env/export_pip-r.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,21 +48,23 @@ echo "Exporting versions from $2..."
echo "#"
} > $1

pip_cmd="python --version && pip list --format=freeze --no-color --all --disable-pip-version-check --no-input"
pip_cmd="python --version && pip list --format=freeze --no-color --disable-pip-version-check --no-input"
if [ "${2/#.sif}" != "$2" ]
then
# singularity
cmd="singularity exec $2 /bin/bash -c '$pip_cmd'"
cmd=("singularity" "exec" "$2" "/bin/bash" -c "$pip_cmd")
clean_cmd="singularity exec $2 /bin/bash -c '$pip_cmd'"
else
# docker
cmd="docker run --entrypoint /bin/bash $2 -c '$pip_cmd'"
clean_cmd="docker run --rm -u <user_id>:<group_id> --entrypoint /bin/bash $2 -c '$pip_cmd'"
cmd=("docker" "run" --rm -u "$(id -u):$(id -g)" --entrypoint /bin/bash "$2" -c "$pip_cmd")
fi
{
echo "# Which ran the following command:"
echo "# $cmd"
echo "# $clean_cmd"
echo "#"
} >> $1
out=$($cmd)
out=$("${cmd[@]}")
hardware=$(echo "$out" | grep "torch==" | cut -d"+" -f2)
pyversion=$(echo "$out" | head -n 1 | cut -d" " -f2)
{
Expand All @@ -73,5 +75,3 @@ pyversion=$(echo "$out" | head -n 1 | cut -d" " -f2)
echo ""
echo "# $out"
} >> $1

}
50 changes: 25 additions & 25 deletions env/fastsurfer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,28 @@ channels:
- defaults

dependencies:
- h5py=3.7.0
- lapy=1.0.1
- matplotlib=3.7.1
- nibabel=5.1.0
- numpy=1.25.0
- pandas=1.5.3
- pillow=10.0.1
- pip=23.1.2
- python=3.10
- python-dateutil=2.8.2
- pyyaml=6.0
- scikit-image=0.19.3
- scikit-learn=1.2.2
- scipy=1.10.1
- setuptools=67.8.0
- tensorboard=2.12.1
- tqdm=4.66
- yacs=0.1.8
- pip
- pip:
- --extra-index-url https://download.pytorch.org/whl/cu117
- simpleitk==2.2.1
- torch==2.0.1
- torchio==0.18.83
- torchvision==0.15.2
- h5py=3.11.0
- lapy=1.1.0
- matplotlib=3.9.2
- nibabel=5.2.1
- numpy=1.26.4
- pandas=2.2.2
- pillow=10.4.0
- pip=24.2
- python=3.10
- python-dateutil=2.9.0
- pyyaml=6.0.2
- requests=2.32.3
- scikit-image=0.24.0
- scikit-learn=1.5.1
- scipy=1.14.1
- setuptools=72.2.0
- tensorboard=2.17.1
- tqdm=4.66.5
- yacs=0.1.8
- pip:
- --extra-index-url https://download.pytorch.org/whl/cu124
- simpleitk==2.4.0
- torch==2.4.0+cu124
- torchio==0.19.9
- torchvision==0.19.0+cu124
104 changes: 104 additions & 0 deletions requirements.cpu.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
#
# This file is autogenerated by kueglerd from deepmi/fastsurfer:cpu-v2.3.0
# by the following command from FastSurfer:
#
# ./requirements.cpu.txt deepmi/fastsurfer:cpu-v2.3.0
#
# Which ran the following command:
# docker run --rm -u <user_id>:<group_id> --entrypoint /bin/bash deepmi/fastsurfer:cpu-v2.3.0 -c 'python --version && pip list --format=freeze --no-color --disable-pip-version-check --no-input'
#
#
# Image was configured for cpu using python version 3.10.14
#
--extra-index-url https://download.pytorch.org/whl/cpu

# Python 3.10.14
absl-py==2.1.0
Brotli==1.1.0
cached-property==1.5.2
certifi==2024.7.4
cffi==1.17.0
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
contourpy==1.2.1
cycler==0.12.1
Deprecated==1.2.14
filelock==3.15.4
fonttools==4.53.1
fsspec==2024.6.1
grpcio==1.62.2
h2==4.1.0
h5py==3.11.0
hpack==4.0.0
humanize==4.10.0
hyperframe==6.0.1
idna==3.8
imagecodecs==2024.6.1
imageio==2.35.1
importlib_metadata==8.4.0
importlib_resources==6.4.4
Jinja2==3.1.4
joblib==1.4.2
kiwisolver==1.4.5
lapy==1.1.0
lazy_loader==0.4
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.2
mdurl==0.1.2
mpmath==1.3.0
munkres==1.1.4
networkx==3.3
nibabel==5.2.1
numpy==1.26.4
packaging==24.1
pandas==2.2.2
pillow==10.4.0
pip==24.2
plotly==5.23.0
protobuf==4.25.3
psutil==6.0.0
pycparser==2.22
Pygments==2.18.0
pyparsing==3.1.4
PySide6==6.7.2
PySocks==1.7.1
python-dateutil==2.9.0
pytz==2024.1
PyWavelets==1.7.0
PyYAML==6.0.2
requests==2.32.3
rich==13.8.0
scikit-image==0.24.0
scikit-learn==1.5.1
scikit-sparse==0.4.14
scipy==1.14.1
setuptools==72.2.0
shellingham==1.5.4
shiboken6==6.7.2
SimpleITK==2.4.0
six==1.16.0
sympy==1.13.2
tenacity==9.0.0
tensorboard==2.17.1
tensorboard-data-server==0.7.0
threadpoolctl==3.5.0
tifffile==2024.8.24
torch==2.4.0+cpu
torchio==0.19.9
torchvision==0.19.0+cpu
tornado==6.4.1
tqdm==4.66.5
typer==0.12.5
typing_extensions==4.12.2
tzdata==2024.1
unicodedata2==15.1.0
urllib3==2.2.2
Werkzeug==3.0.4
wheel==0.44.0
wrapt==1.16.0
yacs==0.1.8
zipp==3.20.0
zstandard==0.23.0
Loading