Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INFRA] Python test automation #44

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

Remi-Gau
Copy link
Contributor

@Remi-Gau Remi-Gau commented Apr 7, 2022

Because this has not been merged, the results of the Github continuous integration woklfows are for now only viewable in the action tab of my fork

The initial commit of this PR is common to #44 (some changes are common to both PR and this might help prevent merge conflicts).

relates to #33

TODO

  • investigate test failure with python 3.10 (see here)
  • set up code coverage upload on codecov

DONE

  • make sure CI workflow show as failed if any of the job fails
  • ensure that demo workflows shows as failed if any of the demo fails
  • add Makefile (to help install test data, code cleaning...)
  • developers documentation
  • add badges in the README
  • set up system tests
  • automate system via GITHUB CI
    • for python 3.7, 3.8, 3.9
    • for linux, mac, windows
  • get and upload code coverage to codecov
  • automate testing of the jupyter notebooks every 15 days
  • only run test in CI on push on master and PR on all branches

@Remi-Gau Remi-Gau marked this pull request as draft April 7, 2022 11:47
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-20.04]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to test with different OS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably higher priority is different python versions (since python versions seem to cause more problems)... I suppose, assuming that the tests can run pretty quickly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now set up to test in parallel python 3.7 to python 3.9 on ubuntu, mac, windows

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the tests themselves takes about a minute in CI. But the whole takes a bit more time than this because of the set up and tear down

tests/test_glmsingleunit.py Outdated Show resolved Hide resolved
@Remi-Gau
Copy link
Contributor Author

Remi-Gau commented Apr 7, 2022

Python demos run in 35-40 minutes in CI (see here).

So running it on every commit of every pull-request might be a bit much.

Maybe a better approach could be to run it:

  1. only on push to the master branch (so only once when the PR is merged)
  2. at regular intervals with a CRON job (see here)

@Remi-Gau
Copy link
Contributor Author

@iancharest

Trying to test against actual values, I realizing that the HRFindex in the output flles of the different models have different shapes.

Is this intentional?

In [65]: tmp = np.load("TYPEB_FITHRF.npy", allow_pickle=True)
    ...: print(tmp.item(0)["HRFindex"].shape)
    ...: 
    ...: tmp = np.load("TYPEC_FITHRF_GLMDENOISE.npy", allow_pickle=True)
    ...: print(tmp.item(0)["HRFindex"].shape)
    ...: 
    ...: tmp = np.load("TYPED_FITHRF_GLMDENOISE_RR.npy", allow_pickle=True)
    ...: print(tmp.item(0)["HRFindex"].shape)
(20, 20, 1)
(20, 20, 1)
(400,)

@Remi-Gau
Copy link
Contributor Author

@iancharest

Trying to test against actual values, I realizing that the HRFindex in the output flles of the different models have different shapes.

Looks like this is because there is not a

'HRFindex': HRFindex.reshape(xyz),

in an "else" block for model 3 as there is for model 2.

@Remi-Gau
Copy link
Contributor Author

Maybe a better approach could be to run it:

1. only on `push` to the master branch (so only once when the PR is merged)

2. at regular intervals with a CRON job (see [here](https://jasonet.co/posts/scheduled-actions/))

This is now set up


[![Python tests](https://github.com/cvnlab/GLMsingle/actions/workflows/run_tests_python.yml/badge.svg)](https://github.com/cvnlab/GLMsingle/actions/workflows/run_tests_python.yml)
[![Python demos](https://github.com/cvnlab/GLMsingle/actions/workflows/run_demos_python.yml/badge.svg)](https://github.com/cvnlab/GLMsingle/actions/workflows/run_demos_python.yml)
[![Python-coverage](https://codecov.io/gh/Remi-Gau/GLMsingle/branch/main/graph/badge.svg?token=H75TAAUVSW)](https://codecov.io/gh/Remi-Gau/GLMsingle)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code coverage is not yet uploaded and so this might require some tweaking after merging:

  • setting up an account for cvnlab on codecov (can simply link the github account)
  • changing the URL of this badge

But once set up you can know how pull request change the code coverage: like this
Remi-Gau#3 (comment)


Notes:
* Please note that GLMsingle is not (yet) compatible with Python 3.9 (due to an incompatibility between scikit-learn and Python 3.9). Please use Python 3.8 or earlier.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this since the tests pass with python 3.9.

But there seem to be a different issue with python 3.10 and numpy: I can investigate but I would not consider necessary for this PR

Comment on lines +1 to +8
fracridge>=1.4.3
h5py>=3.1.0
matplotlib>=3.3.4
nibabel>=3.2.2
numpy>=1.17.0
scikit-learn>=0.23.2
scipy>=1.5.4
tqdm>=4.63.1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pinned the dependency minimum version for GLMsingle to be those that were installed by default when using python 3.6.

version='0.0.1',
version='1.0.0',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match what the README and the github tag say

@@ -28,5 +28,6 @@
packages=find_packages(),
include_package_data=True,
zip_safe=False,
install_requires=requires
install_requires=requires,
python_requires='>=3.6, <3.10'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying which python version to use

Comment on lines +19 to +22
# TODO use same expected data for Python and MATLAB ?
# TODO in both cases expected results should probably be in a set of CSV files
# tmp = sio.loadmat(join(expected_dir, "TYPEB_FITHRF.mat"))
# expected["typeb"]["HRFindex"] = tmp["HRFindex"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run we probably do not want to have different expected data for MATLAB and python, a set of CSV files to read from for both languages could be easier to handle, more interoperable and easier to inspect too.

@Remi-Gau Remi-Gau changed the title [WIP][INFRA] Python test automation [INFRA] Python test automation Apr 12, 2022
@Remi-Gau Remi-Gau marked this pull request as ready for review April 12, 2022 15:37
@Remi-Gau
Copy link
Contributor Author

@iancharest
I am pretty much done with this so feel free to have a look when you can.

Comment on lines +59 to +69
[black](https://black.readthedocs.io/en/stable/) (code formater) and
[flake8](https://flake8.pycqa.org/en/latest/) (style guide enforcement) are used
on the test code base.

Ypu can use make to run them automatically with

```bash
make lint/black # to run black
make lint/flake8 # to run flake8
make lint # to run both
```
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of those are for code styling, so it does not have to be done manually. Only applies to the test code base.

Can remove it if you want.

@kendrickkay
Copy link
Member

@iancharest
Trying to test against actual values, I realizing that the HRFindex in the output flles of the different models have different shapes.

Looks like this is because there is not a

'HRFindex': HRFindex.reshape(xyz),

in an "else" block for model 3 as there is for model 2.

I think that sounds about right. Basically, the code tries to flexibly handle X Y Z (volume) data, or A (surface) data, and there are funny reshapings that have to happen. Hopefully you and @iancharest can sort this out and put this into this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants