Skip to content

Developer guide

Martin Trans edited this page Apr 13, 2022 · 10 revisions

Interested in contributing to 4CAT or extending it with your own modules? The following pages offer the information you need to do this:

Developing in Docker

Docker's containerized approach allows you to easily change 4CAT and its environment and redeploy your local application. You can make changes locally in your cloned 4CAT repository, shutdown you running 4CAT instance with docker-compose down, and then recreate the application with docker-compose up --build -d. This will rebuild your application following the docker-compose.yml file (ensure you use that file as opposed to docker-compose_prod.yml since the prod file pull images directly from Docker Hub). Note: if you wish to remove all collected data as well, use docker-compose down -v which removes the volumes as well.

Often it is not necessary to rebuild the entire application. If you only change certain files or create new processors/datasources, you can copy those files directly into your Docker 4CAT containers. docker cp path/to/file 4cat_backend:/usr/src/app/path/to/file will copy a file to the 4cat_backend container. Remember to also copy the same files to the 4cat_frontend container which works as a mirror to the backend. The frontend will generally pick up all changed files automatically so long as you refresh your browser, however, the backend will need to be restarted. You could restart the container or run docker exec 4cat_backend python3 4cat-daemon.py restart.

You can also connect directly to the containers with either the Docker GUI or the command docker exec -it 4cat_backend /bin/bash where you can run commands, test your python3 environment and files, and even edit files directly. (You could install your preferred text editor if you wish with a command such as apt-get install nano directly in the container.)

4CAT tips

  • 4CAT containers inherit the git repository you cloned, so if you are using git to track changes and manage branches, you can use all of git's functionality inside the containers.
  • Logs are stored by default in /usr/src/app/logs. This is a shared volume so it is accessible via both 4cat_backend and 4cat_frontend. Main logs are generally shown in the docker logs as well, but this is not always the case (particularly the backend).
  • Additional backend commands:
    • python3 4cat-daemon.py status
    • python3 4cat-daemon.py stop
    • python3 4cat-daemon.py start
  • Containers do not need to be running to copy files to and from them (helpful if an error causes a container to crash!)
  • docker-compose essentially relies on the docker-compose.yml file and your .env file, but also uses the docker\Dockerfile to build the 4cat image environments and the docker/docker-entrypoint.sh file which runs every time the 4cat_backend container starts.
  • You can open additional ports by adding them to docker-compose.yml. For example, if you wished to directly connect to the database from your local machine, just add the line ports: - 5432:5432 to the db container under services and recreate 4CAT.

Helpful Docker commands

  1. View container logs docker container logs container_name
  2. Stop running container docker stop container_name
  3. Start stopped container docker start container_name
  4. Remove container docker container rm container_name Useful to remove then recreate with new parameters (e.g. port mappings)
  5. Remove image docker image rm image_name:image_tag Useful if you need to change Dockerfile or docker-entrypoint.sh and rebuild
    • Note: must also remove any containers dependent on image; you could alternately create a new image with a different name:tag
  6. Copy files into container docker cp path/to/file container_name:/full/path/to/container/directory/ Can update and change files (e.g. config.py or other configuration files) Note: may require restarting the container to take effect

Documentation

Documentation, in this documentation branch, is done (semi-)automatically, with the use of docstrings formatted in RestructuredText, through Sphinx (v4.5) and its autodoc and autosummary modules. Therefore, when merging commits and general updates to the code into this branch, one ought to be aware that the overall code structure have changed slightly; mostly in terms of directories and filenames being renamed with underscores taking the place of hyphens throughout. This is as Sphinx need to import all relevant directories and py-files as modules and packages. This is also why there are a lot of empty "init.py" files.

Currently many functions lack thorough documentation, or any documentation at all, please follow the steps below to update as you encounter unfinished documentation!

Docstring format:

Please follow the official ReStructuredText formatting for your docstrings:

"""
[Summary]

[Detailed description if needed]

:param [ParamName]: [ParamDescription], defaults to [DefaultParamVal]
:type [ParamName]: [ParamType](, optional)
:raises [ErrorType]: [ErrorDescription]
:return: [ReturnDescription]
:rtype: [ReturnType]
"""

Make sure to, as a minimum, keep the empty lines between summary and param-list

How to update the documentation

For more info, this short guide might be helpful. Make sure that you have sphinx, sphinx-mdinclude sphinx-rtd-theme installed via pip in your environment:

$ (sudo) pip install sphinx

$ (sudo) pip install sphinx-rtd-theme

$ (sudo) pip install sphinx-mdinclude

Scenario: I changed or added new information to existing docstrings:

  1. Update project files and merge commits into this documentation branch, be aware of possible conflicts within naming conventions and resolve accordingly
  2. Open the "docs" folder in your terminal
  3. Run make clean && make html as this will clear existing HTML and regenerate new, including your new docstrings, and you can do what you want with the newly generated HTML-files in the /build directory - success!

Scenario: I added a new submodule (i.e. a new datasource)

  1. Update project files and merge commits into this documentation branch, be aware of possible conflicts within naming conventions and resolve accordingly

  2. Open the "docs" folder in your terminal

  3. Either:

    • Delete the rst file(s) in question within the "source"-directory (leave the index.rst!)

    or

    • Manually add the stub in the relevant rst file within the "source"-directory as required
  4. Regenerate missing rst files by running sphinx-apidoc -o ./source .. "/*setup*" "/*4cat-daemon*" "/*config*" in your terminal and reformat accordingly for layout, as all formatting will be lost upon deletion. The last part of the command lists the aspects of the code that we want to exclude, as these are work poorly for Sphinx documentation.

  5. run make clean && make html as this will clear existing HTML and regenerate new, including your new docstrings, and you can do what you want with the newly generated HTML-files in the /build directory - success!

Documentation Troubleshooting

If you encounter errors, make sure your current python environment (venv?) has the packages installed from pip which 4cat is listing in setup.py. If you have issues installing psycopg2==2.8.6 then try to pip install psycopg2-binary==2.8.6 instead.

Flask needs to be executable from Sphinx' end, so it might ask to have FLASK_KEY filled. In that case make a config.py from config.py-example and populate the secret name (see this on how to generate one)

🐈🐈🐈🐈

Clone this wiki locally