Skip to content
/ diet Public

Demo on how to develop data science projects for production deployments

License

Notifications You must be signed in to change notification settings

spkelle2/diet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

diet

This repository demos how to develop data science projects for production deployments. The concepts on display in this repository are an abridgment of Pete Cacioppi's Tidy, Tested, Safe book and a collection of DevOps best practices I've picked up over the years. Together, they include:

  • Enforcing data science model input data integrity
  • Writing unit tests
  • Writing assert statements
  • Using version control
  • Automating installation for:
    • Running in development
    • Running as a Conda package
    • Running behind a web service in a Docker container

For more information, see the companion medium article.

Running in Development

If Conda is not already installed on your machine, go here to do so. Then, run the following commands from this project's root directory:

conda env create -f environment.yml
conda activate diet

To confirm that the environment was created correctly, run:

coverage run -m unittest discover

Running as a Conda Package

To run as a package, download and install Conda as instructed above if you haven't already. Then, run the following command to download and install the package:

conda install spkelle2::diet

To confirm that the package installed correctly, run:

python -m diet -i /path/to/diet_sample_data -o /path/to/diet_sample_solution

Running behind a web service in a Docker container

To run as a Docker container, run the following commands from this project's root directory:

docker build --no-cache -t diet:1.0.0 .
docker run -p 8080:8080 diet:1.0.0

To confirm the container works as expected run:

curl -X POST -H "content-type:application/json" -d @examples/diet_sample_data.json http://localhost:8080

About

Demo on how to develop data science projects for production deployments

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages