Skip to content

doordash-oss/DataQualityReport

Repository files navigation

DataQualityReport

DataQualityReport (DQR) endeavors to quickly, both in machine and human attention time, make users aware of potential data quality issues to be addressed in Pandas Dataframes. It is a priority that generating diagnostics should be FAST (<1 minute), such that users are encouraged to use this utility without interuption to interactive development.

Usage

Installation

pip install dataqualityreport

Quick Start

from dataqualityreport import dqr_table
dqr_table(my_df) # supply your own dataframe: my_df

Example Table

Image of DQR Table

Learn More

View the tutorial.

Developing / Contributing

DataQualityReport uses Poetry for python library management.

All contributors must agree to the Contribution License Agreement.

# To create a venv with required dependencies
poetry install

This will use the existing poetry.lock file to know what dependencies (including versions) should be installed.

# run a command within the venv
poetry run ipython

# open a shell using the venv
poetry shell

A list of dependencies can be found in /pyproject.toml

# add a library using the CLI
poetry add matplotib

# To update the `poetry.lock` file based on new dependencies:
poetry update

Code Style and Standards

# Includes mypy, pydocstring, etc.
make lint

# automatically run on every `git commit`
poetry run pre-commit install

# Tests
make unittest

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published