-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 159 new evaluation #160
Merged
Merged
Changes from 11 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
7db0420
Update makefile with new test types
csala 2ab600e
Add github actions
csala b56b9c4
Install graphviz
csala 81bc8b8
Add graphviz to travis
csala cebfc7d
Install graphviz
csala e4d8814
Replace old evaluation with new sdmetrics
csala 895e864
Install dev version of sdmetrics
csala 7de1555
Minor fixes for new demo datasets
csala b1c6292
Document evaluation
csala 7773ed6
Fix tests
csala f3861b2
Update links
csala 0d836e9
Typos
csala b9ebc20
Update sdmetrics version
csala File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
name: Generate Docs | ||
|
||
on: | ||
push: | ||
branches: [ master ] | ||
|
||
jobs: | ||
|
||
docs: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
|
||
- name: Python | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: '3.7' | ||
|
||
- name: Build | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -e .[dev] | ||
make docs | ||
- name: Deploy | ||
uses: peaceiris/actions-gh-pages@v3 | ||
with: | ||
github_token: ${{secrets.GITHUB_TOKEN}} | ||
publish_dir: docs/_build/html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
name: Run Tests | ||
|
||
on: | ||
push: | ||
branches: [ '*' ] | ||
pull_request: | ||
branches: [ master ] | ||
|
||
jobs: | ||
build: | ||
runs-on: ${{ matrix.os }} | ||
strategy: | ||
matrix: | ||
python-version: [3.5, 3.6, 3.7] | ||
os: [ubuntu-latest, macos-latest] | ||
|
||
steps: | ||
- uses: actions/checkout@v1 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- if: matrix.os == 'ubuntu-latest' | ||
name: Install graphviz - Ubuntu | ||
run: | | ||
sudo apt-get install graphviz | ||
|
||
- if: matrix.os == 'macos-latest' | ||
name: Install graphviz - MacOS | ||
run: | | ||
brew install graphviz | ||
|
||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install tox tox-gh-actions | ||
|
||
- name: Test with tox | ||
run: tox |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,32 +1,18 @@ | ||
# Config file for automatic testing at travis-ci.org | ||
dist: trusty | ||
dist: bionic | ||
language: python | ||
python: | ||
- 3.7 | ||
- 3.6 | ||
- 3.5 | ||
|
||
matrix: | ||
include: | ||
- python: 3.7 | ||
dist: xenial | ||
sudo: required | ||
|
||
# Command to install dependencies | ||
install: pip install -U tox-travis codecov | ||
install: | ||
- sudo apt-get update | ||
- sudo apt-get install graphviz | ||
- pip install -U tox-travis codecov | ||
|
||
after_success: codecov | ||
|
||
# Command to run tests | ||
script: tox | ||
|
||
deploy: | ||
|
||
- provider: pages | ||
skip-cleanup: true | ||
github-token: "$GITHUB_TOKEN" | ||
keep-history: true | ||
local-dir: docs/_build/html | ||
target-branch: gh-pages | ||
on: | ||
branch: master | ||
python: 3.6 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# SDV Evaluation | ||
|
||
After using SDV to model your database and generate a synthetic version of it you | ||
might want to evaluate how similar the syntehtic data is to your real data. | ||
|
||
SDV has an evaluation module with a simple function that allows you to compare | ||
the syntehtic data to your real data using [SDMetrics](https://github.com/sdv-dev/SDMetrics) and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo: syntehtic -> synthetic |
||
generate a simple standardized score. | ||
|
||
## Evaluating your synthetic data | ||
|
||
After you have modeled your databased and generated samples out of the SDV models | ||
you will be left with a dictionary that contains table names and dataframes. | ||
|
||
For exmple, if we model and sample the demo dataset: | ||
|
||
```python3 | ||
from sdv import SDV | ||
from sdv.demo import load_demo | ||
|
||
metadata, tables = load_demo(metadata=True) | ||
|
||
sdv = SDV() | ||
sdv.fit(metadata, tables) | ||
|
||
samples = sdv.sample_all(10) | ||
``` | ||
|
||
`samples` will contain a dictionary with three tables, just like the `tables` dict. | ||
|
||
|
||
At this point, you can evaluate how similar the two sets of tables are by using the | ||
`sdv.evaluation.evaluate` function as follows: | ||
|
||
``` | ||
from sdv.evaluation import evaluate | ||
|
||
score = evaluate(samples, tables, metadata) | ||
``` | ||
|
||
The output will be a maximization score that will indicate how good the modeling was: | ||
the higher the value, the more similar the sets of table are. Notice that in most cases | ||
the value will be negative. | ||
|
||
For further options, including visualizations and more detailed reports, please refer to | ||
the [SDMetrics](https://github.com/sdv-dev/SDMetrics) library. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: syntehtic -> synthetic