splink_comparison_viewer

Understanding the tool

There's a tutorial video available here.

Usage

To generate a dashboard:

from splink import Splink
linker = Splink(settings_obj.settings_dict, df, spark)
df_e = linker.get_scored_comparisons()


from splink_comparison_viewer import get_vis_data, render_html_vis
edges_data = get_vis_data(df_e, linker.model.current_settings_obj.settings_dict, 3)
render_html_vis(edges_data, linker.model.current_settings_obj.settings_dict, "out.html", True)

For big df_e, probably good to save it out to disk before passing to get_vis_data()

For very big df_e with a large number of distinct comparison vector patterns (>20k), might want to filter down edges_data before passing to render_html_vis e.g. to remove entries with low counts.

Example

Example output here

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
splink_comparison_viewer		splink_comparison_viewer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

splink_comparison_viewer

Understanding the tool

Usage

Example

About

Releases 2

Packages

Languages

License

moj-analytical-services/splink_comparison_viewer

Folders and files

Latest commit

History

Repository files navigation

splink_comparison_viewer

Understanding the tool

Usage

Example

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages