Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drive data quality metrics from enso code #11638

Merged
merged 15 commits into from
Nov 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,7 @@ export class TableVisualisationTooltip implements ITooltipComp {
*/
init(
params: ITooltipParams & {
numberOfNothing: number
numberOfWhitespace: number
dataQualityMetrics: Record<string, number>[]
total: number
showDataQuality: boolean
},
Expand All @@ -32,7 +31,6 @@ export class TableVisualisationTooltip implements ITooltipComp {
})

const getPercentage = (value: number) => ((value / params.total) * 100).toFixed(2)
const getDisplay = (value: number) => (value > 0 ? 'block' : 'none')
const createIndicator = (value: number) => {
const color =
value < 33 ? 'green'
Expand All @@ -41,20 +39,29 @@ export class TableVisualisationTooltip implements ITooltipComp {
return `<div style="display: inline-block; width: 10px; height: 10px; border-radius: 50%; background-color: ${color}; margin-left: 5px;"></div>`
}

const dataQualityTemplate = `
<div style="display: ${getDisplay(params.numberOfNothing)};">
Nulls/Nothing: ${getPercentage(params.numberOfNothing)}% ${createIndicator(+getPercentage(params.numberOfWhitespace))}
</div>
<div style="display: ${getDisplay(params.numberOfWhitespace)};">
Trailing/Leading Whitespace: ${getPercentage(params.numberOfWhitespace)}% ${createIndicator(+getPercentage(params.numberOfWhitespace))}
</div>
`
const getDataQualityTemplate = () => {
let template = ''
params.dataQualityMetrics.forEach((obj) => {
const key = Object.keys(obj)[0]
const value = key ? obj[key] : null
if (key && value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value truthiness check here seems to make getDisplay redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get an else branch that does some console.log telling us that a metric format is malformed?

This was one of the problems with debugging visualizations - there are similar situations where it just works or just doesn't work but doesn't tell us why. Let's try to log these bad conditions so that we can actually figure out what is wrong. IMHO every if that checks a required condition should have an else branch logging that it is violated.

const metricTemplate = `<div>
${key}: ${getPercentage(value)}% ${createIndicator(+getPercentage(value))}
</div>`
template = template + metricTemplate
} else {
console.warn(
'Data quality metric is missing a valid key-value pair. Ensure each object in data_quality_pairs contains a single valid key with a numeric value.',
)
}
})
return template
}

this.eGui.innerHTML = `
<div><b>Column value type:</b> ${params.value}</div>
<div style="display: ${params.showDataQuality ? 'block' : 'none'};"">
<b>Data Quality Indicators</b>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed intentionally?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - we didn't see the point having the header in the tooltip.

${dataQualityTemplate}
${getDataQualityTemplate()}
</div>
`
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,12 @@ interface UnknownTable {
get_child_node_action: string
get_child_node_link_name: string
link_value_type: string
data_quality_pairs?: DataQualityPairs
data_quality_metrics?: DataQualityMetric[]
}

interface DataQualityPairs {
number_of_nothing: number[]
number_of_whitespace: number[]
type DataQualityMetric = {
name: string
percentage_value: number[]
}

export type TextFormatOptions = 'full' | 'partial' | 'off'
Expand Down Expand Up @@ -356,23 +356,15 @@ function toField(
const displayValue = valueType ? valueType.display_text : null
const icon = valueType ? getValueTypeIcon(valueType.constructor) : null

const dataQuality =
typeof props.data === 'object' && 'data_quality_pairs' in props.data ?
props.data.data_quality_pairs
// eslint-disable-next-line camelcase
: { number_of_nothing: [], number_of_whitespace: [] }

const nothingIsNonZero =
index != null && dataQuality?.number_of_nothing ?
(dataQuality.number_of_nothing[index] ?? 0) > 0
: false

const whitespaceIsNonZero =
index != null && dataQuality?.number_of_nothing ?
(dataQuality.number_of_whitespace[index] ?? 0) > 0
: false
const dataQualityMetrics =
typeof props.data === 'object' && 'data_quality_metrics' in props.data ?
props.data.data_quality_metrics.map((metric: DataQualityMetric) => {
return { [metric.name]: metric.percentage_value[index!] ?? 0 }
})
: []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log an error in this case as well? (Assuming it is a malformed data error...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an error just no metrics sent.


const showDataQuality = nothingIsNonZero || whitespaceIsNonZero
const showDataQuality =
dataQualityMetrics.filter((obj) => (Object.values(obj)[0] as number) > 0).length > 0

const getSvgTemplate = (icon: string) =>
`<svg viewBox="0 0 16 16" width="16" height="16"> <use xlink:href="${icons}#${icon}"/> </svg>`
Expand Down Expand Up @@ -401,8 +393,7 @@ function toField(
tooltipComponent: TableVisualisationTooltip,
headerTooltip: displayValue ? displayValue : '',
tooltipComponentParams: {
numberOfNothing: index != null ? dataQuality.number_of_nothing[index] : null,
numberOfWhitespace: index != null ? dataQuality.number_of_whitespace[index] : null,
dataQualityMetrics,
total: typeof props.data === 'object' ? props.data.all_rows_count : 0,
showDataQuality,
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -199,8 +199,10 @@ make_json_for_table dataframe all_rows_count include_index_col is_db_table =
links = ["get_child_node_action", "get_row"]
number_of_nothing = if is_db_table then Nothing else columns.map c-> c.count_nothing
number_of_whitespace= if is_db_table then Nothing else columns.map c-> whitespace_count c
data_quality_pairs = JS_Object.from_pairs [["number_of_nothing", number_of_nothing], ["number_of_whitespace", number_of_whitespace]]
pairs = [header, value_type, data, all_rows, has_index_col, links, ["data_quality_pairs", data_quality_pairs] ,["type", "Table"]]
nothing_p = JS_Object.from_pairs [["name", "Number of nothings"], ["percentage_value", number_of_nothing]]
whitespace_p = JS_Object.from_pairs [["name", "Number of untrimmed whitespace"], ["percentage_value",number_of_whitespace]]
data_quality_metrics = [nothing_p, whitespace_p]
pairs = [header, value_type, data, all_rows, has_index_col, links, ["data_quality_metrics", data_quality_metrics] ,["type", "Table"]]
JS_Object.from_pairs pairs

## PRIVATE
Expand Down
8 changes: 4 additions & 4 deletions test/Visualization_Tests/src/Table_Spec.enso
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ add_specs suite_builder =
p_value_type = ["value_type", value_type]
p_has_index_col = ["has_index_col", has_index_col]
p_get_child_node = ["get_child_node_action", get_child_node]
p_number_of_nothing = ["number_of_nothing", number_of_nothing]
p_number_of_whitespace = ["number_of_whitespace", number_of_whitespace]
data_quality_pairs = JS_Object.from_pairs [p_number_of_nothing, p_number_of_whitespace]
pairs = [p_header, p_value_type, p_data, p_all_rows, p_has_index_col, p_get_child_node, ["data_quality_pairs", data_quality_pairs], ["type", "Table"]]
p_number_of_nothing = JS_Object.from_pairs [["name", "Number of nothings"], ["percentage_value", number_of_nothing]]
p_number_of_whitespace = JS_Object.from_pairs [["name", "Number of untrimmed whitespace"], ["percentage_value", number_of_whitespace]]
data_quality_metrics = [p_number_of_nothing, p_number_of_whitespace]
pairs = [p_header, p_value_type, p_data, p_all_rows, p_has_index_col, p_get_child_node, ["data_quality_metrics", data_quality_metrics], ["type", "Table"]]
JS_Object.from_pairs pairs . to_text

suite_builder.group "Table Visualization" group_builder->
Expand Down
Loading